Rework cluster metrics dashboard to support the modern clusters

This commit is contained in:
Michael Vines 2020-03-11 10:21:53 -07:00
parent 0ef9d79056
commit 5f5824d78d
9 changed files with 58 additions and 55 deletions

View File

@ -24,8 +24,8 @@ solana transaction-count
Inspect the network explorer at
[https://explorer.solana.com/](https://explorer.solana.com/) for activity.
View the [metrics dashboard](https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta?var-testnet=testnet)
for more detail on cluster activity.
View the [metrics dashboard](https://metrics.solana.com:3000/d/monitor/cluster-telemetry) for more
detail on cluster activity.
## Confirm your Installation

View File

@ -5,7 +5,7 @@ testnet participants, [https://discord.gg/pquxPsq](https://discord.gg/pquxPsq).
## Useful Links & Discussion
* [Network Explorer](http://explorer.solana.com/)
* [Testnet Metrics Dashboard](https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?refresh=60s&orgId=2)
* [Testnet Metrics Dashboard](https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge?refresh=60s&orgId=2)
* Validator chat channels
* [\#validator-support](https://discord.gg/rZsenD) General support channel for any Validator related queries.
* [\#tourdesol](https://discord.gg/BdujK2) Discussion and support channel for Tour de SOL participants ([What is Tour de SOL?](https://solana.com/tds/)).
@ -14,6 +14,6 @@ testnet participants, [https://discord.gg/pquxPsq](https://discord.gg/pquxPsq).
* [Core software repo](https://github.com/solana-labs/solana)
* [Tour de SOL Docs](https://docs.solana.com/tour-de-sol)
* [TdS repo](https://github.com/solana-labs/tour-de-sol)
* [TdS metrics dashboard](https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge?refresh=1m&from=now-15m&to=now&var-testnet=tds&orgId=2&var-datasource=TdS%20Metrics%20%28read-only%29)
* [TdS metrics dashboard](https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge?refresh=1m&from=now-15m&to=now&var-testnet=tds)
Can't find what you're looking for? Send an email to ryan@solana.com or reach out to @rshea\#2622 on Discord.

View File

@ -6,7 +6,7 @@ description: Where to go after you've read this guide
* [Solana Docs](https://docs.solana.com/)
* [Network Explorer](http://explorer.solana.com/)
* [TdS metrics dashboard](https://metrics.solana.com:3000/d/testnet/testnet-monitor?refresh=1m&from=now-15m&to=now&orgId=2&var-datasource=Solana%20Metrics%20(read-only)&var-testnet=tds&var-hostid=All9)
* [TdS metrics dashboard](https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge?refresh=1m&from=now-15m&to=now&var-testnet=tds)
* Validator chat channels
* [\#validator-support](https://discord.gg/rZsenD) General support channel for any Validator related queries that dont fall under Tour de SOL.
* [\#tourdesol](https://discord.gg/BdujK2) Discussion and support channel for Tour de SOL participants.

View File

@ -4,13 +4,14 @@
There are three versions of the testnet dashboard, corresponding to the three
release channels:
* https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge
* https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta
* https://metrics.solana.com:3000/d/testnet/testnet-monitor
* https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge
* https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta
* https://metrics.solana.com:3000/d/monitor/cluster-telemetry
The dashboard for each channel is defined from the
`metrics/testnet-monitor.json` source file in the git branch associated with
that channel, and deployed by automation running `ci/publish-metrics-dashboard.sh`.
`metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json` source
file in the git branch associated with that channel, and deployed by automation
running `ci/publish-metrics-dashboard.sh`.
A deploy can be triggered at any time via the `New Build` button of
https://buildkite.com/solana-labs/publish-metrics-dashboard.
@ -18,7 +19,7 @@ https://buildkite.com/solana-labs/publish-metrics-dashboard.
### Modifying a Dashboard
Dashboard updates are accomplished by modifying
`metrics/scripts/grafana-provisioning/dashboards/testnet-monitor.json`,
`metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json`,
**manual edits made directly in Grafana will be overwritten**.
* Check out metrics to add at https://metrics.solana.com:8888/ in the data explorer.
@ -32,13 +33,13 @@ Dashboard updates are accomplished by modifying
`Settings` menu for the dashboard
3. Edit dashboard as desired
4. Extract the JSON Model by selecting `JSON Model` in the `Settings` menu. Copy the JSON to the clipboard
and paste into `metrics/scripts/grafana-provisioning/dashboards/testnet-monitor.json`,
and paste into `metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json`,
5. Delete your development dashboard: `Settings` => `Delete`
### Deploying a Dashboard Manually
If you need to immediately deploy a dashboard using the contents of
`testnet-monitor.json` in your local workspace,
`cluster-monitor.json` in your local workspace,
```
$ export GRAFANA_API_TOKEN="an API key from https://metrics.solana.com:3000/org/apikeys"
$ metrics/publish-metrics-dashboard.sh (edge|beta|stable)

View File

@ -11,13 +11,13 @@ fi
case $CHANNEL in
edge)
DASHBOARD=testnet-monitor-edge
DASHBOARD=cluster-telemetry-edge
;;
beta)
DASHBOARD=testnet-monitor-beta
DASHBOARD=cluster-telemetry-beta
;;
stable)
DASHBOARD=testnet-monitor
DASHBOARD=cluster-telemetry
;;
*)
echo "Error: Invalid CHANNEL=$CHANNEL"
@ -31,7 +31,7 @@ if [[ -z $GRAFANA_API_TOKEN ]]; then
exit 1
fi
DASHBOARD_JSON=scripts/grafana-provisioning/dashboards/testnet-monitor.json
DASHBOARD_JSON=scripts/grafana-provisioning/dashboards/cluster-monitor.json
if [[ ! -r $DASHBOARD_JSON ]]; then
echo Error: $DASHBOARD_JSON not found
fi

View File

@ -21,7 +21,7 @@ with open(dashboard_json, 'r') as read_file:
data = json.load(read_file)
if channel == 'local':
data['title'] = 'Local Testnet Monitor'
data['title'] = 'Local Cluster Monitor'
data['uid'] = 'local'
data['links'] = []
data['templating']['list'] = [{'current': {'text': '$datasource',
@ -66,10 +66,9 @@ if channel == 'local':
'useTags': False}]
elif channel == 'stable':
# Stable dashboard only allows the user to select between the stable
# testnet databases
data['title'] = 'Testnet Monitor'
data['uid'] = 'testnet'
# Stable dashboard only allows the user to select between public clusters
data['title'] = 'Cluster Telemetry'
data['uid'] = 'monitor'
data['templating']['list'] = [{'current': {'text': '$datasource',
'value': '$datasource'},
'hide': 1,
@ -81,20 +80,26 @@ elif channel == 'stable':
'regex': '',
'type': 'datasource'},
{'allValue': None,
'current': {'text': 'testnet',
'value': 'testnet'},
'current': {'text': 'Developer Testnet',
'value': 'devnet'},
'hide': 1,
'includeAll': False,
'label': 'Testnet',
'multi': False,
'name': 'testnet',
'options': [{'selected': False,
'text': 'testnet',
'value': 'testnet'},
{'selected': True,
'text': 'testnet-perf',
'value': 'testnet-perf'}],
'query': 'testnet,testnet-perf',
'options': [{'selected': True,
'text': 'Developer Testnet',
'value': 'devnet'},
{'selected': False,
'text': 'Mainnet Beta',
'value': 'mainnet-beta'},
{'selected': False,
'text': 'Tour de SOL Testnet',
'value': 'tds'},
{'selected': False,
'text': 'Soft Launch Testnet',
'value': 'cluster'}],
'query': 'devnet,mainnet-beta,tds,cluster',
'type': 'custom'},
{'allValue': ".*",
'datasource': '$datasource',
@ -114,10 +119,9 @@ elif channel == 'stable':
'type': 'query',
'useTags': False}]
else:
# Non-stable dashboard only allows the user to select between all testnet
# databases
data['title'] = 'Testnet Monitor ({})'.format(channel)
data['uid'] = 'testnet-' + channel
# Non-stable dashboard includes all the dev clusters
data['title'] = 'Cluster Telemetry ({})'.format(channel)
data['uid'] = 'monitor-' + channel
data['templating']['list'] = [{'current': {'text': '$datasource',
'value': '$datasource'},
'hide': 1,
@ -129,8 +133,8 @@ else:
'regex': '',
'type': 'datasource'},
{'allValue': ".*",
'current': {'text': 'testnet',
'value': 'testnet'},
'current': {'text': 'Developer Testnet',
'value': 'devnet'},
'datasource': '$datasource',
'hide': 1,
'includeAll': False,
@ -140,7 +144,7 @@ else:
'options': [],
'query': 'show databases',
'refresh': 1,
'regex': 'testnet.*',
'regex': '(devnet|cluster|tds|mainnet-beta|testnet.*)',
'sort': 1,
'tagValuesQuery': '',
'tags': [],

View File

@ -27,21 +27,21 @@
"title": "Stable",
"tooltip": "",
"type": "link",
"url": "https://metrics.solana.com:3000/d/testnet/testnet-monitor"
"url": "https://metrics.solana.com:3000/d/monitor/cluster-telemetry"
},
{
"icon": "dashboard",
"tags": [],
"title": "Beta",
"type": "link",
"url": "https://metrics.solana.com:3000/d/testnet-beta/testnet-monitor-beta"
"url": "https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta"
},
{
"icon": "dashboard",
"tags": [],
"title": "Edge",
"type": "link",
"url": "https://metrics.solana.com:3000/d/testnet-edge/testnet-monitor-edge"
"url": "https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge"
}
],
"panels": [
@ -4618,7 +4618,7 @@
},
"yaxes": [
{
"format": "µs",
"format": "\u00b5s",
"label": null,
"logBase": 1,
"max": null,
@ -5385,7 +5385,7 @@
},
"yaxes": [
{
"format": "µs",
"format": "\u00b5s",
"label": null,
"logBase": 1,
"max": null,
@ -5752,7 +5752,7 @@
},
"yaxes": [
{
"format": "µs",
"format": "\u00b5s",
"label": null,
"logBase": 1,
"max": null,
@ -6727,7 +6727,7 @@
},
"yaxes": [
{
"format": "µs",
"format": "\u00b5s",
"label": null,
"logBase": 1,
"max": null,
@ -10181,7 +10181,6 @@
"list": [
{
"current": {
"selected": true,
"text": "$datasource",
"value": "$datasource"
},
@ -10197,9 +10196,8 @@
{
"allValue": ".*",
"current": {
"selected": false,
"text": "testnet",
"value": "testnet"
"text": "Developer Testnet",
"value": "devnet"
},
"datasource": "$datasource",
"hide": 1,
@ -10210,7 +10208,7 @@
"options": [],
"query": "show databases",
"refresh": 1,
"regex": "testnet.*",
"regex": "(devnet|cluster|tds|mainnet-beta|testnet.*)",
"sort": 1,
"tagValuesQuery": "",
"tags": [],
@ -10269,7 +10267,7 @@
]
},
"timezone": "",
"title": "Testnet Monitor (edge)",
"uid": "testnet-edge",
"title": "Cluster Telemetry (edge)",
"uid": "monitor-edge",
"version": 2
}
}

View File

@ -34,7 +34,7 @@ source lib/config.sh
if [[ ! -f lib/grafana-provisioning ]]; then
cp -va grafana-provisioning lib
./adjust-dashboard-for-channel.py \
lib/grafana-provisioning/dashboards/testnet-monitor.json local
lib/grafana-provisioning/dashboards/cluster-monitor.json local
mkdir -p lib/grafana-provisioning/datasources
cat > lib/grafana-provisioning/datasources/datasource.yml <<EOF

View File

@ -106,7 +106,7 @@ function upload_results_to_slack() {
BUILDKITE_BUILD_URL="https://buildkite.com/solana-labs/"
fi
GRAFANA_URL="https://metrics.solana.com:3000/d/testnet-${CHANNEL:-edge}/testnet-monitor-${CHANNEL:-edge}?var-testnet=${TESTNET_TAG:-testnet-automation}&from=${TESTNET_START_UNIX_MSECS:-0}&to=${TESTNET_FINISH_UNIX_MSECS:-0}"
GRAFANA_URL="https://metrics.solana.com:3000/d/monitor-${CHANNEL:-edge}/cluster-telemetry-${CHANNEL:-edge}?var-testnet=${TESTNET_TAG:-testnet-automation}&from=${TESTNET_START_UNIX_MSECS:-0}&to=${TESTNET_FINISH_UNIX_MSECS:-0}"
[[ -n $RESULT_DETAILS ]] || RESULT_DETAILS="Undefined"
[[ -n $TEST_CONFIGURATION ]] || TEST_CONFIGURATION="Undefined"