ba770832d0
* initial work for poh timing report service * add poh_timing_report_service to validator * fix comments * clippy * imrove test coverage * delete record when complete * rename shred full to slot full. * debug logging * fix slot full * remove debug comments * adding fmt trait * derive default * default for poh timing reporter * better comments * remove commented code * fix test * more test fixes * delete timestamps for slot that are older than root_slot * debug log * record poh start end in bank reset * report full to start time instead * fix poh slot offset * report poh start for normal ticks * fix typo * refactor out poh point report fn * rename * optimize delete - delete only when last_root changed * change log level to trace * convert if to match * remove redudant check * fix SlotPohTiming comments * review feedback on poh timing reporter * review feedback on poh_recorder * add test case for out-of-order arrival of timing points and incomplete timing points * refactor poh_timing_points into its own mod * remove option for poh_timing_report service * move poh_timing_point_sender to constructor * clippy * better comments * more clippy * more clippy * add slot poh timing point macro * clippy * assert in test * comments and display fmt * fix check * assert format * revise comments * refactor * extrac send fn * revert reporting_poh_timing_point * align loggin * small refactor * move type declaration to the top of the module * replace macro with constructor * clippy: remove redundant closure * review comments * simplify poh timing point creation Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local> |
||
---|---|---|
.. | ||
scripts | ||
src | ||
.gitignore | ||
Cargo.toml | ||
README.md | ||
grafcli.conf | ||
publish-metrics-dashboard.sh |
README.md
Metrics
Testnet Grafana Dashboard
There are three versions of the testnet dashboard, corresponding to the three release channels:
- https://metrics.solana.com:3000/d/monitor-edge/cluster-telemetry-edge
- https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta
- https://metrics.solana.com:3000/d/monitor/cluster-telemetry
The dashboard for each channel is defined from the
metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json
source
file in the git branch associated with that channel, and deployed by automation
running ci/publish-metrics-dashboard.sh
.
A deploy can be triggered at any time via the New Build
button of
https://buildkite.com/solana-labs/publish-metrics-dashboard.
Modifying a Dashboard
Dashboard updates are accomplished by modifying
metrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json
,
manual edits made directly in Grafana will be overwritten.
- Check out metrics to add at https://metrics.solana.com:8888/ in the data explorer.
- When editing a query for a dashboard graph, use the "Toggle Edit Mode" selection behind the hamburger button to use raw SQL and copy the query into the text field. You may have to fixup the query with the dashboard variables like $testnet or $timeFilter, check other functioning fields in the dashboard for examples.
- Open the desired dashboard in Grafana
- Create a development copy of the dashboard by selecting
Save As..
in theSettings
menu for the dashboard - Edit dashboard as desired
- Extract the JSON Model by selecting
JSON Model
in theSettings
menu. Copy the JSON to the clipboard and paste intometrics/scripts/grafana-provisioning/dashboards/cluster-monitor.json
, - Delete your development dashboard:
Settings
=>Delete
Deploying a Dashboard Manually
If you need to immediately deploy a dashboard using the contents of
cluster-monitor.json
in your local workspace,
$ export GRAFANA_API_TOKEN="an API key from https://metrics.solana.com:3000/org/apikeys"
$ metrics/publish-metrics-dashboard.sh (edge|beta|stable)
Note that automation will eventually overwrite your manual deploy.