4f7e45bb24
There are a few minor issues this change addresses: 1. When we send points to the `MetricsWriter` we are calling `Instant::now()` twice, using the first result in the metrics stats, and using the seconds value for `last_write_time`. Yet, on the next upload, we would use `last_write_time` as a reference point. We upload metrics using a network call, so it is far from instantaneous. This creates a minor discrepancy in our time reporting. Good news is that we do not really need to call `Instant::now()` twice at all, as we can use the same value for both stats and `last_write_time`. 2. We did not report metrics stats if we did not have any points accumulated. It seems better to always report metric stats, including when no points have been accumulated. In practice, this does not happen for the validator, as validators always report something during a 10-second accumulation interval. 3. We did not upload any points when the metrics thread was existing. This may cause a short number of metrics not to be reported. 4. `collect_points()` was always converting both `points` and `counters` into a vector of `DataPoint`, even if the final length was over the specified `max_points`. In the `mainnet-beta` we have values of up to 5m points lost, so it could be a small optimization if we drop them sooner. |
||
---|---|---|
.. | ||
benches | ||
scripts | ||
src | ||
.gitignore | ||
Cargo.toml | ||
README.md |
README.md
Metrics
InfluxDB
In order to explore validator specific metrics from mainnet-beta, testnet or devnet you can use Chronograf:
- https://metrics.solana.com:8888/ (production environment)
- https://metrics.solana.com:8889/ (testing environment)
For local cluster deployments you should use:
Public Grafana Dashboards
There are three main public dashboards for cluster related metrics:
- https://metrics.solana.com/d/monitor-edge/cluster-telemetry
- https://metrics.solana.com/d/0n54roOVz/fee-market
- https://metrics.solana.com/d/UpIWbId4k/ping-result
For local cluster deployments you should use:
Cluster Telemetry
The cluster telemetry dashboard shows the current state of the cluster:
- Cluster Stability
- Validator Streamer
- Tomer Consensus
- IP Network
- Snapshots
- RPC Send Transaction Service
Fee Market
The fee market dashboard shows:
- Total Prioritization Fees
- Block Min Prioritization Fees
- Cost Tracker Stats
Ping Results
The ping results dashboard displays relevant information about the Ping API