* Add minimum queued block height to metrics
* Add metrics to sent block cache
* Add metrics for the last sent finalized height
* Refactor sent metrics
* Add new block channel metrics to a grafana dashboard
* Use a named CancelHeartbeatTask unit struct for the channel type
* Prefer cancel handles in selects, if both are ready
* Fix message metrics to just show the command name
* Add metrics for internal requests and responses
* Add internal requests and responses to the messages dashboard
* Add a canceled metric, and peer addresses to request and response metrics
* Add a canceled messages graph
* Add connection state metrics for currently open connections
* Fix the connection state graph with new metrics
* Always send an error before dropping pending responses
* Move error detail logging into `fail_with`
* Delete an unused timer future
* Make error strings in metrics less verbose
* Downgrade some error logs to info
* Remove a redundant expect
* Avoid unnecessary allocations for connection state metrics
* Fix missed updates to mempool and block gossip metrics
* Rename tx downloader & verifier metrics
* Add version to mempool metrics
* Add new metrics
* Make sure mempool gauges are zeroed when instances are dropped
* Updated mempool grafana dashboard
* Removed transaction verification dashboard; moved to mempool
* Update mempool dashboard
* Add reason to error labels in mempool dashboard
* Rename some metrics per review
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Check for state errors before updating metrics
Previously, the metrics would be updated for some rejected blocks.
* Clarify and expand block verification metrics
Rename checkpoint-specific metrics to clarify their purpose.
Add metrics for:
- finalized blocks on disk
- blocks verified using the full block verifier
(this metric was previously incorrectly called `zcash_chain_verified_block_height`)
* Update dashboard metric names
Also:
- add some extra block height metrics
- fix a dashboard name
* Add exact block heights to Grafana dashboards
* Add a missing comment
* grafana: use 0 decimals for metrics
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* grafana: show the entire height instead of abbreviated
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* grafana: show the entire height instead of abbreviated
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Fix typo in metric name
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Move height gauges to the state, so they are correct
If we update height gauges in futures, they can execute out of order,
so the metrics can be incorrect.
Instead:
- move the height gauges to the state, and update them based on the best tip
- move the verified block counts to the state
- continue to include all verified blocks on all non-finalized chains
(not just the best chain)
* Show exact checkpoint heights in the dashboard
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Add metrics gauges for the most recent peer network protocol version
This gague lets us join the initial seeds to the network protocol versions,
even if the peer upgrades and reconnects with a different version.
* Ensure dashboard peer network versions are unique
Otherwise, prometheus returns an error,
and the dashboard shows no data.
* Make seeder labels more readable
- put labels to the right of the graph
- remove default ports
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Add tracing and metrics for seed peer DNS resolution
* Add a grafana dashboard for seed peers
Currently this just shows the initial peer count from each seed.
* Add tracing and metrics for peer network protocol versions
* Update peers dashboard with network protocol versions
* Show peer network protocol versions for each seeder in dashboard
* Add per-seed filter to dashboard
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Minimal recent sync lengths implementation
Also includes metrics and logging, to make diagnosing bugs easier.
* Add logging to check what happens when Zebra reaches the chain tip
* Add tests for recent sync lengths
- initially empty
- pruned to correct length
- newest entries go first
* Drop a redundant `/` from a Cargo.lock URL
This seems to be a nightly or beta Rust change,
but hopefully stable just accepts it.
* Use metrics histograms to avoid overwriting values
* Add detailed syncer monitoring dashboard
* Increase the recent sync length to 4
This length makes it easier to distinguish between temporary and
sustained errors/syncs.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>