When `cargo run` is run in the workspace directory, it can see two
executables:
- `zebrad`
- `zebra_checkpoints`
Adding `default-run = "zebrad"` to `zebrad/Cargo.toml` makes the
workspace run `zebrad` by default. (Even though it's redundant for the
`zebrad` crate itself.)
Because the new version of the prometheus exporter launches its own
single-threaded runtime on a dedicated worker thread, there's no need
for the tokio and hyper versions it uses internally to align with the
versions used in other crates. So we don't need to use our fork with
tokio 0.3, and can just use the published alpha. Advancing to a later
alpha may fix the missing-metrics issues.
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.
To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.
Co-authored-by: teor <teor@riseup.net>
The metrics code becomes much simpler because the current version of the
metrics crate builds its own single-threaded runtime on a dedicated worker
thread, so no dependency on the main Zebra Tokio runtime is required.
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware. This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method. See
ddc64e8d4d
for more context on the Tower changes. To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
This makes two changes relative to the existing download code:
1. It uses a oneshot to attempt to cancel the download task after it
has started;
2. It encapsulates the download creation and cancellation logic into a
Downloads struct.
* Split tracing component code into modules.
* Repatriate Tracing and simplify config handling.
We upstreamed our Tracing component, expecting not to have to exert fine
control over the tracing settings. But this turned out not to be the case, and
now that we want to do other things (flamegraphs, journalctl, opentelemetry,
etc), we end up with really awkward code (as in the current flamegraph
handling).
This also makes use of the changes to `init()` to load the config early to pass
configuration data into the components, which avoids the need for the
refactoring in #775.
Finally, we restore support for the `-v` flag when the filter is unset. Closes#831.
* Disable tracing and metrics endpoints by default.
Closes#660.
* Switch back to upstream Abscissa.
* Integrate flamegraph support into the new Tracing component.
* Pass -v in acceptance tests to get info-level output.
* Clean up acceptance test code.
* Setup tracing-flame for use profiling zebrad
* start work on conditional flamegraph generation
* review time!
* update comments
* Update Cargo.toml
* disable default features for inferno
* reorganize
* missing one trait
* Apply suggestions from code review
* graceful shutdown!
* remove special case handling on ctrlc for cleanup
* rename signal fn to better represent its responsibility
* remove unused global hook for flushing flamegraph
* move tracing logic to the right file
* just copy linkerd's signal handling logic
* update book
* make zebrad app drop on shutdown normally
* Update zebrad/src/components/tokio.rs
Co-authored-by: teor <teor@riseup.net>
* Update zebrad/src/application.rs
Co-authored-by: teor <teor@riseup.net>
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* cleanup a little
* ooh yea there's an API for that
* setup env-filter for backup subscriber
* document env filter
* document return codes
* forgot to save
* Update book/src/applications/zebrad.md
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case
Co-authored-by: Jane Lusby <jane@zfnd.org>
We had a brief discussion on discord and it seemed like we had consensus on the
following versioning policy:
* zebrad: match major version to NU version, so we will start by releasing
zebrad 3.0.0;
* zebra-* libraries: start by matching zebrad's version, then increment major
versions of each library as we need to make breaking changes (potentially
faster than the zebrad version, always respecting semver but making no
guarantees about the longevity of major releases).
This commit sets all of the crate versions to 3.0.0-alpha.0 -- the -alpha.0
marks it as a prerelease not subject to perfect adherence to compatibility
guarantees.