Commit Graph

486 Commits

Author SHA1 Message Date
Kirill Fomichev 43e792b9a4
Update to vergen 5, add branch, commit time, and build target to the panic metadata, automatically update app version from crate version (#2029)
* build(deps): bump vergen from 3.2.0 to 5.1.1

* fix hardcoded version for Tracing struct

* add additional metadata

* remove extra allocations for metadata

* Remove zebrad code version from release checklist

The zebrad code automatically uses the crate version now.

* Sort panic metadata into rough categories

Co-authored-by: teor <teor@riseup.net>
2021-04-20 06:48:14 +10:00
teor 1e4e5924ca clippy: factor common code out of an if-else block 2021-04-14 23:16:45 -04:00
dependabot[bot] 7d36a5e2c3 build(deps): bump color-eyre from 0.5.10 to 0.5.11
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.10 to 0.5.11.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Changelog](https://github.com/yaahc/color-eyre/blob/v0.5.11/CHANGELOG.md)
- [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.10...v0.5.11)

Signed-off-by: dependabot[bot] <support@github.com>
2021-04-14 09:22:27 -04:00
teor a417c7c8c7 Use meaningful names for select! variables 2021-04-13 23:56:16 -04:00
Alfredo Garcia 5ec05e91e1 update version strings for v1.0.0-alpha.6 2021-04-08 18:48:34 -04:00
teor 5bf0a2954e
Update a test comment 2021-04-07 19:25:31 +10:00
teor bb3147dd1e
Update an outdated comment 2021-04-07 19:23:48 +10:00
teor 306fa88214 Document the correctness of Poll::Pending wakeups 2021-03-27 08:55:49 -04:00
teor 829a6f11c5 Document the behaviour of the `select!` macro 2021-03-27 08:55:49 -04:00
Deirdre Connolly 7efc700aca
Merge pull request #1713 from ZcashFoundation/use-groth16-batch-math
Use batch optimizations, load params in groth16::Verifier, verify Spend & Output descriptions in transaction verifier
2021-03-24 12:28:25 -04:00
Deirdre Connolly ca1d2de87d
Bump versions for v1.0.0-alpha.5 (#1932)
Zebra's latest alpha checkpoints on Canopy activation, continues our work on NU5, and fixes a security issue.

Some notable changes include:

## Added
- Log address book metrics when PeerSet or CandidateSet don't have many peers (#1906)
- Document test coverage workflow (#1919)
- Add a final job to CI, so we can easily require all the CI jobs to pass (#1927)

## Changed
- Zebra has moved its mandatory checkpoint from Sapling to Canopy (#1898, #1926)
  - This is a breaking change for users that depend on the exact height of the mandatory checkpoint.

## Fixed
- tower-batch: wake waiting workers on close to avoid hangs (#1908)
- Assert that pre-Canopy blocks use checkpointing (#1909)
- Fix CI disk space usage by disabling incremental compilation in coverage builds (#1923)

## Security
- Stop relying on unchecked length fields when preallocating vectors (#1925)
2021-03-22 22:05:01 -04:00
teor 74cc30c307 Change the cached sync tests to canopy
This change requires a cached state rebuild. The rebuilt state will be
significantly larger.
2021-03-18 10:13:47 +10:00
Alfredo Garcia d49eaab68e
Bump versions for zebrad 1.0.0-alpha.4 (#1913)
* Bump versions for zebrad 1.0.0-alpha.4

* add Cargo.lock
2021-03-16 21:12:37 -03:00
Jack Grigg bae9a7ecd5 Expose binary data in metrics
This enables slicing and aggregating metrics based on zebrad version:
https://www.robustperception.io/exposing-the-software-version-to-prometheus
2021-03-17 09:38:07 +10:00
dependabot[bot] b618f5b522 build(deps): bump tracing-subscriber from 0.2.16 to 0.2.17
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.16 to 0.2.17.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.16...tracing-subscriber-0.2.17)

Signed-off-by: dependabot[bot] <support@github.com>
2021-03-14 19:37:24 -04:00
teor 1e1859f5a3
Merge pull request #1887 from ZcashFoundation/revert-1877-revert-1789-large-sync-testnet-disable
Revert "Revert "Disable unreliable `sync_large_checkpoints_testnet`""

This disables the failing large testnet sync test.
2021-03-12 12:31:19 +10:00
teor d494af1e90 Document how the syncer resists memory DoS 2021-03-11 06:24:46 -05:00
teor c6358b157c Reduce inbound concurrency to limit memory usage
Inbound malicious blocks can use a large amount of RAM when
deserialized. Limit inbound concurrency, so that the total amount
of RAM remains small.
2021-03-11 06:24:46 -05:00
teor 475deaf655
Adjust the crawl interval and acceptance test timeout (#1878) 2021-03-11 07:53:37 +10:00
teor ac4611ffc4 Revert "Disable unreliable `sync_large_checkpoints_testnet` (#1789)"
This reverts commit bae49e54df.
2021-03-10 02:14:09 -05:00
dependabot[bot] 70327dc9f5 build(deps): bump once_cell from 1.6.0 to 1.7.0
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.6.0 to 1.7.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.6.0...v1.7.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-02-25 15:32:03 -05:00
teor fb6acfaff7
Update the acceptance test port range (#1812)
Windows can reserve or use ports up to 53500.

Windows and macOS sequentially allocate ephemeral ports,
starting at 41952.
2021-02-25 09:27:56 +10:00
dependabot[bot] dab65b33eb build(deps): bump once_cell from 1.5.2 to 1.6.0
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.5.2 to 1.6.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.5.2...v1.6.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-02-23 10:59:40 -05:00
teor 7558f74c78 Bump versions for zebrad 1.0.0-alpha.3 2021-02-23 10:39:13 -05:00
dependabot[bot] 304d7682f5 build(deps): bump tracing-subscriber from 0.2.15 to 0.2.16
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.15 to 0.2.16.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.15...tracing-subscriber-0.2.16)

Signed-off-by: dependabot[bot] <support@github.com>
2021-02-22 01:38:42 -05:00
Alfredo Garcia bae49e54df
Disable unreliable `sync_large_checkpoints_testnet` (#1789)
* delete `sync_large_checkpoints_testnet`
2021-02-19 21:40:01 +00:00
teor b0bc4a79c9 Disable conflict failure cleanup on macOS 2021-02-20 06:22:11 +10:00
teor af12f20732 Re-enable macOS conflict tests
We disabled these tests pending #1613. But the comment incorrectly said
we were waiting for #1631.
2021-02-20 06:22:11 +10:00
teor c51fd688ee Skip node2.is_running() on Windows
`node2.is_running()` can return `true` on Windows, even though `node2`
has logged a panic. This cleanup code only runs if `node2` fails to panic
and exit as expected. So it's ok for us to skip it.

See #1781 for details.
2021-02-19 18:03:07 +10:00
teor 631fe22422 Fix conflict test node termination
On Windows, if a process is killed after it is dead, it returns `true`
for `was_killed`. Instead, check if the process is running before killing
it.

Also make the section where processes are running as short as possible,
and include context for both processes in every error.
2021-02-19 18:03:07 +10:00
teor bcb9cead88
Kill running acceptance test nodes on error (#1770)
And show the output from those nodes.

These changes help us diagnose errors that happen while one or more
acceptance test nodes are running.
2021-02-19 06:46:27 +10:00
teor e61b5e50a2
Diagnostics for CI port conflict failures (#1766)
Log a "Trying..." message before each listener opens, to see if the
delay is inside Zebra, or in the test harness or OS.

Also report the configured and actual ports where possible, for better
diagnostics.
2021-02-18 12:15:09 -03:00
teor 972103d797 Fix tracing macro syntax 2021-02-17 11:09:22 -05:00
teor 253d1c02b3 Make sync logging a bit less verbose
And tweak some log content
2021-02-17 11:09:22 -05:00
teor cc7d5bd2ad
Update comments for the inbound service (#1740) 2021-02-16 06:14:40 +10:00
teor 372a432179
Update the call_all comment in Inbound (#1737) 2021-02-16 06:14:16 +10:00
teor e8d8a49f41
Increase the conflict acceptance test launch delay (#1736)
* Increase the conflict acceptance test launch delay

Also rename the tests - the listener is for the Zcash protocol,
but the state, metrics, and tracing are Zebra-specific.

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-02-16 05:52:30 +10:00
dependabot[bot] af11cc4815 build(deps): bump vergen from 3.1.0 to 3.2.0
Bumps [vergen](https://github.com/rustyhorde/vergen) from 3.1.0 to 3.2.0.
- [Release notes](https://github.com/rustyhorde/vergen/releases)
- [Commits](https://github.com/rustyhorde/vergen/compare/v3.1.0...v3.2.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-02-10 21:36:03 -05:00
teor 0b76352468
Document a state_contains bug (#1715)
* Document a state_contains bug in the syncer and Inbound
2021-02-10 09:05:14 +10:00
Deirdre Connolly 0c5daa8410 Bump versions for zebrad 1.0.0-alpha.2
Including tower-batch bump to 0.2.0, tower-fallback to 0.2.0, zebra-script to 1.0.0-alpha.3
2021-02-09 16:14:29 -05:00
teor dce11358d7
Log when the syncer awaits peer readiness (#1714) 2021-02-10 07:09:27 +10:00
Alfredo Garcia d7c40af2a8
Fix shutdown panics (#1637)
* add a shutdown flag in zebra_chain::shutdown
* fix network panic on shutdown
* fix checkpoint panic on shutdown
2021-02-03 19:03:28 +10:00
teor 6679a124e3 Require Inbound setup handlers to provide a result
Rather than having them default to `Ok(())`, which is incorrect
for some error handlers.
2021-02-03 08:32:10 +10:00
teor 09c8c89462 Make sure FailedInit never escapes Inbound::poll_ready 2021-02-03 08:32:10 +10:00
teor 134a5e78bd Consistently use `network_setup` for the Inbound Setup 2021-02-03 08:32:10 +10:00
teor 1c8362fe01 Remove unused imports 2021-02-03 08:32:10 +10:00
Jane Lusby 4cf331562c combine network setup into an exhaustive match 2021-02-03 08:32:10 +10:00
Jane Lusby 4d6ef89248 avoid using async blocks to avoid lifetime bug with generators 2021-02-03 08:32:10 +10:00
Jane Lusby 685a592399 Add clonable wrapper around TryRecvError 2021-02-03 08:32:10 +10:00
teor 6ffeb670ed Log the failed response in an unreachable panic 2021-02-03 08:32:10 +10:00
teor eac4fd181a Add a Setup enum to manage Inbound network setup internal state
This change encodes a bunch of invariants in the type system,
and adds explicit failure states for:
* a closed oneshot,
* bugs in the initialization code.
2021-02-03 08:32:10 +10:00
teor 32b032204a Consistently return Response::Nil during setup
And log an info-level message as a diagnostic, in case setup takes a
long time.
2021-02-03 08:32:10 +10:00
teor 94eb91305b Stop using ServiceExt::call_all due to buffer bugs
ServiceExt::call_all leaks Tower::Buffer reservations, so we can't use
it in Zebra.

Instead, use a loop in the returned future.

See #1593 for details.
2021-02-03 08:32:10 +10:00
teor 64bc45cd2e Fix state readiness hangs for Inbound
Use `ServiceExt::oneshot` to perform state requests.

Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
2021-02-03 08:32:10 +10:00
teor 4d1a2fd02e Make the Inbound invariant clearer 2021-02-03 08:32:10 +10:00
teor 2a25b9ee72 Remove services that are never `call`ed from Inbound
Uses the `ServiceExt::oneshot` design pattern from #1593.
2021-02-03 08:32:10 +10:00
Alfredo Garcia 4b34482264
Add hints to port conflict and lock file panics (#1535)
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests

* Add a full set of stderr acceptance test functions

This change makes the stdout and stderr acceptance test interfaces
identical.

* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path

* Use Display for state path logs

Avoids weird escaping on Windows when using Debug

* Add Windows conflict error messages

* Turn PORT_IN_USE_ERROR into a regex

And add another alternative Windows-specific port error

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
2021-01-29 22:36:33 +10:00
teor 24f1b9bad1
Document the Inbound service in the start module (#1653) 2021-01-29 22:19:06 +10:00
teor 21b0360114 Limit concurrent inbound gossipped block requests
Uses the "load shed directly" design pattern from #1618.
2021-01-29 11:02:26 +10:00
teor 3d9888f736 Rewrite a sync comment 2021-01-29 11:02:26 +10:00
Deirdre Connolly 1b09538277
Bump versions for zebrad 1.0.0-alpha.1 (#1646)
* Bump versions where appropriate

Tested with cargo install --locked --path etc

* Remove fixed panics from 'Known Issues'

* Change to alpha release series in the README

Co-authored-by: teor <teor@riseup.net>
2021-01-27 20:31:39 -05:00
teor 391c53aa60 Move BoxError to zebrad's lib.rs
For consistency with other crates.
2021-01-27 12:14:27 -08:00
teor 494a5130c1 Fix clippy "unnecessary Vec::push" lints 2021-01-22 11:51:30 -08:00
teor 00c3ad5d8e Fix clippy "unnecessary Ok" lints 2021-01-22 11:51:30 -08:00
teor eadbea1423 Construct structs with ..default() to avoid a lint 2021-01-19 11:02:20 -05:00
teor b1d28b73fd Stop disabling lints that no longer cause warnings on nightly 2021-01-19 11:02:20 -05:00
teor 258789ed9b Use the rustc unknown lints attribute
The clippy unknown lints attribute was deprecated in
nightly in rust-lang/rust#80524. The old lint name now produces a
warning.

Since we're using `allow(unknown_lints)` to suppress warnings, we need to
add the canonical name, so we can continue to build without warnings on
nightly.

But we also need to keep the old name, so we can continue to build
without warnings on stable.

And therefore, we also need to disable the "removed lints" warning,
otherwise we'll get warnings about the old name on nightly.

We'll need to keep this transitional clippy config until rustc 1.51 is
stable.
2021-01-19 11:02:20 -05:00
teor 8a7b0232f0
Stop failing acceptance tests if their directories already exist (#1588)
* Stop failing acceptance tests if their directories already exist
* Add an immutable config writing helper
  and use it in the cached sapling acceptance tests.

Also:
  * consistently create missing config and state directories
  * refactor the common config writing code into a separate function
  * only ignore NotFound errors in replace_config
  * enforce config immutability using the type system
2021-01-14 16:59:06 +10:00
teor 9cdf41f5f4
Panic if the lookahead limit is misconfigured (#1589) 2021-01-14 14:06:30 +10:00
teor 92d95d4be5 Refactor inbound members into a consistent order
And add download comments
2021-01-13 20:46:25 -05:00
teor fb76eb2e6b Add download and verify timeouts to the inbound service 2021-01-13 20:46:25 -05:00
teor 973aec8ccc Refactor sync members into a consistent order
And add comments about correctness and usage.
2021-01-13 20:46:25 -05:00
teor c2893dce51 Warn when the user's configured lookahead limit is ignored 2021-01-13 20:46:25 -05:00
teor 3699bbdae6 Add some additional sync correctness constraints
And adjust the sync restart delay as a consequence.
2021-01-13 20:46:25 -05:00
teor cef0a492d8 Add a timeout to sync service block verification
This timeout stops the sync service hanging when it is missing required
blocks, but the lookahead queue is full of dependent verify tasks, so the
missing blocks never get downloaded.
2021-01-13 20:46:25 -05:00
teor 730910cd99 Upgrade to tokio 0.3.6 from crates.io
And remove the tokio git dependency patch
2021-01-12 15:37:27 -05:00
dependabot[bot] e3e9990315 build(deps): bump inferno from 0.10.2 to 0.10.3
Bumps [inferno](https://github.com/jonhoo/inferno) from 0.10.2 to 0.10.3.
- [Release notes](https://github.com/jonhoo/inferno/releases)
- [Changelog](https://github.com/jonhoo/inferno/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jonhoo/inferno/compare/v0.10.2...v0.10.3)

Signed-off-by: dependabot[bot] <support@github.com>
2021-01-12 00:53:49 -05:00
teor caca450904
zebrad acceptance test cleanup (#1560)
Check misconfigured ephemeral doesn't create a state dir

Add extra misconfigured `zebrad` ephemeral mode checks:
* doesn't create a state directory
* doesn't create unexpected files or directories in the working directory

Check ephemeral doesn't delete an existing state directory

Refactor all the ephemeral configs and checks into a single test
function.

Also:
* cleanup acceptance tests using utility functions
* make some checks consistent between tests
* make error messages consistent

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2021-01-12 12:26:08 +10:00
teor c75cbdea79
Log configured network in every log message (#1568)
* Add the configured network to error reports
* Log the configured network at error level
* Create the global span immediately after activating tracing
And leak the span guard, so the span is always active.

* Include panic metadata in the report and URL
* Use `Main` and `Test` in the global span
`net=Mainnet` is a bit redundant
2021-01-12 07:46:56 +10:00
teor 40c427cf7c Make `cargo run` use `zebrad` by default
When `cargo run` is run in the workspace directory, it can see two
executables:
- `zebrad`
- `zebra_checkpoints`

Adding `default-run = "zebrad"` to `zebrad/Cargo.toml` makes the
workspace run `zebrad` by default. (Even though it's redundant for the
`zebrad` crate itself.)
2021-01-08 15:14:08 -08:00
teor b1f14f47c6
Rewrite GetData handling to match the zcashd implementation (#1518)
* Rewrite GetData handling to match the zcashd implementation

`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)

This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.

Also change Zebra's GetData responses to peer request so they ignore
missing blocks.

* Stop hanging on incomplete transaction or block responses

Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
   that were successfully received
2. if none of the expected blocks or transactions were received, return
   an error, and close the connection
2021-01-04 13:25:35 +10:00
Deirdre Connolly f5b3412a50 Get PathBuf even if /zebrad-cache exists for create_cached_database_height() 2020-12-30 21:32:55 -05:00
teor 6cd7f255d4
Cleanup acceptance tests using utility functions (#1537)
And add a missing network check.
2020-12-20 09:06:18 +10:00
teor 3355be4c41
Improve the random port function docs (#1536)
Also:
* rename the function
* create an alternative function for the common case.
2020-12-18 11:32:33 +10:00
teor 69fcf64d6c
Disable issue URLs for "duplicate hash" errors (#1517)
In our README, we tell users to ignore these errors, so we should also
disable the issue URL.

Also include the hash in the error. (We don't want the span active for
all messages, we just want the hash in the error.)
2020-12-16 08:14:42 +10:00
teor 008577561c Use a sleep future in the async acceptance tests
And wait slightly longer for `zebrad` to launch.

These fixes should reduce the failure rate of the acceptance tests on
busy machines.
2020-12-16 08:09:48 +10:00
teor 1b6bf7f105 Use random ports in the acceptance tests
This change avoids errors when tests are cancelled and re-run within a
short period of time, for example, using `cargo watch`.

It introduces a slight risk of port conflicts between the endpoint tests,
and with (ephemeral) ports used by other services. The risk of conflicts
across 2 tests is very low, and tests should be run in an isolated
environment on busy servers.
2020-12-16 08:09:48 +10:00
Alfredo Garcia 41833340c1
downgrade remaining version strings to 1.0.0-alpha.0 (#1488) 2020-12-15 11:21:00 +10:00
Deirdre Connolly 2d1698a120 Comment out Sentry stacktraces for now
While panic = abort, Sentry collects the same one-line stack trace for all panics,
making it incorrectly dedupe different errors into one.
2020-12-12 13:26:52 -05:00
Deirdre Connolly cff28f7ac8 Use the commit sha as the sentry release 2020-12-09 13:06:18 -05:00
Jane Lusby 400213e2b3 integrate sentry with our existing panic reporting logic 2020-12-09 13:06:18 -05:00
Deirdre Connolly f1ec1d626d Tidy for now 2020-12-09 13:06:18 -05:00
Deirdre Connolly 44e1051dee Debug 2020-12-09 13:06:18 -05:00
Deirdre Connolly 8b268e3f71 Don't keep guard around 2020-12-09 13:06:18 -05:00
Deirdre Connolly 25f6fd25b3 Test catching panic 2020-12-09 13:06:18 -05:00
Deirdre Connolly 6a17549945 Try sentry-tracing integration 2020-12-09 13:06:18 -05:00
Deirdre Connolly c03a3a2606 Pull DSN from runtime env, enable Sentry debug mode with RUST_LOG=debug 2020-12-09 13:06:18 -05:00
Deirdre Connolly 27e42f4ed5 Set up Sentry error collection via a feature flag 2020-12-09 13:06:18 -05:00
Deirdre Connolly 47d78d4cf4 Try sentry::init() 2020-12-09 13:06:18 -05:00
teor 16ffb1dbbf
Disable issue URLs on all timeouts (#1470)
This change helps prevent spurious bug reports.
2020-12-08 07:47:01 +10:00