Commit Graph

36 Commits

Author SHA1 Message Date
teor c5e6adc87a
Return an error instead of panicking on shutdown (#5530) 2022-11-02 02:43:13 +00:00
teor 6ad445eb97
1. fix(perf): Run CPU-intensive state updates in parallel rayon threads (#4802)
* Split disk reads from CPU-heavy Sprout interstitial tree cryptography

* Improve anchor validation debugging and error messages

* Work around a test data bug, and save some CPU

* Remove redundant checks for empty shielded data

* Skip generating unused interstitial treestates

* Do disk fetches and quick checks, then CPU-heavy cryptography

* Wrap HistoryTree in an Arc in the state

* Run CPU-intensive chain validation and updates in parallel rayon threads

* Refactor to prepare for parallel tree root calculations

* Run finalized state note commitment tree root updates in parallel rayon threads

* Update finalized state note commitment trees using parallel rayon threads

* Fix a comment typo and add a TODO

* Split sprout treestate fetch into its own function

* Move parallel note commitment trees to zebra-chain

* Re-calculate the tree roots in the same parallel batches

* Do non-finalized note commitment tree updates in parallel threads

* Update comments about note commitment tree rebuilds

* Do post-fork tree updates in parallel threads

* Add a TODO for parallel tree updates in tests

* Fix broken intra-doc links

* Clarify documentation for sprout treestates

* Sort Cargo.toml dependencies
2022-07-22 12:19:11 -04:00
teor cf4b2f7a67
feat(verify): Concurrently verify proof and signature batches (#4776)
* Initialize the rayon threadpool with a new config for CPU-bound threads

* Verify proofs and signatures on the rayon thread pool

* Only spawn one concurrent batch per verifier, for now

* Allow tower-batch to queue multiple batches

* Fix up a potentially incorrect comment

* Rename some variables for concurrent batches

* Spawn multiple batches concurrently, without any limits

* Simplify batch worker loop using OptionFuture

* Clear pending batches once they finish

* Stop accepting new items when we're at the concurrent batch limit

* Fail queued requests on drop

* Move pending_items and the batch timer into the worker struct

* Add worker fields to batch trace logs

* Run docker tests on PR series

* During full verification, process 20 blocks concurrently

* Remove an outdated comment about yielding to other tasks
2022-07-18 08:43:29 +10:00
teor 9b9cd55097
fix(batch): Improve batch verifier async, correctness, and performance (#4750)
* Use a new channel for each batch

* Prefer the batch timer if there are also new batch requests

* Allow other tasks to run after each batch

* Label each batch worker with the verifier's type

* Rename Handle to ErrorHandle, and fix up some docs

* Check batch worker tasks for panics and task termination

* Use tokio's PollSemaphore instead of an outdated Semaphore impl

* Run all verifier cryptography on a blocking thread

Also use a new verifier channel for each batch.

* Make flush and drop behaviour consistent for all verifiers

* Partly fix an incorrect NU5 test

* Switch batch tests to the multi-threaded runtime

* Export all verifier primitive modules from zebra-consensus

* Remove outdated test code in tower-batch

* Use a watch channel to send batch verifier results

* Use spawn_blocking for batch fallback verifiers

* Spawn cryptography batches onto blocking tokio threads

* Use smaller batches for halo2

* Minor tower-batch cleanups

* Fix doc link in zebra-test

* Drop previous permit before acquiring another to avoid a deadlock edge case
2022-07-18 08:41:18 +10:00
teor 00aa5d96a3
Consolidate standard lints into a cargo config file (#3386)
* Move standard lints into .cargo/config.toml

* Ignore "wrong self convention" in a futures-based trait

This lint might only trigger on beta or nightly at the moment.

* Warn if future incompatibile code is added to Zebra

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-01-24 16:25:06 +00:00
Deirdre Connolly 89b0403582
Enforce Rust edition 2021 (#3332)
* Rust edition 2021: zebra-network, cargo fix --edition and clippy --fix

* Rust edition 2021: zebra-chain, cargo fix --edition

* Rust edition 2021: tower-batch, cargo fix --edition

* Rust edition 2021: tower-fallback, cargo fix --edition

* Rust edition 2021: zebra-client, cargo fix --edition

* Rust edition 2021: zebra-consensus, cargo fix --edition

* Rust edition 2021: zebra-rpc, cargo fix --edition

* Rust edition 2021: zebra-state, cargo fix --edition

* Rust edition 2021: zebra-state, cargo fix --edition

* Rust edition 2021: zebra-test, cargo fix --edition

* Rust edition 2021: zebra-utils, cargo fix --edition

* Rust edition 2021: zebrad, cargo fix --edition

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-01-14 12:10:18 +00:00
Janito Vaqueiro Ferreira Filho 0960e4fb0b
Update to Tokio 1.13.0 (#2994)
* Update `tower` to version `0.4.9`

Update to latest version to add support for Tokio version 1.

* Replace usage of `ServiceExt::ready_and`

It was deprecated in favor of `ServiceExt::ready`.

* Update Tokio dependency to version `1.13.0`

This will break the build because the code isn't ready for the update,
but future commits will fix the issues.

* Replace import of `tokio::stream::StreamExt`

Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.

* Use `IntervalStream` in `zebra-network`

In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.

* Use `IntervalStream` in `inventory_registry`

In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.

* Use `BroadcastStream` in `inventory_registry`

In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.

* Handle `Semaphore::acquire` error in `tower-batch`

Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.

* Handle `Semaphore::acquire` error in `zebrad` test

On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.

* Update some `zebra-network` dependencies

Use versions compatible with Tokio version 1.

* Upgrade Hyper to version 0.14

Use a version that supports Tokio version 1.

* Update `metrics` dependency to version 0.17

And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.

* Use `f64` as the histogram data type

`u64` isn't supported as the histogram data type in newer versions of
`metrics`.

* Update the initialization of the metrics component

Make it compatible with the new version of `metrics`.

* Simplify build version counter

Remove all constants and use the new `metrics::incement_counter!` macro.

* Change metrics output line to match on

The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.

* Update `sentry` to version 0.23.0

Use a version compatible with Tokio version 1.

* Remove usage of `TracingIntegration`

This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.

* Add sentry layer to tracing initialization

This seems like the replacement for `TracingIntegration`.

* Remove unnecessary conversion

Suggested by a Clippy lint.

* Update Cargo lock file

Apply all of the updates to dependencies.

* Ban duplicate tokio dependencies

Also ban git sources for tokio dependencies.

* Stop allowing sentry-tracing git repository in `deny.toml`

* Allow remaining duplicates after the tokio upgrade

* Use C: drive for CI build output on Windows

GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.

Co-authored-by: teor <teor@riseup.net>
2021-11-02 18:46:57 +00:00
Janito Vaqueiro Ferreira Filho 6905c79fd6
Use `broadcast::Receiver::recv` instead of `next` (#2933)
On newer versions of Tokio the `Receiver` doesn't implement `Stream`.
2021-10-29 21:28:54 -03:00
Janito Vaqueiro Ferreira Filho 2a1d4281c5
Manually pin `Sleep` futures (#2914)
* Wrap `Sleep` timer in a `Pin<Box<_>>`

The `Sleep` type doesn't implement `Unpin` in newer versions of Tokio.

* Wrap `Sleep` type in a `Pin<Box<_>>`

In newer Tokio versions the `Sleep` type doesn't implement `Unpin`, so
it needs to be manually pinned.
2021-10-22 16:06:03 -03:00
teor 2f0f379a9e
Standardise clippy lints and require docs (#2238)
* Standardise lints across Zebra crates, and add missing docs

The only remaining module with missing docs is `zebra_test::command`

* Todo -> TODO

* Clarify what a transcript ErrorChecker does

Also change `Error` -> `BoxError`

* TransError -> ExpectedTranscriptError

* Output Descriptions -> Output descriptions
2021-06-04 08:48:40 +10:00
teor 306fa88214 Document the correctness of Poll::Pending wakeups 2021-03-27 08:55:49 -04:00
teor 829a6f11c5 Document the behaviour of the `select!` macro 2021-03-27 08:55:49 -04:00
Jane Lusby fc4b8c1e70 add basic test for batch waker behaviour 2021-03-17 10:44:18 +10:00
Jane Lusby c10ea1d82b split pair constructor off of Batch::new 2021-03-17 10:44:18 +10:00
teor 873127aac1 Replace smart quotes with ascii quotes
Some tools don't deal well with unicode text. And we're not using it
consistently in Zebra anyway.
2021-03-15 03:18:10 -04:00
teor 895bb43ead Clippy: Fix inconsistent struct member orders lint 2021-03-01 23:31:18 -05:00
teor 1ef836abb9 Add a missing Sync bound 2021-02-17 09:03:09 -05:00
teor 090afb9d4c Ignore clippy lints on copied code 2021-02-17 09:03:09 -05:00
teor 47084ea85e Wake waiting tower-batch tasks on drop
When other tower-batch tasks drop, wake any tasks that are waiting for
a semaphore permit. Otherwise, tower-batch can hang.

We currently pin tower in our workspace to:
d4d1c67 hedge: use auto-resizing histograms (tower-rs/tower#484)

Copy tower/src/semaphore.rs from that commit, to pick up
tower-rs/tower#480.
2021-02-17 09:03:09 -05:00
Jane Lusby 0ac259430a Implement Async Batch verification API for groth16
This PR is the first step in getting a groth16 proving system fully
integrated with the rest of zebra. This PR implements the initial async
API, but none of the actual batching logic necessary for our eventual
verifier design.

Once the batch verification API from bellman has been implemented we
will need to swap out the "Batch" type defined in this crate with the
new `batch::Verifier` defined in bellman.
2021-02-05 14:52:48 -05:00
Alfredo Garcia bfb3de7a8a
Use use max_items as bound in tower-batch (#1691) 2021-02-05 12:42:38 +10:00
Alfredo Garcia e455c3fa8a
add comment about error handling (#1692) 2021-02-05 12:41:27 +10:00
Henry de Valence add94c1c45 deps: move to tokio 0.3, tower 0.4
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware.  This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method.  See

ddc64e8d4d

for more context on the Tower changes.  To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
2020-11-20 10:08:16 -08:00
Henry de Valence 0586da7167 Revert #500 (generic errors in tower-batch).
Unfortunately, since the Batch wrapper was changed to have a generic error
type, when wrapping it in another Service, nothing constrains the error type,
so we have to specify it explicitly to avoid an inference hole.  This is pretty
unergonomic -- from the compiler error message it's very unintuitive that the
right fix is to change `Batch::new` to `Batch::<_, _, SomeError>::new`.

The options are:

1. roll back the changes that make the error type generic, so that the error
   type is a concrete type;

2. keep the error type generic but hardcode the error in the default
   constructor and add an additional code path that allows overriding the
   error.

However, there's a further issue with generic errors: the error type must be
Clone.  This problem comes from the fact that there can be multiple Batch
handles that have to share access to errors generated by the inner Batch
worker, so there's not a way to work around this.  However, almost all error
types aren't Clone, so there are fairly few error types that we would be
swapping in.

This suggests that in case (2) we would be maintaining extra code to allow
generic errors, but with restrictive enough generic bounds to make it
impractical to use generic error types.  For this reason I think that (1) is a
better option.
2020-07-22 14:29:55 -04:00
Dimitris Apostolou ba81d7d4c0 Fix typos 2020-07-07 11:13:49 -07:00
Jane Lusby 9d4ad933aa cleanup 2020-06-19 01:48:56 -04:00
Jane Lusby b67ead665a cleaning 2020-06-19 01:48:56 -04:00
Jane Lusby b727beb778 use better generic idents 2020-06-19 01:48:56 -04:00
Jane Lusby 61489dbf5d fix debug impl 2020-06-19 01:48:56 -04:00
Jane Lusby faf36f5c04 require clone bound inner error type rather than Arcifying it ourselves 2020-06-19 01:48:56 -04:00
Jane Lusby 63ae085945 make return error type for Batch generic 2020-06-19 01:48:56 -04:00
Jane Lusby fa6b098056
cleanup clippy warnings (#495)
Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-16 17:51:50 -07:00
Henry de Valence b299fb7162 Remove deprecated pin_project items 2020-06-16 14:35:42 -07:00
Henry de Valence 8f2ee22708 Add documentation. 2020-06-16 14:35:42 -07:00
Henry de Valence ee26e786f7 tower-batch: initial implementation of batching logic.
The name "Buffer" is changed to "Batch" everywhere, and the worker task is rewritten.

Instead of having Worker implement Future directly, we have a consuming async run() function.
2020-06-16 14:35:42 -07:00
Henry de Valence dcd3f7bb2d tower-batch: copy tower-buffer source code.
There's a lot of functional overlap between the batch design and tower-buffer's
existing internals, so we'll just vendor its source code and modify it.
If/when we upstream it, we can deduplicate common components.
2020-06-16 14:35:42 -07:00