Commit Graph

1565 Commits

Author SHA1 Message Date
Deirdre Connolly 8cac287aa2 Tidy TransactionError 2020-10-28 21:44:13 -04:00
Deirdre Connolly b2df84fc59 Dedupe VerifyTransactionError into TransactionError 2020-10-28 21:44:13 -04:00
Deirdre Connolly 1d646e6a27 Make Clippy happy 2020-10-28 21:44:13 -04:00
Deirdre Connolly 1ce2eea35f Add coinbase shielded descriptions check 2020-10-28 21:44:13 -04:00
Deirdre Connolly 1653aca570 Add shielded_balances_match check 2020-10-28 21:44:13 -04:00
Deirdre Connolly 612148fbdd consensus: add transaction::check module 2020-10-28 21:44:13 -04:00
teor 3748623d92 Remove a redundant block header test vector 2020-10-28 21:24:28 -04:00
teor 9667ee650f Run deserialize_blockheader on every test vector 2020-10-28 21:24:28 -04:00
teor 456842aa86 Run the equihash tests on every block test vector 2020-10-28 21:24:28 -04:00
Alfredo Garcia bcb027ebc5 change canopy.pdf to stable protocol.pdf 2020-10-28 11:34:53 -04:00
teor 92c623eddf Log each genesis download
This change helps us diagnose sync hangs.
2020-10-28 11:31:04 -04:00
teor 656bd24ba7 Hedge every syncer block download request
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.

Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.

This change should improve genesis and post-restart sync latency.
2020-10-28 11:31:04 -04:00
teor b4ce442cea
State RFC: Use escaped concatenation operator
And use BE32 consistently
2020-10-27 20:50:29 +10:00
teor adbdb3c76e
Note that `blocks_by_height` is also a bijection 2020-10-27 20:13:49 +10:00
teor 7ea92283a7
Fix a State RFC rendering issue 2020-10-27 20:07:43 +10:00
teor 0d47b80e68
Fix a comment typo 2020-10-27 19:31:45 +10:00
teor 08910e0378
State RFC: fix block height contextual validation 2020-10-27 19:29:32 +10:00
teor ea510b7d41
Run a block sync in CI with 2 large checkpoints (#1193)
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests

And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
2020-10-27 19:25:29 +10:00
dependabot[bot] 83c844abb5 build(deps): bump futures from 0.3.6 to 0.3.7
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.6 to 0.3.7.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.6...0.3.7)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-27 02:28:48 -04:00
Deirdre Connolly 9490fc6a3a Force Clippy nightly lints to all warn, never error 2020-10-27 00:09:39 -04:00
Jane Lusby 971765ab30
Handle duplicate blocks in zebra-state (#1198)
## Motivation

The zebra-state service needs to be able to handle duplicate blocks.

## Solution

This implements changes already outlined by [The State
RFC](https://zebra.zfnd.org/dev/rfcs/0005-state-updates.html). We check for
successfully committed blocks first, since interacting with the queued blocks
struct at this point just complicates the implimentation. If the block has not
already been committed we then check if the block has already been queued, if
not we handle the block normally (normally here being the bit we already had
implemented).

## Documentation Changes

- [x] Update the state RFC to match the ways this fix departs from the design
	- the main thing is that I switched the order of checking for duplicates
- [x] ~~Add newly added functions to the state rfc~~ Decided not to do this because they're minor getters that don't influence the rest of the design and aren't exposed as part of the API
- [x] Document newly added functions inline

## Testing

## Related Issues

- fixes https://github.com/ZcashFoundation/zebra/issues/1182
- tracking issue https://github.com/ZcashFoundation/zebra/issues/1049

Co-authored-by: teor <teor@riseup.net>
2020-10-26 13:54:19 -07:00
Henry de Valence 4c960c4e6d zebrad: treat duplicate downloads as an error
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
2020-10-26 12:05:35 -07:00
Henry de Valence 4127d086ea zebrad: clarify hedge layering motivation
Co-authored-by: teor <teor@riseup.net>
2020-10-26 12:05:35 -07:00
Henry de Valence 6f8f8a56d4 state: perform sled reads synchronously
We already use an actor model for the state service, so we get an
ordered sequence of state queries by message-passing.  Instead of
performing reads in the futures we return, this commit performs them
synchronously.  This means that all sled access is done from the same
task, which

(1) might reduce contention
(2) allows us to avoid using sled transactions when writing to the
state.

Co-authored-by: Jane Lusby <jane@zfnd.org>


Co-authored-by: Jane Lusby <jane@zfnd.org>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2020-10-26 12:05:35 -07:00
Henry de Valence 253bab042e sync: add a concurrency limit for block downloads 2020-10-26 12:05:35 -07:00
Henry de Valence 0a405c737d zebrad: check state in obtaintips, not extendtips.
The original sync algorithm split the sync process into two phases, one
that obtained prospective chain tips, and another that attempted to
extend those chain tips as far as possible until encountering an error
(at which point the prospective state is discarded and the process
restarts).

Because a previous implementation of this algorithm didn't properly
enforce linkage between segments of the chain while extending tips,
sometimes it would get confused and fail to discard responses that did
not extend a tip.  To mitigate this, a check against the state was
added.  However, this check can cause stalls while checkpointing,
because when a checkpoint is reached we may suddenly need to commit
thousands of blocks to the state.  Because the sync algorithm now has a
a `CheckedTip` structure that ensures that a new segment of hashes
actually extends an existing one, we don't need to check against the
state while extending a tip, because we don't get confused while
interpreting responses.

This change results in significantly smoother progress on mainnet.
2020-10-26 12:05:35 -07:00
Henry de Valence 65e0c22fbe state: don't pre-buffer the service
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
2020-10-26 12:05:35 -07:00
Henry de Valence ce2ac3336f zebrad: add debug message before state check
This reveals that there may be contention in access to the state, as
this takes a long time.
2020-10-26 12:05:35 -07:00
Henry de Valence a1a3e4db5a consensus: simplify block verify tracing output
The previous debug output printed a message that the chain verifier had
recieved a block.  But this provides no additional information compared
to printing no message in chain::Verifier and a message in whichever
verifier the block was sent to, since the resulting spans indicate where
the block was dispatched.

This commit also removes the "unexpected high block" detection; this was
an artefact of the original sync algorithm failing to handle block
advertisements, but we don't have that problem any more, so we can
simplify the code by eliminating that logic.
2020-10-26 12:05:35 -07:00
Henry de Valence 91469faf3c zebrad: eliminate duplicate span in sync 2020-10-26 12:05:35 -07:00
Henry de Valence b5a43f4516 zebrad: remove implementation details from docs
The timeout behavior in zebra-network is an implementation detail, not a
feature of the public API.  So it shouldn't be mentioned in the doc
comments -- if we want timeout behavior, we have to layer it ourselves.
2020-10-26 12:05:35 -07:00
Henry de Valence 1d7309afe2 zebrad: correctly handle duplicates in DownloadSet
Using the cancel_handles, we can deduplicate requests.  This is
important to do, because otherwise when we insert the second cancel
handle, we'd drop the first one, cancelling an existing task for no
reason.
2020-10-26 12:05:35 -07:00
Henry de Valence 56fe4f4379 zebrad: unify sync restart logic
This lets us keep the main loop simple and just write `continue 'sync;`
to keep going.
2020-10-26 12:05:35 -07:00
Henry de Valence 12d25159c6 zebrad: use hedged requests in sync
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
2020-10-26 12:05:35 -07:00
Henry de Valence 5f229d1475 zebrad: use Downloads in sync
Try to use the better cancellation logic to revert to previous sync
algorithm.  As designed, the sync algorithm is supposed to proceed by
downloading state prospectively and handle errors by flushing the
pipeline and starting over.  This hasn't worked well, because we didn't
previously cancel tasks properly.  Now that we can, try to use something
in the spirit of the original sync algorithm.
2020-10-26 12:05:35 -07:00
Henry de Valence b90581a3d7 zebrad: create a Downloads Stream for syncing.
This makes two changes relative to the existing download code:

1.  It uses a oneshot to attempt to cancel the download task after it
    has started;

2.  It encapsulates the download creation and cancellation logic into a
    Downloads struct.
2020-10-26 12:05:35 -07:00
Henry de Valence 8e709bfa88 network: don't fail on unsolicited messages
These messages might be unsolicited, or they might be a response to a
request we already canceled.  So don't fail the whole connection, just
drop the message and move on.
2020-10-26 12:05:35 -07:00
Henry de Valence 13daefa729 network: handle request cancellation in Connection
We handle request cancellation in two places: before we transition into
the AwaitingResponse state, and while we are in AwaitingResponse.  We
need both places, or else if we started processing a request, we
wouldn't process the cancellation until the timeout elapsed.

The first is a check that the oneshot is not already canceled.

For the second, we wait on a cancellation, either from a timeout or from
the tx channel closing.
2020-10-26 12:05:35 -07:00
Henry de Valence b636660d6a zebrad: rename sync::Error alias to BoxError. 2020-10-26 12:05:35 -07:00
teor a141c336ab Actually fix whitespace 2020-10-26 13:49:48 -04:00
teor bbe4aa47ea Fix whitespace for rustfmt 2020-10-26 13:49:48 -04:00
teor 2fa3d8a8f4 Add a comment explaining why block metrics follow validation 2020-10-26 13:49:48 -04:00
teor f5a53d9dae Update block metrics after async transaction verification 2020-10-26 13:49:48 -04:00
teor fb079c2ca1
Replace BlockHeaderHash with block::Hash 2020-10-26 22:27:57 +10:00
teor a9102e8d6d
Fix State RFC rendering ambiguities 2020-10-26 22:02:45 +10:00
teor 0935b3305a
Fix more state RFC function heading sizes 2020-10-26 21:14:14 +10:00
teor 7bf2fdd6d7
Fix a header level in the state RFC 2020-10-26 21:11:26 +10:00
teor 60322c3d48 Test that the checkpoint list gap is correct
If we change the gap, but don't rebuild the lists, `zebrad` hangs with
weird errors.
2020-10-26 20:59:40 +10:00
teor f9dc481934 Rebuild the checkpoint lists with smaller checkpoints 2020-10-26 20:59:40 +10:00
teor 20dfd04463 Reduce maximum checkpoint size in the Zebra code
The new limits are 400 blocks and 32 MB.
2020-10-26 20:59:40 +10:00