* add legacy chain check and tests
* improve has_network_upgrade check
* add docs to legacy_chain_check()
* change arbitrary module structure
* change the panic message
* move legacy chain acceptance into existing tests
* use a reduced_branch_id_strategy()
* add docs to strategy function
* add argument to check for legacy chain into sync_until()
* Standardise lints across Zebra crates, and add missing docs
The only remaining module with missing docs is `zebra_test::command`
* Todo -> TODO
* Clarify what a transcript ErrorChecker does
Also change `Error` -> `BoxError`
* TransError -> ExpectedTranscriptError
* Output Descriptions -> Output descriptions
* Make sure the Canopy activation block is a finalized checkpoint block
This enables ZIP-221 chain history from Canopy activation onwards.
* Clarify that the mandatory checkpoint test includes Canopy activation
The test was correct, but the docs and assertion message did not include activation.
* Document that the mandatory checkpoint includes Canopy activation
Co-authored-by: teor <teor@riseup.net>
* Restore SummaryDebug on arbitrary chains
And also add it to some more proptest vectors.
* Reduce most arbitrary vectors from 10 to 4
This makes debugging easier
* Make SummaryDebug generic over collections and exact size iterators
* Document DisplayToDebug
* add nullifier methods to orchard
* store orchard nullifiers
* bump database version
* update `IntoDisk`
* support V5 in `UpdateWith`
* add a test for finalized state
* Use the latest network upgrade in state proptests
* Set the tip height and previous hash for arbitrary genesis blocks
And cleanup the ledger strategy interface.
* Generate partial chains with correct previous block hashes
* Provide the network value from the PreparedChain strategy
* Clarify the finalized state assertion that checks the genesis block
* Make arbitrary block chains pass some genesis checks
Use the genesis previous block hash for
- the first arbitrary block in each chain, and
- individual arbitrary blocks.
This setting can be adjusted by individual proptests as needed.
* Speedup proptests for Chain struct in zebra-state
* Add teor2345 requested changes
* Fix type for DEFAULT_PARTIAL_CHAIN_PROPTEST_CASES
* More costs for PROPTEST_CASES
* start refactoring transaction v4 for transaction v5
- move ShieldedData to sapling
- add AnchorVariant
- rename shielded_data to sapling_shielded data in V4
- move value_balance into ShieldedData
- update prop tests for new structure
* add AnchorVariant to Spend
- make anchor types available from sapling crate
- update serialize
* change shielded_balances_match() arguments
* change variable name anchor to shared_anchor in ShieldedData
* fix empty value balance serialization
* use AnchorV in shielded spends
* Rename anchor to per_spend_anchor
* Use nullifiers function directly in non-finalized state
* Use self.value_balance instead of passing it as an argument
* Add missing fields to ShieldedData PartialEq
* Derive Copy for tag types
* Add doc comments for ShieldedData refactor
* Implement a per-spend anchor compatibility iterator
Co-authored-by: teor <teor@riseup.net>
* add transaction V5 stub
* add v5_strategy
* deduplicate version group ids
* Update comment for V5 transactions
* Add V5 transactions to non_finalized_state
Currently these are all `unimplemented!(...)`
* Fix struct matches
* Apply trivial panic message changes
* add zcash_deserialize for V5
* make all tx versions explicit in sprout and sapling nullifier functions
* match exhaustively in sprout and sapling nullifier functions
* fix matches in zebra-consensus
* fix NU5 strategy
* We're still deciding if v5 transactions support Sprout
Co-authored-by: teor <teor@riseup.net>
Use `ServiceExt::oneshot` to perform state requests.
Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests
* Add a full set of stderr acceptance test functions
This change makes the stdout and stderr acceptance test interfaces
identical.
* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path
* Use Display for state path logs
Avoids weird escaping on Windows when using Debug
* Add Windows conflict error messages
* Turn PORT_IN_USE_ERROR into a regex
And add another alternative Windows-specific port error
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
The clippy unknown lints attribute was deprecated in
nightly in rust-lang/rust#80524. The old lint name now produces a
warning.
Since we're using `allow(unknown_lints)` to suppress warnings, we need to
add the canonical name, so we can continue to build without warnings on
nightly.
But we also need to keep the old name, so we can continue to build
without warnings on stable.
And therefore, we also need to disable the "removed lints" warning,
otherwise we'll get warnings about the old name on nightly.
We'll need to keep this transitional clippy config until rustc 1.51 is
stable.
## Motivation
This PR is motivated by the regression identified in https://github.com/ZcashFoundation/zebra/issues/1349. That PR notes that the metrics stopped working for most of the crates other than `zebrad`.
## Solution
This PR resolves the regression by deduplicating the `metrics` crate dependency. During a recent change we upgraded the metrics version in `zebrad` and a couple other of our crates, but we never updated the dependencies in `zebra-state`, `zebra-consensus`, or `zebra-network`. This caused the metrics macros to attempt to retrieve the current metrics exporter through the wrong function. We would install the metrics exporter in `0.13`, but then attempt to look it up through the `0.12` crate, which contains a different instance of the metrics exporter static variable which is unset. Doing this causes the metrics macros to return `None` for the current exporter after which they just silently give up.
## Related Issues
closes https://github.com/ZcashFoundation/zebra/issues/1349
## Follow Up Work
I noticed we have quite a few duplicate dependencies in our tree. We might be able to save some compilation time by auditing those and deduplicating them as much as possible.
- https://github.com/ZcashFoundation/zebra/issues/1582
Co-authored-by: teor <teor@riseup.net>
Zcashd will blindly request more block headers as long as it got 160
block headers in response to a previous query, EVEN IF THOSE HEADERS ARE
ALREADY KNOWN. To dodge this behavior, return slightly fewer than the
maximum, to get it to go away.
0ccc885371/src/main.cpp (L6274-L6280)
Without this change, communication between a partially-synced `zebrad`
and fully-synced `zcashd` looked like this:
1. `zebrad` connects to `zcashd`, which sends an initial `getheaders`
request;
2. `zebrad` correctly computes the intersection of the provided block
locator with the node's current chain and returns 160 following
headers;
3. `zcashd` does not check whether it already has those headers and
assumes that any provided headers are new and re-validates them;
4. `zcashd` assumes that because `zebrad` responded with 160 headers,
the `zebrad` node is ahead of it, and requests the next 160 headers.
5. Because block locators are sparse, the intersection between the
`zcashd` and `zebrad` chains is likely well behind the `zebrad` tip,
so this process continues for thousands of blocks.
To avoid this problem, we return slightly fewer than the protocol
maximum (158 rather than 160, to guard against off-by-one errors in
zcashd). This does not interfere with use of the returned headers by
peers that check the headers, but does prevent `zcashd` from trying to
download thousands of block headers it already has.
This problem does not occur in the `zcashd<->zcashd` case only because
`zcashd` does not respond to `getheaders` messages while it is syncing.
However, implementing this behavior in Zebra would be more complicated,
because we don't have a distinct "initial block sync" state (we do
poll-based syncing continuously) and we don't have shared global
variables to modify to set that state.
Relevant links (thanks @str4d):
- The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
- https://github.com/bitcoin/bitcoin/issues/6861
- https://github.com/bitcoin/bitcoin/issues/6755
- https://github.com/bitcoin/bitcoin/pull/8306#issuecomment-614916454
We modeled a Bitcoin `headers` message as being a list of block headers.
However, the actual data structure is slightly different: it's a list of (block
header, transaction count) pairs. This caused zcashd to reject our headers
messages.
To fix this, introduce a new `CountedHeader` struct with a `block::Header` and
transaction count `usize`, then thread it through the inbound service and the
state.
I tested this locally by running Zebra with these changes and inspecting a
trace-level log of the span of a peer connection that requested a nontrivial
headers packet from us, and verified that it did not reject our message.
If the limit is less than the ideal, try to increase it to the ideal.
If that doesn't work, try to increase the limit as high as possible.
If the limit is still less than the minimum, panic.
The `CoinbaseData` parses the block height separately from the rest of the
free-form coinbase data. However, it had two bugs:
1. It did not require that the height was canonically encoded;
2. Its canonical encoding was incorrect relative to the BIP34-inherited encoding.
This meant that we computed some transaction hashes incorrectly, because when
we re-serialized the coinbase transaction, we would canonically serialize the
coinbase transaction (using the incorrect definition of canonical, bug 2). And
we didn't notice that the wrong definition of canonical encoding was being used
because we accepted what we thought were non-canonically encoded heights.
The relevant rules are here: 877212414a/src/script/script.h (L307-L346)
This commit changes the encoding to reject non-canonically encoded heights, and
to match the correct encoding rules. We check that at least one
non-canonically encoded height is correctly rejected using a new test vector.
The database format increments because we saved a bunch of wrongly encoded blocks.
This discrepancy was originally noticed by @teor2345, who pointed out that a
previous version of the block 202 test vector (now preserved as "bad block
202") did not match the block from zcashd.
As a side effect of computing Merkle roots, we build a list of
transaction hashes. Instead of discarding these, add them to
PreparedBlock and FinalizedBlock so that they can be reused rather than
recomputed.
This commit adds Merkle root validation to:
1. the block verifier;
2. the checkpoint verifier.
In the first case, Bitcoin Merkle tree malleability has no effect,
because only a single Merkle tree in each malleablity set is valid (the
others have duplicate transactions).
In the second case, we need to check that the Merkle tree does not contain any
duplicate transactions.
Closes#1385Closes#906
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions
Add a `max_len` argument to support `FindHeaders` requests.
Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.
* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Some checks use the same blocks, so we take a copy of the block borrows
before using them. That way, we don't have to manage the position of the
iterator between checks.
Temporary fix so that Zebra's default logs support a typical workflow:
1. Developer or user runs Zebra with the default config
2. They send the logs to a terminal
3. When they see a bug, they copy-paste the last few log lines into a
bug report
This is the same change that was merged in #1373 and reverted in #1375.
We'll create a consistent logging design for Zebra in ticket #1381.
* Make debug_stop_at_height and ephemeral work together
* if `debug_stop_at_height` and `ephemeral` are set, delete the database
files after reaching the stop height
* drop or flush the database before `debug_stop_at_height` exits Zebra
This commit changes the state system and database format to track the
provenance of UTXOs, in addition to the outputs themselves.
Specifically, it tracks the following additional metadata:
- the height at which the UTXO was created;
- whether or not the UTXO was created from a coinbase transaction or
not.
This metadata will allow us to:
- check the coinbase maturity consensus rule;
- check the coinbase inputs => no transparent outputs rule;
- implement lookup of transactions by utxo (using the height to find the
block and then scanning the block) for a future RPC mechanism.
Closes#1342
This provides useful and not too noisy output at INFO level. We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
This change introduces two new types:
- `PreparedBlock`, representing a block which has undergone semantic
validation and has been prepared for contextual validation;
- `FinalizedBlock`, representing a block which is ready to be finalized
immediately;
and changes the `Request::CommitBlock`,`Request::CommitFinalizedBlock`
variants to use these types instead of their previous fields.
This change solves the problem of passing data between semantic
validation and contextual validation, and cleans up the state code by
allowing it to pass around a bundle of data. Previously, the state code
just passed around an `Arc<Block>`, which forced it to needlessly
recompute block hashes and other data, and was incompatible with the
already-known but not-yet-implemented data transfer requirements, namely
passing in the Sprout and Sapling anchors computed during contextual
validation.
This commit propagates the `PreparedBlock` and `FinalizedBlock` types
through the state code but only uses their data opportunistically, e.g.,
changing .hash() computations to use the precomputed hash. In the
future, these structures can be extended to pass data through the
verification pipeline for reuse as appropriate. For instance, these
changes allow the sprout and sapling anchors to be propagated through
the state.
The behavior of a request for a UTXO from a previous block depends on
whether that block has already been submitted to the state, or not:
* if it has, the state should be able to find it and answer immediately.
* if it has not, the state should see it in a later request.
However, the previous code only checked committed blocks, not queued
blocks, so if the block containing the UTXO had already arrived but had
not been committed, it would never be scanned.
This patch fixes the problem but is a bad solution, duplicating
computation between the block verifier and the state. A better fix
follows in the next commit.
Make tracing messages more concise by omitting information already
contained in a parent span and by shortening messages. This makes them
easier to read.
Previously, this function was instrumented with a span containing the
parent hash that was the entry to the function. But it doesn't make
sense to consider the work done by the function as happening in the
context of the supplied parent hash (as distinct from the context of the
hash of the newly arrived block, which is already contained in an outer
span), so this adds noise without conveying extra context.
Instead, use events that occur within the context of the existing spans.
Here the span is added to the body of the `Service::call`
implementation, not to the futures it returns, because the state service
does all of the work synchronously in `call` rather than in the futures
it returns.
The service is skipped as a span field. We could either include or
exclude the request itself. It would be useful, but the request body
can be very large. Instead, we make two spans, one at info level and
one at trace level, and filter that way.
Some systems have a very small /dev/shm, for example, see:
https://github.com/docker-library/postgres/issues/416
So we should just use the temporary directory on all operating systems.
Also:
* use TempDir to generate the temporary path
* delete the code that we copied from sled
* prefix the temporary path with the state version and network
## Motivation
Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb.
Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network.
## Solution
This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain.
The code in this pull request has:
- [x] Documentation Comments
- [x] Unit Tests and Property Tests
## Review
@hdevalence
This change explicitly documents cancellation contracts for our Tower services,
and tries to correct a bug in the implementation of the CheckpointVerifier,
which duplicates information from the state service but did not ensure that it
would be kept in sync.
This change has two benefits:
* reduces conflicts with the sled refactor and any replacement
* allows the function to be called independently for testing
`check_contextual_validity` mistakenly used the new block's hash to try
to get the parent block from the state. This caused a panic, because the
new block isn't in the state yet.
Use `StateService::chain` to get the parent block, because we'll be
using `chain` for difficulty adjustment contextual verification anyway.
* Add internal iterator API for accessing relevant chain blocks
* get blocks from all chains in non_finalized state
* Impl FusedIterator for service::Iter
* impl ExactSizedIterator for service::Iter
* let size_hint find heights in side chains
Co-authored-by: teor <teor@riseup.net>
* Add transcript test for requests while state is empty
* Add happy path test for each query once the state is populated
* let populate logic handle out of order blocks
* Add a maximum queued height metric to the finalized state
And rename all the finalized state metrics to contain "finalized".
* Use i32 and -1 instead of Option<Height>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Prior to this PR we realized that the RFC had been drafted with the assumption that chains would be ordered from best to worst in `NonFinalizedState`. This assumption was incorrect, since `BTreeSet` only ever orders values in ascending order. This discrepancy was noticed and fixed in the code, but there were still some inconsistencies that needed to be cleaned up.
This PR updates all the incorrect or confusing comments about chain ordering in the RFC and code.
Prior to this PR `memory_state` defined and implemented functionality for three different types, `Chain`, `NonFinalizedState`, and `QueuedBlocks`. Each of these components will need a fair number of unit tests, and I realized that as its currently organized it would be difficult to organize the tests or at a glance figure out which tests are testing which components.
This PR changes the organization of `memory_state` such that each component it exports is defined in its own module. In follow up PRs each module will get its own test module, which will focus exclusively on unit tests for the item defined there-in.
- [Tracking Issue](https://github.com/ZcashFoundation/zebra/issues/1250)
* make service use both finalized and non-finalized state
* Document new functions
* add documentation to sled fns
* cleanup tip fn now that errors are gone
* rename height unwrap fn
## Motivation
While working on the block locator fix PR together with Henry we noticed that we'd accidentally serialized entire transactions in `tx_by_hash`, instead of serializing just the height of the block and the index of the transaction within the block, as described by the original RFC.
## Solution
We've fixed it by adding a `TransactionLocation` new type, which handles the sled format traits. We've removed the sled format impls for `Transaction` to prevent inserting the wrong data in the future. Finally we've bumped the database format to reflect the change in the format on the disk and its incompatibility with previous versions.
Closes#1026
Because of the way that sled uses this parameter, the actual in-memory
size may be much larger. Dialing this down should help avoid high
memory usage.
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests
And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
## Motivation
The zebra-state service needs to be able to handle duplicate blocks.
## Solution
This implements changes already outlined by [The State
RFC](https://zebra.zfnd.org/dev/rfcs/0005-state-updates.html). We check for
successfully committed blocks first, since interacting with the queued blocks
struct at this point just complicates the implimentation. If the block has not
already been committed we then check if the block has already been queued, if
not we handle the block normally (normally here being the bit we already had
implemented).
## Documentation Changes
- [x] Update the state RFC to match the ways this fix departs from the design
- the main thing is that I switched the order of checking for duplicates
- [x] ~~Add newly added functions to the state rfc~~ Decided not to do this because they're minor getters that don't influence the rest of the design and aren't exposed as part of the API
- [x] Document newly added functions inline
## Testing
## Related Issues
- fixes https://github.com/ZcashFoundation/zebra/issues/1182
- tracking issue https://github.com/ZcashFoundation/zebra/issues/1049
Co-authored-by: teor <teor@riseup.net>
We already use an actor model for the state service, so we get an
ordered sequence of state queries by message-passing. Instead of
performing reads in the futures we return, this commit performs them
synchronously. This means that all sled access is done from the same
task, which
(1) might reduce contention
(2) allows us to avoid using sled transactions when writing to the
state.
Co-authored-by: Jane Lusby <jane@zfnd.org>
Co-authored-by: Jane Lusby <jane@zfnd.org>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
This commit begins the process of integrating `zcash_script` with the rest of the system for verifying scripts while syncing the block chain. It does so by adding the necessary support for looking up UTXOs from the state service and implements the first parts of the `script::Verifier` for looking up the necessary UTXOs in the state and then generating the necessary call to `zcash_script` to verify the script itself.
Co-authored-by: teor <teor@riseup.net>
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names
* Begin work on RFC5 implementation
* I think this is necessary
* holy shit supertrait implemented via subtrait
* implement most of the chain functions
* change to slightly better name
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* resolve review comment
* split utxo into two sets
* update the Chain definition
* just a little more
* update comment
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* apply changes from code review
* remove allow attribute in zebra-state/lib.rs
* Update zebra-state/src/memory_state.rs
Co-authored-by: teor <teor@riseup.net>
* merge ChainSet type into MemoryState
* rename state impl types
* Add error messages to asserts
* checkpoint so I can split off arbitrary changes into a PR
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* run rustfmt
* try to fix github actions syntax
* differentiate name
* prove that github action tests zebra-chain build without features
* revert change from last commit now that test is running
* remove accidentally introduced newline
* checkpoint
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* apply change from chain impl PR
* setup config for proptests
* Update zebra-chain/src/block/arbitrary.rs
Co-authored-by: teor <teor@riseup.net>
* run rustfmt
Co-authored-by: teor <teor@riseup.net>
* Begin work on RFC5 implementation
* I think this is necessary
* holy shit supertrait implemented via subtrait
* implement most of the chain functions
* change to slightly better name
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* resolve review comment
* split utxo into two sets
* update the Chain definition
* just a little more
* update comment
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* apply changes from code review
* remove allow attribute in zebra-state/lib.rs
* Update zebra-state/src/memory_state.rs
Co-authored-by: teor <teor@riseup.net>
* merge ChainSet type into MemoryState
* rename state impl types
* Add error messages to asserts
* add module doc comment
* update RFC for utxos
* add missing header
Co-authored-by: teor <teor@riseup.net>
Using a Buffer with size 1 is a footgun because it allows only one
sender to call poll_ready at a time. This is usually undesirable
because it means that a task or service that calls poll_ready but only
makes a service call later (potentially much later) will block all other
callers.
The previous code filled in block height 0 for a missing coinbase height
in `SledState::commit_finalized`, since the genesis block is the only
block without a coinbase height (because of a mistake when it was
created).
However, @teor2345 noticed that this is incorrect, because we already
parse the genesis block specially and fill in its coinbase height
correctly. So instead, we can .expect it to be present, because we can
assume that all finalized blocks are valid.
The new `StateService` type wraps a `SledState` and a `MemoryState`.
This will allow the sled-related code and the in-memory code to be kept
separate, with the top-level `StateService` making method calls to one
or the other, as appropriate.
This commit removes the existing Service impl for the SledService. This
saves time in refactoring, and the code needs to be rewritten
anyways so there's no loss to deleting it now.
* Remove in-memory state service
* make the config compatible with toml again
* checkpoint commit to see how much I still have to revert
* back to the starting point...
* remove unused dependency
* reorganize error handling a bit
* need to make a new color-eyre release now
* reorder again because I have problems
* remove unnecessary helpers
* revert changes to config loading
* add back missing space
* Switch to released color-eyre version
* add back missing newline again...
* improve error message on unix when terminated by signal
* add context to last few asserts in acceptance tests
* instrument some of the helpers
* remove accidental extra space
* try to make this compile on windows
* reorg platform specific code
* hide on_disk module and fix broken link
Fixes a race condition between the height and hash tree updates, when
they are executed simultaneously with GetDepth. `get_tip` uses the height
tree, but `get` uses the hash tree.
This is the first in a sequence of changes that change the block:: items
to not include Block as a prefix in their name, in accordance with the
Rust API guidelines.
* checkpoint: reject older of duplicate verification requests.
If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled. Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.
* sync: add a timeout layer to block requests.
Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.
* sync: restart syncing on error
Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.
* sync: additional debug info
* sync: handle lookahead limit correctly.
Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped. This meant that completed tasks could be left until the limit was
exceeded again. Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.
* network: add debug instrumentation to retry policy
* sync: instrument the spawned task
* sync: streamline ObtainTips/ExtendTips logic & tracing
This change does three things:
1. It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method. This means that when debugging we only have one
deduplication algorithm to focus on.
2. It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.
3. It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip"). The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.
* sync: hack to work around zcashd behavior
* sync: localize debug statement in extend_tips
* sync: change algorithm to define tips as pairs of hashes.
This is different enough from the existing description that its comments no
longer apply, so I removed them. A further chunk of work is to change the sync
RFC to document this algorithm.
* sync: reduce block timeout
* state: add resource limits for sled
Closes#888
* sync: add a restart timeout constant
* sync: de-pub constants
Instead of creating a block locator all the way back to the genesis
block, only ask for blocks within the reorg limit (99 blocks).
Use the reorg limit as the final locator. (Or if the chain is less
than 99 blocks, use the genesis block.)
Fixes some instances of #818 at very small block heights.
The state service was providing block locators starting at the parent of
the current tip. Instead, include the current tip in the block locator.
Also handle an edge case where we could include the genesis block twice,
if the current tip height was a power of two.
Fixes an instance of #818 where we re-download the current tip.
* Load tracing filter only from config and simplify logic.
* Configure the state storage in the config, not an environment variable.
This also changes the config so that the path is always set rather than being
optional, because Zebra always needs a place to store its config.
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case
Co-authored-by: Jane Lusby <jane@zfnd.org>
Only hash block headers in the lowest-level block index code.
This design has a few benefits:
- failures are obvious, because the hash is not available,
- get_tip() returns a smaller object,
- we avoid re-hashing block headers multiple times.
These efficiency changes may be needed to support chain reorganisations,
multiple tips, and heavy query loads.
Move block header hashing from zebra-consensus to zebra-state.
Handle zebra-state AddBlock errors in zebra-consensus BlockVerifier.
Add unit tests for BlockVerifier state error handling.
Part of #428.
* rename zebra-storage to zebra-state
* Setup initial skeleton for zebra-state
* add test
* Apply suggestions from code review
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
* move shared test vectors to a common crate
Co-authored-by: Jane Lusby <jane@zfnd.org>
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>