* Split mempool storage errors into tip-based and chain-based
* Expire tip rejections every time we get a new block
FailedVerification and SpendConflict rejections only apply to the current tip.
The next tip can provide missing inputs, or evict conflicting transactions.
* Enforce MAX_EVICTION_MEMORY_ENTRIES for mempool reject lists
* Implement a task that gossips verified block hashes
* Log an info message for block broadcasts
* Simplify the gossip task
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Re-use the old tip change if there is no new tip change
Also improve the comments.
* Add an assertion message
* Rename task join handles and futures in start method
* Add a dedicated BlockGossipError type
This type helps distinguish between syncer and state errors.
* Test that committed blocks are gossiped to peers
Also do a minor type cleanup on the existing test code,
replacing `Option<Vec<_>>` with `Vec<_>`.
* Formatting
* Remove excess newlines
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Clear the initial gossiped blocks during test setup
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Remove unused mempool storage errors
Preparation for ticket #2819.
Removing these errors means that we don't have to decide
which type of transaction ID match we want for them.
* Remove unused mempool errors, and deduplicate storage errors
* rustfmt
* Add a mempool transaction removal method for mined IDs
And use this method to remove expired transactions,
because all transactions with the same mined ID expire at the same height.
* Remove mined transaction IDs from the mempool
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Rename type parameter to be more explicit
Replace the single letter with a proper name.
* Remove imports for `Request` and `Response`
The type names will conflict with the ones for the mempool service.
* Attach `Mempool` service to the `Crawler`
Add a field to the `Crawler` type to store a way to access the `Mempool`
service.
* Forward crawled transactions to downloader
The crawled transactions are now sent to the transaction downloader and
verifier, to be included in the mempool.
* Derive `Eq` and `PartialEq` for `mempool::Request`
Make it simpler to use the `MockService::expect_request` method.
* Test if crawled transactions are downloaded
Create some dummy crawled transactions, and let the crawler discover
them. Then check if they are forwarded to the mempool to be downloaded
and verified.
* Don't send empty transaction ID list to downloader
Ignore response from peers that don't provide any crawled transactions.
* Log errors when forwarding crawled transaction IDs
Calling the Mempool service should not fail, so if an error happens it
should be visible. However, errors when downloading individual
transactions can happen from time to time, so there's no need for them
to be very visible.
* Document existing `mempool::Crawler` test
Provide some depth as to what the test expect from the crawler's
behavior.
* Refactor to create `setup_crawler` helper function
Make it easier to reuse the common test setup code.
* Simplify code to expect requests
Now that `zebra_network::Request` implement `Eq`, the call can be
simplified into `expect_request`.
* Refactor to create `respond_with_transaction_ids`
A helper function that checks for a network crawl request and responds
with the given list of crawled transaction IDs.
* Refactor to create `crawler_iterator` helper
A function to intercept and respond to the fanned-out requests sent
during a single crawl iteration.
* Refactor to create `respond_to_queue_request`
Reduce the repeated code necessary to intercept and reply to a request
for queuing transactions to be downloaded.
* Add `respond_to_queue_request_with_error` helper
Intercepts a mempool request to queue transactions to be downloaded, and
responds with an error, simulating an internal problem in the mempool
service implementation.
* Derive `Arbitrary` for `NetworkUpgrade`
This is required for deriving `Arbitrary` for some error types.
* Derive `Arbitrary` for `TransactionError`
Allow random transaction errors to be generated for property tests.
* Derive `Arbitrary` for `MempoolError`
Allow random Mempool errors to be generated for property tests.
* Test if errors don't stop the mempool crawler
The crawler should be robust enough to continue operating even if the
mempool service fails to download transactions or even fails to handle
requests to enqueue transactions.
* Reduce the log level for download errors
They should happen regularly, so there's no need to have them with a
high visibility level.
Co-authored-by: teor <teor@riseup.net>
* Stop crawler if service stops
If `Mempool::poll_ready` returns an error, it's because the mempool
service has stopped and can't handle any requests, so the crawler should
stop as well.
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Move mempool tests into `tests::vector` sub-module
Make it consistent with other test modules and prepare for adding
property tests.
* Reorder imports
Make it consistent with the general guidelines followed on other
modules.
* Export `ChainTipBlock` and `ChainTipSender`
Allow these types to be used in other crates for testing purposes.
* Derive `Arbitrary` for `ChainTipBlock`
Make it easy to generate random `ChainTipBlock`s for usage in property
tests.
* Refactor to move test methods into `tests` module
Reduce the repeated test configuration attributes and make it easier to
see what is test specific and what is part of the general
implementation.
* Add a `Mempool::dummy_call` test helper method
Performs a dummy call just so that `poll_ready` gets called.
* Use `dummy_call` in existing tests
Replace the custom dummy requests with the helper method.
* Test if the mempool is cleared on chain reset
A chain reset should force the mempool storage to be cleared so that
transaction verification can restart using the new chain tip.
* Test if mempool is cleared on syncer restart
If the block synchronizer falls behind and then starts catching up
again, the mempool should be disabled and therefore the storage should
be cleared.
* Send mined transaction IDs to the download and verify task for cancellation
* Pass a HashSet of transaction hashes to be cancelled
* Add mempool_cancel_mined() test
* Fix starvation in test
* Fix typo in comment
* mempool - support transaction expiration
* use `LatestChainTip` instead of state call
* clippy
* remove spawn task
* remove non needed async from function
* remove return value
* add a `expiry_height_mut()` method to `Transaction` for testing purposes
* fix `remove_expired_transactions()`
* add a `mempool_transaction_expiration()` test
* tidy cleanup to `expiry_height()`
* improve docs
* fix the build
* try fix macos build
* extend tests
* add doc to function
* clippy
* fix build
* start tests at block two
* Cancel download and verify tasks when the mempool is deactivated
* Refactor enable/disable logic to use a state enum
* Add helper test functions to enable/disable the mempool
* Add documentation about errors on service calls
* Improvements from review
* Improve documentation
* Fix bug in test
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Add `Transaction::spent_outpoints` getter method
Returns an iterator over the UTXO `OutPoint`s spent by the transaction.
* Add `mempool::Error::Conflict` variant
An error representing that a transaction was rejected because it
conflicts with another transaction that's already in the mempool.
* Reject conflicting mempool transactions
Reject including a transaction in the mempool if it spends outputs
already spent by, or reveals nullifiers already revealed by another
transaction in the mempool.
* Fix typo in documentation
Remove the `r` that was incorrectly added.
Co-authored-by: teor <teor@riseup.net>
* Specify that the conflict is a spend conflict
Make the situation clearer, because there are other types of conflict.
Co-authored-by: teor <teor@riseup.net>
* Clarify that the outpoints are from inputs
Because otherwise it could lead to confusion because it could also mean
the outputs of the transaction represented as `OutPoint` references.
Co-authored-by: teor <teor@riseup.net>
* Create `storage::tests::vectors` module
Refactor to follow the convention used for other tests.
* Add an `AtLeastOne::first_mut` method
A getter to allow changing the first element.
* Add an `AtLeastOne::push` method
Allow appending elements to the collection.
* Derive `Arbitrary` for `FieldNotPresent`
This is just to make the code that generates arbitrary anchors a bit
simpler.
* Test if conflicting transactions are rejected
Generate two transactions (either V4 or V5) and insert a conflicting
spend, which can be either a transparent UTXO, or a nullifier for one of
the shielded pools. Check that any attempt to insert both transactions
causes one to be accepted and the other to be rejected.
* Delete a TODO comment that we decided not to do
Co-authored-by: teor <teor@riseup.net>
* Add tests for mempool Request::Queue
* Update test to work after refactoring
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Send `Response::Nil` instead of sending empty `Message`s
This matches `zcashd`'s behaviour more closely.
In most cases, the network layer filters these out already.
But this change makes the the inbound service code clearer.
* revert changes made to `AdvertiseTransactionIds` and `PushTransaction`
* remove newline
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Update the expiry TODO
* Clear the mempool at a chain tip reset
* Clear the mempool by using a sync method (#2777)
* Clear the mempool by using a sync method
* Update docs
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* Refactor last_tip_change()
* Apply suggestions from code review
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Fix brackets
* Use best_tip_block instead of manual borrowing
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Remove return of redundant vector length
An attempt to improve readability a bit by not returning a tuple with a
value that can be obtained from a single return type.
* Refactor `unmined_transactions_in_blocks`
Use a more functional style to try to make it a bit clearer.
* Use ranges in `unmined_transactions_in_blocks`
Allow a finer control over the block range to extract the transactions
from.
* Refactor mempool test code to improve clarity
It was previously not clear that only the first genesis transaction was
being used. The remaining transactions in the genesis block were
discarded and then readded later.
* Replace `oneshot` with `call`
Remove a redundant internal `ready_and` call.
* Return an `impl Iterator` instead of a `Vec<_>`
Remove unnecessary deserializations and heap allocations.
* Refactor `mempool_storage_basic_for_network` test
Make the separation between the transactions expected to be in the
mempool and those expected to be rejected clearer.
* Replace `Iterator` with `DoubleEndedIterator`
Allow getting the last transaction more easily.
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Use `MockService` in inbound test
Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.
* Use `MockService` in the basic mempool test
Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.
* Remove the `mock_peer_set` helper function
It is not used anymore, since the usages were replaced with
`MockService`s.
* add tests for mempool inbound requests
* Use MockService for transaction verifier
* Refactor creation of mock `peer_set`
Use the same style as the mock transaction verifier.
* Derive `Eq` for `zebra_network::Request`
Make it easy to use the `MockService::expect_request` method.
* Return mocked peer set service from `setup`
Allow it to be used to respond to requests.
* Add bindings for the transaction used for testing
Allow them to be moved into futures later.
* Respond to transaction download request
Make sure that the test transaction appears to the mempool as if it had
been downloaded by the peer set service.
* Assert that no unexpected requests were received
Check that the mempool doesn't send unexpected requests to the peer set
service.
* add tests for mempool inbound requests
* Use MockService for transaction verifier
* add missing `expect_no_requests` to `mempool_advertise_transaction_ids` test
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Use `MockService` in inbound test
Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.
* Use `MockService` in the basic mempool test
Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.
* Remove the `mock_peer_set` helper function
It is not used anymore, since the usages were replaced with
`MockService`s.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Allow deliberate instances of the new nightly clippy::derivable_impls lint
We want our config defaults to be explicit.
Not so sure about the application defaults, but they also contain a config.
* Also allow unknown lint names
Stable doesn't know about this lint, but nightly does.
* Implement initial service mocking helpers
Adds a [`MockService`] type, which can be configured and built for usage
in unit tests or proptests. The mocked service can then be used to
intercept requests and respond indivdiually to them.
* Use `MockService in the `mempool::Crawler` test
Refactor it to remove the helper mock function, and use the new
`MockService` helper type.
* Use `MockService` in `CandidateSet` test vectors
Refactor to remove the manual mocking of the peer set service.
* Panic if a response is not sent by `MockService`
Change the current semantics to require all `MockService` usages to
respond to every intercepted request.
A `must_use` attribute was added to the `ResponseSender` so that the
compiler can warn when this doesn't happen.
* Allow generic error types in `MockService`
Replace the hard-coded `BoxError` as the `Service`'s error type with a
generic type parameter. This allows mocking services in locations that
require specific error types.
* Add a `ResponseSender::request` getter
Allow inspecting the request again before responding, and using
information from the request in the response.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Pass sync_status to mempool
* Update zebrad/src/components/mempool.rs
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Remove enabled flag for now; will be handled in #2723
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Check if tx already exists in mempool or state before downloading
* Reorder checks
* Add rejected test; refactor into separate function
* Wrap mempool in buffered service
* Rename RejectedTransactionsById -> RejectedTransactionsIds
* Add RejectedTransactionIds response; fix request name
* Organize imports
* add a test for Storage::rejected_transactions
* add test for mempool `Request::RejectedTransactionIds`
* change buffer size to 1 in the test
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Using `&mut self` as the receiver in the method signatures allows Rust
to infer that the type is properly `Sync`, and therefore `Send`. This
allows removing the `Mutex` work-around.
* Decide if Zebra is at the chain tip
* Avoid division by zero
* Try increasing EVENT_TIMEOUT
* Increase MAX_TEST_EXECUTION
* Implement basic tests
* Resolve Clippy's erorrs
* change doc comments to normal
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* reply to `Request::MempoolTransactionIds`
* remove boilerplate
* get storage from mempool with a method
* change panic message
* try fix for mac
* use normal init instead of init_tests for state service
* newline
* rustfmt
* fix test build
* Rename ChainTipReceiver to CurrentChainTip
`fastmod ChainTipReceiver CurrentChainTip zebra*`
* Update chain tip documentation and variable names
* Basic chain tip change implementation, without resets
Also includes the following name changes:
```
fastmod CurrentChainTip LatestChainTip zebra*
fastmod chain_tip_receiver latest_chain_tip zebra*
```
* Clarify the difference between `LatestChainTip` and `ChainTipChange`
* Store a `SyncStatus` handle in the `Crawler`
The helper type will make it easier to determine if the crawler is
enabled or not.
* Pause crawler if mempool is disabled
Implement waiting until the mempool becomes enabled, so that the crawler
does not run while the mempool is disabled.
If the `MempoolStatus` helper is unable to determine if the mempool is
enabled, stop the crawler task entirely.
* Update test to consider when crawler is paused
Change the mempool crawler test so that it's a proptest that tests
different chain sync. lengths. This leads to different scenarios with
the crawler pausing and resuming.
Co-authored-by: teor <teor@riseup.net>
* Create a `SyncStatus` helper type
Keeps track if the synchronizer is close to the chain tip or not.
* Refactor `ChainSync` ctor. to return `SyncStatus`
Change the constructor API so that it returns a higher level construct.
* Test if `SyncStatus` waits for the chain tip
Test if waiting for the chain tip to be reached correctly finishes when
the chain tip is reached. This is done by sending recent sync lengths to
the `SyncStatus` instance, and checking that every time a separate
`SyncStatus` instance determines it has reached the tip the original
instance wakes up.
* Add a temporary attribute to allow dead code
The code added isn't used yet, so we'll add a temporary waiver until
another PR is merged to use them.
This avoids peer set contention when most peers are busy.
Also exit the task if the peer service returns a readiness error,
because that means it's permanently unusable.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* First pass at a Mempool Service, incl. a storage layer underneath
* Fixed up Mempool service and storage
* allow dead code where needed
* clippy
* typo
* only drain if the mempool is full
* add a basic storage test
* remove space
* fix test for when MEMPOOL_SIZE change
* group some imports
* add a basic mempool service test
* add clippy suggestions
* remove not needed allow dead code
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: teor <teor@riseup.net>
* Rename BestTipHeight so it can be generalised to ChainTipSender
`fastmod BestTipHeight ChainTipSender zebra*`
For senders:
`fastmod best_tip_height chain_tip_sender zebra*`
For receivers:
`fastmod best_tip_height chain_tip_receiver zebra*`
* Rename best_tip_height module to chain_tip
* Wrap the chain tip watch channel in a ChainTipReceiver type
* Create a ChainTip trait to avoid tricky crate dependencies
And add convenience impls for optional and empty chain tips.
* Use the ChainTip trait in zebra-network
* Replace `Option<ChainTip>` with `NoChainTip`
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Return a transaction verifier from `zebra_consensus::init`
This verifier is temporarily created separately from the block verifier's
transaction verifier.
* Return the same transaction verifier used by the block verifier
* Clarify that the mempool verifier is the transaction verifier
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Create initial `mempool::Crawler` type
The mempool crawler is responsible for periodically asking peers for
transactions to insert into the local mempool. This initial
implementation will periodically ask for transactions, but won't do
anything with them yet.
Also, the crawler is currently configured to be always enabled, but this
should be fixed to avoid crawling while Zebra is still syncing the
chain.
* Add a timeout to peer responses
Prevent the crawler from getting stuck if there's communication with a
peer that takes too long to respond.
* Run the mempool crawler in Zebra
Spawn a task for the crawler when Zebra starts.
* Test if the crawler is sending requests
Create a mock for the `PeerSet` service to intercept requests and verify
that the transaction requests are sent periodically.
* Use `full` Tokio features when testing
Make it simpler to select the features for test builds.
Co-authored-by: teor <teor@riseup.net>
* Link to the issue for crawler activation
Make it easy to navigate from the `TODO` comment to the current project
planning.
Co-authored-by: teor <teor@riseup.net>
* Link to the issue for downloading transactions
Make it easy to navigate from the `TODO` comment to the current project
planning.
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Minimal recent sync lengths implementation
Also includes metrics and logging, to make diagnosing bugs easier.
* Add logging to check what happens when Zebra reaches the chain tip
* Add tests for recent sync lengths
- initially empty
- pruned to correct length
- newest entries go first
* Drop a redundant `/` from a Cargo.lock URL
This seems to be a nightly or beta Rust change,
but hopefully stable just accepts it.
* Use metrics histograms to avoid overwriting values
* Add detailed syncer monitoring dashboard
* Increase the recent sync length to 4
This length makes it easier to distinguish between temporary and
sustained errors/syncs.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Rename internal network requests for wide transaction IDs
fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*
* Update network transaction request/response comments
* Rename a transaction hash method for wide transaction IDs
fastmod transaction_hashes transaction_ids zebra-network
* Add UnminedTxId methods and conversions for InventoryHash
* Map WtxIds to unmined transaction network messages
Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.
* Enable WtxId mempool inventory tracking for peers
* Further clarify transaction IDs
* Use Witnessed rather than Wide for transaction IDs
And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.
* Rename a missed binding
* Remove an incorrectly named binding
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Simplify state service initialization in test
Use the test helper function to remove redundant code.
* Create `BestTipHeight` helper type
This type abstracts away the calculation of the best tip height based on
the finalized block height and the best non-finalized chain's tip.
* Add `best_tip_height` field to `StateService`
The receiver endpoint is currently ignored.
* Return receiver endpoint from service constructor
Make it available so that the best tip height can be watched.
* Update finalized height after finalizing blocks
After blocks from the queue are finalized and committed to disk, update
the finalized block height.
* Update best non-finalized height after validation
Update the value of the best non-finalized chain tip block height after
a new block is committed to the non-finalized state.
* Update finalized height after loading from disk
When `FinalizedState` is first created, it loads the state from
persistent storage, and the finalized tip height is updated. Therefore,
the `best_tip_height` must be notified of the initial value.
* Update the finalized height on checkpoint commit
When a checkpointed block is commited, it bypasses the non-finalized
state, so there's an extra place where the finalized height has to be
updated.
* Add `best_tip_height` to `Handshake` service
It can be configured using the `Builder::with_best_tip_height`. It's
currently not used, but it will be used to determine if a connection to
a remote peer should be rejected or not based on that peer's protocol
version.
* Require best tip height to init. `zebra_network`
Without it the handshake service can't properly enforce the minimum
network protocol version from peers. Zebrad obtains the best tip height
endpoint from `zebra_state`, and the test vectors simply use a dummy
endpoint that's fixed at the genesis height.
* Pass `best_tip_height` to proto. ver. negotiation
The protocol version negotiation code will reject connections to peers
if they are using an old protocol version. An old version is determined
based on the current known best chain tip height.
* Handle an optional height in `Version`
Fallback to the genesis height in `None` is specified.
* Reject connections to peers on old proto. versions
Avoid connecting to peers that are on protocol versions that don't
recognize a network update.
* Document why peers on old versions are rejected
Describe why it's a security issue above the check.
* Test if `BestTipHeight` starts with `None`
Check if initially there is no best tip height.
* Test if best tip height is max. of latest values
After applying a list of random updates where each one either sets the
finalized height or the non-finalized height, check that the best tip
height is the maximum of the most recently set finalized height and the
most recently set non-finalized height.
* Add `queue_and_commit_finalized` method
A small refactor to make testing easier. The handling of requests for
committing non-finalized and finalized blocks is now more consistent.
* Add `assert_block_can_be_validated` helper
Refactor to move into a separate method some assertions that are done
before a block is validated. This is to allow moving these assertions
more easily to simplify testing.
* Remove redundant PoW block assertion
It's also checked in
`zebra_state::service::check::block_is_contextually_valid`, and it was
getting in the way of tests that received a gossiped block before
finalizing enough blocks.
* Create a test strategy for test vector chain
Splits a chain loaded from the test vectors in two parts, containing the
blocks to finalize and the blocks to keep in the non-finalized state.
* Test committing blocks update best tip height
Create a mock blockchain state, with a chain of finalized blocks and a
chain of non-finalized blocks. Commit all the blocks appropriately, and
verify that the best tip height is updated.
Co-authored-by: teor <teor@riseup.net>
* Always send our local listener with the latest time
Previously, whenever there was an inbound request for peers, we would
clone the address book and update it with the local listener.
This had two impacts:
- the listener could conflict with an existing entry,
rather than unconditionally replacing it, and
- the listener was briefly included in the address book metrics.
As a side-effect, this change also makes sanitization slightly faster,
because it avoids some useless peer filtering and sorting.
* Skip listeners that are not valid for outbound connections
* Filter sanitized addresses Zebra based on address state
This fix correctly prevents Zebra gossiping client addresses to peers,
but still keeps the client in the address book to avoid reconnections.
* Add a full set of DateTime32 and Duration32 calculation methods
* Refactor sanitize to use the new DateTime32/Duration32 methods
* Security: Use canonical SocketAddrs to avoid duplicate connections
If we allow multiple variants for each peer address, we can make multiple
connections to that peer.
Also make sure sanitized MetaAddrs are valid for outbound connections.
* Test that address books contain the local listener address
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* implement and test a rate limit in `request_genesis()`
* add `request_genesis_is_rate_limited` test to sync
* add ensure_timeouts constraint for GENESIS_TIMEOUT_RETRY
* Suppress expected warning logs in zebrad tests
Co-authored-by: teor <teor@riseup.net>
* Standardise lints across Zebra crates, and add missing docs
The only remaining module with missing docs is `zebra_test::command`
* Todo -> TODO
* Clarify what a transcript ErrorChecker does
Also change `Error` -> `BoxError`
* TransError -> ExpectedTranscriptError
* Output Descriptions -> Output descriptions
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
* Use the git version + new commit count + hash for the app version
This helps diagnose bugs in versions of Zebra built from git branches,
rather than git version tags.
* Fill in assert
* Also log semver string
* Fix syntax
* Handle vergen using the cargo package version or raw git tag
* s/Semver/SemVer/
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Enable builds where:
* there is no google cloud git commit env var, and
* there is no `.git` directory.
By making all `vergen` env vars optional, and skipping any env vars that
don't exist.
* build(deps): bump vergen from 3.2.0 to 5.1.1
* fix hardcoded version for Tracing struct
* add additional metadata
* remove extra allocations for metadata
* Remove zebrad code version from release checklist
The zebrad code automatically uses the crate version now.
* Sort panic metadata into rough categories
Co-authored-by: teor <teor@riseup.net>
Zebra's latest alpha checkpoints on Canopy activation, continues our work on NU5, and fixes a security issue.
Some notable changes include:
## Added
- Log address book metrics when PeerSet or CandidateSet don't have many peers (#1906)
- Document test coverage workflow (#1919)
- Add a final job to CI, so we can easily require all the CI jobs to pass (#1927)
## Changed
- Zebra has moved its mandatory checkpoint from Sapling to Canopy (#1898, #1926)
- This is a breaking change for users that depend on the exact height of the mandatory checkpoint.
## Fixed
- tower-batch: wake waiting workers on close to avoid hangs (#1908)
- Assert that pre-Canopy blocks use checkpointing (#1909)
- Fix CI disk space usage by disabling incremental compilation in coverage builds (#1923)
## Security
- Stop relying on unchecked length fields when preallocating vectors (#1925)
Log a "Trying..." message before each listener opens, to see if the
delay is inside Zebra, or in the test harness or OS.
Also report the configured and actual ports where possible, for better
diagnostics.
This change encodes a bunch of invariants in the type system,
and adds explicit failure states for:
* a closed oneshot,
* bugs in the initialization code.
Use `ServiceExt::oneshot` to perform state requests.
Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests
* Add a full set of stderr acceptance test functions
This change makes the stdout and stderr acceptance test interfaces
identical.
* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path
* Use Display for state path logs
Avoids weird escaping on Windows when using Debug
* Add Windows conflict error messages
* Turn PORT_IN_USE_ERROR into a regex
And add another alternative Windows-specific port error
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
* Bump versions where appropriate
Tested with cargo install --locked --path etc
* Remove fixed panics from 'Known Issues'
* Change to alpha release series in the README
Co-authored-by: teor <teor@riseup.net>
This timeout stops the sync service hanging when it is missing required
blocks, but the lookahead queue is full of dependent verify tasks, so the
missing blocks never get downloaded.
* Add the configured network to error reports
* Log the configured network at error level
* Create the global span immediately after activating tracing
And leak the span guard, so the span is always active.
* Include panic metadata in the report and URL
* Use `Main` and `Test` in the global span
`net=Mainnet` is a bit redundant
* Rewrite GetData handling to match the zcashd implementation
`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)
This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.
Also change Zebra's GetData responses to peer request so they ignore
missing blocks.
* Stop hanging on incomplete transaction or block responses
Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
that were successfully received
2. if none of the expected blocks or transactions were received, return
an error, and close the connection
In our README, we tell users to ignore these errors, so we should also
disable the issue URL.
Also include the hash in the error. (We don't want the span active for
all messages, we just want the hash in the error.)
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.
To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.
Co-authored-by: teor <teor@riseup.net>
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions
Add a `max_len` argument to support `FindHeaders` requests.
Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.
* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
This provides useful and not too noisy output at INFO level. We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
The metrics code becomes much simpler because the current version of the
metrics crate builds its own single-threaded runtime on a dedicated worker
thread, so no dependency on the main Zebra Tokio runtime is required.
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware. This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method. See
ddc64e8d4d
for more context on the Tower changes. To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
This helps prevent overloading the network with too many concurrent
block requests. On a fast network, we're likely to still have enough
room to saturate our bandwidth. In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads. If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s. In practice, this
minimum speed will likely be much lower.
This reverts commit 656bd24ba7.
The Hedge middleware keeps a pair of histograms, writing into one in the
current time interval and reading from the previous time interval's
data. This means that the reverted change resulted in doubling all
block downloads until after at least the second measurement interval
(which means that the time measurements are also incorrect, as they're
operating under double the network load...)
Sets the default value to the previous lookahead limit. My testing on
mainnet suggested that the newly lower value (changed when the
checkpoint frequency was decreased) is low enough to cause stalls, even
when using hedged requests.
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.
Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.
This change should improve genesis and post-restart sync latency.
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
The original sync algorithm split the sync process into two phases, one
that obtained prospective chain tips, and another that attempted to
extend those chain tips as far as possible until encountering an error
(at which point the prospective state is discarded and the process
restarts).
Because a previous implementation of this algorithm didn't properly
enforce linkage between segments of the chain while extending tips,
sometimes it would get confused and fail to discard responses that did
not extend a tip. To mitigate this, a check against the state was
added. However, this check can cause stalls while checkpointing,
because when a checkpoint is reached we may suddenly need to commit
thousands of blocks to the state. Because the sync algorithm now has a
a `CheckedTip` structure that ensures that a new segment of hashes
actually extends an existing one, we don't need to check against the
state while extending a tip, because we don't get confused while
interpreting responses.
This change results in significantly smoother progress on mainnet.
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
The timeout behavior in zebra-network is an implementation detail, not a
feature of the public API. So it shouldn't be mentioned in the doc
comments -- if we want timeout behavior, we have to layer it ourselves.
Using the cancel_handles, we can deduplicate requests. This is
important to do, because otherwise when we insert the second cancel
handle, we'd drop the first one, cancelling an existing task for no
reason.
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
Try to use the better cancellation logic to revert to previous sync
algorithm. As designed, the sync algorithm is supposed to proceed by
downloading state prospectively and handle errors by flushing the
pipeline and starting over. This hasn't worked well, because we didn't
previously cancel tasks properly. Now that we can, try to use something
in the spirit of the original sync algorithm.
This makes two changes relative to the existing download code:
1. It uses a oneshot to attempt to cancel the download task after it
has started;
2. It encapsulates the download creation and cancellation logic into a
Downloads struct.
* Reverse displayed endianness of transaction and block hashes
* fix zebra-checkpoints utility for new hash order
* Stop using "zebrad revhex" in zebrad-hash-lookup
* Rebuild checkpoint lists in new hash order
This change also adds additional checkpoints to the end of each list.
* Replace TransactionHash with transaction::Hash
This change should have been made in #905, but we missed Debug impls
and some docs.
Co-authored-by: Ramana Venkata <vramana@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
This reduces the API surface to the minimum required for functionality,
and cleans up module documentation. The stub mempool module is deleted
entirely, since it will need to be redone later anyways.
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names
The Inbound service only needs the network setup for some requests, but
it can service other requests without it. Making it return
Poll::Pending until the network setup finishes means that initial
network connections may view the Inbound service as overloaded and
attempt to load-shed.
The original version of this commit ran into
https://github.com/rust-lang/rust/issues/64552
again. Thanks to @yaahc for suggesting a workaround (using futures combinators
to avoid writing an async block).
Remove the seed command entirely, and make the behavior it provided
(responding to `Request::Peers`) part of the ordinary functioning of the
start command.
The new `Inbound` service should be expanded to handle all request
types.