This matches the settings for `sync_large_checkpoints_mainnet`.
Also reduce the number of blocks synced to reduce network load.
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Increment the crates that have new commits since the last version
* Increment the crates that depend on crates that have changed
* Increment the version of `zebra-script`
* Use the `zebrad` version in the `zebra-network` user agent string
* Use the `v1.0.0-alpha.19` git tag in `README.md`
* Copy the draft changelog into `CHANGELOG.md`
* Delete bumps
* Update CHANGELOG.md
Co-authored-by: teor <teor@riseup.net>
* Add newly merged PRs
Co-authored-by: teor <teor@riseup.net>
* Create a `NowOrLater` helper type
A replacement for `FutureExt::now_or_never` that ensures that the task
is scheduled for waking up later when the inner future is ready.
* Use `NowOrLater` to fix possible delay bug
Previous usage of `now_or_never` meant that the underlying task wasn't
being scheduled to awake when the `Downloads` stream produced a new
item. Using `NowOrLater` instead fixes that issue.
* Increase the restart test timeout to 10 seconds
It shouldn't take this long.
But maybe the CI VMs are under a lot of load?
* Add extensive logging to diagnose CI state reload failures
* Ignore AlreadyInChain error in the syncer
* Split Cancelled errors; add them to should_restart_sync exceptions
* Also filter 'block is already comitted'; try to detect a wrong downcast
* Rename tx downloader & verifier metrics
* Add version to mempool metrics
* Add new metrics
* Make sure mempool gauges are zeroed when instances are dropped
* Updated mempool grafana dashboard
* Removed transaction verification dashboard; moved to mempool
* Update mempool dashboard
* Add reason to error labels in mempool dashboard
* Rename some metrics per review
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Guarantee unique IDs in mempool service responses
* Guarantee unique IDs in crawler task mempool Queue requests
Also update the tests to use unique IDs.
* Add a CheckForVerifiedTransactions mempool request
Also document the mempool request and response variants.
* Spawn a QueueChecker task to check for newly verified transactions
This task makes sure that transactions reliably propagate,
rather than relying on peer requests or responses to trigger propagation.
* Update the start command documentation
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Impl Drop, Default and take() for ActiveState
* Refactor Mempool::poll_ready to check disabled and reset first
Also remove some levels of nesting.
* Use the same code for dropping and resetting the mempool
* Document where the tasks are dropped when switching states
* Log mempool resets at info level
And add heights to mempool enable/disable/reset logs
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Guarantee unique IDs in mempool service responses
* Guarantee unique IDs in crawler task mempool Queue requests
Also update the tests to use unique IDs.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Create a new VerifiedUnminedTx containing the miner fee
* Use VerifiedUnminedTx in mempool verification responses
And do a bunch of other cleanups.
* Use VerifiedUnminedTx in mempool download and verifier
* Use VerifiedUnminedTx in mempool storage and verified set
* Impl Display for VerifiedUnminedTx, and some convenience methods
* Use VerifiedUnminedTx in existing tests
* add some additional checks to the acceptance mempool test
* add an additional mempool test
* do proposed fixes to `sync_until`
* Ignore "can't kill an exited process" errors
Co-authored-by: teor <teor@riseup.net>
* Get the transaction fee from utxos
* Return the transaction fee from the verifier
* Avoid calculating the fee for coinbase transactions
Coinbase transactions don't have fees. In case of a coinbase transaction, the
verifier returns a zero fee.
* Update the result obtained by `Downloads`
* Update some comments
* Add a mempool debug_enable_at_height config
* Rename a field in the mempool crawler
* Propagate syncer channel errors through the crawler
We don't want to ignore these errors, because they might indicate a shutdown.
(Or a bug that we should fix.)
* Use debug_enable_at_height in the mempool crawler
* Log when the mempool is activated or deactivated
* Deny unknown fields and apply defaults for all configs
* Move Duration last, as required for TOML tables
* Add a basic mempool acceptance test
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* do not advertise rejected transactions
* do not broadcast transaction that are expired
* change dummy var name
* simplify code, performance
* clippy
* add some test coverage
* clippy
Co-authored-by: teor <teor@riseup.net>
* Split mempool config into its own module
Also:
- expand config docs
- clean up mempool imports
* Pass the mempool config to the mempool
* Create the transaction sender channel inside the mempool 1/2
This simplifies all the code that calls the mempool.
Also:
- update the mempool enabled state before returning the new mempool
- add some test module doc comments
* Refactor a setup function out of the mempool unit tests 2/2
Also:
- update the setup function to handle the latest mempool changes
* Clarify a comment
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Limit the size of rejection lists when there is a spend conflict
Previously, `insert` would return early with an error,
and skip limiting the rejection list sizes.
* Use prop_assert macros in proptests, rather than assert
* Add `HashSet`s to help spend conflict detection
Keep track of the spent transparent outpoints and the revealed
nullifiers.
Clippy complained that the `ActiveState` had variants with large size
differences, but that was expected, so I disabled that lint on that
`enum`.
* Clear the `HashSet`s when clearing the mempool
Clear them so that they remain consistent with the set of verified
transactions.
* Use `HashSet`s to check for spend conflicts
Store new outputs into its respective `HashSet`, and abort if a
duplicate output is found.
* Remove inserted outputs when aborting
Restore the `HashSet` to its previous state.
* Remove tracked outputs when removing a transaction
Keep the mempool storage in a consistent state when a transaction is
removed.
* Remove tracked outputs when evicting from mempool
Ensure eviction also keeps the tracked outputs consistent with the
verified transactions.
* Refactor to create a `VerifiedSet` helper type
Move the code to handle the output caches into the new type. Also move
the eviction code to make things a little simpler.
* Refactor to have a single `remove` method
Centralize the code that handles the removal of a transaction to avoid
mistakes.
* Move mempool size limiting back to `Storage`
Because the evicted transactions must be added to the rejected list.
* Remove leftover `dbg!` statement
Leftover from some temporary testing code.
Co-authored-by: teor <teor@riseup.net>
* Remove unnecessary `TODO`
It is more speculation than planning, so it doesn't add much value.
Co-authored-by: teor <teor@riseup.net>
* Fix typo in documentation
The verb should match the subject "transactions" which is plural.
Co-authored-by: teor <teor@riseup.net>
* Add a comment to warn about correctness
There's a subtle but important detail in the implementation that should
be made more visible to avoid mistakes in the future.
Co-authored-by: teor <teor@riseup.net>
* Remove outdated comment
Left-over from the attempt to move the eviction into the `VerifiedSet`.
* Improve comment explaining lint removal
Rewrite the comment explaining why the Clippy lint was ignored.
* Check for spend conflicts in `VerifiedSet`
Refactor to avoid API misuse.
* Test rejected transaction rollback
Using two transactions, perform the same test adding a conflict to both
of them to check if the second inserted transaction is properly
rejected. Then remove any conflicts from the second transaction and add
it again. That should work, because if it doesn't it means that when the
second transaction was rejected it left things it shouldn't in the
cache.
* Test removal of multiple transactions
When removing multiple transactions from the mempool storage, all of the
ones requested should be removed and any other transaction should be
still be there afterwards.
* Increase mempool size to 4, so that spend conflict tests work
If the mempool size is smaller than 4,
these tests don't fail on a trivial removal bug.
Because we need a minimum number of transactions in the mempool
to trigger the bug.
Also commit a proptest seed that fails on a trivial removal bug.
(This seed fails if we remove indexes in order,
because every index past the first removes the wrong transaction.)
* Summarise transaction data in proptest error output
* Summarise spend conflict field data in proptest error output
* Summarise multiple removal field data in proptest error output
And replace the very large proptest debug output with the new summary.
Co-authored-by: teor <teor@riseup.net>
* bradcast transactions to peers after they get inserted into mempool
* remove network argument from mempool init
* remove dbg left
* remove return value in mempool enable call
* rename channel sender and receiver vars
* change unwrap() to expect()
* change the channel to a hashset
* fix build
* fix tests
* rustfmt
* fix tiny space issue inside macro
Co-authored-by: teor <teor@riseup.net>
* check errors/panics in transaction gossip tests
* fix build of newly added tests
* Stop dropping the inbound service and mempool in a test
Keeping the mempool around avoids a transaction broadcast task error,
so we can test that there are no other errors in the task.
* Tweak variable names and add comments
* Avoid unexpected drops by returning a mempool guard in tests
* Use BoxError to simplify service types in tests
* Make all returned service types consistent in tests
We want to be able to change the setup without changing the tests.
Co-authored-by: teor <teor@riseup.net>
* Split mempool storage errors into tip-based and chain-based
* Expire tip rejections every time we get a new block
FailedVerification and SpendConflict rejections only apply to the current tip.
The next tip can provide missing inputs, or evict conflicting transactions.
* Enforce MAX_EVICTION_MEMORY_ENTRIES for mempool reject lists
* Implement a task that gossips verified block hashes
* Log an info message for block broadcasts
* Simplify the gossip task
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Re-use the old tip change if there is no new tip change
Also improve the comments.
* Add an assertion message
* Rename task join handles and futures in start method
* Add a dedicated BlockGossipError type
This type helps distinguish between syncer and state errors.
* Test that committed blocks are gossiped to peers
Also do a minor type cleanup on the existing test code,
replacing `Option<Vec<_>>` with `Vec<_>`.
* Formatting
* Remove excess newlines
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Clear the initial gossiped blocks during test setup
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Remove unused mempool storage errors
Preparation for ticket #2819.
Removing these errors means that we don't have to decide
which type of transaction ID match we want for them.
* Remove unused mempool errors, and deduplicate storage errors
* rustfmt
* Add a mempool transaction removal method for mined IDs
And use this method to remove expired transactions,
because all transactions with the same mined ID expire at the same height.
* Remove mined transaction IDs from the mempool
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Update versions for zebra v1.0.0-alpha.18 release
* WIP: Initial PR list
* Remove uninteresting version bumps from CHANGELOG
* Categorise and group PRs in CHANGELOG, removing uninteresting PRs
* Further refine and categorise changelog entries
* Fix tag url
* Final changes to CHANGELOG
* Add a changelog description
* Spacing
* Clarify and fix changelog PR descriptions
* Add PRs that are about to be merged
* More slight clarifications
* Spacing
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Rename type parameter to be more explicit
Replace the single letter with a proper name.
* Remove imports for `Request` and `Response`
The type names will conflict with the ones for the mempool service.
* Attach `Mempool` service to the `Crawler`
Add a field to the `Crawler` type to store a way to access the `Mempool`
service.
* Forward crawled transactions to downloader
The crawled transactions are now sent to the transaction downloader and
verifier, to be included in the mempool.
* Derive `Eq` and `PartialEq` for `mempool::Request`
Make it simpler to use the `MockService::expect_request` method.
* Test if crawled transactions are downloaded
Create some dummy crawled transactions, and let the crawler discover
them. Then check if they are forwarded to the mempool to be downloaded
and verified.
* Don't send empty transaction ID list to downloader
Ignore response from peers that don't provide any crawled transactions.
* Log errors when forwarding crawled transaction IDs
Calling the Mempool service should not fail, so if an error happens it
should be visible. However, errors when downloading individual
transactions can happen from time to time, so there's no need for them
to be very visible.
* Document existing `mempool::Crawler` test
Provide some depth as to what the test expect from the crawler's
behavior.
* Refactor to create `setup_crawler` helper function
Make it easier to reuse the common test setup code.
* Simplify code to expect requests
Now that `zebra_network::Request` implement `Eq`, the call can be
simplified into `expect_request`.
* Refactor to create `respond_with_transaction_ids`
A helper function that checks for a network crawl request and responds
with the given list of crawled transaction IDs.
* Refactor to create `crawler_iterator` helper
A function to intercept and respond to the fanned-out requests sent
during a single crawl iteration.
* Refactor to create `respond_to_queue_request`
Reduce the repeated code necessary to intercept and reply to a request
for queuing transactions to be downloaded.
* Add `respond_to_queue_request_with_error` helper
Intercepts a mempool request to queue transactions to be downloaded, and
responds with an error, simulating an internal problem in the mempool
service implementation.
* Derive `Arbitrary` for `NetworkUpgrade`
This is required for deriving `Arbitrary` for some error types.
* Derive `Arbitrary` for `TransactionError`
Allow random transaction errors to be generated for property tests.
* Derive `Arbitrary` for `MempoolError`
Allow random Mempool errors to be generated for property tests.
* Test if errors don't stop the mempool crawler
The crawler should be robust enough to continue operating even if the
mempool service fails to download transactions or even fails to handle
requests to enqueue transactions.
* Reduce the log level for download errors
They should happen regularly, so there's no need to have them with a
high visibility level.
Co-authored-by: teor <teor@riseup.net>
* Stop crawler if service stops
If `Mempool::poll_ready` returns an error, it's because the mempool
service has stopped and can't handle any requests, so the crawler should
stop as well.
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Move mempool tests into `tests::vector` sub-module
Make it consistent with other test modules and prepare for adding
property tests.
* Reorder imports
Make it consistent with the general guidelines followed on other
modules.
* Export `ChainTipBlock` and `ChainTipSender`
Allow these types to be used in other crates for testing purposes.
* Derive `Arbitrary` for `ChainTipBlock`
Make it easy to generate random `ChainTipBlock`s for usage in property
tests.
* Refactor to move test methods into `tests` module
Reduce the repeated test configuration attributes and make it easier to
see what is test specific and what is part of the general
implementation.
* Add a `Mempool::dummy_call` test helper method
Performs a dummy call just so that `poll_ready` gets called.
* Use `dummy_call` in existing tests
Replace the custom dummy requests with the helper method.
* Test if the mempool is cleared on chain reset
A chain reset should force the mempool storage to be cleared so that
transaction verification can restart using the new chain tip.
* Test if mempool is cleared on syncer restart
If the block synchronizer falls behind and then starts catching up
again, the mempool should be disabled and therefore the storage should
be cleared.
* Send mined transaction IDs to the download and verify task for cancellation
* Pass a HashSet of transaction hashes to be cancelled
* Add mempool_cancel_mined() test
* Fix starvation in test
* Fix typo in comment
* mempool - support transaction expiration
* use `LatestChainTip` instead of state call
* clippy
* remove spawn task
* remove non needed async from function
* remove return value
* add a `expiry_height_mut()` method to `Transaction` for testing purposes
* fix `remove_expired_transactions()`
* add a `mempool_transaction_expiration()` test
* tidy cleanup to `expiry_height()`
* improve docs
* fix the build
* try fix macos build
* extend tests
* add doc to function
* clippy
* fix build
* start tests at block two
* Cancel download and verify tasks when the mempool is deactivated
* Refactor enable/disable logic to use a state enum
* Add helper test functions to enable/disable the mempool
* Add documentation about errors on service calls
* Improvements from review
* Improve documentation
* Fix bug in test
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Add `Transaction::spent_outpoints` getter method
Returns an iterator over the UTXO `OutPoint`s spent by the transaction.
* Add `mempool::Error::Conflict` variant
An error representing that a transaction was rejected because it
conflicts with another transaction that's already in the mempool.
* Reject conflicting mempool transactions
Reject including a transaction in the mempool if it spends outputs
already spent by, or reveals nullifiers already revealed by another
transaction in the mempool.
* Fix typo in documentation
Remove the `r` that was incorrectly added.
Co-authored-by: teor <teor@riseup.net>
* Specify that the conflict is a spend conflict
Make the situation clearer, because there are other types of conflict.
Co-authored-by: teor <teor@riseup.net>
* Clarify that the outpoints are from inputs
Because otherwise it could lead to confusion because it could also mean
the outputs of the transaction represented as `OutPoint` references.
Co-authored-by: teor <teor@riseup.net>
* Create `storage::tests::vectors` module
Refactor to follow the convention used for other tests.
* Add an `AtLeastOne::first_mut` method
A getter to allow changing the first element.
* Add an `AtLeastOne::push` method
Allow appending elements to the collection.
* Derive `Arbitrary` for `FieldNotPresent`
This is just to make the code that generates arbitrary anchors a bit
simpler.
* Test if conflicting transactions are rejected
Generate two transactions (either V4 or V5) and insert a conflicting
spend, which can be either a transparent UTXO, or a nullifier for one of
the shielded pools. Check that any attempt to insert both transactions
causes one to be accepted and the other to be rejected.
* Delete a TODO comment that we decided not to do
Co-authored-by: teor <teor@riseup.net>
* Add tests for mempool Request::Queue
* Update test to work after refactoring
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Send `Response::Nil` instead of sending empty `Message`s
This matches `zcashd`'s behaviour more closely.
In most cases, the network layer filters these out already.
But this change makes the the inbound service code clearer.
* revert changes made to `AdvertiseTransactionIds` and `PushTransaction`
* remove newline
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* Update the expiry TODO
* Clear the mempool at a chain tip reset
* Clear the mempool by using a sync method (#2777)
* Clear the mempool by using a sync method
* Update docs
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* Refactor last_tip_change()
* Apply suggestions from code review
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Fix brackets
* Use best_tip_block instead of manual borrowing
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Remove return of redundant vector length
An attempt to improve readability a bit by not returning a tuple with a
value that can be obtained from a single return type.
* Refactor `unmined_transactions_in_blocks`
Use a more functional style to try to make it a bit clearer.
* Use ranges in `unmined_transactions_in_blocks`
Allow a finer control over the block range to extract the transactions
from.
* Refactor mempool test code to improve clarity
It was previously not clear that only the first genesis transaction was
being used. The remaining transactions in the genesis block were
discarded and then readded later.
* Replace `oneshot` with `call`
Remove a redundant internal `ready_and` call.
* Return an `impl Iterator` instead of a `Vec<_>`
Remove unnecessary deserializations and heap allocations.
* Refactor `mempool_storage_basic_for_network` test
Make the separation between the transactions expected to be in the
mempool and those expected to be rejected clearer.
* Replace `Iterator` with `DoubleEndedIterator`
Allow getting the last transaction more easily.
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Use `MockService` in inbound test
Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.
* Use `MockService` in the basic mempool test
Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.
* Remove the `mock_peer_set` helper function
It is not used anymore, since the usages were replaced with
`MockService`s.
* add tests for mempool inbound requests
* Use MockService for transaction verifier
* Refactor creation of mock `peer_set`
Use the same style as the mock transaction verifier.
* Derive `Eq` for `zebra_network::Request`
Make it easy to use the `MockService::expect_request` method.
* Return mocked peer set service from `setup`
Allow it to be used to respond to requests.
* Add bindings for the transaction used for testing
Allow them to be moved into futures later.
* Respond to transaction download request
Make sure that the test transaction appears to the mempool as if it had
been downloaded by the peer set service.
* Assert that no unexpected requests were received
Check that the mempool doesn't send unexpected requests to the peer set
service.
* add tests for mempool inbound requests
* Use MockService for transaction verifier
* add missing `expect_no_requests` to `mempool_advertise_transaction_ids` test
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Use `MockService` in inbound test
Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.
* Use `MockService` in the basic mempool test
Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.
* Remove the `mock_peer_set` helper function
It is not used anymore, since the usages were replaced with
`MockService`s.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Allow deliberate instances of the new nightly clippy::derivable_impls lint
We want our config defaults to be explicit.
Not so sure about the application defaults, but they also contain a config.
* Also allow unknown lint names
Stable doesn't know about this lint, but nightly does.
* Implement initial service mocking helpers
Adds a [`MockService`] type, which can be configured and built for usage
in unit tests or proptests. The mocked service can then be used to
intercept requests and respond indivdiually to them.
* Use `MockService in the `mempool::Crawler` test
Refactor it to remove the helper mock function, and use the new
`MockService` helper type.
* Use `MockService` in `CandidateSet` test vectors
Refactor to remove the manual mocking of the peer set service.
* Panic if a response is not sent by `MockService`
Change the current semantics to require all `MockService` usages to
respond to every intercepted request.
A `must_use` attribute was added to the `ResponseSender` so that the
compiler can warn when this doesn't happen.
* Allow generic error types in `MockService`
Replace the hard-coded `BoxError` as the `Service`'s error type with a
generic type parameter. This allows mocking services in locations that
require specific error types.
* Add a `ResponseSender::request` getter
Allow inspecting the request again before responding, and using
information from the request in the response.
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Pass sync_status to mempool
* Update zebrad/src/components/mempool.rs
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Remove enabled flag for now; will be handled in #2723
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Check if tx already exists in mempool or state before downloading
* Reorder checks
* Add rejected test; refactor into separate function
* Wrap mempool in buffered service
* Rename RejectedTransactionsById -> RejectedTransactionsIds
* Add RejectedTransactionIds response; fix request name
* Organize imports
* add a test for Storage::rejected_transactions
* add test for mempool `Request::RejectedTransactionIds`
* change buffer size to 1 in the test
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Using `&mut self` as the receiver in the method signatures allows Rust
to infer that the type is properly `Sync`, and therefore `Send`. This
allows removing the `Mutex` work-around.
* Decide if Zebra is at the chain tip
* Avoid division by zero
* Try increasing EVENT_TIMEOUT
* Increase MAX_TEST_EXECUTION
* Implement basic tests
* Resolve Clippy's erorrs
* change doc comments to normal
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
* reply to `Request::MempoolTransactionIds`
* remove boilerplate
* get storage from mempool with a method
* change panic message
* try fix for mac
* use normal init instead of init_tests for state service
* newline
* rustfmt
* fix test build
* Rename ChainTipReceiver to CurrentChainTip
`fastmod ChainTipReceiver CurrentChainTip zebra*`
* Update chain tip documentation and variable names
* Basic chain tip change implementation, without resets
Also includes the following name changes:
```
fastmod CurrentChainTip LatestChainTip zebra*
fastmod chain_tip_receiver latest_chain_tip zebra*
```
* Clarify the difference between `LatestChainTip` and `ChainTipChange`
* Store a `SyncStatus` handle in the `Crawler`
The helper type will make it easier to determine if the crawler is
enabled or not.
* Pause crawler if mempool is disabled
Implement waiting until the mempool becomes enabled, so that the crawler
does not run while the mempool is disabled.
If the `MempoolStatus` helper is unable to determine if the mempool is
enabled, stop the crawler task entirely.
* Update test to consider when crawler is paused
Change the mempool crawler test so that it's a proptest that tests
different chain sync. lengths. This leads to different scenarios with
the crawler pausing and resuming.
Co-authored-by: teor <teor@riseup.net>
* Create a `SyncStatus` helper type
Keeps track if the synchronizer is close to the chain tip or not.
* Refactor `ChainSync` ctor. to return `SyncStatus`
Change the constructor API so that it returns a higher level construct.
* Test if `SyncStatus` waits for the chain tip
Test if waiting for the chain tip to be reached correctly finishes when
the chain tip is reached. This is done by sending recent sync lengths to
the `SyncStatus` instance, and checking that every time a separate
`SyncStatus` instance determines it has reached the tip the original
instance wakes up.
* Add a temporary attribute to allow dead code
The code added isn't used yet, so we'll add a temporary waiver until
another PR is merged to use them.
This avoids peer set contention when most peers are busy.
Also exit the task if the peer service returns a readiness error,
because that means it's permanently unusable.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* First pass at a Mempool Service, incl. a storage layer underneath
* Fixed up Mempool service and storage
* allow dead code where needed
* clippy
* typo
* only drain if the mempool is full
* add a basic storage test
* remove space
* fix test for when MEMPOOL_SIZE change
* group some imports
* add a basic mempool service test
* add clippy suggestions
* remove not needed allow dead code
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: teor <teor@riseup.net>
* Rename BestTipHeight so it can be generalised to ChainTipSender
`fastmod BestTipHeight ChainTipSender zebra*`
For senders:
`fastmod best_tip_height chain_tip_sender zebra*`
For receivers:
`fastmod best_tip_height chain_tip_receiver zebra*`
* Rename best_tip_height module to chain_tip
* Wrap the chain tip watch channel in a ChainTipReceiver type
* Create a ChainTip trait to avoid tricky crate dependencies
And add convenience impls for optional and empty chain tips.
* Use the ChainTip trait in zebra-network
* Replace `Option<ChainTip>` with `NoChainTip`
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Return a transaction verifier from `zebra_consensus::init`
This verifier is temporarily created separately from the block verifier's
transaction verifier.
* Return the same transaction verifier used by the block verifier
* Clarify that the mempool verifier is the transaction verifier
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Create initial `mempool::Crawler` type
The mempool crawler is responsible for periodically asking peers for
transactions to insert into the local mempool. This initial
implementation will periodically ask for transactions, but won't do
anything with them yet.
Also, the crawler is currently configured to be always enabled, but this
should be fixed to avoid crawling while Zebra is still syncing the
chain.
* Add a timeout to peer responses
Prevent the crawler from getting stuck if there's communication with a
peer that takes too long to respond.
* Run the mempool crawler in Zebra
Spawn a task for the crawler when Zebra starts.
* Test if the crawler is sending requests
Create a mock for the `PeerSet` service to intercept requests and verify
that the transaction requests are sent periodically.
* Use `full` Tokio features when testing
Make it simpler to select the features for test builds.
Co-authored-by: teor <teor@riseup.net>
* Link to the issue for crawler activation
Make it easy to navigate from the `TODO` comment to the current project
planning.
Co-authored-by: teor <teor@riseup.net>
* Link to the issue for downloading transactions
Make it easy to navigate from the `TODO` comment to the current project
planning.
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Minimal recent sync lengths implementation
Also includes metrics and logging, to make diagnosing bugs easier.
* Add logging to check what happens when Zebra reaches the chain tip
* Add tests for recent sync lengths
- initially empty
- pruned to correct length
- newest entries go first
* Drop a redundant `/` from a Cargo.lock URL
This seems to be a nightly or beta Rust change,
but hopefully stable just accepts it.
* Use metrics histograms to avoid overwriting values
* Add detailed syncer monitoring dashboard
* Increase the recent sync length to 4
This length makes it easier to distinguish between temporary and
sustained errors/syncs.
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Rename internal network requests for wide transaction IDs
fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*
* Update network transaction request/response comments
* Rename a transaction hash method for wide transaction IDs
fastmod transaction_hashes transaction_ids zebra-network
* Add UnminedTxId methods and conversions for InventoryHash
* Map WtxIds to unmined transaction network messages
Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.
* Enable WtxId mempool inventory tracking for peers
* Further clarify transaction IDs
* Use Witnessed rather than Wide for transaction IDs
And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.
* Rename a missed binding
* Remove an incorrectly named binding
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Simplify state service initialization in test
Use the test helper function to remove redundant code.
* Create `BestTipHeight` helper type
This type abstracts away the calculation of the best tip height based on
the finalized block height and the best non-finalized chain's tip.
* Add `best_tip_height` field to `StateService`
The receiver endpoint is currently ignored.
* Return receiver endpoint from service constructor
Make it available so that the best tip height can be watched.
* Update finalized height after finalizing blocks
After blocks from the queue are finalized and committed to disk, update
the finalized block height.
* Update best non-finalized height after validation
Update the value of the best non-finalized chain tip block height after
a new block is committed to the non-finalized state.
* Update finalized height after loading from disk
When `FinalizedState` is first created, it loads the state from
persistent storage, and the finalized tip height is updated. Therefore,
the `best_tip_height` must be notified of the initial value.
* Update the finalized height on checkpoint commit
When a checkpointed block is commited, it bypasses the non-finalized
state, so there's an extra place where the finalized height has to be
updated.
* Add `best_tip_height` to `Handshake` service
It can be configured using the `Builder::with_best_tip_height`. It's
currently not used, but it will be used to determine if a connection to
a remote peer should be rejected or not based on that peer's protocol
version.
* Require best tip height to init. `zebra_network`
Without it the handshake service can't properly enforce the minimum
network protocol version from peers. Zebrad obtains the best tip height
endpoint from `zebra_state`, and the test vectors simply use a dummy
endpoint that's fixed at the genesis height.
* Pass `best_tip_height` to proto. ver. negotiation
The protocol version negotiation code will reject connections to peers
if they are using an old protocol version. An old version is determined
based on the current known best chain tip height.
* Handle an optional height in `Version`
Fallback to the genesis height in `None` is specified.
* Reject connections to peers on old proto. versions
Avoid connecting to peers that are on protocol versions that don't
recognize a network update.
* Document why peers on old versions are rejected
Describe why it's a security issue above the check.
* Test if `BestTipHeight` starts with `None`
Check if initially there is no best tip height.
* Test if best tip height is max. of latest values
After applying a list of random updates where each one either sets the
finalized height or the non-finalized height, check that the best tip
height is the maximum of the most recently set finalized height and the
most recently set non-finalized height.
* Add `queue_and_commit_finalized` method
A small refactor to make testing easier. The handling of requests for
committing non-finalized and finalized blocks is now more consistent.
* Add `assert_block_can_be_validated` helper
Refactor to move into a separate method some assertions that are done
before a block is validated. This is to allow moving these assertions
more easily to simplify testing.
* Remove redundant PoW block assertion
It's also checked in
`zebra_state::service::check::block_is_contextually_valid`, and it was
getting in the way of tests that received a gossiped block before
finalizing enough blocks.
* Create a test strategy for test vector chain
Splits a chain loaded from the test vectors in two parts, containing the
blocks to finalize and the blocks to keep in the non-finalized state.
* Test committing blocks update best tip height
Create a mock blockchain state, with a chain of finalized blocks and a
chain of non-finalized blocks. Commit all the blocks appropriately, and
verify that the best tip height is updated.
Co-authored-by: teor <teor@riseup.net>
* Use the block verifier and non-finalized state in the cached state tests
This substantially increases test coverage.
Previously, the cached state tests were configured with
`checkpoint_sync = true`, which only uses the checkpoint
verifier and the finalized state.
* Log the source of blocks in commit_finalized_direct
This lets us check that we're actually testing the non-finalized state
and block verifier in the cached state tests.
It also improves diagnostics for state errors.
* Fail cached state tests if they're using incorrect heights or configs
This makes sure that the cached state tests actually test the transition
from checkpoint to block verification, and the non-finalized state.
* Update versions for zebra v1.0.0-alpha.12 release
* Update Cargo.lock
* Update release checklist with latest version changes to help keep track for future releases
* Remove reference to the fact that tower-fallback was not updated
* add legacy chain check and tests
* improve has_network_upgrade check
* add docs to legacy_chain_check()
* change arbitrary module structure
* change the panic message
* move legacy chain acceptance into existing tests
* use a reduced_branch_id_strategy()
* add docs to strategy function
* add argument to check for legacy chain into sync_until()
* Disable IPv6 tests when $ZEBRA_SKIP_IPV6_TESTS is set
This allows users to disable IPv6 tests in environments where IPv6 is not
configured.
* Add network test env var constants
* Replace env strings with constants
fastmod '"ZEBRA_SKIP_NETWORK_TESTS"' zebra_test::net::ZEBRA_SKIP_NETWORK_TESTS
fastmod '"ZEBRA_SKIP_IPV6_TESTS"' zebra_test::net::ZEBRA_SKIP_IPV6_TESTS
* Add functions to skip network tests
* Replace test network env var checks with test function
fastmod --fixed-strings 'env::var_os(zebra_test::net::ZEBRA_SKIP_NETWORK_TESTS).is_some()' 'zebra_test::net::zebra_skip_network_tests()'
fastmod --fixed-strings 'env::var_os(zebra_test::net::ZEBRA_SKIP_IPV6_TESTS).is_some()' 'zebra_test::net::zebra_skip_ipv6_tests()'
* Remove redundant logging and use statements
* Gossip dynamically allocated listener ports to peers
Previously, Zebra would either gossip port `0`, which is invalid, or skip
gossiping its own dynamically allocated listener port.
* Improve "no configured peers" warning
And downgrade from error to warning, because inbound-only nodes are a
valid use case.
* Move random_known_port to zebra-test
* Add tests for dynamic local listener ports and the AddressBook
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Always send our local listener with the latest time
Previously, whenever there was an inbound request for peers, we would
clone the address book and update it with the local listener.
This had two impacts:
- the listener could conflict with an existing entry,
rather than unconditionally replacing it, and
- the listener was briefly included in the address book metrics.
As a side-effect, this change also makes sanitization slightly faster,
because it avoids some useless peer filtering and sorting.
* Skip listeners that are not valid for outbound connections
* Filter sanitized addresses Zebra based on address state
This fix correctly prevents Zebra gossiping client addresses to peers,
but still keeps the client in the address book to avoid reconnections.
* Add a full set of DateTime32 and Duration32 calculation methods
* Refactor sanitize to use the new DateTime32/Duration32 methods
* Security: Use canonical SocketAddrs to avoid duplicate connections
If we allow multiple variants for each peer address, we can make multiple
connections to that peer.
Also make sure sanitized MetaAddrs are valid for outbound connections.
* Test that address books contain the local listener address
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
- Add a custom semver match for `zebrad` versions
- Prefer "line contains string" matches, so tests ignore minor changes
- Escape regex meta-characters when a literal string match is intended
- Rename test functions so they are more precise
- Rewrite match internals to remove duplicate code and enable custom matches
- Document match functions
* implement and test a rate limit in `request_genesis()`
* add `request_genesis_is_rate_limited` test to sync
* add ensure_timeouts constraint for GENESIS_TIMEOUT_RETRY
* Suppress expected warning logs in zebrad tests
Co-authored-by: teor <teor@riseup.net>
* Standardise lints across Zebra crates, and add missing docs
The only remaining module with missing docs is `zebra_test::command`
* Todo -> TODO
* Clarify what a transcript ErrorChecker does
Also change `Error` -> `BoxError`
* TransError -> ExpectedTranscriptError
* Output Descriptions -> Output descriptions
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
* Use the git version + new commit count + hash for the app version
This helps diagnose bugs in versions of Zebra built from git branches,
rather than git version tags.
* Fill in assert
* Also log semver string
* Fix syntax
* Handle vergen using the cargo package version or raw git tag
* s/Semver/SemVer/
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Enable builds where:
* there is no google cloud git commit env var, and
* there is no `.git` directory.
By making all `vergen` env vars optional, and skipping any env vars that
don't exist.
* build(deps): bump vergen from 3.2.0 to 5.1.1
* fix hardcoded version for Tracing struct
* add additional metadata
* remove extra allocations for metadata
* Remove zebrad code version from release checklist
The zebrad code automatically uses the crate version now.
* Sort panic metadata into rough categories
Co-authored-by: teor <teor@riseup.net>
Zebra's latest alpha checkpoints on Canopy activation, continues our work on NU5, and fixes a security issue.
Some notable changes include:
## Added
- Log address book metrics when PeerSet or CandidateSet don't have many peers (#1906)
- Document test coverage workflow (#1919)
- Add a final job to CI, so we can easily require all the CI jobs to pass (#1927)
## Changed
- Zebra has moved its mandatory checkpoint from Sapling to Canopy (#1898, #1926)
- This is a breaking change for users that depend on the exact height of the mandatory checkpoint.
## Fixed
- tower-batch: wake waiting workers on close to avoid hangs (#1908)
- Assert that pre-Canopy blocks use checkpointing (#1909)
- Fix CI disk space usage by disabling incremental compilation in coverage builds (#1923)
## Security
- Stop relying on unchecked length fields when preallocating vectors (#1925)
`node2.is_running()` can return `true` on Windows, even though `node2`
has logged a panic. This cleanup code only runs if `node2` fails to panic
and exit as expected. So it's ok for us to skip it.
See #1781 for details.
On Windows, if a process is killed after it is dead, it returns `true`
for `was_killed`. Instead, check if the process is running before killing
it.
Also make the section where processes are running as short as possible,
and include context for both processes in every error.
Log a "Trying..." message before each listener opens, to see if the
delay is inside Zebra, or in the test harness or OS.
Also report the configured and actual ports where possible, for better
diagnostics.
* Increase the conflict acceptance test launch delay
Also rename the tests - the listener is for the Zcash protocol,
but the state, metrics, and tracing are Zebra-specific.
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
This change encodes a bunch of invariants in the type system,
and adds explicit failure states for:
* a closed oneshot,
* bugs in the initialization code.
Use `ServiceExt::oneshot` to perform state requests.
Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests
* Add a full set of stderr acceptance test functions
This change makes the stdout and stderr acceptance test interfaces
identical.
* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path
* Use Display for state path logs
Avoids weird escaping on Windows when using Debug
* Add Windows conflict error messages
* Turn PORT_IN_USE_ERROR into a regex
And add another alternative Windows-specific port error
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
* Bump versions where appropriate
Tested with cargo install --locked --path etc
* Remove fixed panics from 'Known Issues'
* Change to alpha release series in the README
Co-authored-by: teor <teor@riseup.net>
The clippy unknown lints attribute was deprecated in
nightly in rust-lang/rust#80524. The old lint name now produces a
warning.
Since we're using `allow(unknown_lints)` to suppress warnings, we need to
add the canonical name, so we can continue to build without warnings on
nightly.
But we also need to keep the old name, so we can continue to build
without warnings on stable.
And therefore, we also need to disable the "removed lints" warning,
otherwise we'll get warnings about the old name on nightly.
We'll need to keep this transitional clippy config until rustc 1.51 is
stable.
* Stop failing acceptance tests if their directories already exist
* Add an immutable config writing helper
and use it in the cached sapling acceptance tests.
Also:
* consistently create missing config and state directories
* refactor the common config writing code into a separate function
* only ignore NotFound errors in replace_config
* enforce config immutability using the type system
This timeout stops the sync service hanging when it is missing required
blocks, but the lookahead queue is full of dependent verify tasks, so the
missing blocks never get downloaded.
Check misconfigured ephemeral doesn't create a state dir
Add extra misconfigured `zebrad` ephemeral mode checks:
* doesn't create a state directory
* doesn't create unexpected files or directories in the working directory
Check ephemeral doesn't delete an existing state directory
Refactor all the ephemeral configs and checks into a single test
function.
Also:
* cleanup acceptance tests using utility functions
* make some checks consistent between tests
* make error messages consistent
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* Add the configured network to error reports
* Log the configured network at error level
* Create the global span immediately after activating tracing
And leak the span guard, so the span is always active.
* Include panic metadata in the report and URL
* Use `Main` and `Test` in the global span
`net=Mainnet` is a bit redundant
When `cargo run` is run in the workspace directory, it can see two
executables:
- `zebrad`
- `zebra_checkpoints`
Adding `default-run = "zebrad"` to `zebrad/Cargo.toml` makes the
workspace run `zebrad` by default. (Even though it's redundant for the
`zebrad` crate itself.)
* Rewrite GetData handling to match the zcashd implementation
`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)
This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.
Also change Zebra's GetData responses to peer request so they ignore
missing blocks.
* Stop hanging on incomplete transaction or block responses
Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
that were successfully received
2. if none of the expected blocks or transactions were received, return
an error, and close the connection
In our README, we tell users to ignore these errors, so we should also
disable the issue URL.
Also include the hash in the error. (We don't want the span active for
all messages, we just want the hash in the error.)
This change avoids errors when tests are cancelled and re-run within a
short period of time, for example, using `cargo watch`.
It introduces a slight risk of port conflicts between the endpoint tests,
and with (ephemeral) ports used by other services. The risk of conflicts
across 2 tests is very low, and tests should be run in an isolated
environment on busy servers.
vergen's implementation of REBUILD_ON_HEAD_CHANGE assumes that the .git
directory is in the crate root, but Zebra uses a workspace.
Temporary fix for rustyhorde/vergen#21.
Because the new version of the prometheus exporter launches its own
single-threaded runtime on a dedicated worker thread, there's no need
for the tokio and hyper versions it uses internally to align with the
versions used in other crates. So we don't need to use our fork with
tokio 0.3, and can just use the published alpha. Advancing to a later
alpha may fix the missing-metrics issues.
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.
To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.
Co-authored-by: teor <teor@riseup.net>
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions
Add a `max_len` argument to support `FindHeaders` requests.
Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.
* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
This provides useful and not too noisy output at INFO level. We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
The metrics code becomes much simpler because the current version of the
metrics crate builds its own single-threaded runtime on a dedicated worker
thread, so no dependency on the main Zebra Tokio runtime is required.
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware. This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method. See
ddc64e8d4d
for more context on the Tower changes. To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
## Motivation
Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb.
Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network.
## Solution
This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain.
The code in this pull request has:
- [x] Documentation Comments
- [x] Unit Tests and Property Tests
## Review
@hdevalence
This helps prevent overloading the network with too many concurrent
block requests. On a fast network, we're likely to still have enough
room to saturate our bandwidth. In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads. If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s. In practice, this
minimum speed will likely be much lower.
This reverts commit 656bd24ba7.
The Hedge middleware keeps a pair of histograms, writing into one in the
current time interval and reading from the previous time interval's
data. This means that the reverted change resulted in doubling all
block downloads until after at least the second measurement interval
(which means that the time measurements are also incorrect, as they're
operating under double the network load...)
* Use the default memory limit in the acceptance tests
PR #1233 changed the default `memory_cache_bytes`, but left the
acceptance tests with their old value.
Sets the default value to the previous lookahead limit. My testing on
mainnet suggested that the newly lower value (changed when the
checkpoint frequency was decreased) is low enough to cause stalls, even
when using hedged requests.
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.
Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.
This change should improve genesis and post-restart sync latency.
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests
And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
The original sync algorithm split the sync process into two phases, one
that obtained prospective chain tips, and another that attempted to
extend those chain tips as far as possible until encountering an error
(at which point the prospective state is discarded and the process
restarts).
Because a previous implementation of this algorithm didn't properly
enforce linkage between segments of the chain while extending tips,
sometimes it would get confused and fail to discard responses that did
not extend a tip. To mitigate this, a check against the state was
added. However, this check can cause stalls while checkpointing,
because when a checkpoint is reached we may suddenly need to commit
thousands of blocks to the state. Because the sync algorithm now has a
a `CheckedTip` structure that ensures that a new segment of hashes
actually extends an existing one, we don't need to check against the
state while extending a tip, because we don't get confused while
interpreting responses.
This change results in significantly smoother progress on mainnet.
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
The timeout behavior in zebra-network is an implementation detail, not a
feature of the public API. So it shouldn't be mentioned in the doc
comments -- if we want timeout behavior, we have to layer it ourselves.
Using the cancel_handles, we can deduplicate requests. This is
important to do, because otherwise when we insert the second cancel
handle, we'd drop the first one, cancelling an existing task for no
reason.
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
Try to use the better cancellation logic to revert to previous sync
algorithm. As designed, the sync algorithm is supposed to proceed by
downloading state prospectively and handle errors by flushing the
pipeline and starting over. This hasn't worked well, because we didn't
previously cancel tasks properly. Now that we can, try to use something
in the spirit of the original sync algorithm.
This makes two changes relative to the existing download code:
1. It uses a oneshot to attempt to cancel the download task after it
has started;
2. It encapsulates the download creation and cancellation logic into a
Downloads struct.
* Reverse displayed endianness of transaction and block hashes
* fix zebra-checkpoints utility for new hash order
* Stop using "zebrad revhex" in zebrad-hash-lookup
* Rebuild checkpoint lists in new hash order
This change also adds additional checkpoints to the end of each list.
* Replace TransactionHash with transaction::Hash
This change should have been made in #905, but we missed Debug impls
and some docs.
Co-authored-by: Ramana Venkata <vramana@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
This reduces the API surface to the minimum required for functionality,
and cleans up module documentation. The stub mempool module is deleted
entirely, since it will need to be redone later anyways.
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names
The Inbound service only needs the network setup for some requests, but
it can service other requests without it. Making it return
Poll::Pending until the network setup finishes means that initial
network connections may view the Inbound service as overloaded and
attempt to load-shed.