This helps prevent overloading the network with too many concurrent
block requests. On a fast network, we're likely to still have enough
room to saturate our bandwidth. In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads. If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s. In practice, this
minimum speed will likely be much lower.
This reverts commit 656bd24ba7.
The Hedge middleware keeps a pair of histograms, writing into one in the
current time interval and reading from the previous time interval's
data. This means that the reverted change resulted in doubling all
block downloads until after at least the second measurement interval
(which means that the time measurements are also incorrect, as they're
operating under double the network load...)
* Use the default memory limit in the acceptance tests
PR #1233 changed the default `memory_cache_bytes`, but left the
acceptance tests with their old value.
Sets the default value to the previous lookahead limit. My testing on
mainnet suggested that the newly lower value (changed when the
checkpoint frequency was decreased) is low enough to cause stalls, even
when using hedged requests.
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.
Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.
This change should improve genesis and post-restart sync latency.
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests
And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
The original sync algorithm split the sync process into two phases, one
that obtained prospective chain tips, and another that attempted to
extend those chain tips as far as possible until encountering an error
(at which point the prospective state is discarded and the process
restarts).
Because a previous implementation of this algorithm didn't properly
enforce linkage between segments of the chain while extending tips,
sometimes it would get confused and fail to discard responses that did
not extend a tip. To mitigate this, a check against the state was
added. However, this check can cause stalls while checkpointing,
because when a checkpoint is reached we may suddenly need to commit
thousands of blocks to the state. Because the sync algorithm now has a
a `CheckedTip` structure that ensures that a new segment of hashes
actually extends an existing one, we don't need to check against the
state while extending a tip, because we don't get confused while
interpreting responses.
This change results in significantly smoother progress on mainnet.
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
The timeout behavior in zebra-network is an implementation detail, not a
feature of the public API. So it shouldn't be mentioned in the doc
comments -- if we want timeout behavior, we have to layer it ourselves.
Using the cancel_handles, we can deduplicate requests. This is
important to do, because otherwise when we insert the second cancel
handle, we'd drop the first one, cancelling an existing task for no
reason.
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
Try to use the better cancellation logic to revert to previous sync
algorithm. As designed, the sync algorithm is supposed to proceed by
downloading state prospectively and handle errors by flushing the
pipeline and starting over. This hasn't worked well, because we didn't
previously cancel tasks properly. Now that we can, try to use something
in the spirit of the original sync algorithm.
This makes two changes relative to the existing download code:
1. It uses a oneshot to attempt to cancel the download task after it
has started;
2. It encapsulates the download creation and cancellation logic into a
Downloads struct.
* Reverse displayed endianness of transaction and block hashes
* fix zebra-checkpoints utility for new hash order
* Stop using "zebrad revhex" in zebrad-hash-lookup
* Rebuild checkpoint lists in new hash order
This change also adds additional checkpoints to the end of each list.
* Replace TransactionHash with transaction::Hash
This change should have been made in #905, but we missed Debug impls
and some docs.
Co-authored-by: Ramana Venkata <vramana@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
This reduces the API surface to the minimum required for functionality,
and cleans up module documentation. The stub mempool module is deleted
entirely, since it will need to be redone later anyways.
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names
The Inbound service only needs the network setup for some requests, but
it can service other requests without it. Making it return
Poll::Pending until the network setup finishes means that initial
network connections may view the Inbound service as overloaded and
attempt to load-shed.
The original version of this commit ran into
https://github.com/rust-lang/rust/issues/64552
again. Thanks to @yaahc for suggesting a workaround (using futures combinators
to avoid writing an async block).
Remove the seed command entirely, and make the behavior it provided
(responding to `Request::Peers`) part of the ordinary functioning of the
start command.
The new `Inbound` service should be expanded to handle all request
types.
* add test for first checkpoint sync
Prior this this change we've not had any tests that verify our sync /
network logic is well behaved. This PR cleans up the test helper code to
make error reports more consistent and uses this cleaned up API to
implement a checkpoint sync test which runs zebrad until it reads the
first checkpoint event from stdout.
Co-authored-by: teor <teor@riseup.net>
* move include out of unix cfg
Co-authored-by: teor <teor@riseup.net>
* increase the EWMA default and decay
* increase the block download retries
* increase the request and block download timeouts
* increase the sync timeout
* add metrics and tracking endpoint tests
* test endpoints more
* add change filter test for tracing
* add await to post
* separate metrics and tracing tests
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Ignore sync errors when the block is already verified
If we get an error for a block that is already in our state, we don't
need to restart the sync. It was probably a duplicate download.
Also:
Process any ready tasks before reset, so the logs and metrics are
up to date. (But ignore the errors, because we're about to reset.)
Improve sync logging and metrics during the download and verify task.
* Remove duplicate hashes in logs
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* Log the sync hash span at warn level
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* Remove in-memory state service
* make the config compatible with toml again
* checkpoint commit to see how much I still have to revert
* back to the starting point...
* remove unused dependency
* reorganize error handling a bit
* need to make a new color-eyre release now
* reorder again because I have problems
* remove unnecessary helpers
* revert changes to config loading
* add back missing space
* Switch to released color-eyre version
* add back missing newline again...
* improve error message on unix when terminated by signal
* add context to last few asserts in acceptance tests
* instrument some of the helpers
* remove accidental extra space
* try to make this compile on windows
* reorg platform specific code
* hide on_disk module and fix broken link
* add CheckpointList::new_up_to(limit: NetworkUpgrade)
* if checkpoint_sync is false, limit checkpoints to Sapling
* update tests for CheckpointList and chain::init
This is the first in a sequence of changes that change the block:: items
to not include Block as a prefix in their name, in accordance with the
Rust API guidelines.
* add valid generated config test
* change to pathbuf
* use -c to make sure we are using the generated file
* add and use a ZebraTestDir type
* change approach to generate tempdir in top of each test
* pass tempdir to test_cmd and set current dir to it
* add and use a `generated_config_path` variable in tests
* checkpoint: reject older of duplicate verification requests.
If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled. Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.
* sync: add a timeout layer to block requests.
Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.
* sync: restart syncing on error
Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.
* sync: additional debug info
* sync: handle lookahead limit correctly.
Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped. This meant that completed tasks could be left until the limit was
exceeded again. Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.
* network: add debug instrumentation to retry policy
* sync: instrument the spawned task
* sync: streamline ObtainTips/ExtendTips logic & tracing
This change does three things:
1. It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method. This means that when debugging we only have one
deduplication algorithm to focus on.
2. It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.
3. It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip"). The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.
* sync: hack to work around zcashd behavior
* sync: localize debug statement in extend_tips
* sync: change algorithm to define tips as pairs of hashes.
This is different enough from the existing description that its comments no
longer apply, so I removed them. A further chunk of work is to change the sync
RFC to document this algorithm.
* sync: reduce block timeout
* state: add resource limits for sled
Closes#888
* sync: add a restart timeout constant
* sync: de-pub constants
* network: move gossiped peer selection logic into address book.
* network: return BoxService from init.
* zebrad: add note on why we truncate thegossiped peer list
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* Remove unused .rustfmt.toml
Many of these options are never actually loaded by our CI because of a channel
mismatch, where they're not applied on stable but only on nightly (see the logs
from a rustfmt job). This means that we can get different settings when
running `cargo fmt` on the nightly and stable channels, which was causing a CI
failure on this PR. Reverting back to the default rustfmt settings avoids this
problem and keeps us in line with upstream rustfmt. There's no loss to us
since we were using the defaults anyways.
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* fix: Handle known ObtainTips correctly
enumerate never returns a value beyond the end of the vector.
* fix: Ignore known tips in ExtendTips
Some peers send us known tips when we try to extend.
* fix: Ignore known hashes when downloading
Despite all our other checks, we still end up downloading some hashes
multiple times.
* fix: Increase the number of retries
The old sync code relied on duplicate block fetches to make progress,
but the last few commits have removed some of those duplicates.
Instead, just retry the fetches that fail.
* fix: Tweak comments
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* fix: Cleanup the state_contains interface in Sync
* Fix brackets
Oops
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* make sure no info is printed in non server tests
* check exact full output for validity instead of log msgs
* add end of output character to version regex
* use coercions, use equality operator
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
The old sync code relied on duplicate block fetches to make progress,
but the last few commits have removed some of those duplicates.
Instead, just retry the fetches that fail.
We have a counter for pending "download and verify" futures. But these
futures are spawned, so they can complete in any order. They can also
complete before we receive their results.
* Split tracing component code into modules.
* Repatriate Tracing and simplify config handling.
We upstreamed our Tracing component, expecting not to have to exert fine
control over the tracing settings. But this turned out not to be the case, and
now that we want to do other things (flamegraphs, journalctl, opentelemetry,
etc), we end up with really awkward code (as in the current flamegraph
handling).
This also makes use of the changes to `init()` to load the config early to pass
configuration data into the components, which avoids the need for the
refactoring in #775.
Finally, we restore support for the `-v` flag when the filter is unset. Closes#831.
* Disable tracing and metrics endpoints by default.
Closes#660.
* Switch back to upstream Abscissa.
* Integrate flamegraph support into the new Tracing component.
* Pass -v in acceptance tests to get info-level output.
* Clean up acceptance test code.
* Setup tracing-flame for use profiling zebrad
* start work on conditional flamegraph generation
* review time!
* update comments
* Update Cargo.toml
* disable default features for inferno
* reorganize
* missing one trait
* Apply suggestions from code review
* graceful shutdown!
* remove special case handling on ctrlc for cleanup
* rename signal fn to better represent its responsibility
* remove unused global hook for flushing flamegraph
* move tracing logic to the right file
* just copy linkerd's signal handling logic
* update book
* make zebrad app drop on shutdown normally
* Update zebrad/src/components/tokio.rs
Co-authored-by: teor <teor@riseup.net>
* Update zebrad/src/application.rs
Co-authored-by: teor <teor@riseup.net>
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* cleanup a little
* ooh yea there's an API for that
* setup env-filter for backup subscriber
* document env filter
* document return codes
* forgot to save
* Update book/src/applications/zebrad.md
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: teor <teor@riseup.net>
* Load tracing filter only from config and simplify logic.
* Configure the state storage in the config, not an environment variable.
This also changes the config so that the path is always set rather than being
optional, because Zebra always needs a place to store its config.
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case
Co-authored-by: Jane Lusby <jane@zfnd.org>
We get the injected TokioComponent dependency before the config is
loaded, so we can't use it to open the endpoints.
And we can't define after_config, because we use derive(Component).
So we work around these issues by opening the endpoints manually,
from the application's after_config.
* make zebra-checkpoints
* fix LOOKAHEAD_LIMIT scope
* add a default cli path
* change doc usage text
* add tracing
* move MAX_CHECKPOINT_HEIGHT_GAP to zebra-consensus
* do byte_reverse_hex in a map
This seems like a better design on principle but also appears to give a much
nicer sawtooth pattern of queued blocks in the checkpointer and a much smoother
pattern of block requests.
We had a brief discussion on discord and it seemed like we had consensus on the
following versioning policy:
* zebrad: match major version to NU version, so we will start by releasing
zebrad 3.0.0;
* zebra-* libraries: start by matching zebrad's version, then increment major
versions of each library as we need to make breaking changes (potentially
faster than the zebrad version, always respecting semver but making no
guarantees about the longevity of major releases).
This commit sets all of the crate versions to 3.0.0-alpha.0 -- the -alpha.0
marks it as a prerelease not subject to perfect adherence to compatibility
guarantees.
Closes#697.
per https://github.com/ZcashFoundation/zebra/issues/697#issuecomment-662742971
The response to a getblocks message is an inv message with the hashes of the
following blocks. However, inv messages are also sent unsolicited to gossip new
blocks across the network. Normally, this wouldn't be a problem, because for
every other request we filter only for the messages that are relevant to us.
But because the response to a getblocks message is an inv, the network layer
doesn't (and can't) distinguish between the response inv and the unsolicited
inv.
But there is a mitigation we can do. In our sync algorithm we have two phases:
(1) "ObtainTips" to get a set of tips to chase down, (2) repeatedly call
"ExtendTips" to extend those as far as possible. The unsolicited inv messages
have length 1, but when extending tips we expect to get more than one hash. So
we could reject responses in ExtendTips that have length 1 in order to ignore
these messages. This way we automatically ignore gossip messages during initial
block sync (while we're extending a tip) but we don't ignore length-1 responses
while trying to obtain tips (while querying the network for new tips).
Closes#617.
Closes#698.
The remaining work on the syncer is alluded to in a new comment:
1. Correctly constructing a block locator object
2. Detecting when we've stopped making progress syncing and restarting obtain_tips.