Commit Graph

316 Commits

Author SHA1 Message Date
Henry de Valence e8c16b172f zebrad: pass TracingSection to Tracing component 2020-11-30 15:25:50 -08:00
Alfredo Garcia 4544463059
Inbound `FindBlocks` and `FindHeaders` (#1347)
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions

Add a `max_len` argument to support `FindHeaders` requests.

Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.

* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:30:37 +10:00
Henry de Valence fa02b266ca clippy 2020-11-25 10:55:44 -08:00
Henry de Valence de8415dcb1 tidy spans 2020-11-25 10:55:44 -08:00
Henry de Valence 05837797b1 tidy imports 2020-11-25 10:55:44 -08:00
Henry de Valence 77bf327b07 fix errors (2) 2020-11-25 10:55:44 -08:00
Henry de Valence 527f4d39ed fix errors 2020-11-25 10:55:44 -08:00
Henry de Valence e645e3bf0c remove async 2020-11-25 10:55:44 -08:00
Henry de Valence 6569977549 test compile change 2020-11-25 10:55:44 -08:00
Alfredo Garcia 486e55104a create Downloads for Inbound 2020-11-25 10:55:44 -08:00
Deirdre Connolly 6a0a6f6d37 allow(dead_code) not allow(clippy::dead_code) 2020-11-24 11:04:30 -05:00
Deirdre Connolly 4a67e0e7bb Enable stateful/long sync tests by features, mount rocksdb-based state at Sapling activation for sync_past_sapling_mainnet test 2020-11-24 11:04:30 -05:00
Deirdre Connolly d813603bac Remove defunct memory_cache_bytes from test config 2020-11-24 11:04:30 -05:00
Jane Lusby c2a57d7e49 slight comment tweek 2020-11-24 11:04:30 -05:00
Jane Lusby 99c5acc94f rename test fn 2020-11-24 11:04:30 -05:00
Jane Lusby 602d8c4898 document tests 2020-11-24 11:04:30 -05:00
Jane Lusby 17fdbe941b fix stdout issue with test framework for cached data tests 2020-11-24 11:04:30 -05:00
Jane Lusby 0f51891359 revert unnecessary change in sync_until 2020-11-24 11:04:30 -05:00
Jane Lusby 4bfe747f34 update acceptance tests 2020-11-24 11:04:30 -05:00
Jane Lusby d093b4e528 Add network integration test for quick post sapling sync testing 2020-11-24 11:04:30 -05:00
dependabot[bot] a4af90c2b0 build(deps): bump color-eyre from 0.5.7 to 0.5.8
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.7 to 0.5.8.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Changelog](https://github.com/yaahc/color-eyre/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.7...v0.5.8)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-24 09:59:22 -05:00
Henry de Valence 2a4a89c002 state,zebrad: tidy span levels for good INFO output
This provides useful and not too noisy output at INFO level.  We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
2020-11-23 14:16:39 +10:00
Henry de Valence f0810b028d state,consensus,sync: shorten span lengths
These changes help reduce the size of the resulting spans, making the
output more compact.  Together they save about 30-40 characters.
2020-11-23 14:16:39 +10:00
teor d4da9609ee Update the max_concurrent_block_requests docs
In #1298, we decreased `max_concurrent_block_requests`,
but forgot to update the docs.
2020-11-20 10:08:57 -08:00
Henry de Valence ba3c19142c deps: update hyper, metrics to tokio 0.3
The metrics code becomes much simpler because the current version of the
metrics crate builds its own single-threaded runtime on a dedicated worker
thread, so no dependency on the main Zebra Tokio runtime is required.
2020-11-20 10:08:16 -08:00
Henry de Valence add94c1c45 deps: move to tokio 0.3, tower 0.4
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware.  This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method.  See

ddc64e8d4d

for more context on the Tower changes.  To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
2020-11-20 10:08:16 -08:00
Jane Lusby 4c9bb87df2
zebra-state: replace sled with rocksdb (#1325)
## Motivation

Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb.

Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network.

## Solution

This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain.

The code in this pull request has:
  - [x] Documentation Comments
  - [x] Unit Tests and Property Tests

## Review

@hdevalence
2020-11-18 18:05:06 -08:00
Henry de Valence 4953f21670 fixup! zebrad: hack to skip alreadyverified errors 2020-11-18 03:09:06 -05:00
Henry de Valence d2fc01755b zebrad: more reasonable concurrent block limit
This helps prevent overloading the network with too many concurrent
block requests.  On a fast network, we're likely to still have enough
room to saturate our bandwidth.  In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads.  If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s.  In practice, this
minimum speed will likely be much lower.
2020-11-17 14:56:27 -08:00
Henry de Valence aa7538ab15 zebrad: hack to skip alreadyverified errors 2020-11-17 14:56:27 -08:00
Henry de Valence e55392b61e zebrad: explicitly select the threaded scheduler. 2020-11-17 14:56:27 -08:00
Henry de Valence 6de824bd99 zebrad: remove block verification timeout
Because we set the lookahead limit to be at least twice the size of a checkpoint, we don't have a risk of timeouts.
2020-11-17 14:56:27 -08:00
Henry de Valence e9c847bbd7 zebrad: avoid a borrow in the ChainSync future 2020-11-17 14:56:27 -08:00
Henry de Valence b632a24436 zebrad: add diagnostics on cancelled download tasks 2020-11-17 14:56:27 -08:00
Henry de Valence ec411574ee zebrad: improve sync diagnostics 2020-11-17 14:56:27 -08:00
teor 54cb9277ef Allow some new clippy nightly lints 2020-11-17 10:07:37 +10:00
dependabot[bot] 8c5f6d0177 build(deps): bump once_cell from 1.5.1 to 1.5.2
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.5.1 to 1.5.2.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.5.1...v1.5.2)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-13 14:48:11 -05:00
Jane Lusby 7c0275ac0b
reorganize stop check (#1288)
* reorganize stop check
* remove unused enum
* move out and make it unique
Co-authored-by: teor <teor@riseup.net>
2020-11-13 11:37:52 +10:00
Henry de Valence e0c92167bc Revert "Hedge every syncer block download request"
This reverts commit 656bd24ba7.

The Hedge middleware keeps a pair of histograms, writing into one in the
current time interval and reading from the previous time interval's
data.  This means that the reverted change resulted in doubling all
block downloads until after at least the second measurement interval
(which means that the time measurements are also incorrect, as they're
operating under double the network load...)
2020-11-12 16:45:47 -05:00
Alfredo Garcia 128643d81e
Call `zebra_test::init` where needed. (#1227)
* Add missing `zebra_test::init()` to zebra-chain
* Add missing `zebra_test::init()` to zebra-consensus
* Add missing `zebra_test::init()` to zebra-network
* Add missing `zebra_test::init()` to zebra-state
* Add missing `zebra_test::init()` to zebra-test
* Add missing `zebra_test::init()` to zebrad
2020-11-10 10:29:25 +10:00
teor efef2a2bd7
Reduce acceptance test sled memory usage (#1236)
* Use the default memory limit in the acceptance tests

PR #1233 changed the default `memory_cache_bytes`, but left the
acceptance tests with their old value.
2020-11-10 07:42:30 +10:00
dependabot[bot] a58299a0f0 build(deps): bump color-eyre from 0.5.6 to 0.5.7
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.6 to 0.5.7.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Changelog](https://github.com/yaahc/color-eyre/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.6...v0.5.7)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-09 08:40:55 -05:00
dependabot[bot] 1e3cf6dc5c build(deps): bump tracing-subscriber from 0.2.14 to 0.2.15
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.14 to 0.2.15.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.14...tracing-subscriber-0.2.15)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-04 20:37:40 -05:00
dependabot[bot] 785fc30481 build(deps): bump hyper from 0.13.8 to 0.13.9
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.8 to 0.13.9.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.8...v0.13.9)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-04 20:07:18 -05:00
Henry de Valence 0ad648fb6a zebrad: make lookahead limit configurable.
Sets the default value to the previous lookahead limit.  My testing on
mainnet suggested that the newly lower value (changed when the
checkpoint frequency was decreased) is low enough to cause stalls, even
when using hedged requests.
2020-11-01 10:47:46 -08:00
teor 92c623eddf Log each genesis download
This change helps us diagnose sync hangs.
2020-10-28 11:31:04 -04:00
teor 656bd24ba7 Hedge every syncer block download request
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.

Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.

This change should improve genesis and post-restart sync latency.
2020-10-28 11:31:04 -04:00
teor ea510b7d41
Run a block sync in CI with 2 large checkpoints (#1193)
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests

And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
2020-10-27 19:25:29 +10:00
Henry de Valence 4c960c4e6d zebrad: treat duplicate downloads as an error
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
2020-10-26 12:05:35 -07:00
Henry de Valence 4127d086ea zebrad: clarify hedge layering motivation
Co-authored-by: teor <teor@riseup.net>
2020-10-26 12:05:35 -07:00