Commit Graph

486 Commits

Author SHA1 Message Date
Henry de Valence e55392b61e zebrad: explicitly select the threaded scheduler. 2020-11-17 14:56:27 -08:00
Henry de Valence 6de824bd99 zebrad: remove block verification timeout
Because we set the lookahead limit to be at least twice the size of a checkpoint, we don't have a risk of timeouts.
2020-11-17 14:56:27 -08:00
Henry de Valence e9c847bbd7 zebrad: avoid a borrow in the ChainSync future 2020-11-17 14:56:27 -08:00
Henry de Valence b632a24436 zebrad: add diagnostics on cancelled download tasks 2020-11-17 14:56:27 -08:00
Henry de Valence ec411574ee zebrad: improve sync diagnostics 2020-11-17 14:56:27 -08:00
teor 54cb9277ef Allow some new clippy nightly lints 2020-11-17 10:07:37 +10:00
dependabot[bot] 8c5f6d0177 build(deps): bump once_cell from 1.5.1 to 1.5.2
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.5.1 to 1.5.2.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.5.1...v1.5.2)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-13 14:48:11 -05:00
Jane Lusby 7c0275ac0b
reorganize stop check (#1288)
* reorganize stop check
* remove unused enum
* move out and make it unique
Co-authored-by: teor <teor@riseup.net>
2020-11-13 11:37:52 +10:00
Henry de Valence e0c92167bc Revert "Hedge every syncer block download request"
This reverts commit 656bd24ba7.

The Hedge middleware keeps a pair of histograms, writing into one in the
current time interval and reading from the previous time interval's
data.  This means that the reverted change resulted in doubling all
block downloads until after at least the second measurement interval
(which means that the time measurements are also incorrect, as they're
operating under double the network load...)
2020-11-12 16:45:47 -05:00
Alfredo Garcia 128643d81e
Call `zebra_test::init` where needed. (#1227)
* Add missing `zebra_test::init()` to zebra-chain
* Add missing `zebra_test::init()` to zebra-consensus
* Add missing `zebra_test::init()` to zebra-network
* Add missing `zebra_test::init()` to zebra-state
* Add missing `zebra_test::init()` to zebra-test
* Add missing `zebra_test::init()` to zebrad
2020-11-10 10:29:25 +10:00
teor efef2a2bd7
Reduce acceptance test sled memory usage (#1236)
* Use the default memory limit in the acceptance tests

PR #1233 changed the default `memory_cache_bytes`, but left the
acceptance tests with their old value.
2020-11-10 07:42:30 +10:00
dependabot[bot] a58299a0f0 build(deps): bump color-eyre from 0.5.6 to 0.5.7
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.6 to 0.5.7.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Changelog](https://github.com/yaahc/color-eyre/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.6...v0.5.7)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-09 08:40:55 -05:00
dependabot[bot] 1e3cf6dc5c build(deps): bump tracing-subscriber from 0.2.14 to 0.2.15
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.14 to 0.2.15.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.14...tracing-subscriber-0.2.15)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-04 20:37:40 -05:00
dependabot[bot] 785fc30481 build(deps): bump hyper from 0.13.8 to 0.13.9
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.8 to 0.13.9.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.8...v0.13.9)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-04 20:07:18 -05:00
Henry de Valence 0ad648fb6a zebrad: make lookahead limit configurable.
Sets the default value to the previous lookahead limit.  My testing on
mainnet suggested that the newly lower value (changed when the
checkpoint frequency was decreased) is low enough to cause stalls, even
when using hedged requests.
2020-11-01 10:47:46 -08:00
teor 92c623eddf Log each genesis download
This change helps us diagnose sync hangs.
2020-10-28 11:31:04 -04:00
teor 656bd24ba7 Hedge every syncer block download request
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.

Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.

This change should improve genesis and post-restart sync latency.
2020-10-28 11:31:04 -04:00
teor ea510b7d41
Run a block sync in CI with 2 large checkpoints (#1193)
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests

And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
2020-10-27 19:25:29 +10:00
Henry de Valence 4c960c4e6d zebrad: treat duplicate downloads as an error
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
2020-10-26 12:05:35 -07:00
Henry de Valence 4127d086ea zebrad: clarify hedge layering motivation
Co-authored-by: teor <teor@riseup.net>
2020-10-26 12:05:35 -07:00
Henry de Valence 253bab042e sync: add a concurrency limit for block downloads 2020-10-26 12:05:35 -07:00
Henry de Valence 0a405c737d zebrad: check state in obtaintips, not extendtips.
The original sync algorithm split the sync process into two phases, one
that obtained prospective chain tips, and another that attempted to
extend those chain tips as far as possible until encountering an error
(at which point the prospective state is discarded and the process
restarts).

Because a previous implementation of this algorithm didn't properly
enforce linkage between segments of the chain while extending tips,
sometimes it would get confused and fail to discard responses that did
not extend a tip.  To mitigate this, a check against the state was
added.  However, this check can cause stalls while checkpointing,
because when a checkpoint is reached we may suddenly need to commit
thousands of blocks to the state.  Because the sync algorithm now has a
a `CheckedTip` structure that ensures that a new segment of hashes
actually extends an existing one, we don't need to check against the
state while extending a tip, because we don't get confused while
interpreting responses.

This change results in significantly smoother progress on mainnet.
2020-10-26 12:05:35 -07:00
Henry de Valence 65e0c22fbe state: don't pre-buffer the service
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
2020-10-26 12:05:35 -07:00
Henry de Valence ce2ac3336f zebrad: add debug message before state check
This reveals that there may be contention in access to the state, as
this takes a long time.
2020-10-26 12:05:35 -07:00
Henry de Valence 91469faf3c zebrad: eliminate duplicate span in sync 2020-10-26 12:05:35 -07:00
Henry de Valence b5a43f4516 zebrad: remove implementation details from docs
The timeout behavior in zebra-network is an implementation detail, not a
feature of the public API.  So it shouldn't be mentioned in the doc
comments -- if we want timeout behavior, we have to layer it ourselves.
2020-10-26 12:05:35 -07:00
Henry de Valence 1d7309afe2 zebrad: correctly handle duplicates in DownloadSet
Using the cancel_handles, we can deduplicate requests.  This is
important to do, because otherwise when we insert the second cancel
handle, we'd drop the first one, cancelling an existing task for no
reason.
2020-10-26 12:05:35 -07:00
Henry de Valence 56fe4f4379 zebrad: unify sync restart logic
This lets us keep the main loop simple and just write `continue 'sync;`
to keep going.
2020-10-26 12:05:35 -07:00
Henry de Valence 12d25159c6 zebrad: use hedged requests in sync
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
2020-10-26 12:05:35 -07:00
Henry de Valence 5f229d1475 zebrad: use Downloads in sync
Try to use the better cancellation logic to revert to previous sync
algorithm.  As designed, the sync algorithm is supposed to proceed by
downloading state prospectively and handle errors by flushing the
pipeline and starting over.  This hasn't worked well, because we didn't
previously cancel tasks properly.  Now that we can, try to use something
in the spirit of the original sync algorithm.
2020-10-26 12:05:35 -07:00
Henry de Valence b90581a3d7 zebrad: create a Downloads Stream for syncing.
This makes two changes relative to the existing download code:

1.  It uses a oneshot to attempt to cancel the download task after it
    has started;

2.  It encapsulates the download creation and cancellation logic into a
    Downloads struct.
2020-10-26 12:05:35 -07:00
Henry de Valence b636660d6a zebrad: rename sync::Error alias to BoxError. 2020-10-26 12:05:35 -07:00
dependabot[bot] ff51c2e0c0 build(deps): bump tracing-subscriber from 0.2.13 to 0.2.14
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.13 to 0.2.14.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.13...tracing-subscriber-0.2.14)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-23 15:02:02 -04:00
Henry de Valence cab96aa1a8
zebrad: clarify config help text (#1194) 2020-10-22 15:03:01 +10:00
Alfredo Garcia 21ad6ffc47
Reverse displayed endianness of transaction and block hashes (#1171)
* Reverse displayed endianness of transaction and block hashes
* fix zebra-checkpoints utility for new hash order
* Stop using "zebrad revhex" in zebrad-hash-lookup
* Rebuild checkpoint lists in new hash order
This change also adds additional checkpoints to the end of each list.

* Replace TransactionHash with transaction::Hash
This change should have been made in #905, but we missed Debug impls
and some docs.

Co-authored-by: Ramana Venkata <vramana@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
2020-10-22 07:54:02 +10:00
teor e52a1c07a3 Ignore longer sync tests by default 2020-10-21 21:08:04 +10:00
teor 0d121833af Add sync tests that download 2000 blocks 2020-10-21 21:08:04 +10:00
teor 6fe3cc56dd Refactor sync test to be more flexible
And add documentation
2020-10-21 00:58:08 -04:00
teor 1d35c5a0b9 Enable the zebrad sync tests by default
If your test environment does not have DNS or network access, set the
ZEBRA_SKIP_NETWORK_TESTS environmental variable to disable these tests.
2020-10-21 00:58:08 -04:00
Henry de Valence eb43893de0 consensus: minimize API, clean docs
This reduces the API surface to the minimum required for functionality,
and cleans up module documentation.  The stub mempool module is deleted
entirely, since it will need to be redone later anyways.
2020-10-20 11:16:22 -04:00
teor d9fbba8a55 Skip the sync tests when ZEBRA_SKIP_NETWORK_TESTS is set 2020-10-16 15:21:01 -04:00
teor 04ce907dbf Remove duplicate code in zebra_test::command 2020-10-15 19:54:00 -04:00
teor 32bbc19c6b Fix a timeout bug in zebra_test::command
And add tests for the command functionality.

Also document some remaining bugs (see #1140).
2020-10-15 19:54:00 -04:00
teor 92f0c934cf Add a sync acceptance test for the Testnet 2020-10-15 19:54:00 -04:00
Alfredo Garcia 2d3c3bcc23 add systemd service file 2020-10-14 15:33:00 -04:00
Alfredo Garcia c0a14ecc8c
move genesis parameters to zebra-chain (#1151) 2020-10-12 14:08:23 -07:00
dependabot[bot] 76e7e3d714 build(deps): bump tracing-subscriber from 0.2.12 to 0.2.13
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.12 to 0.2.13.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.12...tracing-subscriber-0.2.13)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-08 15:09:32 -04:00
Jane Lusby 855f9b5bcb
Implement MVP of NonFinalizedState and integrate it with the state service (#1101)
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names
2020-10-08 13:07:32 +10:00
dependabot[bot] 23a62a2d87 build(deps): bump inferno from 0.10.0 to 0.10.1
Bumps [inferno](https://github.com/jonhoo/inferno) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/jonhoo/inferno/releases)
- [Changelog](https://github.com/jonhoo/inferno/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jonhoo/inferno/compare/v0.10.0...v0.10.1)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-06 05:31:01 -04:00
dependabot[bot] d769f62a73 build(deps): bump color-eyre from 0.5.5 to 0.5.6
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Changelog](https://github.com/yaahc/color-eyre/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.5...v0.5.6)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-05 11:26:23 -04:00
Jane Lusby 40e22808c7
disable reporting url for timeout errors (#1087)
* disable reporting url for timeout errors

* revert newline removal

* switch to released color-eyre version
2020-09-21 16:15:09 -07:00
Henry de Valence fe61090a64 zebrad: make Inbound Poll::Ready before setup.
The Inbound service only needs the network setup for some requests, but
it can service other requests without it.  Making it return
Poll::Pending until the network setup finishes means that initial
network connections may view the Inbound service as overloaded and
attempt to load-shed.
2020-09-21 09:26:39 -07:00
dependabot[bot] 85241a49d6 build(deps): bump hyper from 0.13.7 to 0.13.8
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.7 to 0.13.8.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.7...v0.13.8)

Signed-off-by: dependabot[bot] <support@github.com>
2020-09-21 11:58:31 -04:00
Henry de Valence 9c021025a7 network: fill in remaining request/response pairs 2020-09-20 10:21:18 -07:00
Henry de Valence 4b35fea492 zebrad: document Inbound, ChainSync responsibilities 2020-09-18 18:34:25 -07:00
Henry de Valence 65877cb4b1 zebrad: make Inbound propagate backpressure 2020-09-18 18:34:25 -07:00
Henry de Valence 55f46967b2 zebrad: serve blocks from Inbound service
The original version of this commit ran into

https://github.com/rust-lang/rust/issues/64552

again.  Thanks to @yaahc for suggesting a workaround (using futures combinators
to avoid writing an async block).
2020-09-18 18:34:25 -07:00
Henry de Valence 170f588ffb network: document load-shedding behavior
This was part of the original design and is described in the Connection
internals, but we never documented it externally.
2020-09-18 18:34:25 -07:00
Henry de Valence 1d0ebf89c6 zebrad: move seed command into inbound component
Remove the seed command entirely, and make the behavior it provided
(responding to `Request::Peers`) part of the ordinary functioning of the
start command.

The new `Inbound` service should be expanded to handle all request
types.
2020-09-18 18:34:25 -07:00
Henry de Valence 1d3892e1dc network: rename alias to BoxError
This is shorter and consistent with Tower (which is why we use it in the
first place).
2020-09-18 18:34:25 -07:00
Jane Lusby ca648ff27c
Enable issue-url feature in color-eyre (#1072)
* Enable issue-url feature in color-eyre

* get version automatically

* and the url!
2020-09-17 15:09:18 -07:00
dependabot[bot] ba32d27f6e
build(deps): bump tracing-subscriber from 0.2.11 to 0.2.12 (#1059)
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.11 to 0.2.12.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.11...tracing-subscriber-0.2.12)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-09-14 13:49:07 -07:00
Jane Lusby a7b418bfe5
Add test for first checkpoint verification (#1018)
* add test for first checkpoint sync

Prior this this change we've not had any tests that verify our sync /
network logic is well behaved. This PR cleans up the test helper code to
make error reports more consistent and uses this cleaned up API to
implement a checkpoint sync test which runs zebrad until it reads the
first checkpoint event from stdout.

Co-authored-by: teor <teor@riseup.net>

* move include out of unix cfg

Co-authored-by: teor <teor@riseup.net>
2020-09-11 13:39:39 -07:00
Henry de Valence 3133214e4f zebrad: use new state API 2020-09-11 13:37:49 -07:00
teor b1e1291f45 Log inbound peer requests at debug
Logging at info was a bit too verbose.

Also add a short log message.
2020-09-10 09:46:53 -07:00
Henry de Valence 24de90c900 zebrad: tidy sync imports 2020-09-10 09:45:52 -07:00
Henry de Valence 9b6e66c1b9 zebrad: rename Syncer to ChainSync
This name clarifies what is being synced and avoids an agent-noun
construction.
2020-09-10 09:45:52 -07:00
Henry de Valence 0bc79686b8 zebrad: move sync into components module.
Part of #1030.
2020-09-10 09:45:52 -07:00
teor adafe1d189 Restart sync after the first failed ObtainTips
The ObtainTips retry was redundant. The timeout wasn't much shorter, but
it made the code and sync logic more complicated.
2020-09-09 15:35:09 -07:00
teor 2a68ef5acb Update the peerset buffer size and sync timeout
Also add a bunch of comments and documentation for network-constrained
nodes, and for testnet.
2020-09-08 12:44:33 -07:00
teor b062a682b0 Refactor "waiting for pending blocks" log 2020-09-08 12:44:33 -07:00
teor e6e859dce2 Tweak sync timeouts
* increase the EWMA default and decay
* increase the block download retries
* increase the request and block download timeouts
* increase the sync timeout
2020-09-08 12:44:33 -07:00
teor ce12d4dadc Add timeouts for tip responses and block verify tasks 2020-09-08 12:44:33 -07:00
teor 379ce5c1b8 Retry obtain and extend tips on failure 2020-09-08 12:44:33 -07:00
Alfredo Garcia ca1a451895
Add test for metrics and tracing endpoints (#1000)
* add metrics and tracking endpoint tests

* test endpoints more

* add change filter test for tracing

* add await to post

* separate metrics and tracing tests

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-09-07 17:05:23 -07:00
Alfredo Garcia 454e75e7c0
Rename old references to BlockHeaderHash and BlockHeight (#1002)
* rename some references

* Apply suggestions from code review

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: teor <teor@riseup.net>

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: teor <teor@riseup.net>
2020-09-04 15:40:48 -07:00
teor 48497d4857
Ignore sync errors when the block is already verified (#980)
* Ignore sync errors when the block is already verified

If we get an error for a block that is already in our state, we don't
need to restart the sync. It was probably a duplicate download.

Also:

Process any ready tasks before reset, so the logs and metrics are
up to date. (But ignore the errors, because we're about to reset.)

Improve sync logging and metrics during the download and verify task.

* Remove duplicate hashes in logs
Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* Log the sync hash span at warn level
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-09-04 08:13:00 +10:00
teor 437549d8e9
Always drop the final hash in peer responses (#991)
To workaround a zcashd bug that squashes responses together.
2020-09-04 08:09:34 +10:00
teor c770daa51f
If the first ExtendTips hash is bad, discard it and re-check (#992) 2020-09-04 08:08:19 +10:00
Alfredo Garcia 5485f4429a
Add config path to acceptance tests (#946)
* add and apply config mode to get_child

* remove option to read config from current directory

* remove argument from get_child
2020-09-03 13:13:23 -07:00
Jane Lusby ffdec0cb23
Remove in-memory state service (#974)
* Remove in-memory state service

* make the config compatible with toml again

* checkpoint commit to see how much I still have to revert

* back to the starting point...

* remove unused dependency

* reorganize error handling a bit

* need to make a new color-eyre release now

* reorder again because I have problems

* remove unnecessary helpers

* revert changes to config loading

* add back missing space

* Switch to released color-eyre version

* add back missing newline again...

* improve error message on unix when terminated by signal

* add context to last few asserts in acceptance tests

* instrument some of the helpers

* remove accidental extra space

* try to make this compile on windows

* reorg platform specific code

* hide on_disk module and fix broken link
2020-09-01 12:39:04 -07:00
teor 3fdfcb3179 fix: remove old tips that are behind new tips
This change makes sync less reliant on the exact order of ObtainTips and
ExtendTips responses.
2020-09-01 11:42:48 -04:00
teor a6d6e65940 fix: fix the flamegraph module comment 2020-09-01 11:40:18 -04:00
Ramana Venkata 448250f901
Deduplicate test config defaults (#971)
Fixes #967
2020-08-31 12:43:43 -07:00
Ramana Venkata ad0001f7f7
zebra-state: Add support for temporary sled databases (#939)
* Test config with persistent sled database
* Test ephemeral config
* Add misconfigured ephemeral test
2020-08-31 18:32:55 +10:00
teor fa04072298
Make the checkpoint limit test more readable (#941)
* fix: Pass zebra_consensus::Config in a test

* fix: Remove a redundant import
2020-08-24 11:34:10 -07:00
teor 78201b456d feature: Implement checkpoint_sync for checkpoint verification
* add CheckpointList::new_up_to(limit: NetworkUpgrade)
* if checkpoint_sync is false, limit checkpoints to Sapling
* update tests for CheckpointList and chain::init
2020-08-24 15:34:46 +10:00
teor 06f4a59664 feature: Add a checkpoint_sync config option
(The option doesn't do anything yet.)
2020-08-24 15:34:46 +10:00
Ramana Venkata 991c70723a zebrad: Create zebrad.toml in acceptance tests from ZebradConfig
Fixes #929
2020-08-23 21:24:19 -04:00
teor 3400b72699 fix: Make the start acceptance tests stricter 2020-08-21 07:22:53 +10:00
teor 02e6027c57 refactor: Remove duplicate acceptance test code 2020-08-21 07:22:53 +10:00
teor 1e0e4914a0 fix: Improve an acceptance test failure message
If the tests conflict with a local zebrad, zcashd, or other tests, they
need to be run with a custom config, or in an isolated environment.
2020-08-21 07:22:53 +10:00
teor b8e8d4f548 fix: Remove some deeply-nested instrument spans
Closes #923.
2020-08-20 14:52:39 -04:00
Alfredo Garcia d349f2bbc2
Refactor acceptance serialized_tests (#920)
* add network listening address to default config
2020-08-20 07:48:22 +10:00
Henry de Valence 103b663c40 chain: rename BlockHeight to block::Height 2020-08-17 11:46:34 -07:00
Henry de Valence 61dea90e2f chain: rename BlockHeaderHash to block::Hash
This is the first in a sequence of changes that change the block:: items
to not include Block as a prefix in their name, in accordance with the
Rust API guidelines.
2020-08-17 11:46:34 -07:00
Henry de Valence 948b067808 chain: move Network, NetworkUpgrade to parameters
Also, avoid using star-imports of the enum variants, which pollutes the
namespace.
2020-08-17 11:46:34 -07:00
Henry de Valence 0d1f56ad2f chain: remove utils module
A catch-all utils module can really easily slip into being a place to stash
miscellaneous functions that don't really belong anywhere in particular.
2020-08-17 11:46:34 -07:00
Deirdre Connolly 27ed2288b5 Remove redundant clones for PathBufs 2020-08-14 20:15:24 -04:00
Alfredo Garcia e73f976194
Valid generated config acceptance test (#859)
* add valid generated config test

* change to pathbuf

* use -c to make sure we are using the generated file

* add and use a ZebraTestDir type

* change approach to generate tempdir in top of each test

* pass tempdir to test_cmd and set current dir to it

* add and use a `generated_config_path` variable in tests
2020-08-13 13:31:13 -07:00
Henry de Valence a79ce97957
Fix sync algorithm. (#887)
* checkpoint: reject older of duplicate verification requests.

If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled.  Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.

* sync: add a timeout layer to block requests.

Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.

* sync: restart syncing on error

Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.

* sync: additional debug info

* sync: handle lookahead limit correctly.

Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped.  This meant that completed tasks could be left until the limit was
exceeded again.  Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.

* network: add debug instrumentation to retry policy

* sync: instrument the spawned task

* sync: streamline ObtainTips/ExtendTips logic & tracing

This change does three things:

1.  It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method.  This means that when debugging we only have one
deduplication algorithm to focus on.

2.  It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.

3.  It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip").  The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.

* sync: hack to work around zcashd behavior

* sync: localize debug statement in extend_tips

* sync: change algorithm to define tips as pairs of hashes.

This is different enough from the existing description that its comments no
longer apply, so I removed them.  A further chunk of work is to change the sync
RFC to document this algorithm.

* sync: reduce block timeout

* state: add resource limits for sled

Closes #888

* sync: add a restart timeout constant

* sync: de-pub constants
2020-08-12 16:48:01 -07:00
Henry de Valence 299afe13df
zebra-network tweaks. (#877)
* network: move gossiped peer selection logic into address book.

* network: return BoxService from init.

* zebrad: add note on why we truncate thegossiped peer list

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* Remove unused .rustfmt.toml

Many of these options are never actually loaded by our CI because of a channel
mismatch, where they're not applied on stable but only on nightly (see the logs
from a rustfmt job).  This means that we can get different settings when
running `cargo fmt` on the nightly and stable channels, which was causing a CI
failure on this PR.  Reverting back to the default rustfmt settings avoids this
problem and keeps us in line with upstream rustfmt.  There's no loss to us
since we were using the defaults anyways.

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-11 13:07:44 -07:00
dependabot[bot] 945b019739
build(deps): bump tracing-subscriber from 0.2.10 to 0.2.11 (#873)
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.10 to 0.2.11.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.10...tracing-subscriber-0.2.11)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-08-11 10:30:50 -07:00
teor 2550c44d48
Make sync ignore known hashes (#853)
* fix: Handle known ObtainTips correctly

enumerate never returns a value beyond the end of the vector.

* fix: Ignore known tips in ExtendTips

Some peers send us known tips when we try to extend.

* fix: Ignore known hashes when downloading

Despite all our other checks, we still end up downloading some hashes
multiple times.

* fix: Increase the number of retries

The old sync code relied on duplicate block fetches to make progress,
but the last few commits have removed some of those duplicates.

Instead, just retry the fetches that fail.

* fix: Tweak comments

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* fix: Cleanup the state_contains interface in Sync

* Fix brackets

Oops

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-10 16:17:50 -07:00
Alfredo Garcia c9093e4d59
Make more checks in non server acceptance tests (#860)
* make sure no info is printed in non server tests

* check exact full output for validity instead of log msgs

* add end of output character to version regex

* use coercions, use equality operator

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-10 12:50:48 -07:00
Alfredo Garcia 9c387521bd
Print endpoint addresses at startup (#867)
* print tracing and metrics endpoints in startup

* print network address in startup
2020-08-10 12:47:26 -07:00
teor e95358dbe3 fix: Increase the number of retries
The old sync code relied on duplicate block fetches to make progress,
but the last few commits have removed some of those duplicates.

Instead, just retry the fetches that fail.
2020-08-10 18:58:21 +10:00
teor faac50697c feature: Add a verified blocks metrics counter
We have a counter for pending "download and verify" futures. But these
futures are spawned, so they can complete in any order. They can also
complete before we receive their results.
2020-08-10 15:12:08 +10:00
teor 6aeefcee8b fix: Improve sync diagnostics 2020-08-10 15:12:08 +10:00
Henry de Valence 6d1a4b2218
Load config after initializing the Terminal (#848) 2020-08-06 17:22:40 -07:00
Alfredo Garcia c52481c041 fix logs 2020-08-07 09:21:57 +10:00
Jane Lusby 3e9c6f054b
fix log level default for server commands (#840)
* fix log level default for server commands

* remove dbg
2020-08-06 11:23:00 -07:00
Henry de Valence a77328ad7c
Refactor tracing components (#834)
* Split tracing component code into modules.

* Repatriate Tracing and simplify config handling.

We upstreamed our Tracing component, expecting not to have to exert fine
control over the tracing settings.  But this turned out not to be the case, and
now that we want to do other things (flamegraphs, journalctl, opentelemetry,
etc), we end up with really awkward code (as in the current flamegraph
handling).

This also makes use of the changes to `init()` to load the config early to pass
configuration data into the components, which avoids the need for the
refactoring in #775.

Finally, we restore support for the `-v` flag when the filter is unset.  Closes #831.

* Disable tracing and metrics endpoints by default.

Closes #660.

* Switch back to upstream Abscissa.

* Integrate flamegraph support into the new Tracing component.

* Pass -v in acceptance tests to get info-level output.

* Clean up acceptance test code.
2020-08-06 10:29:31 -07:00
Jane Lusby 867dd0b475
Setup tracing-flame for use profiling zebrad (#436)
* Setup tracing-flame for use profiling zebrad

* start work on conditional flamegraph generation

* review time!

* update comments

* Update Cargo.toml

* disable default features for inferno

* reorganize

* missing one trait

* Apply suggestions from code review

* graceful shutdown!

* remove special case handling on ctrlc for cleanup

* rename signal fn to better represent its responsibility

* remove unused global hook for flushing flamegraph

* move tracing logic to the right file

* just copy linkerd's signal handling logic

* update book

* make zebrad app drop on shutdown normally

* Update zebrad/src/components/tokio.rs

Co-authored-by: teor <teor@riseup.net>

* Update zebrad/src/application.rs

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* cleanup a little

* ooh yea there's an API for that

* setup env-filter for backup subscriber

* document env filter

* document return codes

* forgot to save

* Update book/src/applications/zebrad.md

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-08-05 16:35:56 -07:00
Henry de Valence 4a03d76a41
Remove environment variables in favor of documented config options. (#827)
* Load tracing filter only from config and simplify logic.

* Configure the state storage in the config, not an environment variable.

This also changes the config so that the path is always set rather than being
optional, because Zebra always needs a place to store its config.
2020-08-05 11:48:08 -07:00
Henry de Valence 82da4a5326 Remove connect command. 2020-08-04 23:34:45 -07:00
Alfredo Garcia e037466e26
Acceptance tests - check kill signal (#814)
* check kill signal exit code

* change names and add docs

* change exit_status() to was_killed()

* change assert calls
2020-08-04 13:38:39 -07:00
dependabot[bot] 8e268150a7 build(deps): bump tracing-subscriber from 0.2.9 to 0.2.10
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.9 to 0.2.10.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.9...tracing-subscriber-0.2.10)

Signed-off-by: dependabot[bot] <support@github.com>
2020-08-03 21:11:50 -04:00
Alfredo Garcia 5f23970377 move env variable creation to test_cmd 2020-08-03 15:50:48 -04:00
Alfredo Garcia 2dacd0a62b change default state path 2020-08-03 15:50:48 -04:00
Alfredo Garcia f2d7bb3177
Command execution tests (#690)
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-08-01 16:15:26 +10:00
Alfredo Garcia 617f1d80ef move docs to zebra book 2020-07-29 19:44:21 -07:00
Alfredo Garcia 6297a7cd19 document zebrad enviroment variables 2020-07-29 19:44:21 -07:00
teor 050c46388f fix: Open the endpoints after the config is loaded
We get the injected TokioComponent dependency before the config is
loaded, so we can't use it to open the endpoints.

And we can't define after_config, because we use derive(Component).

So we work around these issues by opening the endpoints manually,
from the application's after_config.
2020-07-29 16:03:52 +10:00
teor e7437cc551 feature: Get endpoint addresses from config 2020-07-29 16:03:52 +10:00
teor 11090dbf91 feature: Separate Mainnet and Testnet state 2020-07-29 01:45:19 -04:00
Alfredo Garcia 5b3c6e4c6c
Port bash checkpoint scripts to zebra-checkpoints single rust binary (#740)
* make zebra-checkpoints
* fix LOOKAHEAD_LIMIT scope
* add a default cli path
* change doc usage text
* add tracing
* move MAX_CHECKPOINT_HEIGHT_GAP to zebra-consensus
* do byte_reverse_hex in a map
2020-07-25 17:53:00 +10:00
Henry de Valence b59cfc49b7 sync: create requests sequentially to respect backpressure.
This seems like a better design on principle but also appears to give a much
nicer sawtooth pattern of queued blocks in the checkpointer and a much smoother
pattern of block requests.
2020-07-24 18:36:00 -04:00
Henry de Valence 4aa00ad216 Align crate versions and user-agent with NU numbers.
We had a brief discussion on discord and it seemed like we had consensus on the
following versioning policy:

* zebrad: match major version to NU version, so we will start by releasing
  zebrad 3.0.0;

* zebra-* libraries: start by matching zebrad's version, then increment major
  versions of each library as we need to make breaking changes (potentially
  faster than the zebrad version, always respecting semver but making no
  guarantees about the longevity of major releases).

This commit sets all of the crate versions to 3.0.0-alpha.0 -- the -alpha.0
marks it as a prerelease not subject to perfect adherence to compatibility
guarantees.
2020-07-24 11:46:37 -07:00
dependabot[bot] f7c59c99b5 build(deps): bump tracing-subscriber from 0.2.8 to 0.2.9
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.8 to 0.2.9.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.8...tracing-subscriber-0.2.9)

Signed-off-by: dependabot[bot] <support@github.com>
2020-07-24 14:31:44 -04:00
teor 2acfcf3a90
Make the CheckpointVerifier handle partial restarts (#736)
Also put generic bounds on the BlockVerifier struct,
so we get better compilation errors.
2020-07-24 11:47:48 +10:00
teor 77a1fefa1e
Download genesis (#731)
* feature: Add more CheckpointVerifier tracing

* fix: Download the genesis block
2020-07-23 10:56:52 -07:00
Jane Lusby c1a1493159
use dirs crate for default location of state and config (#714)
* use dirs crate for default location of state and config
* panic if a path isn't specified for zebra-state
2020-07-23 21:12:20 +10:00
teor c95c825707 fix: Lookup the genesis hash based on the network 2020-07-23 03:46:24 -04:00
Henry de Valence 4a98b8fa0d Add basic metrics to the syncer. 2020-07-22 21:59:00 -07:00
Henry de Valence c2c2a28e8b Improve tracing output in chain verifier 2020-07-22 21:59:00 -07:00
Jane Lusby 7d4e717182
Add block locator request to state layer (#712)
* Add block locator request to state layer

* pass genesis in request

* Update zebrad/src/commands/start/sync.rs

* fix errors
2020-07-22 18:01:31 -07:00
Henry de Valence 49aa41544d sync: try to ignore spurious inv messages.
Closes #697.

per  https://github.com/ZcashFoundation/zebra/issues/697#issuecomment-662742971

The response to a getblocks message is an inv message with the hashes of the
following blocks. However, inv messages are also sent unsolicited to gossip new
blocks across the network. Normally, this wouldn't be a problem, because for
every other request we filter only for the messages that are relevant to us.
But because the response to a getblocks message is an inv, the network layer
doesn't (and can't) distinguish between the response inv and the unsolicited
inv.

But there is a mitigation we can do. In our sync algorithm we have two phases:
(1) "ObtainTips" to get a set of tips to chase down, (2) repeatedly call
"ExtendTips" to extend those as far as possible. The unsolicited inv messages
have length 1, but when extending tips we expect to get more than one hash. So
we could reject responses in ExtendTips that have length 1 in order to ignore
these messages. This way we automatically ignore gossip messages during initial
block sync (while we're extending a tip) but we don't ignore length-1 responses
while trying to obtain tips (while querying the network for new tips).
2020-07-22 17:55:52 -07:00
teor 9b97ebbd61 feature: Choose checkpoints based on the config 2020-07-23 10:26:25 +10:00
teor 3d721a96a5 feature: Add the state config to the config file 2020-07-23 10:26:25 +10:00
teor 89ac2793d6 feature: Use ChainVerifier in the sync service 2020-07-23 10:26:25 +10:00
Jane Lusby a722cf33f7 enable new tracing instrumentation in tokio 2020-07-22 14:39:54 -04:00
Henry de Valence 928b0beb5d sync: unindent fetch task 2020-07-21 20:16:23 -07:00
Henry de Valence b722818e02 sync: remove redundant tracing specifier
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-07-21 20:16:23 -07:00
Henry de Valence 1047d2f690 sync: add backpressure to syncer
Closes #617.
Closes #698.

The remaining work on the syncer is alluded to in a new comment:

1. Correctly constructing a block locator object
2. Detecting when we've stopped making progress syncing and restarting obtain_tips.
2020-07-21 20:16:23 -07:00
Alfredo Garcia db2eb80b3e
Create consensus utils and move byte_reverse_hex function to it (#705)
* move byte_reverse_hex function
2020-07-22 12:29:14 +10:00
teor e5bb96715f fix: Reduce sync error logs to info or warn
Network issues are very common.
2020-07-21 10:13:03 -07:00
teor a0dbe85acd fix: Rewrite the config usage comment 2020-07-21 12:58:55 -04:00
dependabot[bot] 2208a6a22d build(deps): bump tracing-subscriber from 0.2.7 to 0.2.8
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.7 to 0.2.8.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.7...tracing-subscriber-0.2.8)

Signed-off-by: dependabot[bot] <support@github.com>
2020-07-21 12:01:40 -04:00
Alfredo Garcia fe2a468417
add favicon to generated docs (#681) 2020-07-17 16:45:29 -07:00
teor 71de6de701 fix: Only enable tokio components for servers
Only enable the tokio and tracing components for server commands.
2020-07-17 10:12:51 +10:00
teor 49a3a7d6d1 fix: Only launch network endpoints for server commands
Fixes #669.
2020-07-16 10:40:03 -07:00
teor 851afad01f
fix: Resist CheckpointVerifier memory DoS attacks (#635)
* fix: Resist CheckpointVerifier memory DoS attacks

Allow a maximum of 2 queued blocks at each height, as a tradeoff between
efficient bad block rejection, and memory usage.

Closes #628.

* fix: Make max queued blocks at height equal to fanout

* fix: Just allocate all the capacity upfront

* fix: Use with_capacity(1) and reserve_exact(1)
2020-07-15 13:27:10 -07:00
teor 78459afe97 fix: Stop revhex on EOF 2020-07-15 19:19:02 +10:00
teor 12b9fa8ae2
Let zebrad revhex read from stdin (#648)
* Log at warn level for commands that use stdout
* Let zebrad revhex read from stdin

Most unix tools support reading from stdin, so they can be used in
pipelines.

Part of #564.
2020-07-15 16:16:07 +10:00
dependabot[bot] c3fcac8a5c Bump hyper from 0.13.6 to 0.13.7
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.6 to 0.13.7.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.6...v0.13.7)

Signed-off-by: dependabot[bot] <support@github.com>
2020-07-14 10:40:53 -07:00
teor e1bb95603c
Put zebrad/Cargo.toml in a nicer order (#650) 2020-07-14 10:17:05 -07:00
teor 8b5ec155f0
Consensus refactor (#629)
* Flatten consensus::verify::* to consensus::*
* Move consensus::*::tests into their own files
* Move CheckpointList into its own file
* Move Progress and Target into a types module

QueuedBlock and QueuedBlockList can stay in checkpoint.rs, because
they are tightly coupled to CheckpointVerifier.
2020-07-10 16:51:01 +10:00
Henry de Valence ff4e722cd7 sync: touch up tracing output. 2020-07-09 11:15:06 -07:00
Dimitris Apostolou ba81d7d4c0 Fix typos 2020-07-07 11:13:49 -07:00
dependabot[bot] a1d02c2606 Bump tracing-subscriber from 0.2.6 to 0.2.7
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.6 to 0.2.7.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.6...tracing-subscriber-0.2.7)

Signed-off-by: dependabot[bot] <support@github.com>
2020-07-02 06:47:04 -04:00
Jane Lusby 51f6ce86ff
Implement retry policy for syncer (#551) 2020-07-01 13:35:01 -07:00
Jane Lusby 7245d91fe9
fix block downloading to be parallelized and commited via the verifier (#540) 2020-06-30 09:42:09 -07:00
Henry de Valence 21bf913b48 Revert "correctly trim and download tips (#531)"
This reverts commit e102bd5e34.
2020-06-24 12:24:37 -07:00
Jane Lusby e102bd5e34
correctly trim and download tips (#531)
* also download tips and filter tips

* dispatch all block downloads together

* tweek to match henry's changes

* switch to more intuitive match

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-24 15:19:34 -04:00
Alfredo Garcia 67718898c5
add usage help to generated config (#527) 2020-06-23 11:56:00 -07:00
Henry de Valence a453edd91c Put type definitions back at the bottom of the file. 2020-06-23 10:16:27 -07:00
Henry de Valence 18eb212d8e Set the new tips to be the last, not first, hash. 2020-06-23 10:16:27 -07:00
Henry de Valence 70241d3cad Fix broken git dependencies.
Pinning hashes means these won't break again in the future; they can always be updated.
2020-06-22 20:23:02 -07:00
Jane Lusby 1c42b66a4f
Implement sync component for start subcommand (#506) 2020-06-22 19:24:53 -07:00
Jane Lusby 246e7cd2a9
Start testing out new version of `eyre` and `color-eyre` in zebra (#526)
* port to new version of eyre without generics

* correctly setup color_eyre hooks

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-22 15:36:23 -07:00
dependabot[bot] f301de41fa Bump tracing-subscriber from 0.2.5 to 0.2.6
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.5 to 0.2.6.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber/-0.2.5...tracing-subscriber-0.2.6)

Signed-off-by: dependabot[bot] <support@github.com>
2020-06-22 12:01:47 -04:00
Deirdre Connolly 05316dee21 Listen on 0.0.0.0, not 127.0.0.1
Turns out when your node faces the internet directly, it has to listen
to those addresses directly.
2020-06-19 03:46:09 -04:00
Henry de Valence 6cc1627a5d zebrad: apply serde(default) to config sections
Each subsection has to have `serde(default)` to get the behaviour we want
(delete all fields except the ones that have been changed); otherwise, we can
delete only entire sections.
2020-06-18 17:43:36 -04:00
Henry de Valence 4b8f07ebb2 zebrad: Add reference to config docs. 2020-06-18 17:43:36 -04:00
Alfredo Garcia b8f174ee3a change config module to generate 2020-06-18 12:44:02 -07:00
Jane Lusby 7f8a336b69 switch to on_disk state service for start cmd 2020-06-17 23:30:50 -07:00
Jane Lusby df18ac72c5 fix sharedpeererror to propagate tracing context 2020-06-17 14:38:26 -07:00
Jane Lusby 06fd3b2503 be more explicit with pattern in drain_requests 2020-06-16 12:04:45 -07:00
Jane Lusby b0ecd019b6 apply comments from code review 2020-06-16 12:04:45 -07:00
Jane Lusby d09c339dc5 little more cleaning 2020-06-16 12:04:45 -07:00
Jane Lusby 528fd2b5b1 add an outline of the structure of the node 2020-06-16 12:04:45 -07:00
Jane Lusby fc96a41b18 copy connect command into start command 2020-06-16 12:04:45 -07:00
Jane Lusby df656a8bf0
Reorganize `connect` subcommand for readibility (#450) 2020-06-12 09:20:58 -07:00
Jane Lusby 431f194c0f
propagate errors out of zebra_network::init (#435)
Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service.

This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.
2020-06-09 12:24:28 -07:00
Deirdre Connolly 42cc55b0bb Remove testing tokio task
That fires 'GetPeers' requests at our running 'zebra seed'.
2020-06-08 19:26:23 -04:00
Deirdre Connolly 43b77b080e Fix 'dos' feature for seed command, and Buffer the seed service 2020-06-08 19:26:23 -04:00
Deirdre Connolly 8f5e7c268b Request::Peers not GetPeers 2020-06-08 19:26:23 -04:00
Jane Lusby 9bcda0f9c7 Wrap Blocks in Arc throughout codebase 2020-06-05 00:36:55 -04:00
Jane Lusby 18b4dbc16c
fix tracing configuration issues (#432) 2020-06-04 19:34:06 -07:00
dependabot-preview[bot] 07b7c711fb Bump color-eyre from 0.3.2 to 0.3.4
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.3.2 to 0.3.4.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Commits](https://github.com/yaahc/color-eyre/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-02 22:55:02 -04:00
Jane Lusby e9af80b875
Add initial version of zebra-state (#414)
* rename zebra-storage to zebra-state

* Setup initial skeleton for zebra-state

* add test

* Apply suggestions from code review

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* move shared test vectors to a common crate

Co-authored-by: Jane Lusby <jane@zfnd.org>
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
2020-06-02 16:16:17 -07:00
dependabot-preview[bot] 8e7d91b4a3 Bump eyre from 0.4.2 to 0.4.3
Bumps [eyre](https://github.com/yaahc/eyre) from 0.4.2 to 0.4.3.
- [Release notes](https://github.com/yaahc/eyre/releases)
- [Commits](https://github.com/yaahc/eyre/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-02 18:48:50 -04:00
dependabot-preview[bot] a62b93d47c Bump hyper from 0.13.5 to 0.13.6
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.5 to 0.13.6.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.5...v0.13.6)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-30 00:29:05 -04:00
Jane Lusby da72c5a86a
switch from abscissa::Context to color-eyre (#409)
Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-05-28 23:01:24 -04:00
Jane Lusby 8c178c3ee4
fix panic in seed subcommand (#401)
Co-authored-by: Jane Lusby <jane@zfnd.org>

Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs. 

First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections.

The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.
2020-05-27 17:40:12 -07:00
Jane Lusby b6b35364f3 cleanup warnings throughout codebase 2020-05-27 15:42:29 -04:00
dependabot-preview[bot] 46cb7c02f2 Bump once_cell from 1.3.1 to 1.4.0
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.3.1 to 1.4.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.3.1...v1.4.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-13 22:28:18 -04:00
dependabot-preview[bot] 83ee4c2ca3 Bump hyper from 0.13.4 to 0.13.5
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.4 to 0.13.5.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.4...v0.13.5)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-04-18 08:43:40 +00:00
dependabot-preview[bot] 6867449ff4 Bump hyper from 0.13.3 to 0.13.4
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.3 to 0.13.4.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.3...v0.13.4)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-20 21:05:17 -04:00
Henry de Valence dd8ba287bf Correct block version parsing. 2020-03-18 21:34:02 -04:00
dependabot-preview[bot] cf0348b4f5 Bump hyper from 0.13.2 to 0.13.3
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.2 to 0.13.3.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.2...v0.13.3)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-03 20:16:27 -05:00
Henry de Valence 81500dfe11 Add Zebra logotype. 2020-02-26 21:25:35 -08:00
Henry de Valence cd6deea7e1 Clarify that it's the ZF discord and that it's engineering-focused. 2020-02-26 21:25:35 -08:00
Henry de Valence ff3efd504c Add Zebra logo to all workspace crates.
Also add html_root_url attributes.
2020-02-26 21:25:35 -08:00
Henry de Valence f98cda40f9 Remove unused import. 2020-02-21 06:48:25 -05:00
Henry de Valence 9c357eaf1e Use retries for FindBlocks requests. 2020-02-21 06:48:25 -05:00
Henry de Valence b951f13f06 Add a `revhex` utility command to reverse endianness.
This makes it easier to translate block hashes output by our debug logs into
the format used by other tools.
2020-02-21 06:48:25 -05:00
Henry de Valence afa2c2347f fmt 2020-02-21 06:48:25 -05:00
Henry de Valence 8bff6ada6c Prevent a crash serializing configs. 2020-02-14 20:14:05 -05:00
Henry de Valence 75d3d44fb3 Metrics MVP: add two metrics and export them to Prometheus.
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2020-02-14 20:14:05 -05:00
Henry de Valence b443d7a4be Fix clippy lints. 2020-02-14 18:23:41 -05:00
Henry de Valence 7ba007f23d Exercise network functionality by downloading lots of blocks.
(Don't check any information about them, just blindly download).
2020-02-14 18:23:41 -05:00
Henry de Valence 7049f9d891 Add a FindBlocks request to get initial block hashes.
Bitcoin does this either with `getblocks` (returns up to 500 following block
hashes) or `getheaders` (returns up to 2000 following block headers, not
just hashes).  However, Bitcoin headers are much smaller than Zcash
headers, which contain a giant Equihash solution block, and many Zcash
blocks don't have many transactions in them, so the block header is
often similarly sized to the block itself.  Because we're
aiming to have a highly parallel network layer, it seems better to use
`getblocks` to implement `FindBlocks` (which is necessarily sequential)
and parallelize the processing of the block downloads.
2020-02-14 18:23:41 -05:00
Henry de Valence 3c9b5612f3 Update zebrad docs and README. 2020-02-12 12:50:55 -08:00
Henry de Valence 29f901add3 Rename Response::Ok to Response::Nil.
This is a better name because it signals "no data in response" rather
than "Ok", which is semantically mixed with `Ok/Err` of `Result`.
2020-02-10 09:03:56 -08:00
Henry de Valence 2c0f48b587 Refactor connection logic and try a block request.
Attempting to implement requests for block data revealed a problem with
the previous connection logic.  Block data is requested by sending a
`getdata` message with hashes of the requested blocks; the peer responds
with a sequence of `block` messages with the blocks themselves.

However, this wasn't possible to handle with the previous connection
logic, which could only convert a single Bitcoin message into a
Response.  Instead, we factor out the message handling logic into a
Handler, which can statefully accumulate arbitrary data into a Response
and signal completion.  This is still pretty ugly but it does work.

As a side effect, the HeartbeatNonceMismatch error is removed; because
the Handler now tries to process messages until it comes to a Response,
it just ignores mismatched nonces (and will eventually time out).

The previous Mempool and Transaction requests were removed but could be
re-added in a different form later.  Also, the `Get` prefixes are
removed from `Request` to tidy the name.
2020-02-10 09:03:56 -08:00
Henry de Valence 9273f83761 Remove tracing-subscriber.
We don't need to handle the subscriber directly since we upstreamed this
functionality into Abscissa.
2020-02-05 14:05:46 -08:00
Henry de Valence f04f4f0b98 Apply clippy fixes 2020-02-05 12:42:32 -08:00
dependabot-preview[bot] dd24dbece3 Bump hyper from 0.13.1 to 0.13.2
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.1 to 0.13.2.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.1...v0.13.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-01-29 17:25:34 -05:00
Deirdre Connolly 08012f058a cargo fmt 2020-01-28 03:48:23 -05:00
dependabot-preview[bot] dbe7427f58 Bump once_cell from 1.3.0 to 1.3.1
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.3.0 to 1.3.1.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.3.0...v1.3.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-01-18 23:26:25 -05:00
Henry de Valence e492cf067e Disable version string test. 2020-01-15 12:06:31 -08:00
Henry de Valence 4fcb550aa6 Fix a deadlock in TokioComponent.
The components are accessed by a lock on application state.  When some command
calls block_on to enter an async context, it obtained a write lock on the
entire application state.  This meant that if the application state were
accessed later in an async context, a deadlock would occur.  Instead the
TokioComponent holds an Option<Runtime> now, so that before calling block_on,
the caller can .take() the runtime and release the lock.  Since we only ever
enter an async context once, it's not a problem that the component is then
missing its runtime, as once we are inside of a task we can access the runtime.
2020-01-15 12:06:31 -08:00
Henry de Valence ab3db201ee Change TracingEndpoint to forward to the Abscissa Tracing component. 2020-01-15 12:06:31 -08:00
Tony Arcieri 45eb81a204 Upgrade to Abscissa v0.5 2020-01-15 12:06:31 -08:00
Henry de Valence d3e954cd4a Remove vestigial tower git dep 2020-01-15 14:23:01 -05:00
dependabot-preview[bot] 3c77c6f685 Bump hyper from 0.13.0 to 0.13.1
Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.0 to 0.13.1.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.13.0...v0.13.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-12-17 17:11:28 -05:00
Henry de Valence 2965187b91 Upgrade tokio, futures, hyper to released versions. 2019-12-13 17:42:15 -05:00
Deirdre Connolly 189d89a7fc Handle 'mempool' messages as 'GetMempool' requests
With a 'Transactions' response that gets turned into an 'Inv(Vec<InventoryHash::Tx>)' message.

We don't yet handle a response from our peer for a 'mempool', which will have to be
a more generic 'Inv' type because we might receive transaction hashes we don't know about yet.

Pertains to #26
2019-11-18 15:55:25 -05:00
Henry de Valence 6d79352fb6 Remove nightly toolchain pin since async/await is stable. 2019-11-16 00:11:14 -05:00
Henry de Valence 3b02b40758 Simplify tracing output. 2019-11-13 18:43:18 -05:00
Henry de Valence ec4f6bd9ea Allow using the connect stub to test address messages. 2019-11-13 18:43:18 -05:00
Henry de Valence 9a0bffecb8 Sanitize outbound address responses.
This aims to prevent a remote peer from inspecting timings of all messages
received by this node.
2019-11-13 18:43:18 -05:00
Deirdre Connolly e5aa02bbd4 Remove special wait, unneeded for seed
Co-Authored-By: Henry de Valence <hdevalence@hdevalence.ca>
2019-11-12 22:39:47 -05:00
Deirdre Connolly bdba52936e Unwrap address_book in call(), which relies on poll_ready giving a positive response first, otherwise panic
Co-Authored-By: Henry de Valence <hdevalence@hdevalence.ca>
2019-11-12 22:39:47 -05:00
Deirdre Connolly fb19febe26 Remove config override, not needed 2019-11-12 22:39:47 -05:00
Deirdre Connolly 4923e0d783 Update tracing invocation to be better manipulated
Co-Authored-By: Henry de Valence <hdevalence@hdevalence.ca>
2019-11-12 22:39:47 -05:00
Deirdre Connolly 73d777fe65 Update `Ok(None)` case logging. 2019-11-12 22:39:47 -05:00
Deirdre Connolly 0f20ff59c7 Clean up SeedService.poll_ready with a 'ref mut'
Co-Authored-By: Henry de Valence <hdevalence@hdevalence.ca>
2019-11-12 22:39:47 -05:00
Deirdre Connolly 9d8e32d05f Update `seed` subcommand description
Co-Authored-By: Henry de Valence <hdevalence@hdevalence.ca>
2019-11-12 22:39:47 -05:00
Deirdre Connolly fe2a1ec1ea Remove autogenerated Abscissa doc comments 2019-11-12 22:39:47 -05:00
Deirdre Connolly a2292d94a0 Clean up some logging and comments on seed service 2019-11-12 22:39:47 -05:00
Deirdre Connolly d6ab549fd5 Yay, SeedService makes a remote 'connect' happy 2019-11-12 22:39:47 -05:00
Deirdre Connolly 4d3ab201e6 seed command seems to be functional
Moved SeedService out of the command closure Command currently spawns
a tokio task to DOS the seed service with `Request::GetPeers` every
second.

Pertains to #54
2019-11-12 22:39:47 -05:00
Deirdre Connolly fee75b5da8 Add SeedService
This may need some cleaning up, but this is the first iteration to appease the compiler.
2019-11-12 22:39:47 -05:00
Deirdre Connolly 0ac1b663fe Keep sets of initial peers as Strings in config file 2019-11-12 22:39:47 -05:00
Deirdre Connolly b5bbef5c47 Default init seed nodes based on network choice
And more flushed out but incomplete
2019-11-12 22:39:47 -05:00
Henry de Valence f588f5d368 Remove connect loop 2019-10-26 19:54:17 -04:00
Henry de Valence 027bdc8465 Rework initial crawler logic.
This splits out the connection handling code into a try_connect closure, which
could be refactored into a Service of its own.

On creation, when we are likely to have very few peers, launch many concurrent
connections to the first few candidates in the initial candidate set, before
continuing to grow the peer set according to demand signals.
2019-10-22 19:06:08 -07:00