Commit Graph

707 Commits

Author SHA1 Message Date
Marek 451448ef99
Remove unused error variants (#2941)
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-22 20:25:01 +00:00
Marek 4f7a977565
Test multiple chain resets (#2897)
* Try simulating a chain growth

* Adjust the transaction expiry height

The mempool evicts expired transactions. When working with mocked data,
appending a new block typically clears the mempool because transactions become
expired. For this reason, the expiry height of each transactions is adjusted so
that it is greater than the new chain tip's height.

* Refactor the code so that it works with `VerifiedUnminedTx`

* Fix a typo

* Fix clippy warnings

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-10-22 02:54:08 +00:00
teor 67327ac462
Downgrade some less interesting info-level logs to debug (#2938)
There are a lot of these messages when Zebra starts up.
They might be slowing down CI and causing timeouts.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-10-22 02:11:09 +00:00
Janito Vaqueiro Ferreira Filho 595d75d5fb
Fix synchronization delay issue (#2921)
* Create a `NowOrLater` helper type

A replacement for `FutureExt::now_or_never` that ensures that the task
is scheduled for waking up later when the inner future is ready.

* Use `NowOrLater` to fix possible delay bug

Previous usage of `now_or_never` meant that the underlying task wasn't
being scheduled to awake when the `Downloads` stream produced a new
item. Using `NowOrLater` instead fixes that issue.
2021-10-21 10:34:12 +10:00
teor e277975d85
Try flushing streams before exiting Zebra (#2911) 2021-10-20 13:57:09 +00:00
Conrado Gouvea 84f2c07fbc
Ignore AlreadyInChain error in the syncer (#2890)
* Ignore AlreadyInChain error in the syncer

* Split Cancelled errors; add them to should_restart_sync exceptions

* Also filter 'block is already comitted'; try to detect a wrong downcast
2021-10-20 11:07:19 +10:00
Conrado Gouvea a5d1467624
Additional mempool metrics (#2878)
* Rename tx downloader & verifier metrics

* Add version to mempool metrics

* Add new metrics

* Make sure mempool gauges are zeroed when instances are dropped

* Updated mempool grafana dashboard

* Removed transaction verification dashboard; moved to mempool

* Update mempool dashboard

* Add reason to error labels in mempool dashboard

* Rename some metrics per review

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-19 17:10:25 +00:00
Conrado Gouvea 128b8be95d
Improve mempool::downloads documentation (#2879)
* Improve mempool::downloads documentation

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-18 20:02:40 +00:00
teor 303c8cf5ef
Add a queue checker task, to make sure mempool transactions propagate (#2888)
* Guarantee unique IDs in mempool service responses

* Guarantee unique IDs in crawler task mempool Queue requests

Also update the tests to use unique IDs.

* Add a CheckForVerifiedTransactions mempool request

Also document the mempool request and response variants.

* Spawn a QueueChecker task to check for newly verified transactions

This task makes sure that transactions reliably propagate,
rather than relying on peer requests or responses to trigger propagation.

* Update the start command documentation

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-18 19:23:21 +00:00
teor 42ce79aad9
Cancel pending download tasks when the mempool is disabled (#2886)
* Impl Drop, Default and take() for ActiveState

* Refactor Mempool::poll_ready to check disabled and reset first

Also remove some levels of nesting.

* Use the same code for dropping and resetting the mempool

* Document where the tasks are dropped when switching states

* Log mempool resets at info level

And add heights to mempool enable/disable/reset logs

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-18 17:39:56 +00:00
teor 40c907dd09
Remove duplicate IDs in mempool requests and responses (#2887)
* Guarantee unique IDs in mempool service responses

* Guarantee unique IDs in crawler task mempool Queue requests

Also update the tests to use unique IDs.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-18 15:31:11 +00:00
Alfredo Garcia fb02ce5925
Add docs to storage and mempool gossip mods (#2884)
* add some docs to storage and mempool gossip mods

* fix grammar

Co-authored-by: Conrado Gouvea <conradoplg@gmail.com>

Co-authored-by: Conrado Gouvea <conradoplg@gmail.com>
2021-10-18 14:48:40 +00:00
teor 2d129414e0
Store the transaction fee in the mempool storage (#2885)
* Create a new VerifiedUnminedTx containing the miner fee

* Use VerifiedUnminedTx in mempool verification responses

And do a bunch of other cleanups.

* Use VerifiedUnminedTx in mempool download and verifier

* Use VerifiedUnminedTx in mempool storage and verified set

* Impl Display for VerifiedUnminedTx, and some convenience methods

* Use VerifiedUnminedTx in existing tests
2021-10-18 11:24:37 +10:00
Deirdre Connolly 4648f8fc70
Make some mempool functions associated with the `mempool::Storage` type (#2872)
* Make some mempool functions associated with the Mempool type

* Move some functions to methods on mempool::Storage
2021-10-15 15:03:13 -03:00
Marek 002c533ea8
Return transaction fee (#2876)
* Get the transaction fee from utxos

* Return the transaction fee from the verifier

* Avoid calculating the fee for coinbase transactions

Coinbase transactions don't have fees. In case of a coinbase transaction, the
verifier returns a zero fee.

* Update the result obtained by `Downloads`
2021-10-15 07:15:10 +10:00
teor 70ec51d770
Zero the mempool metrics when the mempool is disabled (#2875)
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-14 14:41:07 +00:00
teor a21dd26a5e
Insert new mempool transactions, then check for rejections (#2874)
Previously, we checked some rejections before inserting,
so we could accept some new transactions that should be rejected.
2021-10-14 10:19:21 -03:00
teor b64ed62777
Add a debug config that enables the mempool (#2862)
* Update some comments

* Add a mempool debug_enable_at_height config

* Rename a field in the mempool crawler

* Propagate syncer channel errors through the crawler

We don't want to ignore these errors, because they might indicate a shutdown.
(Or a bug that we should fix.)

* Use debug_enable_at_height in the mempool crawler

* Log when the mempool is activated or deactivated

* Deny unknown fields and apply defaults for all configs

* Move Duration last, as required for TOML tables

* Add a basic mempool acceptance test

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-13 15:04:49 +00:00
Alfredo Garcia 8120e8abac
Avoid broadcasting mempool rejected or expired transactions to peers (#2858)
* do not advertise rejected transactions

* do not broadcast transaction that are expired

* change dummy var name

* simplify code, performance

* clippy

* add some test coverage

* clippy

Co-authored-by: teor <teor@riseup.net>
2021-10-13 00:50:35 +00:00
Deirdre Connolly a393720560
Revert "Compute serialized size on the fly" (#2865)
This reverts commit 5cf5641b9b.
2021-10-12 23:01:24 +00:00
Conrado Gouvea 5cf5641b9b Compute serialized size on the fly 2021-10-12 16:53:59 -04:00
Conrado Gouvea 6734a01aa7 Add zcash.mempool.size.transactions and zcash.mempool.size.bytes metrics 2021-10-12 16:53:59 -04:00
teor b274ee4066
Pass the mempool config to the mempool (#2861)
* Split mempool config into its own module

Also:
- expand config docs
- clean up mempool imports

* Pass the mempool config to the mempool

* Create the transaction sender channel inside the mempool 1/2

This simplifies all the code that calls the mempool.

Also:
- update the mempool enabled state before returning the new mempool
- add some test module doc comments

* Refactor a setup function out of the mempool unit tests 2/2

Also:
- update the setup function to handle the latest mempool changes

* Clarify a comment

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-12 17:31:54 +00:00
teor 2f0926a8e4
Stop ignoring the mempool conflicting transaction reject list size limit (#2855)
* Limit the size of rejection lists when there is a spend conflict

Previously, `insert` would return early with an error,
and skip limiting the rejection list sizes.

* Use prop_assert macros in proptests, rather than assert
2021-10-12 10:35:50 +10:00
Conrado Gouvea 31e7a21721
Add expired transactions to the mempool rejected list (#2852)
* Add expired transactions to the mempool rejected list

* Apply suggestions from code review

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Refactor contains_rejected() by just calling rejection_error()

* Improve rejection_error() documentation

* Improve rejected_transaction_count() documentation

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: teor <teor@riseup.net>
2021-10-11 21:23:43 +00:00
Janito Vaqueiro Ferreira Filho 9e78a8af40
Refactor mempool spend conflict checks to increase performance (#2826)
* Add `HashSet`s to help spend conflict detection

Keep track of the spent transparent outpoints and the revealed
nullifiers.

Clippy complained that the `ActiveState` had variants with large size
differences, but that was expected, so I disabled that lint on that
`enum`.

* Clear the `HashSet`s when clearing the mempool

Clear them so that they remain consistent with the set of verified
transactions.

* Use `HashSet`s to check for spend conflicts

Store new outputs into its respective `HashSet`, and abort if a
duplicate output is found.

* Remove inserted outputs when aborting

Restore the `HashSet` to its previous state.

* Remove tracked outputs when removing a transaction

Keep the mempool storage in a consistent state when a transaction is
removed.

* Remove tracked outputs when evicting from mempool

Ensure eviction also keeps the tracked outputs consistent with the
verified transactions.

* Refactor to create a `VerifiedSet` helper type

Move the code to handle the output caches into the new type. Also move
the eviction code to make things a little simpler.

* Refactor to have a single `remove` method

Centralize the code that handles the removal of a transaction to avoid
mistakes.

* Move mempool size limiting back to `Storage`

Because the evicted transactions must be added to the rejected list.

* Remove leftover `dbg!` statement

Leftover from some temporary testing code.

Co-authored-by: teor <teor@riseup.net>

* Remove unnecessary `TODO`

It is more speculation than planning, so it doesn't add much value.

Co-authored-by: teor <teor@riseup.net>

* Fix typo in documentation

The verb should match the subject "transactions" which is plural.

Co-authored-by: teor <teor@riseup.net>

* Add a comment to warn about correctness

There's a subtle but important detail in the implementation that should
be made more visible to avoid mistakes in the future.

Co-authored-by: teor <teor@riseup.net>

* Remove outdated comment

Left-over from the attempt to move the eviction into the `VerifiedSet`.

* Improve comment explaining lint removal

Rewrite the comment explaining why the Clippy lint was ignored.

* Check for spend conflicts in `VerifiedSet`

Refactor to avoid API misuse.

* Test rejected transaction rollback

Using two transactions, perform the same test adding a conflict to both
of them to check if the second inserted transaction is properly
rejected. Then remove any conflicts from the second transaction and add
it again. That should work, because if it doesn't it means that when the
second transaction was rejected it left things it shouldn't in the
cache.

* Test removal of multiple transactions

When removing multiple transactions from the mempool storage, all of the
ones requested should be removed and any other transaction should be
still be there afterwards.

* Increase mempool size to 4, so that spend conflict tests work

If the mempool size is smaller than 4,
these tests don't fail on a trivial removal bug.
Because we need a minimum number of transactions in the mempool
to trigger the bug.

Also commit a proptest seed that fails on a trivial removal bug.
(This seed fails if we remove indexes in order,
because every index past the first removes the wrong transaction.)

* Summarise transaction data in proptest error output

* Summarise spend conflict field data in proptest error output

* Summarise multiple removal field data in proptest error output

And replace the very large proptest debug output with the new summary.

Co-authored-by: teor <teor@riseup.net>
2021-10-10 23:54:46 +00:00
Alfredo Garcia 724967d488
Send `AdvertiseTransactionIds` to peers (#2823)
* bradcast transactions to peers after they get inserted into mempool

* remove network argument from mempool init

* remove dbg left

* remove return value in mempool enable call

* rename channel sender and receiver vars

* change unwrap() to expect()

* change the channel to a hashset

* fix build

* fix tests

* rustfmt

* fix tiny space issue inside macro

Co-authored-by: teor <teor@riseup.net>

* check errors/panics in transaction gossip tests

* fix build of newly added tests

* Stop dropping the inbound service and mempool in a test

Keeping the mempool around avoids a transaction broadcast task error,
so we can test that there are no other errors in the task.

* Tweak variable names and add comments

* Avoid unexpected drops by returning a mempool guard in tests

* Use BoxError to simplify service types in tests

* Make all returned service types consistent in tests

We want to be able to change the setup without changing the tests.

Co-authored-by: teor <teor@riseup.net>
2021-10-08 08:59:46 -03:00
Alfredo Garcia 0683e0b40b
Add a mempool config section (#2845)
* add mempool config section

* change to Duration

* fix typo in var name

* document consensus rules

* tweak var name
2021-10-07 22:47:37 +00:00
Conrado Gouvea dd1f0a6dcc
Add transactions that failed verification to the mempool rejected list (#2821)
* Add transactions that failed verification to the mempool rejected list

* Add tests

* Work with recent changes
2021-10-07 21:34:01 +00:00
teor 664d4384d4
Un-reject mempool transactions if the rejection depends on the current tip (#2844)
* Split mempool storage errors into tip-based and chain-based

* Expire tip rejections every time we get a new block

FailedVerification and SpendConflict rejections only apply to the current tip.
The next tip can provide missing inputs, or evict conflicting transactions.

* Enforce MAX_EVICTION_MEMORY_ENTRIES for mempool reject lists
2021-10-07 16:58:42 -03:00
teor b6a60c6c17 Fix names that are exact or effects depending on the list 2021-10-07 13:43:53 -04:00
teor e470ed00e6 Return rejected transaction ID iterators from mempool storage
Also rename methods for consistency.
2021-10-07 13:43:53 -04:00
teor 3357c58c41 Return transaction iterators from mempool storage
Also rename transaction methods for consistency.
2021-10-07 13:43:53 -04:00
teor 964c819c80
Split storage errors into same effects and exact matches (#2833)
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-07 15:04:23 +00:00
teor 04d2cfb3d0
Gossip recently verified block hashes to peers (#2729)
* Implement a task that gossips verified block hashes

* Log an info message for block broadcasts

* Simplify the gossip task

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Re-use the old tip change if there is no new tip change

Also improve the comments.

* Add an assertion message

* Rename task join handles and futures in start method

* Add a dedicated BlockGossipError type

This type helps distinguish between syncer and state errors.

* Test that committed blocks are gossiped to peers

Also do a minor type cleanup on the existing test code,
replacing `Option<Vec<_>>` with `Vec<_>`.

* Formatting

* Remove excess newlines

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

* Clear the initial gossiped blocks during test setup

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-07 07:46:37 -03:00
teor a3a4773047
Remove unused mempool errors (#2831)
* Remove unused mempool storage errors

Preparation for ticket #2819.

Removing these errors means that we don't have to decide
which type of transaction ID match we want for them.

* Remove unused mempool errors, and deduplicate storage errors

* rustfmt
2021-10-07 11:20:38 +10:00
teor d1ce8e3e6d
Remove transactions in newly committed blocks from the mempool (#2827)
* Add a mempool transaction removal method for mined IDs

And use this method to remove expired transactions,
because all transactions with the same mined ID expire at the same height.

* Remove mined transaction IDs from the mempool

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-07 09:45:14 +10:00
Alfredo Garcia f1718f5c92
Add `zcash_serialized_size()` to `ZcashSerialize` trait (#2824)
* add a zcash_serialized_size()

* add a size field to `UnminedTx`

* refactor zcash_serialized_size() to don't allocate RAM

* improve performance

Co-authored-by: teor <teor@riseup.net>

* clippy

Co-authored-by: teor <teor@riseup.net>
2021-10-06 22:40:11 +00:00
Janito Vaqueiro Ferreira Filho 5d9893cf31
Send crawled transaction IDs to downloader (#2801)
* Rename type parameter to be more explicit

Replace the single letter with a proper name.

* Remove imports for `Request` and `Response`

The type names will conflict with the ones for the mempool service.

* Attach `Mempool` service to the `Crawler`

Add a field to the `Crawler` type to store a way to access the `Mempool`
service.

* Forward crawled transactions to downloader

The crawled transactions are now sent to the transaction downloader and
verifier, to be included in the mempool.

* Derive `Eq` and `PartialEq` for `mempool::Request`

Make it simpler to use the `MockService::expect_request` method.

* Test if crawled transactions are downloaded

Create some dummy crawled transactions, and let the crawler discover
them. Then check if they are forwarded to the mempool to be downloaded
and verified.

* Don't send empty transaction ID list to downloader

Ignore response from peers that don't provide any crawled transactions.

* Log errors when forwarding crawled transaction IDs

Calling the Mempool service should not fail, so if an error happens it
should be visible. However, errors when downloading individual
transactions can happen from time to time, so there's no need for them
to be very visible.

* Document existing `mempool::Crawler` test

Provide some depth as to what the test expect from the crawler's
behavior.

* Refactor to create `setup_crawler` helper function

Make it easier to reuse the common test setup code.

* Simplify code to expect requests

Now that `zebra_network::Request` implement `Eq`, the call can be
simplified into `expect_request`.

* Refactor to create `respond_with_transaction_ids`

A helper function that checks for a network crawl request and responds
with the given list of crawled transaction IDs.

* Refactor to create `crawler_iterator` helper

A function to intercept and respond to the fanned-out requests sent
during a single crawl iteration.

* Refactor to create `respond_to_queue_request`

Reduce the repeated code necessary to intercept and reply to a request
for queuing transactions to be downloaded.

* Add `respond_to_queue_request_with_error` helper

Intercepts a mempool request to queue transactions to be downloaded, and
responds with an error, simulating an internal problem in the mempool
service implementation.

* Derive `Arbitrary` for `NetworkUpgrade`

This is required for deriving `Arbitrary` for some error types.

* Derive `Arbitrary` for `TransactionError`

Allow random transaction errors to be generated for property tests.

* Derive `Arbitrary` for `MempoolError`

Allow random Mempool errors to be generated for property tests.

* Test if errors don't stop the mempool crawler

The crawler should be robust enough to continue operating even if the
mempool service fails to download transactions or even fails to handle
requests to enqueue transactions.

* Reduce the log level for download errors

They should happen regularly, so there's no need to have them with a
high visibility level.

Co-authored-by: teor <teor@riseup.net>

* Stop crawler if service stops

If `Mempool::poll_ready` returns an error, it's because the mempool
service has stopped and can't handle any requests, so the crawler should
stop as well.

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-05 10:55:42 +10:00
Alfredo Garcia 21a3e434bc
remove some dead code attributes (#2820) 2021-10-01 15:59:59 -03:00
Janito Vaqueiro Ferreira Filho 50a5728d0b
Test if the mempool storage is cleared (#2815)
* Move mempool tests into `tests::vector` sub-module

Make it consistent with other test modules and prepare for adding
property tests.

* Reorder imports

Make it consistent with the general guidelines followed on other
modules.

* Export `ChainTipBlock` and `ChainTipSender`

Allow these types to be used in other crates for testing purposes.

* Derive `Arbitrary` for `ChainTipBlock`

Make it easy to generate random `ChainTipBlock`s for usage in property
tests.

* Refactor to move test methods into `tests` module

Reduce the repeated test configuration attributes and make it easier to
see what is test specific and what is part of the general
implementation.

* Add a `Mempool::dummy_call` test helper method

Performs a dummy call just so that `poll_ready` gets called.

* Use `dummy_call` in existing tests

Replace the custom dummy requests with the helper method.

* Test if the mempool is cleared on chain reset

A chain reset should force the mempool storage to be cleared so that
transaction verification can restart using the new chain tip.

* Test if mempool is cleared on syncer restart

If the block synchronizer falls behind and then starts catching up
again, the mempool should be disabled and therefore the storage should
be cleared.
2021-10-01 14:44:25 +00:00
Conrado Gouvea 16a4110475 Cancel all mempool download and verify tasks when a network upgrade activates 2021-09-30 16:35:39 -04:00
Conrado Gouvea 18acec6849
Send mined transaction IDs to the download/verify task for cancellation (#2786)
* Send mined transaction IDs to the download and verify task for cancellation

* Pass a HashSet of transaction hashes to be cancelled

* Add mempool_cancel_mined() test

* Fix starvation in test

* Fix typo in comment
2021-09-30 12:09:08 +10:00
Alfredo Garcia 37595c4b32
Mempool support for transaction expiration (#2774)
* mempool - support transaction expiration

* use `LatestChainTip` instead of state call

* clippy

* remove spawn task

* remove non needed async from function

* remove return value

* add a `expiry_height_mut()` method to `Transaction` for testing purposes

* fix `remove_expired_transactions()`

* add a `mempool_transaction_expiration()` test

* tidy cleanup to `expiry_height()`

* improve docs

* fix the build

* try fix macos build

* extend tests

* add doc to function

* clippy

* fix build

* start tests at block two
2021-09-29 16:52:44 +00:00
Conrado Gouvea c6878d9b63
Cancel download and verify tasks when the mempool is deactivated (#2764)
* Cancel download and verify tasks when the mempool is deactivated

* Refactor enable/disable logic to use a state enum

* Add helper test functions to enable/disable the mempool

* Add documentation about errors on service calls

* Improvements from review

* Improve documentation

* Fix bug in test

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2021-09-29 09:06:40 +10:00
Janito Vaqueiro Ferreira Filho a0d45c38f3
Reject conflicting mempool transactions (#2765)
* Add `Transaction::spent_outpoints` getter method

Returns an iterator over the UTXO `OutPoint`s spent by the transaction.

* Add `mempool::Error::Conflict` variant

An error representing that a transaction was rejected because it
conflicts with another transaction that's already in the mempool.

* Reject conflicting mempool transactions

Reject including a transaction in the mempool if it spends outputs
already spent by, or reveals nullifiers already revealed by another
transaction in the mempool.

* Fix typo in documentation

Remove the `r` that was incorrectly added.

Co-authored-by: teor <teor@riseup.net>

* Specify that the conflict is a spend conflict

Make the situation clearer, because there are other types of conflict.

Co-authored-by: teor <teor@riseup.net>

* Clarify that the outpoints are from inputs

Because otherwise it could lead to confusion because it could also mean
the outputs of the transaction represented as `OutPoint` references.

Co-authored-by: teor <teor@riseup.net>

* Create `storage::tests::vectors` module

Refactor to follow the convention used for other tests.

* Add an `AtLeastOne::first_mut` method

A getter to allow changing the first element.

* Add an `AtLeastOne::push` method

Allow appending elements to the collection.

* Derive `Arbitrary` for `FieldNotPresent`

This is just to make the code that generates arbitrary anchors a bit
simpler.

* Test if conflicting transactions are rejected

Generate two transactions (either V4 or V5) and insert a conflicting
spend, which can be either a transparent UTXO, or a nullifier for one of
the shielded pools. Check that any attempt to insert both transactions
causes one to be accepted and the other to be rejected.

* Delete a TODO comment that we decided not to do

Co-authored-by: teor <teor@riseup.net>
2021-09-28 01:03:08 +00:00
Conrado Gouvea b42ab67a4b
Add tests for mempool Request::Queue (#2770)
* Add tests for mempool Request::Queue

* Update test to work after refactoring

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-23 21:13:52 +00:00
teor 07e8926fd5
Send `Response::Nil` instead of sending empty `Message`s (#2791)
* Send `Response::Nil` instead of sending empty `Message`s

This matches `zcashd`'s behaviour more closely.

In most cases, the network layer filters these out already.
But this change makes the the inbound service code clearer.

* revert changes made to `AdvertiseTransactionIds` and `PushTransaction`

* remove newline

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-09-23 19:58:00 +00:00
Marek 30c9618207
Clear mempool at a network upgrade (#2773)
* Update the expiry TODO

* Clear the mempool at a chain tip reset

* Clear the mempool by using a sync method (#2777)

* Clear the mempool by using a sync method

* Update docs

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Refactor last_tip_change()

* Apply suggestions from code review

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Fix brackets

* Use best_tip_block instead of manual borrowing

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-09-23 19:09:44 +00:00
Janito Vaqueiro Ferreira Filho 11b77afec7
Refactor mempool tests (#2771)
* Remove return of redundant vector length

An attempt to improve readability a bit by not returning a tuple with a
value that can be obtained from a single return type.

* Refactor `unmined_transactions_in_blocks`

Use a more functional style to try to make it a bit clearer.

* Use ranges in `unmined_transactions_in_blocks`

Allow a finer control over the block range to extract the transactions
from.

* Refactor mempool test code to improve clarity

It was previously not clear that only the first genesis transaction was
being used. The remaining transactions in the genesis block were
discarded and then readded later.

* Replace `oneshot` with `call`

Remove a redundant internal `ready_and` call.

* Return an `impl Iterator` instead of a `Vec<_>`

Remove unnecessary deserializations and heap allocations.

* Refactor `mempool_storage_basic_for_network` test

Make the separation between the transactions expected to be in the
mempool and those expected to be rejected clearer.

* Replace `Iterator` with `DoubleEndedIterator`

Allow getting the last transaction more easily.

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-09-23 13:54:14 +00:00
Alfredo Garcia 56636c85fc
Add missing tests for mempool inbound requests (#2769)
* Use `MockService` in inbound test

Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.

* Use `MockService` in the basic mempool test

Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.

* Remove the `mock_peer_set` helper function

It is not used anymore, since the usages were replaced with
`MockService`s.

* add tests for mempool inbound requests

* Use MockService for transaction verifier

* Refactor creation of mock `peer_set`

Use the same style as the mock transaction verifier.

* Derive `Eq` for `zebra_network::Request`

Make it easy to use the `MockService::expect_request` method.

* Return mocked peer set service from `setup`

Allow it to be used to respond to requests.

* Add bindings for the transaction used for testing

Allow them to be moved into futures later.

* Respond to transaction download request

Make sure that the test transaction appears to the mempool as if it had
been downloaded by the peer set service.

* Assert that no unexpected requests were received

Check that the mempool doesn't send unexpected requests to the peer set
service.

* add tests for mempool inbound requests

* Use MockService for transaction verifier

* add missing `expect_no_requests` to `mempool_advertise_transaction_ids` test

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-09-23 10:17:06 -03:00
Janito Vaqueiro Ferreira Filho 1f1bf2ec4d
Replace `mock_peer_set` function with `MockService` (#2790)
* Use `MockService` in inbound test

Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.

* Use `MockService` in the basic mempool test

Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.

* Remove the `mock_peer_set` helper function

It is not used anymore, since the usages were replaced with
`MockService`s.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-09-22 15:44:13 +00:00
teor 776432978c
Allow deliberate instances of the clippy::derivable_impls lint (#2788)
* Allow deliberate instances of the new nightly clippy::derivable_impls lint

We want our config defaults to be explicit.

Not so sure about the application defaults, but they also contain a config.

* Also allow unknown lint names

Stable doesn't know about this lint, but nightly does.
2021-09-22 10:43:27 -03:00
Janito Vaqueiro Ferreira Filho b714b2b3b6
Create a helper `MockService` type to help with writing tests that use mock `tower::Service`s (#2748)
* Implement initial service mocking helpers

Adds a [`MockService`] type, which can be configured and built for usage
in unit tests or proptests. The mocked service can then be used to
intercept requests and respond indivdiually to them.

* Use `MockService in the `mempool::Crawler` test

Refactor it to remove the helper mock function, and use the new
`MockService` helper type.

* Use `MockService` in `CandidateSet` test vectors

Refactor to remove the manual mocking of the peer set service.

* Panic if a response is not sent by `MockService`

Change the current semantics to require all `MockService` usages to
respond to every intercepted request.

A `must_use` attribute was added to the `ResponseSender` so that the
compiler can warn when this doesn't happen.

* Allow generic error types in `MockService`

Replace the hard-coded `BoxError` as the `Service`'s error type with a
generic type parameter. This allows mocking services in locations that
require specific error types.

* Add a `ResponseSender::request` getter

Allow inspecting the request again before responding, and using
information from the request in the response.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-09-21 17:44:59 +00:00
Marek 061ad55144
Sneak chain_tip_change into mempool (#2785)
* Pass ChainTipChange to the mempool

* Fix nits
2021-09-21 17:06:52 +00:00
Conrado Gouvea 957e12e4ca
Pass sync_status to mempool (#2754)
* Pass sync_status to mempool

* Update zebrad/src/components/mempool.rs

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>

* Remove enabled flag for now; will be handled in #2723

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-15 22:13:29 +00:00
Deirdre Connolly ec74f7a821 Update mempool::Storage tests to not use Clone 2021-09-14 09:06:31 -04:00
Deirdre Connolly ccbbb36f7f Remove explicit return
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-09-14 09:06:31 -04:00
Deirdre Connolly c03d1c156b Add mempool::Storage::remove()
Resolves #2738
2021-09-14 09:06:31 -04:00
Deirdre Connolly eff4f8720c Allow removing a tx from mempool storage 2021-09-14 09:06:31 -04:00
Conrado Gouvea 34876113c2 Remove Clone from mempool Storage 2021-09-13 18:49:46 -04:00
Conrado Gouvea 8825a52bb8
Move transaction download and verify stream into the mempool service (#2741)
* Move transaction dowloader and verifier into the mempool service

* add test for `Storage::contains_rejected()`

* Rename DownloadAndVerify->Queue; move should_download_or_verify() to previous impl

* GossipedTx -> Gossip

* Revamp error handling

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-13 16:28:07 -04:00
Conrado Gouvea f3ee76f202
Verify inbound PushTransactions (#2727)
* Verify inbound PushTransactions

* Add GossipedTx and refactor downloader to use it

* remove grafana changes

* remove TODOs

* Tidy the transaction fetching in mempool downloader

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2021-09-09 10:04:44 -03:00
Conrado Gouvea a2993e8df0
Skip download and verification if the transaction is already in the mempool or state (#2718)
* Check if tx already exists in mempool or state before downloading

* Reorder checks

* Add rejected test; refactor into separate function

* Wrap mempool in buffered service

* Rename RejectedTransactionsById -> RejectedTransactionsIds

* Add RejectedTransactionIds response; fix request name

* Organize imports

* add a test for Storage::rejected_transactions

* add test for mempool `Request::RejectedTransactionIds`

* change buffer size to 1 in the test

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-08 18:51:17 +00:00
Janito Vaqueiro Ferreira Filho be59dd2b93
Replace `Mutex` with usage of `&mut self` (#2742)
Using `&mut self` as the receiver in the method signatures allows Rust
to infer that the type is properly `Sync`, and therefore `Send`. This
allows removing the `Mutex` work-around.
2021-09-08 09:50:46 -03:00
Marek 1c4ac18df2
Decide if Zebra is at the chain tip (#2722)
* Decide if Zebra is at the chain tip

* Avoid division by zero

* Try increasing EVENT_TIMEOUT

* Increase MAX_TEST_EXECUTION

* Implement basic tests

* Resolve Clippy's erorrs

* change doc comments to normal

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-09-06 23:33:31 +00:00
Alfredo Garcia 65e308d2e1
Respond to inbound `TransactionsById` with mempool content (#2725)
* reply to inbound `TransactionsById` requests

* apply style/redability suggestions and fix typo

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-06 22:55:17 +00:00
Alfredo Garcia 9c220afdc8
Reply to `Request::MempoolTransactionIds` with mempool content (#2720)
* reply to `Request::MempoolTransactionIds`

* remove boilerplate

* get storage from mempool with a method

* change panic message

* try fix for mac

* use normal init instead of init_tests for state service

* newline

* rustfmt

* fix test build
2021-09-02 13:42:31 +00:00
Conrado Gouvea 1ccb2de7c7
Add transaction downloader and verifier (#2679)
* Add transaction downloader

* Changed mempool downloader to be like inbound

* Verifier working (logs result)

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Fix coinbase check for mempool, improve is_coinbase() docs

* Change other downloads.rs docs to reflect the mempool downloads.rs changes

* Change TIMEOUTs to downloads.rs; add docs

* Renamed is_coinbase() to has_valid_coinbase_transaction_inputs() and contains_coinbase_input() to has_any_coinbase_inputs(); reorder checks

* Validate network upgrade for V4 transactions; check before computing sighash (for V5 too)

* Add block_ prefix to downloads and verifier

* Update zebra-consensus/src/transaction.rs

Co-authored-by: teor <teor@riseup.net>

* Add consensus doc; add more Block prefixes

Co-authored-by: teor <teor@riseup.net>
2021-09-02 00:06:20 +00:00
teor b6fe816473
Add a `ChainTipChange` type to `await` chain tip changes (#2715)
* Rename ChainTipReceiver to CurrentChainTip

`fastmod ChainTipReceiver CurrentChainTip zebra*`

* Update chain tip documentation and variable names

* Basic chain tip change implementation, without resets

Also includes the following name changes:
```
fastmod CurrentChainTip LatestChainTip zebra*
fastmod chain_tip_receiver latest_chain_tip zebra*
```

* Clarify the difference between `LatestChainTip` and `ChainTipChange`
2021-09-01 22:31:16 +00:00
Janito Vaqueiro Ferreira Filho 8bff71e857
Only enable the mempool crawler after synchronization reaches the chain tip (#2667)
* Store a `SyncStatus` handle in the `Crawler`

The helper type will make it easier to determine if the crawler is
enabled or not.

* Pause crawler if mempool is disabled

Implement waiting until the mempool becomes enabled, so that the crawler
does not run while the mempool is disabled.

If the `MempoolStatus` helper is unable to determine if the mempool is
enabled, stop the crawler task entirely.

* Update test to consider when crawler is paused

Change the mempool crawler test so that it's a proptest that tests
different chain sync. lengths. This leads to different scenarios with
the crawler pausing and resuming.

Co-authored-by: teor <teor@riseup.net>
2021-08-31 10:42:25 +00:00
Janito Vaqueiro Ferreira Filho 83a2e30e33
Create a `SyncStatus` helper type (#2685)
* Create a `SyncStatus` helper type

Keeps track if the synchronizer is close to the chain tip or not.

* Refactor `ChainSync` ctor. to return `SyncStatus`

Change the constructor API so that it returns a higher level construct.

* Test if `SyncStatus` waits for the chain tip

Test if waiting for the chain tip to be reached correctly finishes when
the chain tip is reached. This is done by sending recent sync lengths to
the `SyncStatus` instance, and checking that every time a separate
`SyncStatus` instance determines it has reached the tip the original
instance wakes up.

* Add a temporary attribute to allow dead code

The code added isn't used yet, so we'll add a temporary waiver until
another PR is merged to use them.
2021-08-30 10:01:33 +10:00
teor 6a9a8dfc38
Stop concurrently obtaining peer set readiness (#2672)
This avoids peer set contention when most peers are busy.

Also exit the task if the peer service returns a readiness error,
because that means it's permanently unusable.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-27 15:49:41 +00:00
Deirdre Connolly 8d3f6dc026
Mempool component and storage (#2651)
* First pass at a Mempool Service, incl. a storage layer underneath

* Fixed up Mempool service and storage

* allow dead code where needed

* clippy

* typo

* only drain if the mempool is full

* add a basic storage test

* remove space

* fix test for when MEMPOOL_SIZE change

* group some imports

* add a basic mempool service test

* add clippy suggestions

* remove not needed allow dead code

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: teor <teor@riseup.net>
2021-08-27 14:36:17 +00:00
teor d2e14b22f9
Refactor BestTipHeight into a generic ChainTip sender and receiver (#2676)
* Rename BestTipHeight so it can be generalised to ChainTipSender

`fastmod BestTipHeight ChainTipSender zebra*`

For senders:
`fastmod best_tip_height chain_tip_sender zebra*`

For receivers:
`fastmod best_tip_height chain_tip_receiver zebra*`

* Rename best_tip_height module to chain_tip

* Wrap the chain tip watch channel in a ChainTipReceiver type

* Create a ChainTip trait to avoid tricky crate dependencies

And add convenience impls for optional and empty chain tips.

* Use the ChainTip trait in zebra-network

* Replace `Option<ChainTip>` with `NoChainTip`

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-27 11:34:33 +10:00
Marek 1c232ff5ea
Enumerate mempool errors (#2615)
* Enumerate mempool errors.

* Update code formatting

* Allow dead code

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

* Allow dead code

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

* Add a new error

* Update error formatting

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

* Remove TransactionQueryError

* Derive Copy and Clone

Co-authored-by: teor <teor@riseup.net>

* Remove the Copy trait

Co-authored-by: teor <teor@riseup.net>

* Rename enum variants

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-08-25 18:39:27 +00:00
teor ace7aec933
Return a transaction verifier from `zebra_consensus::init` (#2665)
* Return a transaction verifier from `zebra_consensus::init`

This verifier is temporarily created separately from the block verifier's
transaction verifier.

* Return the same transaction verifier used by the block verifier

* Clarify that the mempool verifier is the transaction verifier

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-08-25 15:07:26 +00:00
Janito Vaqueiro Ferreira Filho 069f7716db
Create initial transaction crawler for the mempool (#2646)
* Create initial `mempool::Crawler` type

The mempool crawler is responsible for periodically asking peers for
transactions to insert into the local mempool. This initial
implementation will periodically ask for transactions, but won't do
anything with them yet.

Also, the crawler is currently configured to be always enabled, but this
should be fixed to avoid crawling while Zebra is still syncing the
chain.

* Add a timeout to peer responses

Prevent the crawler from getting stuck if there's communication with a
peer that takes too long to respond.

* Run the mempool crawler in Zebra

Spawn a task for the crawler when Zebra starts.

* Test if the crawler is sending requests

Create a mock for the `PeerSet` service to intercept requests and verify
that the transaction requests are sent periodically.

* Use `full` Tokio features when testing

Make it simpler to select the features for test builds.

Co-authored-by: teor <teor@riseup.net>

* Link to the issue for crawler activation

Make it easy to navigate from the `TODO` comment to the current project
planning.

Co-authored-by: teor <teor@riseup.net>

* Link to the issue for downloading transactions

Make it easy to navigate from the `TODO` comment to the current project
planning.

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2021-08-24 11:23:53 -03:00
teor 6f8f4d8987
Provide recent syncer response lengths as a watch channel (#2602)
* Minimal recent sync lengths implementation

Also includes metrics and logging, to make diagnosing bugs easier.

* Add logging to check what happens when Zebra reaches the chain tip

* Add tests for recent sync lengths

- initially empty
- pruned to correct length
- newest entries go first

* Drop a redundant `/` from a Cargo.lock URL

This seems to be a nightly or beta Rust change,
but hopefully stable just accepts it.

* Use metrics histograms to avoid overwriting values

* Add detailed syncer monitoring dashboard

* Increase the recent sync length to 4

This length makes it easier to distinguish between temporary and
sustained errors/syncs.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-19 23:16:16 +00:00
teor c608260256
Support witnessed transaction IDs in zebra-network requests and responses (#2638)
* Rename internal network requests for wide transaction IDs

fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*

* Update network transaction request/response comments

* Rename a transaction hash method for wide transaction IDs

fastmod transaction_hashes transaction_ids zebra-network

* Add UnminedTxId methods and conversions for InventoryHash

* Map WtxIds to unmined transaction network messages

Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.

* Enable WtxId mempool inventory tracking for peers

* Further clarify transaction IDs

* Use Witnessed rather than Wide for transaction IDs

And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.

* Rename a missed binding
* Remove an incorrectly named binding

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-18 22:55:24 +00:00
Janito Vaqueiro Ferreira Filho 4c4dbfe7cd
Reject connections from outdated peers (#2519)
* Simplify state service initialization in test

Use the test helper function to remove redundant code.

* Create `BestTipHeight` helper type

This type abstracts away the calculation of the best tip height based on
the finalized block height and the best non-finalized chain's tip.

* Add `best_tip_height` field to `StateService`

The receiver endpoint is currently ignored.

* Return receiver endpoint from service constructor

Make it available so that the best tip height can be watched.

* Update finalized height after finalizing blocks

After blocks from the queue are finalized and committed to disk, update
the finalized block height.

* Update best non-finalized height after validation

Update the value of the best non-finalized chain tip block height after
a new block is committed to the non-finalized state.

* Update finalized height after loading from disk

When `FinalizedState` is first created, it loads the state from
persistent storage, and the finalized tip height is updated. Therefore,
the `best_tip_height` must be notified of the initial value.

* Update the finalized height on checkpoint commit

When a checkpointed block is commited, it bypasses the non-finalized
state, so there's an extra place where the finalized height has to be
updated.

* Add `best_tip_height` to `Handshake` service

It can be configured using the `Builder::with_best_tip_height`. It's
currently not used, but it will be used to determine if a connection to
a remote peer should be rejected or not based on that peer's protocol
version.

* Require best tip height to init. `zebra_network`

Without it the handshake service can't properly enforce the minimum
network protocol version from peers. Zebrad obtains the best tip height
endpoint from `zebra_state`, and the test vectors simply use a dummy
endpoint that's fixed at the genesis height.

* Pass `best_tip_height` to proto. ver. negotiation

The protocol version negotiation code will reject connections to peers
if they are using an old protocol version. An old version is determined
based on the current known best chain tip height.

* Handle an optional height in `Version`

Fallback to the genesis height in `None` is specified.

* Reject connections to peers on old proto. versions

Avoid connecting to peers that are on protocol versions that don't
recognize a network update.

* Document why peers on old versions are rejected

Describe why it's a security issue above the check.

* Test if `BestTipHeight` starts with `None`

Check if initially there is no best tip height.

* Test if best tip height is max. of latest values

After applying a list of random updates where each one either sets the
finalized height or the non-finalized height, check that the best tip
height is the maximum of the most recently set finalized height and the
most recently set non-finalized height.

* Add `queue_and_commit_finalized` method

A small refactor to make testing easier. The handling of requests for
committing non-finalized and finalized blocks is now more consistent.

* Add `assert_block_can_be_validated` helper

Refactor to move into a separate method some assertions that are done
before a block is validated. This is to allow moving these assertions
more easily to simplify testing.

* Remove redundant PoW block assertion

It's also checked in
`zebra_state::service::check::block_is_contextually_valid`, and it was
getting in the way of tests that received a gossiped block before
finalizing enough blocks.

* Create a test strategy for test vector chain

Splits a chain loaded from the test vectors in two parts, containing the
blocks to finalize and the blocks to keep in the non-finalized state.

* Test committing blocks update best tip height

Create a mock blockchain state, with a chain of finalized blocks and a
chain of non-finalized blocks. Commit all the blocks appropriately, and
verify that the best tip height is updated.

Co-authored-by: teor <teor@riseup.net>
2021-08-08 23:52:52 +00:00
teor 1a57023eac
Security: Use canonical SocketAddrs to avoid duplicate peer connections, Feature: Send local listener to peers (#2276)
* Always send our local listener with the latest time

Previously, whenever there was an inbound request for peers, we would
clone the address book and update it with the local listener.

This had two impacts:
- the listener could conflict with an existing entry,
  rather than unconditionally replacing it, and
- the listener was briefly included in the address book metrics.

As a side-effect, this change also makes sanitization slightly faster,
because it avoids some useless peer filtering and sorting.

* Skip listeners that are not valid for outbound connections

* Filter sanitized addresses Zebra based on address state

This fix correctly prevents Zebra gossiping client addresses to peers,
but still keeps the client in the address book to avoid reconnections.

* Add a full set of DateTime32 and Duration32 calculation methods

* Refactor sanitize to use the new DateTime32/Duration32 methods

* Security: Use canonical SocketAddrs to avoid duplicate connections

If we allow multiple variants for each peer address, we can make multiple
connections to that peer.

Also make sure sanitized MetaAddrs are valid for outbound connections.

* Test that address books contain the local listener address

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-06-22 02:16:59 +00:00
Alfredo Garcia 96a1b661f0
Rate limit initial genesis block download retries, Credit: Equilibrium (#2255)
* implement and test a rate limit in `request_genesis()`
* add `request_genesis_is_rate_limited` test to sync
* add ensure_timeouts constraint for GENESIS_TIMEOUT_RETRY
* Suppress expected warning logs in zebrad tests

Co-authored-by: teor <teor@riseup.net>
2021-06-09 23:39:51 +00:00
teor b18c32f30f
Add the database format to the panic metadata (#2249)
Seems like it might be useful as we add more stuff to the state.
2021-06-04 14:42:15 +10:00
teor 2f0f379a9e
Standardise clippy lints and require docs (#2238)
* Standardise lints across Zebra crates, and add missing docs

The only remaining module with missing docs is `zebra_test::command`

* Todo -> TODO

* Clarify what a transcript ErrorChecker does

Also change `Error` -> `BoxError`

* TransError -> ExpectedTranscriptError

* Output Descriptions -> Output descriptions
2021-06-04 08:48:40 +10:00
teor 52dcaa2544 Stop ignoring lightweight git tags in panic metadata
Unfortunately, Zebra's first alpha release is an annotated tag, but
GitHub defaults to lightweight tags. (At least for pre-releases.)
2021-05-20 09:00:56 +10:00
teor bcc59d11c3 Refactor metadata so git vars must be optional
We don't test non-git builds, but we can use the type system to make sure
they are always optional.
2021-05-20 09:00:56 +10:00
teor b6c5ef8041 Add VERGEN_CARGO_PROFILE to the panic env vars
Some panics should only happen on debug profiles.
2021-05-20 09:00:56 +10:00
teor 62f053de9e Enable cargo env vars when there is no .git
But still disable git env vars.

This change requires vergen 5.1.4 or later.
2021-05-20 09:00:56 +10:00
teor 92828bbb29 Reliability: send local listener address to peers
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
2021-05-18 14:02:19 +10:00
teor 74e155ff9f
Spelling: gossipped -> gossiped (#2119) 2021-05-07 13:01:11 +02:00
teor 7e2c3a2fc7 Clarify a duplicate log message 2021-04-21 23:59:29 -04:00
Kirill Fomichev 5b2f1cdfd5
Add journald support through tracing-journald (#2034)
* Add journald support through tracing-journald

* change journald to use_journald

* more fixes
2021-04-22 09:31:06 +10:00
teor 96b3c94dbc
Add the new commit count and git hash to the version (#2038)
* Use the git version + new commit count + hash for the app version

This helps diagnose bugs in versions of Zebra built from git branches,
rather than git version tags.

* Fill in assert

* Also log semver string

* Fix syntax

* Handle vergen using the cargo package version or raw git tag

* s/Semver/SemVer/

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2021-04-21 22:14:36 +00:00
teor 0203d1475a Refactor and document correctness for std::sync::Mutex<AddressBook> 2021-04-21 17:14:47 -04:00
teor 79c0c4ec57 Stop assuming there will always be a git commit
Enable builds where:
* there is no google cloud git commit env var, and
* there is no `.git` directory.

By making all `vergen` env vars optional, and skipping any env vars that
don't exist.
2021-04-20 13:48:31 -04:00
Kirill Fomichev 43e792b9a4
Update to vergen 5, add branch, commit time, and build target to the panic metadata, automatically update app version from crate version (#2029)
* build(deps): bump vergen from 3.2.0 to 5.1.1

* fix hardcoded version for Tracing struct

* add additional metadata

* remove extra allocations for metadata

* Remove zebrad code version from release checklist

The zebrad code automatically uses the crate version now.

* Sort panic metadata into rough categories

Co-authored-by: teor <teor@riseup.net>
2021-04-20 06:48:14 +10:00
teor 1e4e5924ca clippy: factor common code out of an if-else block 2021-04-14 23:16:45 -04:00
teor a417c7c8c7 Use meaningful names for select! variables 2021-04-13 23:56:16 -04:00
Alfredo Garcia 5ec05e91e1 update version strings for v1.0.0-alpha.6 2021-04-08 18:48:34 -04:00
teor 306fa88214 Document the correctness of Poll::Pending wakeups 2021-03-27 08:55:49 -04:00
teor 829a6f11c5 Document the behaviour of the `select!` macro 2021-03-27 08:55:49 -04:00
Deirdre Connolly ca1d2de87d
Bump versions for v1.0.0-alpha.5 (#1932)
Zebra's latest alpha checkpoints on Canopy activation, continues our work on NU5, and fixes a security issue.

Some notable changes include:

## Added
- Log address book metrics when PeerSet or CandidateSet don't have many peers (#1906)
- Document test coverage workflow (#1919)
- Add a final job to CI, so we can easily require all the CI jobs to pass (#1927)

## Changed
- Zebra has moved its mandatory checkpoint from Sapling to Canopy (#1898, #1926)
  - This is a breaking change for users that depend on the exact height of the mandatory checkpoint.

## Fixed
- tower-batch: wake waiting workers on close to avoid hangs (#1908)
- Assert that pre-Canopy blocks use checkpointing (#1909)
- Fix CI disk space usage by disabling incremental compilation in coverage builds (#1923)

## Security
- Stop relying on unchecked length fields when preallocating vectors (#1925)
2021-03-22 22:05:01 -04:00
Alfredo Garcia d49eaab68e
Bump versions for zebrad 1.0.0-alpha.4 (#1913)
* Bump versions for zebrad 1.0.0-alpha.4

* add Cargo.lock
2021-03-16 21:12:37 -03:00
Jack Grigg bae9a7ecd5 Expose binary data in metrics
This enables slicing and aggregating metrics based on zebrad version:
https://www.robustperception.io/exposing-the-software-version-to-prometheus
2021-03-17 09:38:07 +10:00
teor d494af1e90 Document how the syncer resists memory DoS 2021-03-11 06:24:46 -05:00
teor c6358b157c Reduce inbound concurrency to limit memory usage
Inbound malicious blocks can use a large amount of RAM when
deserialized. Limit inbound concurrency, so that the total amount
of RAM remains small.
2021-03-11 06:24:46 -05:00
teor 7558f74c78 Bump versions for zebrad 1.0.0-alpha.3 2021-02-23 10:39:13 -05:00
teor e61b5e50a2
Diagnostics for CI port conflict failures (#1766)
Log a "Trying..." message before each listener opens, to see if the
delay is inside Zebra, or in the test harness or OS.

Also report the configured and actual ports where possible, for better
diagnostics.
2021-02-18 12:15:09 -03:00
teor 972103d797 Fix tracing macro syntax 2021-02-17 11:09:22 -05:00
teor 253d1c02b3 Make sync logging a bit less verbose
And tweak some log content
2021-02-17 11:09:22 -05:00
teor cc7d5bd2ad
Update comments for the inbound service (#1740) 2021-02-16 06:14:40 +10:00
teor 372a432179
Update the call_all comment in Inbound (#1737) 2021-02-16 06:14:16 +10:00
teor 0b76352468
Document a state_contains bug (#1715)
* Document a state_contains bug in the syncer and Inbound
2021-02-10 09:05:14 +10:00
Deirdre Connolly 0c5daa8410 Bump versions for zebrad 1.0.0-alpha.2
Including tower-batch bump to 0.2.0, tower-fallback to 0.2.0, zebra-script to 1.0.0-alpha.3
2021-02-09 16:14:29 -05:00
teor dce11358d7
Log when the syncer awaits peer readiness (#1714) 2021-02-10 07:09:27 +10:00
Alfredo Garcia d7c40af2a8
Fix shutdown panics (#1637)
* add a shutdown flag in zebra_chain::shutdown
* fix network panic on shutdown
* fix checkpoint panic on shutdown
2021-02-03 19:03:28 +10:00
teor 6679a124e3 Require Inbound setup handlers to provide a result
Rather than having them default to `Ok(())`, which is incorrect
for some error handlers.
2021-02-03 08:32:10 +10:00
teor 09c8c89462 Make sure FailedInit never escapes Inbound::poll_ready 2021-02-03 08:32:10 +10:00
teor 134a5e78bd Consistently use `network_setup` for the Inbound Setup 2021-02-03 08:32:10 +10:00
teor 1c8362fe01 Remove unused imports 2021-02-03 08:32:10 +10:00
Jane Lusby 4cf331562c combine network setup into an exhaustive match 2021-02-03 08:32:10 +10:00
Jane Lusby 4d6ef89248 avoid using async blocks to avoid lifetime bug with generators 2021-02-03 08:32:10 +10:00
Jane Lusby 685a592399 Add clonable wrapper around TryRecvError 2021-02-03 08:32:10 +10:00
teor 6ffeb670ed Log the failed response in an unreachable panic 2021-02-03 08:32:10 +10:00
teor eac4fd181a Add a Setup enum to manage Inbound network setup internal state
This change encodes a bunch of invariants in the type system,
and adds explicit failure states for:
* a closed oneshot,
* bugs in the initialization code.
2021-02-03 08:32:10 +10:00
teor 32b032204a Consistently return Response::Nil during setup
And log an info-level message as a diagnostic, in case setup takes a
long time.
2021-02-03 08:32:10 +10:00
teor 94eb91305b Stop using ServiceExt::call_all due to buffer bugs
ServiceExt::call_all leaks Tower::Buffer reservations, so we can't use
it in Zebra.

Instead, use a loop in the returned future.

See #1593 for details.
2021-02-03 08:32:10 +10:00
teor 64bc45cd2e Fix state readiness hangs for Inbound
Use `ServiceExt::oneshot` to perform state requests.

Explain that `ServiceExt::call_all` calls `poll_ready` internally.
Document a state service invariant imposed by `ServiceExt::call_all`.
2021-02-03 08:32:10 +10:00
teor 4d1a2fd02e Make the Inbound invariant clearer 2021-02-03 08:32:10 +10:00
teor 2a25b9ee72 Remove services that are never `call`ed from Inbound
Uses the `ServiceExt::oneshot` design pattern from #1593.
2021-02-03 08:32:10 +10:00
Alfredo Garcia 4b34482264
Add hints to port conflict and lock file panics (#1535)
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests

* Add a full set of stderr acceptance test functions

This change makes the stdout and stderr acceptance test interfaces
identical.

* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path

* Use Display for state path logs

Avoids weird escaping on Windows when using Debug

* Add Windows conflict error messages

* Turn PORT_IN_USE_ERROR into a regex

And add another alternative Windows-specific port error

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
2021-01-29 22:36:33 +10:00
teor 24f1b9bad1
Document the Inbound service in the start module (#1653) 2021-01-29 22:19:06 +10:00
teor 21b0360114 Limit concurrent inbound gossipped block requests
Uses the "load shed directly" design pattern from #1618.
2021-01-29 11:02:26 +10:00
teor 3d9888f736 Rewrite a sync comment 2021-01-29 11:02:26 +10:00
Deirdre Connolly 1b09538277
Bump versions for zebrad 1.0.0-alpha.1 (#1646)
* Bump versions where appropriate

Tested with cargo install --locked --path etc

* Remove fixed panics from 'Known Issues'

* Change to alpha release series in the README

Co-authored-by: teor <teor@riseup.net>
2021-01-27 20:31:39 -05:00
teor 391c53aa60 Move BoxError to zebrad's lib.rs
For consistency with other crates.
2021-01-27 12:14:27 -08:00
teor 9cdf41f5f4
Panic if the lookahead limit is misconfigured (#1589) 2021-01-14 14:06:30 +10:00
teor 92d95d4be5 Refactor inbound members into a consistent order
And add download comments
2021-01-13 20:46:25 -05:00
teor fb76eb2e6b Add download and verify timeouts to the inbound service 2021-01-13 20:46:25 -05:00
teor 973aec8ccc Refactor sync members into a consistent order
And add comments about correctness and usage.
2021-01-13 20:46:25 -05:00
teor c2893dce51 Warn when the user's configured lookahead limit is ignored 2021-01-13 20:46:25 -05:00
teor 3699bbdae6 Add some additional sync correctness constraints
And adjust the sync restart delay as a consequence.
2021-01-13 20:46:25 -05:00
teor cef0a492d8 Add a timeout to sync service block verification
This timeout stops the sync service hanging when it is missing required
blocks, but the lookahead queue is full of dependent verify tasks, so the
missing blocks never get downloaded.
2021-01-13 20:46:25 -05:00
teor c75cbdea79
Log configured network in every log message (#1568)
* Add the configured network to error reports
* Log the configured network at error level
* Create the global span immediately after activating tracing
And leak the span guard, so the span is always active.

* Include panic metadata in the report and URL
* Use `Main` and `Test` in the global span
`net=Mainnet` is a bit redundant
2021-01-12 07:46:56 +10:00
teor b1f14f47c6
Rewrite GetData handling to match the zcashd implementation (#1518)
* Rewrite GetData handling to match the zcashd implementation

`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)

This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.

Also change Zebra's GetData responses to peer request so they ignore
missing blocks.

* Stop hanging on incomplete transaction or block responses

Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
   that were successfully received
2. if none of the expected blocks or transactions were received, return
   an error, and close the connection
2021-01-04 13:25:35 +10:00
teor 69fcf64d6c
Disable issue URLs for "duplicate hash" errors (#1517)
In our README, we tell users to ignore these errors, so we should also
disable the issue URL.

Also include the hash in the error. (We don't want the span active for
all messages, we just want the hash in the error.)
2020-12-16 08:14:42 +10:00
Alfredo Garcia 41833340c1
downgrade remaining version strings to 1.0.0-alpha.0 (#1488) 2020-12-15 11:21:00 +10:00
Deirdre Connolly 2d1698a120 Comment out Sentry stacktraces for now
While panic = abort, Sentry collects the same one-line stack trace for all panics,
making it incorrectly dedupe different errors into one.
2020-12-12 13:26:52 -05:00
Deirdre Connolly cff28f7ac8 Use the commit sha as the sentry release 2020-12-09 13:06:18 -05:00
Jane Lusby 400213e2b3 integrate sentry with our existing panic reporting logic 2020-12-09 13:06:18 -05:00
Deirdre Connolly f1ec1d626d Tidy for now 2020-12-09 13:06:18 -05:00
Deirdre Connolly 44e1051dee Debug 2020-12-09 13:06:18 -05:00
Deirdre Connolly 8b268e3f71 Don't keep guard around 2020-12-09 13:06:18 -05:00
Deirdre Connolly 25f6fd25b3 Test catching panic 2020-12-09 13:06:18 -05:00
Deirdre Connolly 6a17549945 Try sentry-tracing integration 2020-12-09 13:06:18 -05:00
Deirdre Connolly c03a3a2606 Pull DSN from runtime env, enable Sentry debug mode with RUST_LOG=debug 2020-12-09 13:06:18 -05:00
Deirdre Connolly 27e42f4ed5 Set up Sentry error collection via a feature flag 2020-12-09 13:06:18 -05:00
Deirdre Connolly 47d78d4cf4 Try sentry::init() 2020-12-09 13:06:18 -05:00
teor 16ffb1dbbf
Disable issue URLs on all timeouts (#1470)
This change helps prevent spurious bug reports.
2020-12-08 07:47:01 +10:00
Jane Lusby ef7e91c3c7
disable color-eyre colors if not connected to a tty (#1443)
* disable color-eyre colors if not connected to a tty
* check if color is disabled
2020-12-04 11:05:25 +10:00
Jane Lusby 90f944709b
fix git commit logic to work on gcloud (#1442) 2020-12-03 15:18:55 +10:00
teor 0e42d8b6c1 Always enable color_eyre, even when color is disabled
We want to automatically disable colors upstream in color_eyre,
and add a config that allows users to always turn off color.
2020-12-02 10:25:44 -08:00
teor bed34168c1 Automatically disable abscissa colors and color_eyre when writing to a file 2020-12-02 10:25:44 -08:00
teor 97d1a81b7c Automatically disable colors when tracing to a file 2020-12-02 10:25:44 -08:00
Henry de Valence f0db75e712 cargo fmt 2020-12-01 19:16:41 -08:00
Jane Lusby a91d0f0bb6
Include short sha in log messages and error urls (#1410)
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.

To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.

Co-authored-by: teor <teor@riseup.net>
2020-12-01 12:13:20 -08:00
Jane Lusby fceef849cf remove unused mutability to defuse deadlock 2020-12-01 11:03:13 -05:00
Henry de Valence 1df9284444 zebrad: add a use_color option to the tracing config.
This is useful for creating searchable logs without having to filter color codes after the fact.
2020-11-30 15:25:50 -08:00
Henry de Valence e8c16b172f zebrad: pass TracingSection to Tracing component 2020-11-30 15:25:50 -08:00
Alfredo Garcia 4544463059
Inbound `FindBlocks` and `FindHeaders` (#1347)
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions

Add a `max_len` argument to support `FindHeaders` requests.

Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.

* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:30:37 +10:00
Henry de Valence fa02b266ca clippy 2020-11-25 10:55:44 -08:00
Henry de Valence de8415dcb1 tidy spans 2020-11-25 10:55:44 -08:00
Henry de Valence 05837797b1 tidy imports 2020-11-25 10:55:44 -08:00
Henry de Valence 77bf327b07 fix errors (2) 2020-11-25 10:55:44 -08:00
Henry de Valence 527f4d39ed fix errors 2020-11-25 10:55:44 -08:00
Henry de Valence e645e3bf0c remove async 2020-11-25 10:55:44 -08:00
Henry de Valence 6569977549 test compile change 2020-11-25 10:55:44 -08:00
Alfredo Garcia 486e55104a create Downloads for Inbound 2020-11-25 10:55:44 -08:00
Henry de Valence 2a4a89c002 state,zebrad: tidy span levels for good INFO output
This provides useful and not too noisy output at INFO level.  We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
2020-11-23 14:16:39 +10:00
Henry de Valence f0810b028d state,consensus,sync: shorten span lengths
These changes help reduce the size of the resulting spans, making the
output more compact.  Together they save about 30-40 characters.
2020-11-23 14:16:39 +10:00
teor d4da9609ee Update the max_concurrent_block_requests docs
In #1298, we decreased `max_concurrent_block_requests`,
but forgot to update the docs.
2020-11-20 10:08:57 -08:00
Henry de Valence ba3c19142c deps: update hyper, metrics to tokio 0.3
The metrics code becomes much simpler because the current version of the
metrics crate builds its own single-threaded runtime on a dedicated worker
thread, so no dependency on the main Zebra Tokio runtime is required.
2020-11-20 10:08:16 -08:00
Henry de Valence add94c1c45 deps: move to tokio 0.3, tower 0.4
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware.  This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method.  See

ddc64e8d4d

for more context on the Tower changes.  To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
2020-11-20 10:08:16 -08:00
Henry de Valence 4953f21670 fixup! zebrad: hack to skip alreadyverified errors 2020-11-18 03:09:06 -05:00
Henry de Valence d2fc01755b zebrad: more reasonable concurrent block limit
This helps prevent overloading the network with too many concurrent
block requests.  On a fast network, we're likely to still have enough
room to saturate our bandwidth.  In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads.  If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s.  In practice, this
minimum speed will likely be much lower.
2020-11-17 14:56:27 -08:00
Henry de Valence aa7538ab15 zebrad: hack to skip alreadyverified errors 2020-11-17 14:56:27 -08:00
Henry de Valence e55392b61e zebrad: explicitly select the threaded scheduler. 2020-11-17 14:56:27 -08:00
Henry de Valence 6de824bd99 zebrad: remove block verification timeout
Because we set the lookahead limit to be at least twice the size of a checkpoint, we don't have a risk of timeouts.
2020-11-17 14:56:27 -08:00
Henry de Valence e9c847bbd7 zebrad: avoid a borrow in the ChainSync future 2020-11-17 14:56:27 -08:00
Henry de Valence b632a24436 zebrad: add diagnostics on cancelled download tasks 2020-11-17 14:56:27 -08:00
Henry de Valence ec411574ee zebrad: improve sync diagnostics 2020-11-17 14:56:27 -08:00
Henry de Valence e0c92167bc Revert "Hedge every syncer block download request"
This reverts commit 656bd24ba7.

The Hedge middleware keeps a pair of histograms, writing into one in the
current time interval and reading from the previous time interval's
data.  This means that the reverted change resulted in doubling all
block downloads until after at least the second measurement interval
(which means that the time measurements are also incorrect, as they're
operating under double the network load...)
2020-11-12 16:45:47 -05:00
Alfredo Garcia 128643d81e
Call `zebra_test::init` where needed. (#1227)
* Add missing `zebra_test::init()` to zebra-chain
* Add missing `zebra_test::init()` to zebra-consensus
* Add missing `zebra_test::init()` to zebra-network
* Add missing `zebra_test::init()` to zebra-state
* Add missing `zebra_test::init()` to zebra-test
* Add missing `zebra_test::init()` to zebrad
2020-11-10 10:29:25 +10:00
Henry de Valence 0ad648fb6a zebrad: make lookahead limit configurable.
Sets the default value to the previous lookahead limit.  My testing on
mainnet suggested that the newly lower value (changed when the
checkpoint frequency was decreased) is low enough to cause stalls, even
when using hedged requests.
2020-11-01 10:47:46 -08:00
teor 92c623eddf Log each genesis download
This change helps us diagnose sync hangs.
2020-10-28 11:31:04 -04:00
teor 656bd24ba7 Hedge every syncer block download request
Remove the minimum data points from the syncer hedge configuragtion.
When there are no data points, hedge sends the second request
immediately.

Where there are less than 1/(1-latency_percentile) data points (20),
hedge delays the second request by the highest recent download time.

This change should improve genesis and post-restart sync latency.
2020-10-28 11:31:04 -04:00
Henry de Valence 4c960c4e6d zebrad: treat duplicate downloads as an error
We should error if we notice that we're attempting to download the same
blocks multiple times, because that indicates that peers reported bad
information to us, or we got confused trying to interpret their
responses.
2020-10-26 12:05:35 -07:00
Henry de Valence 4127d086ea zebrad: clarify hedge layering motivation
Co-authored-by: teor <teor@riseup.net>
2020-10-26 12:05:35 -07:00
Henry de Valence 253bab042e sync: add a concurrency limit for block downloads 2020-10-26 12:05:35 -07:00
Henry de Valence 0a405c737d zebrad: check state in obtaintips, not extendtips.
The original sync algorithm split the sync process into two phases, one
that obtained prospective chain tips, and another that attempted to
extend those chain tips as far as possible until encountering an error
(at which point the prospective state is discarded and the process
restarts).

Because a previous implementation of this algorithm didn't properly
enforce linkage between segments of the chain while extending tips,
sometimes it would get confused and fail to discard responses that did
not extend a tip.  To mitigate this, a check against the state was
added.  However, this check can cause stalls while checkpointing,
because when a checkpoint is reached we may suddenly need to commit
thousands of blocks to the state.  Because the sync algorithm now has a
a `CheckedTip` structure that ensures that a new segment of hashes
actually extends an existing one, we don't need to check against the
state while extending a tip, because we don't get confused while
interpreting responses.

This change results in significantly smoother progress on mainnet.
2020-10-26 12:05:35 -07:00
Henry de Valence 65e0c22fbe state: don't pre-buffer the service
There's no reason to return a pre-Buffer'd service (there's no need for
internal access to the state service, as in zebra-network), but wrapping
it internally removes control of the buffer size from the caller.
2020-10-26 12:05:35 -07:00
Henry de Valence ce2ac3336f zebrad: add debug message before state check
This reveals that there may be contention in access to the state, as
this takes a long time.
2020-10-26 12:05:35 -07:00
Henry de Valence 91469faf3c zebrad: eliminate duplicate span in sync 2020-10-26 12:05:35 -07:00
Henry de Valence b5a43f4516 zebrad: remove implementation details from docs
The timeout behavior in zebra-network is an implementation detail, not a
feature of the public API.  So it shouldn't be mentioned in the doc
comments -- if we want timeout behavior, we have to layer it ourselves.
2020-10-26 12:05:35 -07:00
Henry de Valence 1d7309afe2 zebrad: correctly handle duplicates in DownloadSet
Using the cancel_handles, we can deduplicate requests.  This is
important to do, because otherwise when we insert the second cancel
handle, we'd drop the first one, cancelling an existing task for no
reason.
2020-10-26 12:05:35 -07:00
Henry de Valence 56fe4f4379 zebrad: unify sync restart logic
This lets us keep the main loop simple and just write `continue 'sync;`
to keep going.
2020-10-26 12:05:35 -07:00
Henry de Valence 12d25159c6 zebrad: use hedged requests in sync
The hedge middleware implements hedged requests, as described in _The
Tail At Scale_. The idea is that we auto-tune our retry logic according
to the actual network conditions, pre-emptively retrying requests that
exceed some latency percentile. This would hopefully solve the problem
where our timeouts are too long on mainnet and too slow on testnet.
2020-10-26 12:05:35 -07:00
Henry de Valence 5f229d1475 zebrad: use Downloads in sync
Try to use the better cancellation logic to revert to previous sync
algorithm.  As designed, the sync algorithm is supposed to proceed by
downloading state prospectively and handle errors by flushing the
pipeline and starting over.  This hasn't worked well, because we didn't
previously cancel tasks properly.  Now that we can, try to use something
in the spirit of the original sync algorithm.
2020-10-26 12:05:35 -07:00
Henry de Valence b90581a3d7 zebrad: create a Downloads Stream for syncing.
This makes two changes relative to the existing download code:

1.  It uses a oneshot to attempt to cancel the download task after it
    has started;

2.  It encapsulates the download creation and cancellation logic into a
    Downloads struct.
2020-10-26 12:05:35 -07:00
Henry de Valence b636660d6a zebrad: rename sync::Error alias to BoxError. 2020-10-26 12:05:35 -07:00
Henry de Valence cab96aa1a8
zebrad: clarify config help text (#1194) 2020-10-22 15:03:01 +10:00
Alfredo Garcia 21ad6ffc47
Reverse displayed endianness of transaction and block hashes (#1171)
* Reverse displayed endianness of transaction and block hashes
* fix zebra-checkpoints utility for new hash order
* Stop using "zebrad revhex" in zebrad-hash-lookup
* Rebuild checkpoint lists in new hash order
This change also adds additional checkpoints to the end of each list.

* Replace TransactionHash with transaction::Hash
This change should have been made in #905, but we missed Debug impls
and some docs.

Co-authored-by: Ramana Venkata <vramana@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
2020-10-22 07:54:02 +10:00
Henry de Valence eb43893de0 consensus: minimize API, clean docs
This reduces the API surface to the minimum required for functionality,
and cleans up module documentation.  The stub mempool module is deleted
entirely, since it will need to be redone later anyways.
2020-10-20 11:16:22 -04:00
Alfredo Garcia c0a14ecc8c
move genesis parameters to zebra-chain (#1151) 2020-10-12 14:08:23 -07:00
Jane Lusby 855f9b5bcb
Implement MVP of NonFinalizedState and integrate it with the state service (#1101)
* implement most of the chain functions
* implement fork
* fix outpoint handling in Chain struct
* update expect for work
* split utxo into two sets
* update the Chain definition
* remove allow attribute in zebra-state/lib.rs
* merge ChainSet type into MemoryState
* Add error messages to asserts
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* try to fix github actions syntax
* add module doc comment
* update RFC for utxos
* add missing header
* working proptest for Chain
* propagate back results over channel
* Start updating RFC to match changes
* implement queued block pruning
* and now it syncs wooo!
* remove empty modules
* setup config for proptests
* re-enable missing_docs lint
* update RFC to match changes in impl
* add documentation
* use more explicit variable names
2020-10-08 13:07:32 +10:00
Jane Lusby 40e22808c7
disable reporting url for timeout errors (#1087)
* disable reporting url for timeout errors

* revert newline removal

* switch to released color-eyre version
2020-09-21 16:15:09 -07:00
Henry de Valence fe61090a64 zebrad: make Inbound Poll::Ready before setup.
The Inbound service only needs the network setup for some requests, but
it can service other requests without it.  Making it return
Poll::Pending until the network setup finishes means that initial
network connections may view the Inbound service as overloaded and
attempt to load-shed.
2020-09-21 09:26:39 -07:00
Henry de Valence 9c021025a7 network: fill in remaining request/response pairs 2020-09-20 10:21:18 -07:00
Henry de Valence 4b35fea492 zebrad: document Inbound, ChainSync responsibilities 2020-09-18 18:34:25 -07:00
Henry de Valence 65877cb4b1 zebrad: make Inbound propagate backpressure 2020-09-18 18:34:25 -07:00
Henry de Valence 55f46967b2 zebrad: serve blocks from Inbound service
The original version of this commit ran into

https://github.com/rust-lang/rust/issues/64552

again.  Thanks to @yaahc for suggesting a workaround (using futures combinators
to avoid writing an async block).
2020-09-18 18:34:25 -07:00
Henry de Valence 170f588ffb network: document load-shedding behavior
This was part of the original design and is described in the Connection
internals, but we never documented it externally.
2020-09-18 18:34:25 -07:00
Henry de Valence 1d0ebf89c6 zebrad: move seed command into inbound component
Remove the seed command entirely, and make the behavior it provided
(responding to `Request::Peers`) part of the ordinary functioning of the
start command.

The new `Inbound` service should be expanded to handle all request
types.
2020-09-18 18:34:25 -07:00
Henry de Valence 1d3892e1dc network: rename alias to BoxError
This is shorter and consistent with Tower (which is why we use it in the
first place).
2020-09-18 18:34:25 -07:00
Jane Lusby ca648ff27c
Enable issue-url feature in color-eyre (#1072)
* Enable issue-url feature in color-eyre

* get version automatically

* and the url!
2020-09-17 15:09:18 -07:00
Henry de Valence 3133214e4f zebrad: use new state API 2020-09-11 13:37:49 -07:00
teor b1e1291f45 Log inbound peer requests at debug
Logging at info was a bit too verbose.

Also add a short log message.
2020-09-10 09:46:53 -07:00
Henry de Valence 24de90c900 zebrad: tidy sync imports 2020-09-10 09:45:52 -07:00
Henry de Valence 9b6e66c1b9 zebrad: rename Syncer to ChainSync
This name clarifies what is being synced and avoids an agent-noun
construction.
2020-09-10 09:45:52 -07:00
Henry de Valence 0bc79686b8 zebrad: move sync into components module.
Part of #1030.
2020-09-10 09:45:52 -07:00
teor adafe1d189 Restart sync after the first failed ObtainTips
The ObtainTips retry was redundant. The timeout wasn't much shorter, but
it made the code and sync logic more complicated.
2020-09-09 15:35:09 -07:00
teor 2a68ef5acb Update the peerset buffer size and sync timeout
Also add a bunch of comments and documentation for network-constrained
nodes, and for testnet.
2020-09-08 12:44:33 -07:00
teor b062a682b0 Refactor "waiting for pending blocks" log 2020-09-08 12:44:33 -07:00
teor e6e859dce2 Tweak sync timeouts
* increase the EWMA default and decay
* increase the block download retries
* increase the request and block download timeouts
* increase the sync timeout
2020-09-08 12:44:33 -07:00
teor ce12d4dadc Add timeouts for tip responses and block verify tasks 2020-09-08 12:44:33 -07:00
teor 379ce5c1b8 Retry obtain and extend tips on failure 2020-09-08 12:44:33 -07:00
Alfredo Garcia 454e75e7c0
Rename old references to BlockHeaderHash and BlockHeight (#1002)
* rename some references

* Apply suggestions from code review

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: teor <teor@riseup.net>

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: teor <teor@riseup.net>
2020-09-04 15:40:48 -07:00
teor 48497d4857
Ignore sync errors when the block is already verified (#980)
* Ignore sync errors when the block is already verified

If we get an error for a block that is already in our state, we don't
need to restart the sync. It was probably a duplicate download.

Also:

Process any ready tasks before reset, so the logs and metrics are
up to date. (But ignore the errors, because we're about to reset.)

Improve sync logging and metrics during the download and verify task.

* Remove duplicate hashes in logs
Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* Log the sync hash span at warn level
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-09-04 08:13:00 +10:00
teor 437549d8e9
Always drop the final hash in peer responses (#991)
To workaround a zcashd bug that squashes responses together.
2020-09-04 08:09:34 +10:00
teor c770daa51f
If the first ExtendTips hash is bad, discard it and re-check (#992) 2020-09-04 08:08:19 +10:00
Alfredo Garcia 5485f4429a
Add config path to acceptance tests (#946)
* add and apply config mode to get_child

* remove option to read config from current directory

* remove argument from get_child
2020-09-03 13:13:23 -07:00
Jane Lusby ffdec0cb23
Remove in-memory state service (#974)
* Remove in-memory state service

* make the config compatible with toml again

* checkpoint commit to see how much I still have to revert

* back to the starting point...

* remove unused dependency

* reorganize error handling a bit

* need to make a new color-eyre release now

* reorder again because I have problems

* remove unnecessary helpers

* revert changes to config loading

* add back missing space

* Switch to released color-eyre version

* add back missing newline again...

* improve error message on unix when terminated by signal

* add context to last few asserts in acceptance tests

* instrument some of the helpers

* remove accidental extra space

* try to make this compile on windows

* reorg platform specific code

* hide on_disk module and fix broken link
2020-09-01 12:39:04 -07:00
teor 3fdfcb3179 fix: remove old tips that are behind new tips
This change makes sync less reliant on the exact order of ObtainTips and
ExtendTips responses.
2020-09-01 11:42:48 -04:00
teor a6d6e65940 fix: fix the flamegraph module comment 2020-09-01 11:40:18 -04:00
teor 78201b456d feature: Implement checkpoint_sync for checkpoint verification
* add CheckpointList::new_up_to(limit: NetworkUpgrade)
* if checkpoint_sync is false, limit checkpoints to Sapling
* update tests for CheckpointList and chain::init
2020-08-24 15:34:46 +10:00
teor 06f4a59664 feature: Add a checkpoint_sync config option
(The option doesn't do anything yet.)
2020-08-24 15:34:46 +10:00
teor b8e8d4f548 fix: Remove some deeply-nested instrument spans
Closes #923.
2020-08-20 14:52:39 -04:00
Henry de Valence 103b663c40 chain: rename BlockHeight to block::Height 2020-08-17 11:46:34 -07:00
Henry de Valence 61dea90e2f chain: rename BlockHeaderHash to block::Hash
This is the first in a sequence of changes that change the block:: items
to not include Block as a prefix in their name, in accordance with the
Rust API guidelines.
2020-08-17 11:46:34 -07:00
Henry de Valence 948b067808 chain: move Network, NetworkUpgrade to parameters
Also, avoid using star-imports of the enum variants, which pollutes the
namespace.
2020-08-17 11:46:34 -07:00
Henry de Valence 0d1f56ad2f chain: remove utils module
A catch-all utils module can really easily slip into being a place to stash
miscellaneous functions that don't really belong anywhere in particular.
2020-08-17 11:46:34 -07:00
Henry de Valence a79ce97957
Fix sync algorithm. (#887)
* checkpoint: reject older of duplicate verification requests.

If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled.  Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.

* sync: add a timeout layer to block requests.

Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.

* sync: restart syncing on error

Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.

* sync: additional debug info

* sync: handle lookahead limit correctly.

Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped.  This meant that completed tasks could be left until the limit was
exceeded again.  Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.

* network: add debug instrumentation to retry policy

* sync: instrument the spawned task

* sync: streamline ObtainTips/ExtendTips logic & tracing

This change does three things:

1.  It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method.  This means that when debugging we only have one
deduplication algorithm to focus on.

2.  It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.

3.  It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip").  The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.

* sync: hack to work around zcashd behavior

* sync: localize debug statement in extend_tips

* sync: change algorithm to define tips as pairs of hashes.

This is different enough from the existing description that its comments no
longer apply, so I removed them.  A further chunk of work is to change the sync
RFC to document this algorithm.

* sync: reduce block timeout

* state: add resource limits for sled

Closes #888

* sync: add a restart timeout constant

* sync: de-pub constants
2020-08-12 16:48:01 -07:00
Henry de Valence 299afe13df
zebra-network tweaks. (#877)
* network: move gossiped peer selection logic into address book.

* network: return BoxService from init.

* zebrad: add note on why we truncate thegossiped peer list

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* Remove unused .rustfmt.toml

Many of these options are never actually loaded by our CI because of a channel
mismatch, where they're not applied on stable but only on nightly (see the logs
from a rustfmt job).  This means that we can get different settings when
running `cargo fmt` on the nightly and stable channels, which was causing a CI
failure on this PR.  Reverting back to the default rustfmt settings avoids this
problem and keeps us in line with upstream rustfmt.  There's no loss to us
since we were using the defaults anyways.

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-11 13:07:44 -07:00
teor 2550c44d48
Make sync ignore known hashes (#853)
* fix: Handle known ObtainTips correctly

enumerate never returns a value beyond the end of the vector.

* fix: Ignore known tips in ExtendTips

Some peers send us known tips when we try to extend.

* fix: Ignore known hashes when downloading

Despite all our other checks, we still end up downloading some hashes
multiple times.

* fix: Increase the number of retries

The old sync code relied on duplicate block fetches to make progress,
but the last few commits have removed some of those duplicates.

Instead, just retry the fetches that fail.

* fix: Tweak comments

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* fix: Cleanup the state_contains interface in Sync

* Fix brackets

Oops

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-10 16:17:50 -07:00
Alfredo Garcia 9c387521bd
Print endpoint addresses at startup (#867)
* print tracing and metrics endpoints in startup

* print network address in startup
2020-08-10 12:47:26 -07:00
teor e95358dbe3 fix: Increase the number of retries
The old sync code relied on duplicate block fetches to make progress,
but the last few commits have removed some of those duplicates.

Instead, just retry the fetches that fail.
2020-08-10 18:58:21 +10:00
teor faac50697c feature: Add a verified blocks metrics counter
We have a counter for pending "download and verify" futures. But these
futures are spawned, so they can complete in any order. They can also
complete before we receive their results.
2020-08-10 15:12:08 +10:00
teor 6aeefcee8b fix: Improve sync diagnostics 2020-08-10 15:12:08 +10:00
Henry de Valence 6d1a4b2218
Load config after initializing the Terminal (#848) 2020-08-06 17:22:40 -07:00
Alfredo Garcia c52481c041 fix logs 2020-08-07 09:21:57 +10:00
Jane Lusby 3e9c6f054b
fix log level default for server commands (#840)
* fix log level default for server commands

* remove dbg
2020-08-06 11:23:00 -07:00
Henry de Valence a77328ad7c
Refactor tracing components (#834)
* Split tracing component code into modules.

* Repatriate Tracing and simplify config handling.

We upstreamed our Tracing component, expecting not to have to exert fine
control over the tracing settings.  But this turned out not to be the case, and
now that we want to do other things (flamegraphs, journalctl, opentelemetry,
etc), we end up with really awkward code (as in the current flamegraph
handling).

This also makes use of the changes to `init()` to load the config early to pass
configuration data into the components, which avoids the need for the
refactoring in #775.

Finally, we restore support for the `-v` flag when the filter is unset.  Closes #831.

* Disable tracing and metrics endpoints by default.

Closes #660.

* Switch back to upstream Abscissa.

* Integrate flamegraph support into the new Tracing component.

* Pass -v in acceptance tests to get info-level output.

* Clean up acceptance test code.
2020-08-06 10:29:31 -07:00
Jane Lusby 867dd0b475
Setup tracing-flame for use profiling zebrad (#436)
* Setup tracing-flame for use profiling zebrad

* start work on conditional flamegraph generation

* review time!

* update comments

* Update Cargo.toml

* disable default features for inferno

* reorganize

* missing one trait

* Apply suggestions from code review

* graceful shutdown!

* remove special case handling on ctrlc for cleanup

* rename signal fn to better represent its responsibility

* remove unused global hook for flushing flamegraph

* move tracing logic to the right file

* just copy linkerd's signal handling logic

* update book

* make zebrad app drop on shutdown normally

* Update zebrad/src/components/tokio.rs

Co-authored-by: teor <teor@riseup.net>

* Update zebrad/src/application.rs

Co-authored-by: teor <teor@riseup.net>

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* cleanup a little

* ooh yea there's an API for that

* setup env-filter for backup subscriber

* document env filter

* document return codes

* forgot to save

* Update book/src/applications/zebrad.md

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-08-05 16:35:56 -07:00
Henry de Valence 4a03d76a41
Remove environment variables in favor of documented config options. (#827)
* Load tracing filter only from config and simplify logic.

* Configure the state storage in the config, not an environment variable.

This also changes the config so that the path is always set rather than being
optional, because Zebra always needs a place to store its config.
2020-08-05 11:48:08 -07:00
Henry de Valence 82da4a5326 Remove connect command. 2020-08-04 23:34:45 -07:00
Alfredo Garcia f2d7bb3177
Command execution tests (#690)
* add zebrad acceptance tests
* add custom command test helpers that work with kill
* add and use info event for start and seed commands
* combine conflicting tests into one test case

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-08-01 16:15:26 +10:00
Alfredo Garcia 617f1d80ef move docs to zebra book 2020-07-29 19:44:21 -07:00
Alfredo Garcia 6297a7cd19 document zebrad enviroment variables 2020-07-29 19:44:21 -07:00
teor 050c46388f fix: Open the endpoints after the config is loaded
We get the injected TokioComponent dependency before the config is
loaded, so we can't use it to open the endpoints.

And we can't define after_config, because we use derive(Component).

So we work around these issues by opening the endpoints manually,
from the application's after_config.
2020-07-29 16:03:52 +10:00
teor e7437cc551 feature: Get endpoint addresses from config 2020-07-29 16:03:52 +10:00
teor 11090dbf91 feature: Separate Mainnet and Testnet state 2020-07-29 01:45:19 -04:00
Alfredo Garcia 5b3c6e4c6c
Port bash checkpoint scripts to zebra-checkpoints single rust binary (#740)
* make zebra-checkpoints
* fix LOOKAHEAD_LIMIT scope
* add a default cli path
* change doc usage text
* add tracing
* move MAX_CHECKPOINT_HEIGHT_GAP to zebra-consensus
* do byte_reverse_hex in a map
2020-07-25 17:53:00 +10:00
Henry de Valence b59cfc49b7 sync: create requests sequentially to respect backpressure.
This seems like a better design on principle but also appears to give a much
nicer sawtooth pattern of queued blocks in the checkpointer and a much smoother
pattern of block requests.
2020-07-24 18:36:00 -04:00
teor 2acfcf3a90
Make the CheckpointVerifier handle partial restarts (#736)
Also put generic bounds on the BlockVerifier struct,
so we get better compilation errors.
2020-07-24 11:47:48 +10:00
teor 77a1fefa1e
Download genesis (#731)
* feature: Add more CheckpointVerifier tracing

* fix: Download the genesis block
2020-07-23 10:56:52 -07:00
Jane Lusby c1a1493159
use dirs crate for default location of state and config (#714)
* use dirs crate for default location of state and config
* panic if a path isn't specified for zebra-state
2020-07-23 21:12:20 +10:00
teor c95c825707 fix: Lookup the genesis hash based on the network 2020-07-23 03:46:24 -04:00
Henry de Valence 4a98b8fa0d Add basic metrics to the syncer. 2020-07-22 21:59:00 -07:00
Henry de Valence c2c2a28e8b Improve tracing output in chain verifier 2020-07-22 21:59:00 -07:00
Jane Lusby 7d4e717182
Add block locator request to state layer (#712)
* Add block locator request to state layer

* pass genesis in request

* Update zebrad/src/commands/start/sync.rs

* fix errors
2020-07-22 18:01:31 -07:00
Henry de Valence 49aa41544d sync: try to ignore spurious inv messages.
Closes #697.

per  https://github.com/ZcashFoundation/zebra/issues/697#issuecomment-662742971

The response to a getblocks message is an inv message with the hashes of the
following blocks. However, inv messages are also sent unsolicited to gossip new
blocks across the network. Normally, this wouldn't be a problem, because for
every other request we filter only for the messages that are relevant to us.
But because the response to a getblocks message is an inv, the network layer
doesn't (and can't) distinguish between the response inv and the unsolicited
inv.

But there is a mitigation we can do. In our sync algorithm we have two phases:
(1) "ObtainTips" to get a set of tips to chase down, (2) repeatedly call
"ExtendTips" to extend those as far as possible. The unsolicited inv messages
have length 1, but when extending tips we expect to get more than one hash. So
we could reject responses in ExtendTips that have length 1 in order to ignore
these messages. This way we automatically ignore gossip messages during initial
block sync (while we're extending a tip) but we don't ignore length-1 responses
while trying to obtain tips (while querying the network for new tips).
2020-07-22 17:55:52 -07:00
teor 9b97ebbd61 feature: Choose checkpoints based on the config 2020-07-23 10:26:25 +10:00
teor 3d721a96a5 feature: Add the state config to the config file 2020-07-23 10:26:25 +10:00
teor 89ac2793d6 feature: Use ChainVerifier in the sync service 2020-07-23 10:26:25 +10:00
Henry de Valence 928b0beb5d sync: unindent fetch task 2020-07-21 20:16:23 -07:00
Henry de Valence b722818e02 sync: remove redundant tracing specifier
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-07-21 20:16:23 -07:00
Henry de Valence 1047d2f690 sync: add backpressure to syncer
Closes #617.
Closes #698.

The remaining work on the syncer is alluded to in a new comment:

1. Correctly constructing a block locator object
2. Detecting when we've stopped making progress syncing and restarting obtain_tips.
2020-07-21 20:16:23 -07:00
Alfredo Garcia db2eb80b3e
Create consensus utils and move byte_reverse_hex function to it (#705)
* move byte_reverse_hex function
2020-07-22 12:29:14 +10:00
teor e5bb96715f fix: Reduce sync error logs to info or warn
Network issues are very common.
2020-07-21 10:13:03 -07:00
teor a0dbe85acd fix: Rewrite the config usage comment 2020-07-21 12:58:55 -04:00
Alfredo Garcia fe2a468417
add favicon to generated docs (#681) 2020-07-17 16:45:29 -07:00
teor 71de6de701 fix: Only enable tokio components for servers
Only enable the tokio and tracing components for server commands.
2020-07-17 10:12:51 +10:00
teor 49a3a7d6d1 fix: Only launch network endpoints for server commands
Fixes #669.
2020-07-16 10:40:03 -07:00
teor 851afad01f
fix: Resist CheckpointVerifier memory DoS attacks (#635)
* fix: Resist CheckpointVerifier memory DoS attacks

Allow a maximum of 2 queued blocks at each height, as a tradeoff between
efficient bad block rejection, and memory usage.

Closes #628.

* fix: Make max queued blocks at height equal to fanout

* fix: Just allocate all the capacity upfront

* fix: Use with_capacity(1) and reserve_exact(1)
2020-07-15 13:27:10 -07:00
teor 78459afe97 fix: Stop revhex on EOF 2020-07-15 19:19:02 +10:00
teor 12b9fa8ae2
Let zebrad revhex read from stdin (#648)
* Log at warn level for commands that use stdout
* Let zebrad revhex read from stdin

Most unix tools support reading from stdin, so they can be used in
pipelines.

Part of #564.
2020-07-15 16:16:07 +10:00
teor 8b5ec155f0
Consensus refactor (#629)
* Flatten consensus::verify::* to consensus::*
* Move consensus::*::tests into their own files
* Move CheckpointList into its own file
* Move Progress and Target into a types module

QueuedBlock and QueuedBlockList can stay in checkpoint.rs, because
they are tightly coupled to CheckpointVerifier.
2020-07-10 16:51:01 +10:00
Henry de Valence ff4e722cd7 sync: touch up tracing output. 2020-07-09 11:15:06 -07:00
Dimitris Apostolou ba81d7d4c0 Fix typos 2020-07-07 11:13:49 -07:00
Jane Lusby 51f6ce86ff
Implement retry policy for syncer (#551) 2020-07-01 13:35:01 -07:00
Jane Lusby 7245d91fe9
fix block downloading to be parallelized and commited via the verifier (#540) 2020-06-30 09:42:09 -07:00
Henry de Valence 21bf913b48 Revert "correctly trim and download tips (#531)"
This reverts commit e102bd5e34.
2020-06-24 12:24:37 -07:00
Jane Lusby e102bd5e34
correctly trim and download tips (#531)
* also download tips and filter tips

* dispatch all block downloads together

* tweek to match henry's changes

* switch to more intuitive match

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-24 15:19:34 -04:00
Alfredo Garcia 67718898c5
add usage help to generated config (#527) 2020-06-23 11:56:00 -07:00
Henry de Valence a453edd91c Put type definitions back at the bottom of the file. 2020-06-23 10:16:27 -07:00
Henry de Valence 18eb212d8e Set the new tips to be the last, not first, hash. 2020-06-23 10:16:27 -07:00
Jane Lusby 1c42b66a4f
Implement sync component for start subcommand (#506) 2020-06-22 19:24:53 -07:00
Jane Lusby 246e7cd2a9
Start testing out new version of `eyre` and `color-eyre` in zebra (#526)
* port to new version of eyre without generics

* correctly setup color_eyre hooks

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-22 15:36:23 -07:00
Deirdre Connolly 05316dee21 Listen on 0.0.0.0, not 127.0.0.1
Turns out when your node faces the internet directly, it has to listen
to those addresses directly.
2020-06-19 03:46:09 -04:00
Henry de Valence 6cc1627a5d zebrad: apply serde(default) to config sections
Each subsection has to have `serde(default)` to get the behaviour we want
(delete all fields except the ones that have been changed); otherwise, we can
delete only entire sections.
2020-06-18 17:43:36 -04:00
Henry de Valence 4b8f07ebb2 zebrad: Add reference to config docs. 2020-06-18 17:43:36 -04:00
Alfredo Garcia b8f174ee3a change config module to generate 2020-06-18 12:44:02 -07:00
Jane Lusby 7f8a336b69 switch to on_disk state service for start cmd 2020-06-17 23:30:50 -07:00
Jane Lusby df18ac72c5 fix sharedpeererror to propagate tracing context 2020-06-17 14:38:26 -07:00
Jane Lusby 06fd3b2503 be more explicit with pattern in drain_requests 2020-06-16 12:04:45 -07:00
Jane Lusby b0ecd019b6 apply comments from code review 2020-06-16 12:04:45 -07:00
Jane Lusby d09c339dc5 little more cleaning 2020-06-16 12:04:45 -07:00
Jane Lusby 528fd2b5b1 add an outline of the structure of the node 2020-06-16 12:04:45 -07:00
Jane Lusby fc96a41b18 copy connect command into start command 2020-06-16 12:04:45 -07:00
Jane Lusby df656a8bf0
Reorganize `connect` subcommand for readibility (#450) 2020-06-12 09:20:58 -07:00
Jane Lusby 431f194c0f
propagate errors out of zebra_network::init (#435)
Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service.

This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.
2020-06-09 12:24:28 -07:00
Deirdre Connolly 42cc55b0bb Remove testing tokio task
That fires 'GetPeers' requests at our running 'zebra seed'.
2020-06-08 19:26:23 -04:00
Deirdre Connolly 43b77b080e Fix 'dos' feature for seed command, and Buffer the seed service 2020-06-08 19:26:23 -04:00
Deirdre Connolly 8f5e7c268b Request::Peers not GetPeers 2020-06-08 19:26:23 -04:00
Jane Lusby 9bcda0f9c7 Wrap Blocks in Arc throughout codebase 2020-06-05 00:36:55 -04:00
Jane Lusby 18b4dbc16c
fix tracing configuration issues (#432) 2020-06-04 19:34:06 -07:00
Jane Lusby e9af80b875
Add initial version of zebra-state (#414)
* rename zebra-storage to zebra-state

* Setup initial skeleton for zebra-state

* add test

* Apply suggestions from code review

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* move shared test vectors to a common crate

Co-authored-by: Jane Lusby <jane@zfnd.org>
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
2020-06-02 16:16:17 -07:00
Jane Lusby da72c5a86a
switch from abscissa::Context to color-eyre (#409)
Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-05-28 23:01:24 -04:00
Jane Lusby 8c178c3ee4
fix panic in seed subcommand (#401)
Co-authored-by: Jane Lusby <jane@zfnd.org>

Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs. 

First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections.

The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.
2020-05-27 17:40:12 -07:00
Jane Lusby b6b35364f3 cleanup warnings throughout codebase 2020-05-27 15:42:29 -04:00
Henry de Valence dd8ba287bf Correct block version parsing. 2020-03-18 21:34:02 -04:00
Henry de Valence 81500dfe11 Add Zebra logotype. 2020-02-26 21:25:35 -08:00
Henry de Valence cd6deea7e1 Clarify that it's the ZF discord and that it's engineering-focused. 2020-02-26 21:25:35 -08:00
Henry de Valence ff3efd504c Add Zebra logo to all workspace crates.
Also add html_root_url attributes.
2020-02-26 21:25:35 -08:00
Henry de Valence f98cda40f9 Remove unused import. 2020-02-21 06:48:25 -05:00
Henry de Valence 9c357eaf1e Use retries for FindBlocks requests. 2020-02-21 06:48:25 -05:00
Henry de Valence b951f13f06 Add a `revhex` utility command to reverse endianness.
This makes it easier to translate block hashes output by our debug logs into
the format used by other tools.
2020-02-21 06:48:25 -05:00
Henry de Valence afa2c2347f fmt 2020-02-21 06:48:25 -05:00
Henry de Valence 8bff6ada6c Prevent a crash serializing configs. 2020-02-14 20:14:05 -05:00
Henry de Valence 75d3d44fb3 Metrics MVP: add two metrics and export them to Prometheus.
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2020-02-14 20:14:05 -05:00
Henry de Valence b443d7a4be Fix clippy lints. 2020-02-14 18:23:41 -05:00
Henry de Valence 7ba007f23d Exercise network functionality by downloading lots of blocks.
(Don't check any information about them, just blindly download).
2020-02-14 18:23:41 -05:00
Henry de Valence 7049f9d891 Add a FindBlocks request to get initial block hashes.
Bitcoin does this either with `getblocks` (returns up to 500 following block
hashes) or `getheaders` (returns up to 2000 following block headers, not
just hashes).  However, Bitcoin headers are much smaller than Zcash
headers, which contain a giant Equihash solution block, and many Zcash
blocks don't have many transactions in them, so the block header is
often similarly sized to the block itself.  Because we're
aiming to have a highly parallel network layer, it seems better to use
`getblocks` to implement `FindBlocks` (which is necessarily sequential)
and parallelize the processing of the block downloads.
2020-02-14 18:23:41 -05:00
Henry de Valence 3c9b5612f3 Update zebrad docs and README. 2020-02-12 12:50:55 -08:00
Henry de Valence 29f901add3 Rename Response::Ok to Response::Nil.
This is a better name because it signals "no data in response" rather
than "Ok", which is semantically mixed with `Ok/Err` of `Result`.
2020-02-10 09:03:56 -08:00
Henry de Valence 2c0f48b587 Refactor connection logic and try a block request.
Attempting to implement requests for block data revealed a problem with
the previous connection logic.  Block data is requested by sending a
`getdata` message with hashes of the requested blocks; the peer responds
with a sequence of `block` messages with the blocks themselves.

However, this wasn't possible to handle with the previous connection
logic, which could only convert a single Bitcoin message into a
Response.  Instead, we factor out the message handling logic into a
Handler, which can statefully accumulate arbitrary data into a Response
and signal completion.  This is still pretty ugly but it does work.

As a side effect, the HeartbeatNonceMismatch error is removed; because
the Handler now tries to process messages until it comes to a Response,
it just ignores mismatched nonces (and will eventually time out).

The previous Mempool and Transaction requests were removed but could be
re-added in a different form later.  Also, the `Get` prefixes are
removed from `Request` to tidy the name.
2020-02-10 09:03:56 -08:00
Henry de Valence f04f4f0b98 Apply clippy fixes 2020-02-05 12:42:32 -08:00
Deirdre Connolly 08012f058a cargo fmt 2020-01-28 03:48:23 -05:00
Henry de Valence 4fcb550aa6 Fix a deadlock in TokioComponent.
The components are accessed by a lock on application state.  When some command
calls block_on to enter an async context, it obtained a write lock on the
entire application state.  This meant that if the application state were
accessed later in an async context, a deadlock would occur.  Instead the
TokioComponent holds an Option<Runtime> now, so that before calling block_on,
the caller can .take() the runtime and release the lock.  Since we only ever
enter an async context once, it's not a problem that the component is then
missing its runtime, as once we are inside of a task we can access the runtime.
2020-01-15 12:06:31 -08:00