Commit Graph

921 Commits

Author SHA1 Message Date
teor 4ce6fbccc4
Fix new clippy lints in clippy nightly (#3176) 2021-12-09 14:19:14 +00:00
Janito Vaqueiro Ferreira Filho 0ad89f2f41
Disconnect from outdated peers on network upgrade (#3108)
* Replace usage of `discover::Change` with a tuple

Remove the assumption that a `Remove` variant would never be created
with type changes that allow the compiler to guarantee that assumption.

* Add a `version` field to the `Client` type

Keep track of the peer's reported protocol version.

* Create `LoadTrackedClient` type

A `peer::Client` type wrapper that implements `Load`. This helps with
the creation of a client service that has extra peer information to be
accessed without having to send requests.

* Use `LoadTrackedClient` in `initialize`

Ensure that `PeerSet` receives `LoadTrackedClient`s so that it will be
able to query the peer's protocol version later on.

* Require `LoadTrackedClient` in `PeerSet`

Replace the generic type with a concrete `LoadTrackedClient` so that we
can query its version.

* Create `MinimumPeerVersion` helper type

A type to track the current minimum protocol version for connected
peers based on the current block height.

* Use `MinimumPeerVersion` in handshakes

Keep the code to obtain the current minimum peer protocol version in a
central place.

* Add a `MinimumPeerVersion` instance to `PeerSet`

Prepare it to be able to disconnect from outdated peers based on the
current minimum supported peer protocol version.

* Disconnect from ready services for outdated peers

When the minimum peer protocol version is detected to have changed
(because of a network upgrade), remove all ready services of peers that
became outdated.

* Cancel added unready services of outdated peers

Only add an unready service if it's for a peer that has a supported
protocol version. Otherwise, add it but drop the cancel handle so that
the `UnreadyService` can execute and detect that it was cancelled.

* Avoid adding ready services for outdated peers

If a service becomes ready but it's for a connection to an outdated
peer, drop it.

* Improve comment inside `crawl_and_dial`

Describe an edge case that is also handled but was not explicit.

Co-authored-by: teor <teor@riseup.net>

* Test if calculated minimum peer version is correct

Given an arbitrary best chain tip height, check that the calculated
minimum peer protocol version is the expected value.

* Test if minimum version changes with chain tip

Apply an arbitrary list of chain tip height updates and check that for
each update the minimum peer version is calculated correctly.

* Test minimum peer version changed reports

Simulate a series of best chain tip height updates, and check for
minimum peer version updates at least once between them. Changes should
only be reported once.

* Create a `MockedClientHandle` helper type

Used to create and then track a mock `Client` instance.

* Add `MinimumPeerVersion::with_mock_chain_tip`

An extension method useful for tests, that contains some shared
boilerplate code.

* Bias arbitrary `Version`s to be in valid range

Give a 50% chance for an arbitrary `Version` to be in the range of
previously used values the Zcash network.

* Create a `PeerVersions` helper type

Helps with the creation of mocked client services with arbitrary
protocol versions.

* Create a `PeerSetGuard` helper type

An auxiliary type to a `PeerSet` instance created for testing. It keeps
track of any dummy endpoints of channels created and passed to the
`PeerSet` instance.

* Create a `PeerSetBuilder` helper type

Helps to reduce the code when preparing a `PeerSet` test instance.

* Test if outdated peers are rejected by `PeerSet`

Simulate a set of discovered peers being sent to the `PeerSet`. Ensure
that only up-to-date peers are kept by the `PeerSet` and that outdated
peers are dropped.

* Create `BlockHeightPairAcrossNetworkUpgrades` type

A helper type that allows the creation of arbitrary block height pairs,
where one value is before and the other is at or after the activation
height of an arbitrary network upgrade.

* Test if peers are dropped as they become outdated

Simulate a network upgrade, and check that peers that become outdated
are dropped by the `PeerSet`.

* Remove dbg! macros

Co-authored-by: teor <teor@riseup.net>
2021-12-09 02:54:29 +00:00
teor c55753d5bd
Add debug-level Zebra network message tracing (#3170)
* Add debug-level Zebra network message tracing

* Delete redundant spaces

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-12-09 01:09:23 +00:00
Janito Vaqueiro Ferreira Filho 1f756fcc81
Add `zebra_test::init_async` helper function (#3169)
* Use a single-thread shared Tokio runtime

This allows it to pause the time and more closely resembles the
environment that's set by default for asynchronous tests.

* Add a `zebra_test::init_async` helper function

Calls `zebra_test::init` but also constructs a single-thread Tokio
runtime and returns it. This makes it simpler to initialize asynchronous
tests that can't use the `#[tokio::test]` attribute.

* Replace usages of `Runtime::new` in tests

Use the new `zebra_test::init_async()` helper function instead.

* Replace `runtime::Builder::new_current_thread()`

Use the new `zebra_test::init_async()` helper function instead.

* Replace `runtime::Builder::new_multi_thread()`

Use the new `zebra_test::init_async()` helper function instead. The test
with the change doesn't necessarily have to use a multi-thread runtime.
2021-12-09 00:18:17 +00:00
teor 332afc17d5
Security: Limit address book size to limit memory usage (#3162)
* Refactor the address response limit

* Limit the number of peers in the address book

* Allow changing the address book limit in tests

* Add tests for the address book length limit

* rustfmt
2021-12-06 16:09:10 -03:00
teor 4d608d3224
Stop doing thousands of time checks each time we connect to a peer (#3106)
* Stop checking the entire AddressBook for each connection attempt

* Stop redundant peer time checks within the address book

* Stop calling `Instant::now` 3 times for each address book update

* Only get the time once each time an address book method is called

* Update outdated comment

* Use an OrderedMap to efficiently store address book peers

* Add address book order tests
2021-12-03 15:09:43 -03:00
teor 022808d028
Release Zebra v1.0.0-beta.2 (#3132)
Zebra's latest beta continues implementing zero-knowledge proof and note commitment tree validation. In this release, we have finished implementing transaction header, transaction amount, and Zebra-specific NU5 validation. (NU5 mainnet validation is waiting on an `orchard` crate update, and some consensus parameter updates.)

We also fix a number of security issues that could pose a local denial of service risk, or make it easier for an attacker to make a node follow a false chain.

As of this release, Zebra will automatically download and cache the Sprout and Sapling Groth16 circuit parameters. The cache uses around 1 GB of disk space. These cached parameters are shared across all Zebra and `zcashd` instances run by the same user.

See CHANGELOG.md for the full list of changes in this release.
2021-12-03 06:54:14 +10:00
teor a92c431c03
Ignore NotFound errors in the syncer (#3131) 2021-12-02 11:28:20 -03:00
teor ab471b0db0
Revert "Stop returning NotFound errors, use the response instead" (#3124)
* Revert "Stop returning NotFound errors, use the response instead"

This reverts commit 45871f6915.

* Fix clippy warnings

* Downgrade a frequent log to debug level
2021-12-01 05:09:54 +00:00
teor c85ea18b43
Fix slow Zebra startup times, to reduce CI failures (#3104)
* Tweak a log message

* Only retry failed DNS once, then use the other DNS responses

* Limit broadcasts to half the peers

* Use a longer minimum interval for GetAddr requests

* Reduce the syncer and mempool crawler fanouts

* Stop resetting the mempool twice when it starts up

This spawns two crawlers, which send two fanouts,
so it can use up a lot of peers.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-30 21:04:32 +00:00
teor a358c410f5
Stop closing connections on unexpected messages, Credit: Equilibrium (#3120)
* Ignore unsupported messages from peers

* Ignore unknown message commands from peers

* Implement Display for Request, Response, Handler, connection::State

* Stop ignoring some completed `Response`s

* Stop returning NotFound errors, use the response instead

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-30 19:26:17 +00:00
teor f6abb15778
Security: Stop routing inventory requests by peer address (#3090)
* Rewrite PeerSet comments to split long sentences

* Replace peer set integer indexes with address-based indexes

Also improve documentation and logging.

* Security: Stop using peer addresses to choose inventory routing order

* Minor doc and code cleanups

* Stop re-using a drained HashSet

* Replace used `_cancel` with `cancel`

* Reword a comment

* Replace cloned with copied
2021-11-24 10:31:42 +10:00
teor b39f4ca5aa
Shut down channels and tasks on PeerSet Drop (#3078)
* Shut down channels and tasks on PeerSet Drop

* Document all the PeerSet fields

* Close the peer set background task handle on shutdown

* Receive background tasks during shutdown

Also, split receiving and polling background tasks into separate methods.
2021-11-22 22:29:34 -03:00
dependabot[bot] 1d14032b10
Bump tower from 0.4.10 to 0.4.11 (#3081)
Bumps [tower](https://github.com/tower-rs/tower) from 0.4.10 to 0.4.11.
- [Release notes](https://github.com/tower-rs/tower/releases)
- [Commits](https://github.com/tower-rs/tower/compare/tower-0.4.10...tower-0.4.11)

---
updated-dependencies:
- dependency-name: tower
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 06:26:39 +10:00
Pili Guerra 26b3a50e01
Updates for zebra v1.0.0-beta.1 release (#3073)
* Update versions for zebra v1.0.0-beta.1 release

* Adding original PR list for comparison and tracking as PRs merge

* First pass at categorising changes

* Merge and clarify description of related changes

* Remove or merge trivial changes

* Improve change descriptions

* Add new PRs merged

* CHANGELOG: Improve release summary

* CHANGELOG: categorise changes further

* README: Remove resolved issues and items

* Update CHANGELOG.md

Co-authored-by: teor <teor@riseup.net>

* CHANGELOG: Add new PRs merged

* CHANGELOG: Move change category

* CHANGELOG: Update release date ready for tagging

Co-authored-by: teor <teor@riseup.net>
2021-11-19 13:05:11 +01:00
teor 3fc049e2eb
Implement graceful shutdown for the peer set (#3071)
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-18 13:28:25 +00:00
teor c4118dcc2c
Check for panics in the address book updater task (#3064)
* Check for panics in the address book updater task

* Fix the return type and tests

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-11-18 12:34:51 +00:00
dependabot[bot] b33ffc9df8
Bump tokio from 1.13.0 to 1.14.0 (#3062)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.13.0 to 1.14.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-17 09:27:01 +10:00
teor 7457edcb86
Stop asking users to report peer errors, fix a common peer error (#3054)
* Stop treating inv with mixed item types as a connection error

* Remove unused connection errors

* Stop asking users to create bug reports for peer errors
2021-11-15 11:32:18 -03:00
Dimitris Apostolou afb8b3d477
Fix typos (#3055)
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-12 19:30:22 +00:00
teor d6f3b3dc9a
Parse received addrv2 messages (#3022)
* Revert "Remove commented-out code"

This reverts commit 9e69777925f103ee11e5940bba95b896c828839b.

* Implement deserialization for `addrv2` messages

* Limit addr and addrv2 messages to MAX_ADDRS_IN_MESSAGE

* Clarify address version comments

* Minor cleanups and fixes

* Add preallocation tests for AddrV2

* Add serialization tests for AddrV2

* Use prop_assert in AddrV2 proptests

* Use a generic utility method for deserializing IP addresses in `addrv2`

* Document the purpose of a conversion to MetaAddr

* Fix a comment typo, and clarify that comment

* Clarify the unsupported AddrV2 network ID error and enum variant names

```sh
fastmod AddrV2UnimplementedError UnsupportedAddrV2NetworkIdError zebra-network
fastmod Unimplemented Unsupported zebra-network
```

* Fix and clarify unsupported AddrV2 comments

* Replace `panic!` with `unreachable!`

* Clarify a comment about skipping a length check in a test

* Remove a redundant test

* Basic addr (v1) and addrv2 deserialization tests

* Test deserialized IPv4 and IPv6 values in addr messages

* Remove redundant io::Cursor

* Add comments with expected values of address test vectors
2021-11-12 00:25:23 +00:00
Janito Vaqueiro Ferreira Filho 11b5a33651
Security: Avoid reconnecting to peers that are likely unreachable (#3030)
* Add a `Duration32::from_days` constructor

Make it simpler to construct a `Duration32` representing a certain
number of days.

* Add `MetaAddr::was_not_recently_seen` method

A helper method to check if a peer was never seen before or if it was
last seen a long time ago. This will be one of the conditions to
consider a peer as unreachable.

* Add `MetaAddr::is_probably_unreachable` method

A helper method to check if a peer should be considered unreachable. It
is considered unreachable if recent connection attempts have failed and
it was not recently seen.

If a peer is considered unreachable, Zebra shouldn't attempt to connect
to it again.

* Do not keep trying to connect to unreachable peer

A peer is probably unreachable if it was last seen a long time ago and
if it's last connection attempt failed.

* Test `was_not_recently_seen`

Redo the calculation on arbitrary `MetaAddr`s.

* Test `is_probably_unreachable`

Redo the calculation on arbitrary `MetaAddr`s.

* Test if probably unreachable peers are ignored

Given an `AddressBook` with a list of arbitrary `MetaAddr`s, check that
none of the peers listed for a reconnection is probably unreachable.

* Rename unit test to improve clarity

Remove the double negative from the name.

Co-authored-by: teor <teor@riseup.net>

* Rename constant to `MAX_RECENT_PEER_AGE`

Make the purpose of the constant clearer.

Co-authored-by: teor <teor@riseup.net>

* Rename method to `last_seen_is_recent`

Remove the double negative from the name.

* Rename method to `is_probably_reachable`

Avoid having to negate the result of the method in security critical
filter.

* Move check into `is_ready_for_connection_attempt`

Make sure the check is used in any place that requires a peer that's
ready for a connection attempt.

* Improve test documention

Describe the goal of the test better.

Co-authored-by: teor <teor@riseup.net>

* Improve `is_probably_reachable` documentation

List the conditions as bullet points.

Co-authored-by: teor <teor@riseup.net>

* Document what happens when peers have no last seen time

Co-authored-by: teor <teor@riseup.net>
2021-11-10 23:51:22 +00:00
teor c0c00b3f0d
Simplify preallocate tests (#3032)
* Simplify preallocation tests using a test function

* Use prop_assert in proptests
2021-11-11 07:53:21 +10:00
teor 85b016756d
Refactor addr v1 serialization using a separate AddrV1 type (#3021)
* Implement addr v1 serialization using a separate AddrV1 type

* Remove commented-out code

* Split the address serialization code into modules

* Reorder v1 and in_version fields in serialization order

* Fix a missed search-and-replace

* Explain conversion to MetaAddr

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-10 06:47:50 +10:00
teor af2baa0a5e
Avoid listener address conflicts in network tests (#3031)
These conflicts can make some tests fail if they run in parallel.
Failures are more likely on machines with lots of cores.
2021-11-08 11:20:13 -03:00
teor 7d8240fac3
Fix verbose add_initial_peers logs (#3019)
And update some function docs.
2021-11-07 22:21:51 +00:00
teor 88cdf0f832
Move MetaAddr serialization into zebra_network::protocol::external (#3020)
Preparation for `addrv2` serialization.
2021-11-07 21:37:38 +00:00
Marek d03161c63f
Add unused seed peers to the AddressBook (#2974)
* Add unused seed peers to the AddressBook

* Document a new `await`

We added an extra await on the AddressBook thread mutex.

Co-authored-by: teor <teor@riseup.net>

* Fix a typo

* Refactor names

* Return early from `limit_initial_peers`

* Add `proptest`s regressions

* Return `MetaAddr` instead of `None`

* Test if `zebra_network::init()` deadlocks

* Remove unneeded regressions

* Rename `TimestampCollector` to `AddressBookUpdater` (#2992)

* Rename `TimestampCollector` to `AddressBookUpdater`

* Update comments

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Move `all_peers` instead of copying them

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Make `Duration` a const

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Use a timeout instead of measuring the elapsed time

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Copy `initial_peers` instead of moving them

* Refactor the position of `NewInitial` and `new_initial`

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-11-04 08:34:00 -03:00
Janito Vaqueiro Ferreira Filho 0960e4fb0b
Update to Tokio 1.13.0 (#2994)
* Update `tower` to version `0.4.9`

Update to latest version to add support for Tokio version 1.

* Replace usage of `ServiceExt::ready_and`

It was deprecated in favor of `ServiceExt::ready`.

* Update Tokio dependency to version `1.13.0`

This will break the build because the code isn't ready for the update,
but future commits will fix the issues.

* Replace import of `tokio::stream::StreamExt`

Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.

* Use `IntervalStream` in `zebra-network`

In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.

* Use `IntervalStream` in `inventory_registry`

In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.

* Use `BroadcastStream` in `inventory_registry`

In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.

* Handle `Semaphore::acquire` error in `tower-batch`

Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.

* Handle `Semaphore::acquire` error in `zebrad` test

On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.

* Update some `zebra-network` dependencies

Use versions compatible with Tokio version 1.

* Upgrade Hyper to version 0.14

Use a version that supports Tokio version 1.

* Update `metrics` dependency to version 0.17

And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.

* Use `f64` as the histogram data type

`u64` isn't supported as the histogram data type in newer versions of
`metrics`.

* Update the initialization of the metrics component

Make it compatible with the new version of `metrics`.

* Simplify build version counter

Remove all constants and use the new `metrics::incement_counter!` macro.

* Change metrics output line to match on

The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.

* Update `sentry` to version 0.23.0

Use a version compatible with Tokio version 1.

* Remove usage of `TracingIntegration`

This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.

* Add sentry layer to tracing initialization

This seems like the replacement for `TracingIntegration`.

* Remove unnecessary conversion

Suggested by a Clippy lint.

* Update Cargo lock file

Apply all of the updates to dependencies.

* Ban duplicate tokio dependencies

Also ban git sources for tokio dependencies.

* Stop allowing sentry-tracing git repository in `deny.toml`

* Allow remaining duplicates after the tokio upgrade

* Use C: drive for CI build output on Windows

GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.

Co-authored-by: teor <teor@riseup.net>
2021-11-02 18:46:57 +00:00
teor 938376f11f
Disable an unreliable test on macOS (#2997)
We keep the test active on other platforms,
because it passes on them.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-11-02 14:11:39 +00:00
Janito Vaqueiro Ferreira Filho a9f1c189d9
Make `services` field in `MetaAddr` optional (#2976)
* Use `prop_assert` instead of `assert`

Otherwise the test input isn't minimized.

* Split long string into a multi-line string

And add some newlines to try to improve readability.

* Fix referenced issue number

They had a typo in their number.

* Make peer services optional

It is unknown for initial peers.

* Fix `preserve_initial_untrusted_values` test

Now that it's optional, the services field can be written to if it was
previously empty.

* Fix formatting of property tests

Run rustfmt on them.

* Restore `TODO` comment

Make it easy to find planned improvements in the code.

Co-authored-by: teor <teor@riseup.net>

* Comment on how ordering is affected

Make it clear that missing services causes the peer to be chosen last.

Co-authored-by: teor <teor@riseup.net>

* Don't expect `services` to be available

Avoid a panic by using the compiler to help enforce the handling of the
case correctly.

* Panic if received gossiped address has no services

All received gossiped addresses have services. The only addresses that
don't have services configured are the initial seed addresses.

Co-authored-by: teor <teor@riseup.net>
2021-11-02 02:45:35 +00:00
Conrado Gouvea e54917ae7c
V1.0.0-beta.0 (#2973)
* V1.0.0-beta.0

* Bump version in install.md
2021-10-29 20:21:26 +00:00
Alfredo Garcia 07610feef3
Reduce outgoing peers demand (#2969)
* reduce demand

* use `saturating_sub`
2021-10-29 16:29:52 +00:00
Alfredo Garcia 3402c1d8a2
Add user agent metrics (#2957)
* add remote peer user agent metrics

* add user agent to obsolete peers
2021-10-28 19:23:09 +00:00
teor f26a60b801
Limit the number of inbound peer connections (#2961)
* Limit open inbound connections based on the config

* Log inbound connection errors at debug level

* Test inbound connection limits

* Use clone directly in function call argument lists

* Remove an outdated comment

* Update tests to use an unbounded channel rather than mem::forget

And rename some variables.

* Use a lower limit in a slow test and require that it is exceeded
2021-10-28 01:49:31 +00:00
Conrado Gouvea 8d01750459
Rate-limit initial seed peer connections (#2943)
* Rate-limit initial seed peer connections

* Revert "Rate-limit initial seed peer connections"

This reverts commit f779a1eb9e.

* Simplify logic

* Avoid cooperative async task starvation in the peer crawler and listener

If we don't yield in these loops, they can run for a long time before
tokio forces them to yield.

* Add test

* Check for task panics in initial peers test

* Remove duplicate code in rebase

Co-authored-by: teor <teor@riseup.net>
2021-10-27 23:46:43 +00:00
teor 3e03d48799
Limit the number of outbound peer connections (#2944)
* Limit the number of outbound connections in the crawler

* Make zebra-network channel bounds depend on config.peerset_initial_target_size

* Bias Zebra towards outbound connections

And turn connection limits into `Config` methods.

* Downgrade some connection logs to debug

* Remove verbose or outdated fields in tracing logs

* Clarify connection limits

Includes:
- `fastmod OUTBOUND_PEER_BIAS_FRACTION OUTBOUND_PEER_BIAS_DENOMINATOR zebra*`
- clarify connection limit documentation

* Clarify inventory channel capacity

* Add zebra_network::initialize tests with limited numbers of peers

* Avoid cooperative async task starvation in the peer crawler and listener

If we don't yield in these loops, they can run for a long time before
tokio forces them to yield.

* Test the crawler with small connection limits

And use the multi-threaded runtime to avoid long hangs.

* Stop using the multi-threaded executor in tests where it's not needed

* Avoid starvation for every connection

Adds yields after inbound successes and initial peer connections.

* Add a crawler peer connection success test

* Add outbound connection limit tests

* Improve outbound tests
2021-10-27 21:28:51 +00:00
teor c2734f5661
Simplify calling `add_initial_peers` (#2945)
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-25 20:16:35 +00:00
Janito Vaqueiro Ferreira Filho 2a1d4281c5
Manually pin `Sleep` futures (#2914)
* Wrap `Sleep` timer in a `Pin<Box<_>>`

The `Sleep` type doesn't implement `Unpin` in newer versions of Tokio.

* Wrap `Sleep` type in a `Pin<Box<_>>`

In newer Tokio versions the `Sleep` type doesn't implement `Unpin`, so
it needs to be manually pinned.
2021-10-22 16:06:03 -03:00
teor 67327ac462
Downgrade some less interesting info-level logs to debug (#2938)
There are a lot of these messages when Zebra starts up.
They might be slowing down CI and causing timeouts.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-10-22 02:11:09 +00:00
teor 424edfa4d9
Improve documentation and types in the PeerSet (#2925)
* Replace some unit tuples with named unit structs

This helps distinguish generic channels and make them type-safe.

Also tidy imports and documentation in `peer_set::set`.

* Link to the tower balance crate from docs

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-22 01:26:04 +00:00
Alfredo Garcia ad5f5ff24a
Rate limit the amount of inbound connections (#2928)
* add sleep to `accept_inbound_connections()`

* Expand docs

* Expand comments again

Co-authored-by: teor <teor@riseup.net>
2021-10-22 00:35:34 +00:00
Alfredo Garcia 2de93bba8e
Limit the number of initial peers (#2913)
* limit the number of initial peers

* Move more code out of zebra_network::initialize

* Always limit the number of initial peers in the Config

This way, we can never get the unused peers out.

* Revert "Always limit the number of initial peers in the Config"

This reverts commit 81ede597c8.

Actually, this doesn't work, because we want those extra peers.

* Minor tweaks

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: teor <teor@riseup.net>
2021-10-21 23:04:46 +00:00
teor 4cdd12e2c4
Track the number of active inbound and outbound peer connections (#2912)
* Count the number of active inbound and outbound peer connections

And reduce the count when each connection fails.

* Fix a comment typo

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>

Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-21 21:36:42 +00:00
Janito Vaqueiro Ferreira Filho 39ed7d70d3
Use single thread Tokio runtime for tests (#2916)
Newer versions of Tokio panic if `tokio::time::pause()` is called from a
multi-thread executor, and `#[tokio::test]` defaults to a single thread
runtime, so it makes sense to always use a single thread runtime in all
tests.
2021-10-21 16:22:12 +00:00
Janito Vaqueiro Ferreira Filho 192a45ccf1
Refactor rate limiting to not store `Sleep` type (#2915)
In newer Tokio versions the `Sleep` type doesn't implement `Unpin`, so
it's a little more complicated to use it. In this case it was easier to
refactor the code to not store the `Sleep` type instead of wrapping it
in a `Pin` type.
2021-10-21 11:47:04 +00:00
Marek d2a5af0ea5
V1.0.0 alpha.19 (#2907)
* Increment the crates that have new commits since the last version

* Increment the crates that depend on crates that have changed

* Increment the version of `zebra-script`

* Use the `zebrad` version in the `zebra-network` user agent string

* Use the `v1.0.0-alpha.19` git tag in `README.md`

* Copy the draft changelog into `CHANGELOG.md`

* Delete bumps

* Update CHANGELOG.md

Co-authored-by: teor <teor@riseup.net>

* Add newly merged PRs

Co-authored-by: teor <teor@riseup.net>
2021-10-21 12:33:35 +02:00
teor 4b8b65a627
Avoid spurious acceptance test failures by decreasing the peer crawler timeout (#2905)
* Improve logging for initial peer connections

* Decrease the initial peer crawl timeout to make tests more reliable

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-19 15:29:03 +00:00
teor c8ad19080a
Improve logging for initial peer connections (#2896)
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
2021-10-18 18:43:12 +00:00
teor b64ed62777
Add a debug config that enables the mempool (#2862)
* Update some comments

* Add a mempool debug_enable_at_height config

* Rename a field in the mempool crawler

* Propagate syncer channel errors through the crawler

We don't want to ignore these errors, because they might indicate a shutdown.
(Or a bug that we should fix.)

* Use debug_enable_at_height in the mempool crawler

* Log when the mempool is activated or deactivated

* Deny unknown fields and apply defaults for all configs

* Move Duration last, as required for TOML tables

* Add a basic mempool acceptance test

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-13 15:04:49 +00:00
Alfredo Garcia 4280ef5003
Give more information to the user in the wrong port init warning (#2853)
* Update initialize.rs

* grammar

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2021-10-12 01:13:13 +00:00
Alfredo Garcia dcf281efff
make `INITIAL_MIN_NETWORK_PROTOCOL_VERSION` suport testnet and mainnet (#2851) 2021-10-08 14:57:04 -03:00
Alfredo Garcia f1718f5c92
Add `zcash_serialized_size()` to `ZcashSerialize` trait (#2824)
* add a zcash_serialized_size()

* add a size field to `UnminedTx`

* refactor zcash_serialized_size() to don't allocate RAM

* improve performance

Co-authored-by: teor <teor@riseup.net>

* clippy

Co-authored-by: teor <teor@riseup.net>
2021-10-06 22:40:11 +00:00
Pili Guerra a85e80a0ec
Update versions for zebra v1.0.0-alpha.18 release (#2828)
* Update versions for zebra v1.0.0-alpha.18 release

* WIP: Initial PR list

* Remove uninteresting version bumps from CHANGELOG

* Categorise and group PRs in CHANGELOG, removing uninteresting PRs

* Further refine and categorise changelog entries

* Fix tag url

* Final changes to CHANGELOG

* Add a changelog description

* Spacing

* Clarify and fix changelog PR descriptions

* Add PRs that are about to be merged

* More slight clarifications

* Spacing

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-10-05 23:22:31 -03:00
Janito Vaqueiro Ferreira Filho 339fefb6e6
Update shared NU5 dependencies, set the NU5 testnet activation network upgrade parameters (#2825)
* Check return value of zcash_script_new_precomputed_tx

* Set the NU5 testnet activation height to 1_590_000

* Apply suggestions from code review

Co-authored-by: teor <teor@riseup.net>

* Update Nu5 constants to new values

* Update ZIP-244 test vectors for new branch ID

* Squashed commit of the following:

commit bdb120a249
Author: Deirdre Connolly <durumcrustulum@gmail.com>
Date:   Tue Oct 5 11:54:01 2021 -0400

    Use pallas::Base::from_str_vartime() in sinsemilla tests

commit e99fa49258
Author: Deirdre Connolly <durumcrustulum@gmail.com>
Date:   Tue Oct 5 11:45:24 2021 -0400

    Compiles

commit a520018114
Author: Deirdre Connolly <durumcrustulum@gmail.com>
Date:   Tue Oct 5 10:15:17 2021 -0400

    Incomplete upgrade of deps

* Squashed commit of the following:

commit 8d1b76ec5626517817c3a4d9f3950acc90a359df
Author: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Date:   Tue Oct 5 04:02:26 2021 +0000

    Update `zcash_script` to support V5 transactions

    Use a newer version of `zcash_script` that has been updated to support
    V5 transactions.

commit 371233628ae61e0c25d6ba8f31d9dba42823becb
Author: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Date:   Tue Oct 5 03:06:20 2021 +0000

    Update Zcash dependencies

    Update some Zcash crates:

    - `halo2`
    - `incrementalmerkletree' (patch version)
    - `orchard` (patch version)
    - `zcash_history` (patch version)
    - `zcash_note_encryption` (patch version)
    - `zcash_primitives` (patch version)

    And also update the `group` dependency so that the code remains
    compatible.

commit de5cf1ec40c3fc08670fc971cdf3e65e13d9f4c7
Author: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Date:   Tue Oct 5 03:04:13 2021 +0000

    Update error message assertion

    Use the updated message for the expected error variant.

* Update `zcash_script` to support V5 transactions

Use a newer version of `zcash_script` that has been updated to support
V5 transactions.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: teor <teor@riseup.net>
2021-10-06 11:08:41 +10:00
teor e5f5ac9ce8
Fix or disable recent nightly clippy lints (#2817)
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-10-01 15:26:06 +00:00
teor 966f52a280
Fix join errors in initial seed peer versions dashboard (#2811)
* Add metrics gauges for the most recent peer network protocol version

This gague lets us join the initial seeds to the network protocol versions,
even if the peer upgrades and reconnects with a different version.

* Ensure dashboard peer network versions are unique

Otherwise, prometheus returns an error,
and the dashboard shows no data.

* Make seeder labels more readable

- put labels to the right of the graph
- remove default ports

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-10-01 01:05:00 +00:00
teor 20b2e0549e
Add metrics for initial peer network protocol versions (#2804)
* Add tracing and metrics for seed peer DNS resolution

* Add a grafana dashboard for seed peers

Currently this just shows the initial peer count from each seed.

* Add tracing and metrics for peer network protocol versions

* Update peers dashboard with network protocol versions

* Show peer network protocol versions for each seeder in dashboard

* Add per-seed filter to dashboard

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-09-29 18:08:20 +00:00
Marek 952da4c794
Change current network protocol version for NU5 testnet (#2803)
* Set the CURRENT_NETWORK_PROTOCOL_VERSION to 170_-014

* Adjust verify_v5_transaction()
2021-09-27 10:44:51 -03:00
Alfredo Garcia 56636c85fc
Add missing tests for mempool inbound requests (#2769)
* Use `MockService` in inbound test

Refactor the `mempool_requsets_for_transactions` test so that it uses a
`MockService` instead of the `mock_peer_set` function.

* Use `MockService` in the basic mempool test

Refactor the `mempool_service_basic` test so that it uses a
`MockService` instead of the `mock_peer_set` helper function.

* Remove the `mock_peer_set` helper function

It is not used anymore, since the usages were replaced with
`MockService`s.

* add tests for mempool inbound requests

* Use MockService for transaction verifier

* Refactor creation of mock `peer_set`

Use the same style as the mock transaction verifier.

* Derive `Eq` for `zebra_network::Request`

Make it easy to use the `MockService::expect_request` method.

* Return mocked peer set service from `setup`

Allow it to be used to respond to requests.

* Add bindings for the transaction used for testing

Allow them to be moved into futures later.

* Respond to transaction download request

Make sure that the test transaction appears to the mempool as if it had
been downloaded by the peer set service.

* Assert that no unexpected requests were received

Check that the mempool doesn't send unexpected requests to the peer set
service.

* add tests for mempool inbound requests

* Use MockService for transaction verifier

* add missing `expect_no_requests` to `mempool_advertise_transaction_ids` test

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-09-23 10:17:06 -03:00
Janito Vaqueiro Ferreira Filho b714b2b3b6
Create a helper `MockService` type to help with writing tests that use mock `tower::Service`s (#2748)
* Implement initial service mocking helpers

Adds a [`MockService`] type, which can be configured and built for usage
in unit tests or proptests. The mocked service can then be used to
intercept requests and respond indivdiually to them.

* Use `MockService in the `mempool::Crawler` test

Refactor it to remove the helper mock function, and use the new
`MockService` helper type.

* Use `MockService` in `CandidateSet` test vectors

Refactor to remove the manual mocking of the peer set service.

* Panic if a response is not sent by `MockService`

Change the current semantics to require all `MockService` usages to
respond to every intercepted request.

A `must_use` attribute was added to the `ResponseSender` so that the
compiler can warn when this doesn't happen.

* Allow generic error types in `MockService`

Replace the hard-coded `BoxError` as the `Service`'s error type with a
generic type parameter. This allows mocking services in locations that
require specific error types.

* Add a `ResponseSender::request` getter

Allow inspecting the request again before responding, and using
information from the request in the response.

Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
2021-09-21 17:44:59 +00:00
Conrado Gouvea 8971d62727
Update version strings for v1.0.0 alpha.17 release (#2750)
* Change versions for v1.0.0-alpha.17 release
2021-09-14 17:41:50 +00:00
teor b6fe816473
Add a `ChainTipChange` type to `await` chain tip changes (#2715)
* Rename ChainTipReceiver to CurrentChainTip

`fastmod ChainTipReceiver CurrentChainTip zebra*`

* Update chain tip documentation and variable names

* Basic chain tip change implementation, without resets

Also includes the following name changes:
```
fastmod CurrentChainTip LatestChainTip zebra*
fastmod chain_tip_receiver latest_chain_tip zebra*
```

* Clarify the difference between `LatestChainTip` and `ChainTipChange`
2021-09-01 22:31:16 +00:00
Alfredo Garcia 968f20d423
Update versions for zebra v1.0.0-alpha.16 release (#2670)
* bump crate versions

* update zebra-script

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-08-27 22:13:54 +00:00
teor d2e14b22f9
Refactor BestTipHeight into a generic ChainTip sender and receiver (#2676)
* Rename BestTipHeight so it can be generalised to ChainTipSender

`fastmod BestTipHeight ChainTipSender zebra*`

For senders:
`fastmod best_tip_height chain_tip_sender zebra*`

For receivers:
`fastmod best_tip_height chain_tip_receiver zebra*`

* Rename best_tip_height module to chain_tip

* Wrap the chain tip watch channel in a ChainTipReceiver type

* Create a ChainTip trait to avoid tricky crate dependencies

And add convenience impls for optional and empty chain tips.

* Use the ChainTip trait in zebra-network

* Replace `Option<ChainTip>` with `NoChainTip`

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-27 11:34:33 +10:00
teor 047576273c
Stop converting `Message::Inv(TxId+)` into `Request::TransactionsById` (#2660)
`Message::Inv(TxId+)` is a transaction advertisement,
so it should be converted into `Request::AdvertiseTransactionIds`.

This is a copy-paste mistake from the original zebra-network
implementation.
2021-08-24 21:40:21 +00:00
teor c608260256
Support witnessed transaction IDs in zebra-network requests and responses (#2638)
* Rename internal network requests for wide transaction IDs

fastmod TransactionsByHash TransactionsById zebra*
fastmod AdvertiseTransactions AdvertiseTransactionIds zebra*
fastmod MempoolTransactions MempoolTransactionIds zebra*
fastmod TransactionHashes TransactionIds zebra*

* Update network transaction request/response comments

* Rename a transaction hash method for wide transaction IDs

fastmod transaction_hashes transaction_ids zebra-network

* Add UnminedTxId methods and conversions for InventoryHash

* Map WtxIds to unmined transaction network messages

Also, use UnminedTxId and UnminedTx in:
* Zebra's internal request and response format, and
* external Zcash network protocol messages.

* Enable WtxId mempool inventory tracking for peers

* Further clarify transaction IDs

* Use Witnessed rather than Wide for transaction IDs

And rename narrow to legacy when it only applies to v1-v4 transactions.
Otherwise, rename it to mined ID.

* Rename a missed binding
* Remove an incorrectly named binding

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-08-18 22:55:24 +00:00
teor 6c86c8dd0d
Implement a WtxId struct, and use it in Zebra's external network protocol (#2618)
* Make the `AuthDigest` display order match transaction IDs

And derive `Hash`, just like transaction IDs.

Don't derive `serde` for now, because it's not needed.

* Move transaction::Hash test to tests module

* Add a simple AuthDigest display order test

* Add a WtxId type for wide transaction IDs

* Add conversions between transaction IDs and bytes

* Use the WtxId type in external network protocol messages
2021-08-16 21:26:08 +00:00
Pili Guerra 234953e620
Update versions for zebra v1.0.0-alpha.15 release (#2612) 2021-08-16 10:06:26 +00:00
Janito Vaqueiro Ferreira Filho 4c4dbfe7cd
Reject connections from outdated peers (#2519)
* Simplify state service initialization in test

Use the test helper function to remove redundant code.

* Create `BestTipHeight` helper type

This type abstracts away the calculation of the best tip height based on
the finalized block height and the best non-finalized chain's tip.

* Add `best_tip_height` field to `StateService`

The receiver endpoint is currently ignored.

* Return receiver endpoint from service constructor

Make it available so that the best tip height can be watched.

* Update finalized height after finalizing blocks

After blocks from the queue are finalized and committed to disk, update
the finalized block height.

* Update best non-finalized height after validation

Update the value of the best non-finalized chain tip block height after
a new block is committed to the non-finalized state.

* Update finalized height after loading from disk

When `FinalizedState` is first created, it loads the state from
persistent storage, and the finalized tip height is updated. Therefore,
the `best_tip_height` must be notified of the initial value.

* Update the finalized height on checkpoint commit

When a checkpointed block is commited, it bypasses the non-finalized
state, so there's an extra place where the finalized height has to be
updated.

* Add `best_tip_height` to `Handshake` service

It can be configured using the `Builder::with_best_tip_height`. It's
currently not used, but it will be used to determine if a connection to
a remote peer should be rejected or not based on that peer's protocol
version.

* Require best tip height to init. `zebra_network`

Without it the handshake service can't properly enforce the minimum
network protocol version from peers. Zebrad obtains the best tip height
endpoint from `zebra_state`, and the test vectors simply use a dummy
endpoint that's fixed at the genesis height.

* Pass `best_tip_height` to proto. ver. negotiation

The protocol version negotiation code will reject connections to peers
if they are using an old protocol version. An old version is determined
based on the current known best chain tip height.

* Handle an optional height in `Version`

Fallback to the genesis height in `None` is specified.

* Reject connections to peers on old proto. versions

Avoid connecting to peers that are on protocol versions that don't
recognize a network update.

* Document why peers on old versions are rejected

Describe why it's a security issue above the check.

* Test if `BestTipHeight` starts with `None`

Check if initially there is no best tip height.

* Test if best tip height is max. of latest values

After applying a list of random updates where each one either sets the
finalized height or the non-finalized height, check that the best tip
height is the maximum of the most recently set finalized height and the
most recently set non-finalized height.

* Add `queue_and_commit_finalized` method

A small refactor to make testing easier. The handling of requests for
committing non-finalized and finalized blocks is now more consistent.

* Add `assert_block_can_be_validated` helper

Refactor to move into a separate method some assertions that are done
before a block is validated. This is to allow moving these assertions
more easily to simplify testing.

* Remove redundant PoW block assertion

It's also checked in
`zebra_state::service::check::block_is_contextually_valid`, and it was
getting in the way of tests that received a gossiped block before
finalizing enough blocks.

* Create a test strategy for test vector chain

Splits a chain loaded from the test vectors in two parts, containing the
blocks to finalize and the blocks to keep in the non-finalized state.

* Test committing blocks update best tip height

Create a mock blockchain state, with a chain of finalized blocks and a
chain of non-finalized blocks. Commit all the blocks appropriately, and
verify that the best tip height is updated.

Co-authored-by: teor <teor@riseup.net>
2021-08-08 23:52:52 +00:00
Pili Guerra f59d552721
Update versions for zebra v1.0.0-alpha.14 release (#2537)
Co-authored-by: teor <teor@riseup.net>
2021-07-29 19:42:21 +00:00
Pili Guerra 4bfcc916de
Update versions for v1.0.0 alpha.13 release (#2488)
* Update versions for v1.0.0-alpha.13 release

* Update Cargo.lock

Co-authored-by: teor <teor@riseup.net>
2021-07-15 08:52:55 -03:00
Janito Vaqueiro Ferreira Filho 20eeddcaab
Parse `MSG_WTX` inventory type (part of ZIP-239) (#2446)
* Rename constant to `MIN_INVENTORY_HASH_SIZE`

Because the size is not constant anymore, since the `MSG_WTX` inventory
type is larger.

* Add `InventoryHash::smallest_types_strategy`

A method for a proptest strategy that generates the `InventoryHash`
variants that have the smallest serialized size.

* Update proptest to use only smallest inventories

In order to properly test the maximum allocation.

* Add intra-doc links in some method documentation

Make it easier to navigate from the documentation of the proptest
strategies to the variants they generate.

* Parse `MSG_WTX` inventory type

Avoid returning an error if a received `GetData` or `Inv` message
contains a `MSG_WTX` inventory type. This prevents Zebra from
disconnecting from peers that announce V5 transactions.

* Fix inventory hash size proptest

The serialized size now depends on what type of `InventoryHash` is being
tested.

* Implement serialization of `InventoryHash::Wtx`

For now it just copies the stored bytes, in order to allow the tests to
run correctly.

* Test if `MSG_WTX` inventory is parsed correctly

Create some mock input bytes representing a serialized `MSG_WTX`
inventory item, and check that it can be deserialized successfully.

* Generate arbitrary `InventoryHash::Wtx` for tests

Create a strategy that only generates `InventoryHash::Wtx` instances,
and also update the `Arbitrary` implementation for `InventoryHash` to
also generate `Wtx` variants.

* Test `InventoryHash` serialization roundtrip

Given an arbitrary `InventoryHash`, check that it does not change after
being serialized and deserialized.

Currently, `InventoryHash::Wtx` can't be serialized, so this test will
is expected to panic for now, but it will fail once the serialization
code is implemented, and then the `should_panic` should be removed.

* Test deserialize `InventoryHash` from random bytes

Create an random input vector of bytes, and try to deserialize an
`InventoryHash` from it. This should either succeed or fail in an
expected way.

* Remove redundant attribute

The attribute is redundant because the `arbitrary` module already has
that attribute.

* Implement `Message::inv_strategy()`

A method to return a proptest strategy that creates `Message::Inv`
instances.

* Implement `Message::get_data_strategy()`

A method that returns a proptest strategy that creates
`Message::GetData` instances.

* Test encode/decode roundtrip of some `Message`s

Create a `Message` instance, encode it and then decode it using a
`Codec` instance and check that the result is the same as the initial
`Message`.

For now, this only tests `Message::Inv` and `Message::GetData`, because
these are the variants that are related to the scope of the current set
of changes to support parsing the `MSG_WTX` inventory type.

Even so, the test relies on being able to serialize an
`InventoryHash::Wtx`, which is currently not implemented. Therefore the
test was marked as `should_panic` until the serialization code is
implemented.
2021-07-07 11:06:11 +10:00
Pili Guerra 515dc4bf5c
Update versions for Zebra v1.0.0 alpha.12 release (#2415)
* Update versions for zebra v1.0.0-alpha.12 release

* Update Cargo.lock

* Update release checklist with latest version changes to help keep track for future releases

* Remove reference to the fact that tower-fallback was not updated
2021-07-01 08:59:32 +01:00
dependabot[bot] b59121b09e build(deps): bump indexmap from 1.6.2 to 1.7.0
Bumps [indexmap](https://github.com/bluss/indexmap) from 1.6.2 to 1.7.0.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Commits](https://github.com/bluss/indexmap/compare/1.6.2...1.7.0)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-06-30 13:36:35 -04:00
Janito Vaqueiro Ferreira Filho b68202c68a
Security: Zebra should stop gossiping unreachable addresses to other nodes, Action: re-deploy all nodes (#2392)
* Rename some methods and constants for clarity

Using the following commands:

```
fastmod '\bis_ready_for_attempt\b' is_ready_for_connection_attempt
  # One instance required a tweak, because of the ASCII diagram.
fastmod '\bwas_recently_live\b' has_connection_recently_responded
fastmod '\bwas_recently_attempted\b' was_connection_recently_attempted
fastmod '\bwas_recently_failed\b' has_connection_recently_failed
fastmod '\bLIVE_PEER_DURATION\b' MIN_PEER_RECONNECTION_DELAY
```

* Use `Instant::elapsed` for conciseness

Instead of `Instant::now().saturating_duration_since`. They're both
equivalent, and `elapsed` only panics if the `Instant` is somehow
synthetically generated.

* Allow `Duration32` to be created in other crates

Export the `Duration32` from the `zebra_chain::serialization` module.

* Add some new `Duration32` constructors

Create some helper `const` constructors to make it easy to create
constant durations. Add methods to create a `Duration32` from seconds,
minutes and hours.

* Avoid gossiping unreachable peers

When sanitizing the list of peers to gossip, remove those that we
haven't seen in more than three hours.

* Test if unreachable addresses aren't gossiped

Create a property test with random addreses inserted into an
`AddressBook`, and verify that the sanitized list of addresses does not
contain any addresses considered unreachable.

* Test if new alternate address isn't gossipable

Create a new alternate peer, because that type of `MetaAddr` does not
have `last_response` or `untrusted_last_seen` times. Verify that the
peer is not considered gossipable.

* Test if local listener is gossipable

The `MetaAddr` representing the local peer's listening address should
always be considered gossipable.

* Test if gossiped peer recently seen is gossipable

Create a `MetaAddr` representing a gossiped peer that was reported to be
seen recently. Check that the peer is considered gossipable.

* Test peer reportedly last seen in the future

Create a `MetaAddr` representing a peer gossiped and reported to have
been last seen in a time that's in the future. Check that the peer is
considered gossipable, to check that the fallback calculation is working
as intended.

* Test gossiped peer reportedly seen long ago

Create a `MetaAddr` representing a gossiped peer that was reported to
last have been seen a long time ago. Check that the peer is not
considered gossipable.

* Test if just responded peer is gossipable

Create a `MetaAddr` representing a peer that has just responded and
check that it is considered gossipable.

* Test if recently responded peer is gossipable

Create a `MetaAddr` representing a peer that last responded within the
duration a peer is considered reachable. Verify that the peer is
considered gossipable.

* Test peer that responded long ago isn't gossipable

Create a `MetaAddr` representing a peer that last responded outside the
duration a peer is considered reachable. Verify that the peer is not
considered gossipable.
2021-06-29 05:12:27 +00:00
teor 9cb7ee4d0e
Release Blocker? Disable IPv6 tests when $ZEBRA_SKIP_IPV6_TESTS is set (#2405)
* Disable IPv6 tests when $ZEBRA_SKIP_IPV6_TESTS is set

This allows users to disable IPv6 tests in environments where IPv6 is not
configured.

* Add network test env var constants

* Replace env strings with constants

fastmod '"ZEBRA_SKIP_NETWORK_TESTS"' zebra_test::net::ZEBRA_SKIP_NETWORK_TESTS
fastmod '"ZEBRA_SKIP_IPV6_TESTS"' zebra_test::net::ZEBRA_SKIP_IPV6_TESTS

* Add functions to skip network tests

* Replace test network env var checks with test function

fastmod --fixed-strings 'env::var_os(zebra_test::net::ZEBRA_SKIP_NETWORK_TESTS).is_some()' 'zebra_test::net::zebra_skip_network_tests()'
fastmod --fixed-strings 'env::var_os(zebra_test::net::ZEBRA_SKIP_IPV6_TESTS).is_some()' 'zebra_test::net::zebra_skip_ipv6_tests()'

* Remove redundant logging and use statements
2021-06-29 11:20:32 +10:00
teor 7586699f86
Support a minimum protocol version during initial block download (#2395)
* Support a min protocol version during initial block download

But don't actually use the state height yet.

Also rename some functions and constants.

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-06-29 10:49:03 +10:00
teor 1b6688f139
README: update known issues and add inbound network ports (#2373)
* README: update known issues

* Add ticket numbers

* Add network ports to README

* Make heading a bit clearer

* Update zebra listener address docs

Explain how Zebra currently uses listener addresses,
after recent changes.
2021-06-23 08:10:21 -03:00
teor d18d118a20
Remove unicode in Zebra's user agent (#2376) 2021-06-23 08:45:25 +01:00
teor bcd5f2c50d
Gossip dynamic local listener ports to peers (#2277)
* Gossip dynamically allocated listener ports to peers

Previously, Zebra would either gossip port `0`, which is invalid, or skip
gossiping its own dynamically allocated listener port.

* Improve "no configured peers" warning

And downgrade from error to warning, because inbound-only nodes are a
valid use case.

* Move random_known_port to zebra-test

* Add tests for dynamic local listener ports and the AddressBook

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-06-23 07:59:06 +10:00
teor 1a57023eac
Security: Use canonical SocketAddrs to avoid duplicate peer connections, Feature: Send local listener to peers (#2276)
* Always send our local listener with the latest time

Previously, whenever there was an inbound request for peers, we would
clone the address book and update it with the local listener.

This had two impacts:
- the listener could conflict with an existing entry,
  rather than unconditionally replacing it, and
- the listener was briefly included in the address book metrics.

As a side-effect, this change also makes sanitization slightly faster,
because it avoids some useless peer filtering and sorting.

* Skip listeners that are not valid for outbound connections

* Filter sanitized addresses Zebra based on address state

This fix correctly prevents Zebra gossiping client addresses to peers,
but still keeps the client in the address book to avoid reconnections.

* Add a full set of DateTime32 and Duration32 calculation methods

* Refactor sanitize to use the new DateTime32/Duration32 methods

* Security: Use canonical SocketAddrs to avoid duplicate connections

If we allow multiple variants for each peer address, we can make multiple
connections to that peer.

Also make sure sanitized MetaAddrs are valid for outbound connections.

* Test that address books contain the local listener address

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-06-22 02:16:59 +00:00
teor 3bd52f89a5 Upgrade to pin_project 1.0.7 to resolve nightly warnings
Except for tower-fallback, which has code that is incompatible with
pin_project 1.0.
2021-06-21 15:52:39 -04:00
teor 4d22a0bae9
Security: Limit reconnection rate to individual peers (#2275)
* Security: Limit reconnection rate to individual peers

Reconnection Rate

Limit the reconnection rate to each individual peer by applying the
liveness cutoff to the attempt, responded, and failure time fields.
If any field is recent, the peer is skipped.

The new liveness cutoff skips any peers that have recently been attempted
or failed. (Previously, the liveness check was only applied if the peer
was in the `Responded` state, which could lead to repeated retries of
`Failed` peers, particularly in small address books.)

Reconnection Order

Zebra prefers more useful peer states, then the earliest attempted,
failed, and responded times, then the most recent gossiped last seen
times.

Before this change, Zebra took the most recent time in all the peer time
fields, and used that time for liveness and ordering. This led to
confusion between trusted and untrusted data, and success and failure
times.

Unlike the previous order, the new order:
- tries all peers in each state, before re-trying any peer in that state,
  and
- only checks the the gossiped untrusted last seen time
  if all other times are equal.

* Preserve the later time if changes arrive out of order

* Update CandidateSet::next documentation

* Update CandidateSet state diagram

* Fix variant names in comments

* Explain why timestamps can be left out of MetaAddrChanges

* Add a simple test for the individual peer retry limit

* Only generate valid Arbitrary PeerServices values

* Add an individual peer retry limit AddressBook and CandidateSet test

* Stop deleting recently live addresses from the address book

If we delete recently live addresses from the address book, we can get a
new entry for them, and reconnect too rapidly.

* Rename functions to match similar tokio API

* Fix docs for service sorting

* Clarify a comment

* Cleanup a variable and comments

* Remove blank lines in the CandidateSet state diagram

* Add a multi-peer proptest that checks outbound attempt fairness

* Fix a comment typo

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>

* Simplify time maths in MetaAddr

* Create a Duration32 type to simplify calculations and comparisons

* Rename variables for clarity

* Split a string constant into multiple lines

* Make constants match rustdoc order

Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
2021-06-18 09:30:44 -03:00
Pili Guerra 6396ac27d8
Update versions for zebra v1.0.0-alpha.11 release (#2334)
* Update versions for zebra v1.0.0-alpha.11 release

* Update Cargo.lock
2021-06-18 10:37:58 +01:00
teor 3932661a93
Qualify std::sync::Mutex in the unit tests (#2304)
Also add a missing zebra_test::init().
2021-06-15 10:01:56 -03:00
teor 3f7410d073
Security: stop gossiping failure and attempt times as last_seen times (#2273)
* Security: stop gossiping failure and attempt times as last_seen times

Previously, Zebra had a single time field for peer addresses, which was
updated every time a peer was attempted, sent a message, or failed.

This is a security issue, because the `last_seen` time should be
"the last time [a peer] connected to that node", so that
"nodes can use the time field to avoid relaying old 'addr' messages".
So Zebra was sending incorrect peer information to other nodes.

As part of this change, we split the `last_seen` time into the
following fields:
- untrusted_last_seen: gossiped from other peers
- last_response: time we got a response from a directly connected peer
- last_attempt: time we attempted to connect to a peer
- last_failure: time a connection with a peer failed

* Implement Arbitrary and strategies for MetaAddrChange

Also replace the MetaAddr Arbitrary impl with a derive.

* Write proptests for MetaAddr and MetaAddrChange

MetaAddr:
- the only times that get included in serialized MetaAddrs are
  the untrusted last seen and responded times

MetaAddrChange:
- the untrusted last seen time is never updated
- the services are only updated if there has been a handshake
2021-06-15 13:31:16 +10:00
teor 86f23f7960
Security: only apply the outbound connection rate-limit to actual connections (#2278)
* Only advance the outbound connection timer when it returns an address

Previously, we were advancing the timer even when we returned `None`.
This created large wait times when there were no eligible peers.

* Refactor to avoid overlapping sleep timers

* Add a maximum next peer delay test

Also refactor peer numbers into constants.

* Make the number of proptests overridable by the standard env var

Also cleanup the test constants.

* Test that skipping peer connections also skips their rate limits

* Allow an extra second after each sleep on loaded machines

macOS VMs seem to need this extra time to pass their tests.

* Restart test time bounds from the current time

This change avoids test failures due to cumulative errors.

Also use a single call to `Instant::now` for each test round.
And print the times when the tests fail.

* Stop generating invalid outbound peers in proptests

The candidate set proptests will fail if enough generated peers are
invalid for outbound connections.
2021-06-15 08:29:17 +10:00
teor 56ef08e385 Rewrite acceptance test matching
- Add a custom semver match for `zebrad` versions
- Prefer "line contains string" matches, so tests ignore minor changes
- Escape regex meta-characters when a literal string match is intended
- Rename test functions so they are more precise
- Rewrite match internals to remove duplicate code and enable custom matches
- Document match functions
2021-06-10 22:46:33 -04:00
Janito Vaqueiro Ferreira Filho a2d3078fcb
Replace usage of atomics with `tokio::sync::watch` (#2272)
Rust atomics have an API that's very easy to use incorrectly, leading to
hard to find bugs. For that reason, it's best to avoid it unless there's
a good reason not to.
2021-06-11 12:25:06 +10:00
Pili Guerra 9aafa79fa3
Update versions for zebra v1.0.0-alpha.10 release (#2245)
* Update versions for zebra v1.0.0-alpha.10 release

* Update Cargo.lock
2021-06-09 12:56:36 +02:00
Janito Vaqueiro Ferreira Filho e8d5f6978d
Rate limit `GetAddr` messages to any peer, Credit: Equilibrium (#2254)
* Rename field to `wait_next_handshake`

Make the name a bit more clear regarding to the field's purpose.

* Move `MIN_PEER_CONNECTION_INTERVAL` to `constants`

Move it to the `constants` module so that it is placed closer to other
constants for consistency and to make it easier to see any relationships
when changing them.

* Rate limit calls to `CandidateSet::update()`

This effectively rate limits requests asking for more peer addresses
sent to the same peer. A new `min_next_crawl` field was added to
`CandidateSet`, and `update` only sends requests for more peer addresses
if the call happens after the instant specified by that field. After
sending the requests, the field value is updated so that there is a
`MIN_PEER_GET_ADDR_INTERVAL` wait time until the next `update` call
sends requests again.

* Include `update_initial` in rate limiting

Move the rate limiting code from `update` to `update_timeout`, so that
both `update` and `update_initial` get rate limited.

* Test `CandidateSet::update` rate limiting

Create a `CandidateSet` that uses a mocked `PeerService`. The mocked
service always returns an empty list of peers, but it also checks that
the requests only happen after expected instants, determined by the
fanout amount and the rate limiting interval.

* Refactor to create a `mock_peer_service` helper

Move the code from the test to a utility function so that another test
will be able to use it as well.

* Check number of times service was called

Use an `AtomicUsize` shared between the service and the test body that
the service increments on every call. The test can then verify if the
service was called the number of times it expected.

* Test calling `update` after `update_initial`

The call to `update` should be skipped because the call to
`update_initial` should also be considered in the rate limiting.

* Mention that call to `update` may be skipped

Make it clearer that in this case the rate limiting causes calls to be
skipped, and not that there's an internal sleep that happens.

Also remove "to the same peers", because it's more general than that.

Co-authored-by: teor <teor@riseup.net>
2021-06-09 09:42:45 +10:00
teor 8ebb415e7c Clippy: remove needless borrows 2021-06-07 18:33:58 -04:00
Janito Vaqueiro Ferreira Filho aaef94c2bf
Prevent burst of reconnection attempts (#2251)
* Rate-limit new outbound peer connections

Set the rate-limiting sleep timer to use a delay added to the maximum
between the next peer connection instant and now. This ensures that the
timer always sleeps at least the time used for the delay.

This change fixes rate-limiting new outbound peer connections, since
before there could be a burst of attempts until the deadline progressed
to the current instant.

Fixes #2216

* Create `MetaAddr::alternate_node_strategy` helper

Creates arbitrary `MetaAddr`s as if they were network nodes that sent
their listening address.

* Test outbound peer connection rate limiting

Tests if connections are rate limited to 10 per second, and also tests
that sleeping before continuing with the attempts still respets the rate
limit and does not result in a burst of reconnection attempts.
2021-06-07 14:13:46 +10:00
teor 2f0f379a9e
Standardise clippy lints and require docs (#2238)
* Standardise lints across Zebra crates, and add missing docs

The only remaining module with missing docs is `zebra_test::command`

* Todo -> TODO

* Clarify what a transcript ErrorChecker does

Also change `Error` -> `BoxError`

* TransError -> ExpectedTranscriptError

* Output Descriptions -> Output descriptions
2021-06-04 08:48:40 +10:00
teor ce45198c17
Fix comment typo: overflow -> underflow 2021-06-01 16:44:45 +10:00
teor 34702f22b6 clippy: remove needless clone and collect 2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 83ac1519e9 Add proptest for future `last_seen` correction
Given a generated list of gossiped peers, ensure that after running the
`validate_addrs` function none of the resulting peers have a `last_seen`
time that's after the specified limit.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 63672a2633 Test underflow handling
If the calculation to apply the compensation offset overflows or
underflows, the reported times are too distant apart, and could be sent
on purpose by a malicious peer, so all addresses from that peer should
be rejected.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho f263d85aa4 Test `last_seen` time being equal to the limit
If the most recent `last_seen` time reported by a peer is exactly the
limit, the offset doesn't need to be applied because no times are in the
future.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho f4a7026aa3 Test that offset is applied to all gossiped peers
Use some mock gossiped peers where some have `last_seen` times in the
past and some have times in the future. Check that all the peers have
an offset applied to them by the `validate_addrs` function.

This tests if the offset is applied to all peers that a malicious peer
gossiped to us.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 60f660e53f Test if validation doesn't offset past times
Use some mock gossiped peers that all have `last_seen` times in the
past and check that they don't have any changes to the `last_seen` times
applied by the `validate_addrs` function.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 3c9c920bbd Test if validation offsets times in the future
Use some mock gossiped peers that all have `last_seen` times in the
future and check that they all have a specific offset applied to them.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 82452621e0 Remove empty list of peers check
The `limit_last_seen_times` can now safely handle an empty list.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 966430d400 Update security note to be broader
Focus on what can go wrong, and not on the specific causes.

Co-authored-by: teor <teor@riseup.net>
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho f3419b7baf Handle overflow when applying offset
If an overflow occurs, the reported `last_seen` times are either very
wrong or malicious, so reject all addresses gossiped by that peer.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 5b8f33390c Add comment to describe purpose
Make it clear why all peers have the time offset applied to them.

Co-authored-by: teor <teor@riseup.net>
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 9eac43a8bb Apply offset to all times received from a peer
If any of the times gossiped by a peer are in the future, apply the
necessary offset to all the times gossiped by that peer. This ensures
that all gossiped peers from a malicious peer are moved further back in
the queue.

Co-authored-by: teor <teor@riseup.net>
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho fa35c9b4f1 Only apply offset to times in the future
Times in the past don't have any security implications, so there's no
point in trying to apply the offset to them as well.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 876d515dd6 Improve documentation
- Make the security impact clearer and in a separate section.
- Instead of listing an assumption as almost a side-note, describe it
  clearly inside a `Panics` section.

Co-authored-by: teor <teor@riseup.net>
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 54809a1b89 Don't trust reported peer `last_seen` times
Due to clock skew, the peers could end up at the front of the
reconnection queue or far at the back. The solution to this is to offset
the reported times by the difference between the most recent reported
sight (in the remote clock) and the current time (in the local clock).
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 29c51d5086 Implement `MetaAddr::set_last_seen` setter method
Will be used when limiting the reported last seen times for recived
gossiped addresses.
2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho 14ecc79f01 Use `DateTime32` in `validate_addrs` 2021-06-01 03:42:08 -03:00
Janito Vaqueiro Ferreira Filho b891a96a6d Improve ergonomics by returning `impl Iterator`
Returning `impl IntoIterator` means that the caller will always be
forced to call `.into_iter()`, and returning `impl Iterator` still
allows them to call `.into_iter()` because it becomes the identity
function.
2021-06-01 03:42:08 -03:00
teor ebe1c9f88e
Add a DateTime32 type for 32-bit serialized times (#2210)
* Add a DateTime32 type for 32-bit serialized times
* Use DateTime32 for MetaAddr.last_seen
* Create and use a `DateTime32::now` method
2021-05-31 12:52:34 +10:00
teor a6e272bf1c
Fix a typo: BIP11 -> BIP111 (#2223) 2021-05-28 14:50:43 +02:00
teor 5cdcc5255f Proptest `MetaAddr` sanitization and serialization together 2021-05-26 18:13:35 -04:00
teor 9f8b4f836e Test round-trip serialization for gossiped `MetaAddr`s 2021-05-26 18:13:35 -04:00
teor 81630d19f2 Add service sanitization to `MetaAddr::sanitize`
This makes sure that deserialization and generated `MetaAddr`s are consistent.
2021-05-26 18:13:35 -04:00
teor bf6fe175dd Stop deriving PartialEq for MetaAddr
This makes sure Ord and ParitalEq are always consistent.
2021-05-26 18:13:35 -04:00
teor 078385ae00 Canonicalise arbitrary IP addresses in proptests
This makes round-trip serialization tests work.
2021-05-26 18:13:35 -04:00
teor c0114a2c5f Security: Stop panicking when serializing out-of-range times
Zebra assumes that deserialized times are always able to be serialized.

But this assumption is wrong because:
- sanitization can modify times
- gossiped `MetaAddr` validation can modify times
2021-05-26 18:13:35 -04:00
Pili Guerra e3d2ae0a8a
Update versions for zebra v1.0.0-alpha.9 release (#2196)
* Update versions for zebra v1.0.0-alpha.9 release

* Update Cargo.lock
2021-05-26 13:01:39 +02:00
teor f0549b2f7c
Derive Arbitrary impls for a bunch of chain and network types (#2179)
Enable proptests for internal and external network protocol messages,
using times with the correct protocol-specific ranges. (4 or 8 bytes.)
2021-05-24 11:10:07 -04:00
teor 57fb5c028c
Fix up some doc links (#2180) 2021-05-21 12:06:31 -03:00
teor 2685fc746e
Remove CandidateSet state and add last seen time limit to candidate_set::validate_addrs (#2177) 2021-05-21 02:21:13 +00:00
teor 752358d236
Fix some candidate set and meta addr doc links (#2174)
Suggested by jvff.
2021-05-21 11:40:14 +10:00
teor 40d06657b3 Update new_gossiped_meta_addr to the latest API 2021-05-21 06:51:34 +10:00
teor c7ea1395e7 Security: Fix CandidateSet timeout and fanout
* Refactor: Split CandidateSet::update into separate functions
* Security: Apply a timeout to the entire CandidateSet::update
* Security: Stop using very large fanout limits during initialization

Previously, Zebra used the number of resolved peer addresses.
So it was possible for all peers to fail, and for Zebra to hang on the
first update.

And Zebra could send a fanout for each initial peer, regardless
of whether their connection was successful.

Also:
- wait for at least one successful peer before trying an update
- warn if there are no successful initial peers
2021-05-21 06:51:34 +10:00
Deirdre Connolly bf72d6dbc0 Update zebra-network/src/peer/handshake.rs
Co-authored-by: teor <teor@riseup.net>
2021-05-18 14:02:19 +10:00
teor 92828bbb29 Reliability: send local listener address to peers
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
2021-05-18 14:02:19 +10:00
teor d2a8985dbc Reliability: Add inbound canonical addresses to the address book
Add canonical addresses from inbound connections to the address book,
so that Zebra can use them for reconnection attempts.

Use the newly added `NeverAttemptedAlternate` state for these addresses,
so we try gossiped addresses first, then canonical addresses. This avoids
duplicate connections to inbound peers.
2021-05-18 14:02:19 +10:00
teor 458c26f1e3 Limit initial candidate set fanout to the number of initial peers
If there is a small number of initial peers, and they are slow, the
initial candidate set update can appear to hang. To avoid this issue,
limit the initial candidate set fanout to the number of initial peers.

Once the initial peers have sent us more peer addresses, there is no need
to limit the fanouts for future updates.

Reported by Niklas Long of Equilibrium.
2021-05-18 07:54:03 +10:00
teor 679920f6b8 Stop trying to resolve empty initial peer lists
Instead, log an error and return immediately.
2021-05-18 07:54:03 +10:00
teor b600e82d6e
Security: Avoid silently corrupting invalid times during serialization (#2149)
* Security: panic if an internally generated time is out of range

If Zebra has a bug where it generates blocks, transactions, or meta
addresses with bad times, panic. This avoids sending bad data onto the
network.

(Previously, Zebra would truncate some of these times, silently
corrupting the underlying data.)

Make it clear that deserialization of these objects is infalliable.
2021-05-17 16:53:10 -04:00
teor b0b8b2f61a
Add extra instrumentation for initialize and handshakes (#2122)
* Instrument the crawl task

When we created the crawl task, we forgot to instrument it with the
global span. This fix makes sure that the git and network span appears on
crawl logs.

* Instrument the connector

* Improve handshake instrumentation

Make some spans debug, so there are not too many spans.

* Add the address to initial peer connection errors
2021-05-17 16:49:16 -04:00
teor 7969459b19
Security: Move the Verack response after the version check (#2121)
We should do as many local checks as possible, before sending further
messages.
2021-05-17 16:39:44 -04:00
teor c40cbee42f Remove address book peers that have changed to clients
If an address book peer stops advertising the NODE_SERVICES bit, remove
it from the address book.
2021-05-14 23:45:42 +10:00
teor f541f85792 Send unspecified addresses and client services for isolated connections 2021-05-14 23:45:42 +10:00
teor 9160365d06 Fix a comment 2021-05-14 23:45:42 +10:00
teor a8a0d6450c Security: stop gossiping temporary inbound remote addresses to peers
- stop putting inbound addresses in the address book
- drop address book entries that can't be used for outbound connections
  - distinguish between temporary inbound and permanent outbound peer
    addresses
  - also create variants to handle proxy connections
    (but don't use them yet)
  - avoid tracking connection state for isolated connections
- document security constraints for the address book and peer set
2021-05-14 23:45:42 +10:00
teor fde8f1e4ca
Security: stop panicking on out-of-range version timestamps, Credit: Equilibrium (#2148)
* Security: stop panicking on out-of-range version timestamps

Instead, return a deserialization error, and close the connection.

This issue was reported by Equilibrium.
2021-05-14 17:13:11 +10:00
Pili Guerra 500dc2e511
Update version strings for Zebra v1.0.0-alpha.8 release (#2136)
* Update versions for zebra v1.0.0-alpha.8 release

* Update tower-batch and tower-fallback version strings

* Update Cargo.lock
2021-05-12 14:27:36 +02:00
teor 1f40498fcf
Clippy nightly: disable owned cmp, stop comparing bool using assert_eq (#2073)
* Disable clippy warnings about comparing a newly created struct

In Sapling, we compare canonical JubJub bytes with a supplied byte array.

Since we need to perform calculations to get it into canonical form, we
need to create a newly owned object.

* Clippy: use assert rather than assert_eq on a bool
2021-04-27 09:57:45 -03:00
Pili Guerra ea1446ee92
Update version strings for Zebra v1.0.0-alpha.7 release (#2056)
* Update version strings for Zebra v1.0.0-alpha.7 release
2021-04-23 12:56:25 +00:00
teor 7b13d5573a Make String Zcash serialization consistent with deserialization
After recent changes, serialization was `write_string`, but
deserialization was `zcash_deserialize`.
2021-04-21 23:58:48 -04:00
Kirill Fomichev afac2c2846
Use the default port for configured listen addresses with no port (#2043)
* Allow use listen address in config without port

* update comments

* remove not used alias

* use Network::default_port

* Move tests and use toml instead json

* change error message

* Make match more readable

Co-authored-by: teor <teor@riseup.net>
2021-04-21 23:14:29 +00:00
teor 0203d1475a Refactor and document correctness for std::sync::Mutex<AddressBook> 2021-04-21 17:14:47 -04:00
teor 905b90d6a1 Refactor and document correctness for std::sync::Mutex in ErrorSlot 2021-04-21 16:39:06 -04:00
teor 3f45735f3f Use futures:🔒:Mutex for the nonce set 2021-04-21 01:39:49 -04:00
teor 2ed8bb00cf Clarify CandidateSet state diagram
We get inbound connections on the listener port,
but the important part is the inbound connection
itself.
2021-04-21 01:37:43 -04:00
teor ad272f2bee Make sure handshake version negotiation always has a timeout
As part of this change, refactor handshake version negotiation into its
own function.
2021-04-19 18:31:28 -04:00
teor 2cecd52a10 Fix comment typo 2021-04-19 10:11:22 -04:00
teor 8fb12f07a1 Fix outdated comment 2021-04-19 10:11:22 -04:00
teor eabadb8301 Make heartbeats wait for the connection queue to empty, with a timeout
Also cleanup the heartbeat code, so each heartbeat request/response runs
in a future with a single timeout.
2021-04-19 10:11:22 -04:00
teor 0def12f825
Add split array serialization functions for Transaction::V5 (#2017)
* Add functions for serializing and deserializing split arrays

In Transaction::V5, Zcash splits some types into multiple arrays, with a
single prefix count before the first array.

Add utility functions for serializing and deserializing the subsequent
arrays, with a paramater for the original array's length.

* Use zcash_deserialize_bytes_external_count in zebra-network

* Move some preallocate proptests to their own file

And fix the test module structure so it is consistent with the rest of
zebra-chain.

* Add a convenience alias zcash_serialize_external_count

* Explain why u64::MAX items will never be reached
2021-04-16 08:23:00 +10:00
teor 381c20b6af Security: change the GetAddr fanout to 3
Zebra avoids having a majority of addresses from a single peer by asking
3 peers for new addresses.

Also update a bunch of security comments and related documentation.
2021-04-15 13:09:14 -04:00
teor 59aa04c9b9 Stop panicking when Zebra sends a reject without extra data
Also add round-trip unit tests for extra data and no extra data.
2021-04-15 12:20:33 -04:00
teor a417c7c8c7 Use meaningful names for select! variables 2021-04-13 23:56:16 -04:00
teor fb95de99a6 Refactor the dial result into a From impl 2021-04-13 18:52:49 -04:00
Alfredo Garcia 5ec05e91e1 update version strings for v1.0.0-alpha.6 2021-04-08 18:48:34 -04:00
teor 1626ec383a
Add InventoryHash and MetaAddr proptests (#1985)
* Make proptest dependencies consistent between chain and network

* Implement Arbitrary for InventoryHash and use it in tests

* Impl Arbitrary for MetaAddr and use it in tests

Also test some extreme times in MetaAddr sanitization.
2021-04-07 14:13:52 -03:00
teor 375c8d8700
Fix a deadlock between the crawler and dialer, and other hangs (#1950)
* Stop ignoring inbound message errors and handshake timeouts

To avoid hangs, Zebra needs to maintain the following invariants in the
handshake and heartbeat code:
- each handshake should run in a separate spawned task
  (not yet implemented)
- every message, error, timeout, and shutdown must update the peer address state
- every await that depends on the network must have a timeout

Once the Connection is created, it should handle timeouts.
But we need to handle timeouts during handshake setup.

* Avoid hangs by adding a timeout to the candidate set update

Also increase the fanout from 1 to 2, to increase address diversity.

But only return permanent errors from `CandidateSet::update`, because
the crawler task exits if `update` returns an error.

Also log Peers response errors in the CandidateSet.

* Use the select macro in the crawler to reduce hangs

The `select` function is biased towards its first argument, risking
starvation.

As a side-benefit, this change also makes the code a lot easier to read
and maintain.

* Split CrawlerAction::Demand into separate actions

This refactor makes the code a bit easier to read, at the cost of
sometimes blocking the crawler on `candidates.next()`.

That's ok, because `next` only has a short (< 100 ms) delay. And we're
just about to spawn a separate task for each handshake.

* Spawn a separate task for each handshake

This change avoids deadlocks by letting each handshake make progress
independently.

* Move the dial task into a separate function

This refactor improves readability.

* Fix buggy future::select function usage

And document the correctness of the new code.
2021-04-07 10:25:10 -03:00
teor de6d1c93f3
Clarify a comment 2021-04-07 18:56:38 +10:00
teor 64662a758d
Move the preallocate tests into their own files (#1977)
* Move the preallocate tests into their own files

And move the MetaAddr proptest into its own file.

Also do some minor formatting and cleanups.

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2021-04-07 12:32:27 +10:00
Preston Evans 0daaf582e2
Implement Trusted Vector Preallocation (#1920)
* Implement SafePreallocate. Resolves #1880

* Add proptests for SafePreallocate

* Apply suggestions from code review

Comments which did not include replacement code will be addressed in a follow-up commit.

Co-authored-by: teor <teor@riseup.net>

* Rename [Safe-> Trusted]Allocate. Add doc and tests

Add tests to show that the largest allowed vec under TrustedPreallocate
is small enough to fit in a Zcash block/message (depending on type).
Add doc comments to all TrustedPreallocate test cases.
Tighten bounds on max_trusted_alloc for some types.

Note - this commit does NOT include TrustedPreallocate
impls for JoinSplitData, String, and Script.
These impls will be added in a follow up commit

* Implement SafePreallocate. Resolves #1880

* Add proptests for SafePreallocate

* Apply suggestions from code review

Comments which did not include replacement code will be addressed in a follow-up commit.

Co-authored-by: teor <teor@riseup.net>

* Rename [Safe-> Trusted]Allocate. Add doc and tests

Add tests to show that the largest allowed vec under TrustedPreallocate
is small enough to fit in a Zcash block/message (depending on type).
Add doc comments to all TrustedPreallocate test cases.
Tighten bounds on max_trusted_alloc for some types.

Note - this commit does NOT include TrustedPreallocate
impls for JoinSplitData, String, and Script.
These impls will be added in a follow up commit

* Impl TrustedPreallocate for Joinsplit

* Impl ZcashDeserialize for Vec<u8>

* Arbitrary, TrustedPreallocate, Serialize, and tests for Spend<SharedAnchor>

Co-authored-by: teor <teor@riseup.net>
2021-04-06 09:49:42 +10:00
teor 83b88f5b7a
Merge pull request #1972 from ZcashFoundation/peer-set-demand-deadlock-doc
Document peer set deadlock resistance
2021-04-01 22:50:17 -04:00
teor 306fa88214 Document the correctness of Poll::Pending wakeups 2021-03-27 08:55:49 -04:00
teor b329892665 Add a comment about a zcashd inv message bug 2021-03-26 11:26:59 -04:00
teor 1a159dfcb6 Add more methods for creating MetaAddrs
This refactor lets us remove `MetaAddr::update_last_seen()`.
2021-03-26 07:23:49 +10:00
teor 6fe81d8992 Make MetaAddr.last_seen into a private field 2021-03-26 07:23:49 +10:00
teor eae59de1e8 use PeerAddrState::* 2021-03-26 07:23:49 +10:00
teor e9cdc224a2 Rewrite MetaAddr::sanitize so it's harder to misuse
`sanitize` could be misused in two ways:
* accidentally modifying the addresses in the address book itself
* forgetting to sanitize new fields added to `MetaAddr`

This change prevents accidental modification by taking `&self`, and
explicitly creates a new sanitized `MetaAddr` with all fields listed.
2021-03-26 07:23:49 +10:00
Deirdre Connolly c5bad9fac2
Rename NU5 to Nu5 to appease newly stable clippy::upper-case-acronyms (#1945) 2021-03-26 07:22:50 +10:00
Deirdre Connolly 7efc700aca
Merge pull request #1713 from ZcashFoundation/use-groth16-batch-math
Use batch optimizations, load params in groth16::Verifier, verify Spend & Output descriptions in transaction verifier
2021-03-24 12:28:25 -04:00
Deirdre Connolly ca1d2de87d
Bump versions for v1.0.0-alpha.5 (#1932)
Zebra's latest alpha checkpoints on Canopy activation, continues our work on NU5, and fixes a security issue.

Some notable changes include:

## Added
- Log address book metrics when PeerSet or CandidateSet don't have many peers (#1906)
- Document test coverage workflow (#1919)
- Add a final job to CI, so we can easily require all the CI jobs to pass (#1927)

## Changed
- Zebra has moved its mandatory checkpoint from Sapling to Canopy (#1898, #1926)
  - This is a breaking change for users that depend on the exact height of the mandatory checkpoint.

## Fixed
- tower-batch: wake waiting workers on close to avoid hangs (#1908)
- Assert that pre-Canopy blocks use checkpointing (#1909)
- Fix CI disk space usage by disabling incremental compilation in coverage builds (#1923)

## Security
- Stop relying on unchecked length fields when preallocating vectors (#1925)
2021-03-22 22:05:01 -04:00
Alfredo Garcia c5b1d0deee move consts to start of the function 2021-03-22 11:54:31 -04:00
teor b623acc945 Add memory DoS prevention comments 2021-03-22 11:54:31 -04:00
teor 8e18c99cdc Avoid risky use of Read::take with untrusted lengths
Zebra already uses `Read::take` to enforce message, body, and block
maximum sizes.

So using `Read::take` on untrusted sizes can result in short reads,
without a corresponding `UnexpectedEof` error. (The old code was
correct, but copying it elsewhere would have been risky.)
2021-03-22 11:54:31 -04:00
teor 609d70ae53 Stop untrusted preallocation during string deserialization
This is an easy memory denial of service attack.
2021-03-22 11:54:31 -04:00
teor 4f923b90ea Log address book metrics when peers aren't responding 2021-03-17 10:47:04 +10:00
teor 5a30268d7a Log address metrics when the peer set has no ready peers 2021-03-17 10:47:04 +10:00
teor 6a342e93ca Refactor AddressBook metrics into their own struct
And provide an accessor function for address book metrics.
2021-03-17 10:47:04 +10:00
Alfredo Garcia d49eaab68e
Bump versions for zebrad 1.0.0-alpha.4 (#1913)
* Bump versions for zebrad 1.0.0-alpha.4

* add Cargo.lock
2021-03-16 21:12:37 -03:00
Jack Grigg 7a8cae9321 Tag message metrics by type 2021-03-17 09:38:07 +10:00
Jack Grigg e51f33a4b9 Use interoperable names for common metrics
These names match the equivalent metrics in zcashd, enabling common
metrics to be collected across both node types.
2021-03-17 09:38:07 +10:00
teor 8fabbce037
Document and log trailing message bytes (#1888)
* Rename a variable for consistency
* Log extra trailing message bytes at debug level
2021-03-15 08:25:27 +10:00
teor 976ec912db
Document that the listed address is also advertised to peers (#1891)
Documents a potential privacy leak, and a missing feature.
2021-03-15 08:25:07 +10:00
teor e50692bd51 CandidateSet: Add Listener Port Connections
Inbound connections on the Zcash protocol listener port
perform a handshake. If the handshake is successful, it
adds the peer to the AddressBook.
2021-03-09 23:05:18 -05:00
Jane Lusby 03aa6f671f
Implement outbound connection rate limiting - includes config rename with alias (#1855)
* Implement outbound connection rate limiting
* fix breaking change on config

Co-authored-by: teor <teor@riseup.net>
2021-03-10 01:36:05 +00:00
Jane Lusby e541746a50
Add initial support for NU5 to zebra (#1823)
* Add NU5 variant to NetworkUpgrade
* Add consensus branch ID for NU5
* Add network protocol versions for NU5
* Add NU5 to the protocol::version_consistent test
* Make unimplemented panic messages more specific
* Block target spacing doesn't change in NU5
* add comments for future updates for NU5

Co-authored-by: teor <teor@riseup.net>
2021-03-03 06:22:11 +10:00
teor 895bb43ead Clippy: Fix inconsistent struct member orders lint 2021-03-01 23:31:18 -05:00
teor 2587a4e272
Fix a peer DNS resolution edge case (#1796)
* Retry each peer DNS a few times individually

We retry each peer individually, as well as retrying if there are no
peers in the combined list.

DNS failures are correlated, so all peers can fail DNS, leaving Zebra
with a small list of custom-configured IP address peers.

Individual retries avoid this issue.

* Rename parse_peers to resolve_peers

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
2021-02-26 09:06:27 +10:00
teor 9c3f236075 Stop sending blocks and transactions on error 2021-02-25 08:44:57 -08:00
teor 78f162733d Revert "leverage return value for propagating errors"
This reverts commit e6cb20e13f.
2021-02-24 13:07:31 -08:00
teor 72e2e83828 Revert "introduce Transition enum"
This reverts commit 6906f87ead.
2021-02-24 13:07:31 -08:00
teor a5e89f4f2b Revert "accidental drop on mustusesender"
This reverts commit 5ec8d09e0d.
2021-02-24 13:07:31 -08:00
teor d60226a3cf Revert "rustfmt"
This reverts commit 9d9734ea81.
2021-02-24 13:07:31 -08:00
teor 359015b2be Revert "Only reject pending client requests when the peer has errored"
This reverts commit e06705ed81.
2021-02-24 13:07:31 -08:00
teor 663ed6c842 Revert "Remove remaining references to fail_with"
This reverts commit 5e4bf804aa.
2021-02-24 13:07:31 -08:00
teor 3c225550ee Revert "rename transitions from Exit to Close"
This reverts commit cfc4717b98.
2021-02-24 13:07:31 -08:00
teor 86dc66dfa9 Revert "deduplicate match arms in handle_client_request"
This reverts commit 2adee7b31a.
2021-02-24 13:07:31 -08:00
teor 292a4391e2 Revert "update comments throughout connection.rs"
This reverts commit 651d352ce1.
2021-02-24 13:07:31 -08:00
teor fc44a97925 Revert "remove unnecessary Option around request timeout"
This reverts commit c3724031df.
2021-02-24 13:07:31 -08:00
teor e06120cd36 Revert "ensure peer/client.rs comments are up to date"
This reverts commit 2266886a53.
2021-02-24 13:07:31 -08:00
teor 1a70d807b6 Revert "make sure peer/error.s comments are up to date"
This reverts commit 6f205a1812.
2021-02-24 13:07:31 -08:00
teor 3b2077fcfd Revert "Apply suggestions from code review"
This reverts commit 736092abb8.
2021-02-24 13:07:31 -08:00
teor 7558f74c78 Bump versions for zebrad 1.0.0-alpha.3 2021-02-23 10:39:13 -05:00
dependabot[bot] b578d1ff2e build(deps): bump proptest-derive from 0.2.0 to 0.3.0
Bumps [proptest-derive](https://github.com/AltSysrq/proptest) from 0.2.0 to 0.3.0.
- [Release notes](https://github.com/AltSysrq/proptest/releases)
- [Changelog](https://github.com/AltSysrq/proptest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/AltSysrq/proptest/compare/proptest-derive-0.2.0...proptest-derive-0.3.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-02-22 01:33:54 -05:00
teor d4f2f27218
Add global span to spawned network tasks (#1761)
Closes #1575
2021-02-20 08:36:50 +10:00
ebfull b7fddbde94
Compute the expected body length to reduce heap allocations (#1773)
* Compute the expected body length to reduce heap allocations
2021-02-19 22:18:57 +00:00
Jane Lusby 736092abb8 Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
2021-02-19 14:11:35 -08:00
Jane Lusby 6f205a1812 make sure peer/error.s comments are up to date 2021-02-19 14:11:35 -08:00
Jane Lusby 2266886a53 ensure peer/client.rs comments are up to date 2021-02-19 14:11:35 -08:00
Jane Lusby c3724031df remove unnecessary Option around request timeout 2021-02-19 14:11:35 -08:00
Jane Lusby 651d352ce1 update comments throughout connection.rs 2021-02-19 14:11:35 -08:00
Jane Lusby 2adee7b31a deduplicate match arms in handle_client_request 2021-02-19 14:11:35 -08:00
Jane Lusby cfc4717b98 rename transitions from Exit to Close 2021-02-19 14:11:35 -08:00
teor 5e4bf804aa Remove remaining references to fail_with 2021-02-19 14:11:35 -08:00
teor e06705ed81 Only reject pending client requests when the peer has errored
- Add an `ExitClient` transition, used when the internal client channel
  is closed or dropped, and there are no more pending requests
- Ignore pending requests after an `ExitClient` transition
- Reject pending requests when the peer has caused an error
  (the `Exit` and `ExitRequest` transitions)
- Remove `PeerError::ConnectionDropped`, because it is now handled by
  `ExitClient`. (Which is an internal error, not a peer error.)
2021-02-19 14:11:35 -08:00
teor 9d9734ea81 rustfmt 2021-02-19 14:11:35 -08:00
Jane Lusby 5ec8d09e0d accidental drop on mustusesender 2021-02-19 14:11:35 -08:00
Jane Lusby 6906f87ead introduce Transition enum 2021-02-19 14:11:35 -08:00
Jane Lusby e6cb20e13f leverage return value for propagating errors 2021-02-19 14:11:35 -08:00
teor e61b5e50a2
Diagnostics for CI port conflict failures (#1766)
Log a "Trying..." message before each listener opens, to see if the
delay is inside Zebra, or in the test harness or OS.

Also report the configured and actual ports where possible, for better
diagnostics.
2021-02-18 12:15:09 -03:00
teor 5424e1d8ba
Fix candidate set address state handling (#1709)
Design:
- Add a `PeerAddrState` to each `MetaAddr`
- Use a single peer set for all peers, regardless of state
- Implement time-based liveness as an `AddressBook` method, rather than
  a `PeerAddrState` variant
- Delete `AddressBook.by_state`

Implementation:
- Simplify `AddressBook` changes using `update` and `take` modifier
  methods
- Simplify the `AddressBook` iterator implementation, replacing it with
  methods that are more obviously correct
- Consistently collect peer set metrics

Documentation:
- Expand and update the peer set documentation

We can optimise later, but for now we want simple code that is more
obviously correct.
2021-02-18 11:18:32 +10:00
teor 579bd4a368
Retry DNS resolution on failure (#1762)
Otherwise, a transient DNS failure makes the node hang.
2021-02-18 07:09:02 +10:00
teor 86169f6412
Update PeerSet metrics after every change (#1727) 2021-02-18 07:06:59 +10:00
teor 8d1c498234 Log initial peer connection failures
And standardise another log message
2021-02-17 09:21:53 -05:00
teor e85441c914 Add a correctness comment to justify the revert 2021-02-16 05:52:54 +10:00
teor a02a00a3f5 Revert "Stop using CallAllUnordered in peer_set::add_initial_peers (#1705)"
This reverts commit 241c7ad849.
2021-02-16 05:52:54 +10:00
teor e7176b86da Clarify the Response::Nil documentation 2021-02-11 09:45:42 -05:00
Deirdre Connolly 0c5daa8410 Bump versions for zebrad 1.0.0-alpha.2
Including tower-batch bump to 0.2.0, tower-fallback to 0.2.0, zebra-script to 1.0.0-alpha.3
2021-02-09 16:14:29 -05:00
Alfredo Garcia 241c7ad849
Stop using CallAllUnordered in peer_set::add_initial_peers (#1705)
* use ServiceExt::oneshot and FuturesUnordered

Co-authored-by: teor <teor@riseup.net>
2021-02-09 08:16:02 +10:00
teor 1e156a5d60 Document that connect_isolated only works on mainnet
Document that connect_isolated only works on mainnet.

See #1687.
2021-02-04 17:32:00 -05:00
Alfredo Garcia d7c40af2a8
Fix shutdown panics (#1637)
* add a shutdown flag in zebra_chain::shutdown
* fix network panic on shutdown
* fix checkpoint panic on shutdown
2021-02-03 19:03:28 +10:00
Alfredo Garcia 221512c733
Async DNS seeder lookups (#1662)
* replace to_socket_addrs
* refactor `resolve()` into `resolve_host()`
* use `resolve_host()` to resolve config peers
* add DNS_LOOKUP_TIMEOUT constant
* don't block the main thread in initialize
2021-02-03 12:20:26 +10:00
teor 983e94f9e4 Add a TODO for inbound error handling cleanup 2021-02-03 08:32:10 +10:00
Alfredo Garcia 4b34482264
Add hints to port conflict and lock file panics (#1535)
* add hint for port error
* add issue filter for port panic
* add lock file hint
* add metrics endpoint port conflict hint
* add hint for tracing endpoint port conflict
* add acceptance test for resource conflics
* Split out common conflict test code into a function
* Add state, metrics, and tracing conflict tests

* Add a full set of stderr acceptance test functions

This change makes the stdout and stderr acceptance test interfaces
identical.

* move Zcash listener opening
* add todo about hint for disk full
* add constant for lock file
* match path in state cache
* don't match windows cache path

* Use Display for state path logs

Avoids weird escaping on Windows when using Debug

* Add Windows conflict error messages

* Turn PORT_IN_USE_ERROR into a regex

And add another alternative Windows-specific port error

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jane@zfnd.org>
2021-01-29 22:36:33 +10:00
Deirdre Connolly 1b09538277
Bump versions for zebrad 1.0.0-alpha.1 (#1646)
* Bump versions where appropriate

Tested with cargo install --locked --path etc

* Remove fixed panics from 'Known Issues'

* Change to alpha release series in the README

Co-authored-by: teor <teor@riseup.net>
2021-01-27 20:31:39 -05:00
teor b551d81f8d Explain why we stay connected on Inbound errors
We might be syncing using this peer, so it's ok to just ignore
any internal errors in their Inbound requests, and drop the
request.
2021-01-27 12:08:49 -08:00
teor 258789ed9b Use the rustc unknown lints attribute
The clippy unknown lints attribute was deprecated in
nightly in rust-lang/rust#80524. The old lint name now produces a
warning.

Since we're using `allow(unknown_lints)` to suppress warnings, we need to
add the canonical name, so we can continue to build without warnings on
nightly.

But we also need to keep the old name, so we can continue to build
without warnings on stable.

And therefore, we also need to disable the "removed lints" warning,
otherwise we'll get warnings about the old name on nightly.

We'll need to keep this transitional clippy config until rustc 1.51 is
stable.
2021-01-19 11:02:20 -05:00
teor 05fff8e6f7 Revert "Stop panicking when fail_with is called twice on a connection"
But keep the extra error information.
2021-01-18 00:23:36 -05:00
teor 4fe81da953 Improve logging for connection state errors 2021-01-18 00:23:36 -05:00
teor a6c1cd3c35 Stop panicking when fail_with is called twice on a connection
We can't rule out the connection state changing between the state checks
and any eventual failures, particularly in the presence of async code.

So we turn this panic into a warning.
2021-01-18 00:23:36 -05:00
teor 44c8fafc29 Stop processing the request after failing an overloaded connection
zebra-network's Connection expects that `fail_with` is only called once
per connection, but the overload handling code continues to process the
current request after an overload error, potentially leading to further
failures.

Closes #1599
2021-01-18 00:23:36 -05:00
teor 0f0fb93b5c Update some comments in zebra-network
Add ticket numbers, and update based on design decisions and new code.
2021-01-15 09:02:10 -05:00
teor 730910cd99 Upgrade to tokio 0.3.6 from crates.io
And remove the tokio git dependency patch
2021-01-12 15:37:27 -05:00
Jane Lusby 15698245e1
Deduplicate metrics dependencies (#1561)
## Motivation

This PR is motivated by the regression identified in https://github.com/ZcashFoundation/zebra/issues/1349. That PR notes that the metrics stopped working for most of the crates other than `zebrad`.

## Solution

This PR resolves the regression by deduplicating the `metrics` crate dependency. During a recent change we upgraded the metrics version in `zebrad` and a couple other of our crates, but we never updated the dependencies in `zebra-state`, `zebra-consensus`, or `zebra-network`. This caused the metrics macros to attempt to retrieve the current metrics exporter through the wrong function. We would install the metrics exporter in `0.13`, but then attempt to look it up through the `0.12` crate, which contains a different instance of the metrics exporter static variable which is unset. Doing this causes the metrics macros to return `None` for the current exporter after which they just silently give up.

## Related Issues

closes https://github.com/ZcashFoundation/zebra/issues/1349

## Follow Up Work

I noticed we have quite a few duplicate dependencies in our tree. We might be able to save some compilation time by auditing those and deduplicating them as much as possible.

- https://github.com/ZcashFoundation/zebra/issues/1582
Co-authored-by: teor <teor@riseup.net>
2021-01-12 12:28:56 +10:00
dependabot[bot] 38ac869f57 build(deps): bump byteorder from 1.3.4 to 1.4.2
Bumps [byteorder](https://github.com/BurntSushi/byteorder) from 1.3.4 to 1.4.2.
- [Release notes](https://github.com/BurntSushi/byteorder/releases)
- [Changelog](https://github.com/BurntSushi/byteorder/blob/master/CHANGELOG.md)
- [Commits](https://github.com/BurntSushi/byteorder/compare/1.3.4...1.4.2)

Signed-off-by: dependabot[bot] <support@github.com>
2021-01-11 18:45:49 -05:00
teor b7d0a40ee1 Revert unused instrument macros
Reverts most of "Instrument some functions to try to locate the panic"
2021-01-06 13:07:23 -08:00
teor 6d3aa0002c Ensure received client request oneshots are used via the type system
The `peer::Client` translates `Request`s into `ClientRequest`s, which
it sends to a background task. If the send is `Ok(())`, it will assume
that it is safe to unconditionally poll the `Receiver` tied to the
`Sender` used to create the `ClientRequest`.

We enforce this invariant via the type system, by converting
`ClientRequest`s to `InProgressClientRequest`s when they are received by
the background task. These conversions are implemented by
`ClientRequestReceiver`.

Changes:
* Revert `ClientRequest` so it uses a `oneshot::Sender`
* Add `InProgressClientRequest`, which is the same as `ClientRequest`,
  but has a `MustUseOneshotSender`
* `impl From<ClientRequest> for InProgressClientRequest`

* Add a new `ClientRequestReceiver` type that wraps a
  `mpsc::Receiver<ClientRequest>`
* `impl Stream<InProgressClientRequest> for ClientRequestReceiver`,
  converting the successful result of `inner.poll_next_unpin` into an
  `InProgressClientRequest`

* Replace `client_rx: mpsc::Receiver<ClientRequest>` in `Connection`
  with the new `ClientRequestReceiver` type
* `impl From<mpsc::Receiver<ClientRequest>> for ClientRequestReceiver`
2021-01-06 13:07:23 -08:00
teor df1b0c8d58 Defer a timeout fix until later 2021-01-06 13:07:23 -08:00
teor d5cfd5ad5f Clarify the ClientRequest invariant
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2021-01-06 13:07:23 -08:00
teor f8ff2e9c0b Add more sends before dropping ClientRequests
This fix also changes heartbeat behaviour in the following ways:
* if the queue is full, the connection is closed. Previously, the sender
  would wait until the queue had emptied
* if the queue flush fails, Zebra panics, because it can't send an error
  on the ClientRequest sender, so the invariant is broken
2021-01-06 13:07:23 -08:00
teor 3e711ccc8a Instrument some functions to try to locate the panic 2021-01-06 13:07:23 -08:00
teor fa29fca917 Panic when must-use senders are dropped before use
Add a MustUseOneshotSender, which panics if its inner sender is unused.
Callers must call `send()` on the MustUseOneshotSender, or ensure that
the sender is canceled.

Replaces an unreliable panic in `Client::call()` with a reliable panic
when a must-use sender is dropped.
2021-01-06 13:07:23 -08:00
teor b03809ebe3 Add the invalid state to an unreachable panic message 2021-01-06 13:07:23 -08:00
teor 86136c7b5c Stop ignoring errors when the new state is AwaitingRequest
The previous code would send a Nil message on the Sender, even if the
result was actually an error.
2021-01-06 13:07:23 -08:00
teor da5084a10a Split the 3-level match using a temporary 2021-01-06 13:07:23 -08:00
teor fd23c46726 Remove a redundant fmt::Display bound 2021-01-06 13:07:23 -08:00
teor 3892894ffa Call ClientRequest.tx.send() even if there is an error
Previously, tx would be dropped before send if:
- the success case would have used tx to wait for further messages,
- but the response was actually an error.

Instead, send the error on `tx` and call `fail_with()` using the same
error.

To support this change, allow `fail_with()` to take a `PeerError` or a
`SharedPeerError`.
2021-01-06 13:07:23 -08:00
teor 28f3186182 Mark ClientRequest and State::AwaitingResponse as must_use 2021-01-06 13:07:23 -08:00
teor b1f14f47c6
Rewrite GetData handling to match the zcashd implementation (#1518)
* Rewrite GetData handling to match the zcashd implementation

`zcashd` silently ignores missing blocks, but sends found transactions
followed by a `NotFound` message:
e7b425298f/src/main.cpp (L5497)

This is significantly different to the behaviour expected by the old
Zebra connection state machine, which expected `NotFound` for blocks.

Also change Zebra's GetData responses to peer request so they ignore
missing blocks.

* Stop hanging on incomplete transaction or block responses

Instead, if the peer sends an unexpected block, unexpected transaction,
or NotFound message:
1. end the request, and return a partial response containing any items
   that were successfully received
2. if none of the expected blocks or transactions were received, return
   an error, and close the connection
2021-01-04 13:25:35 +10:00
teor d482900e7f Remove a redundant pattern match
Identified by clippy's redundant_pattern_match lint.
2020-12-13 22:10:05 -05:00
teor 8e2f08221f
Add peer set tracing and unreachable panics (#1468)
Add some extra tracing and panics to double-check our
assumptions about the peer set state machine.
2020-12-14 11:00:39 +10:00
Henry de Valence 0842eb2dab
zebra: move to 1.x-based versioning. (#1476)
Previously we set the crate versions to 3.x, so that the major version was
aligned with the NU version.  But we want to be able to make API changes
independently of the NU schedule.
2020-12-08 08:53:07 +10:00
teor b4a50fd99f
Downgrade tokio to 0.3.4 to avoid a time wheel panic (#1453)
See tokio-rs/tokio#2789 for details. We were seeing this panic during
normal operation, not just at shutdown.
2020-12-04 13:52:37 +10:00
Henry de Valence b449fe93b2 network: correct data modeling for headers messages
We modeled a Bitcoin `headers` message as being a list of block headers.
However, the actual data structure is slightly different: it's a list of (block
header, transaction count) pairs.  This caused zcashd to reject our headers
messages.

To fix this, introduce a new `CountedHeader` struct with a `block::Header` and
transaction count `usize`, then thread it through the inbound service and the
state.

I tested this locally by running Zebra with these changes and inspecting a
trace-level log of the span of a peer connection that requested a nontrivial
headers packet from us, and verified that it did not reject our message.
2020-12-02 10:24:31 -08:00
Henry de Valence bfbc737b6c network: don't cancel heartbeat requests
The cancellation implementation changes made to the connection state machine
mean that if a response oneshot is dropped, the connection will avoid
cancelling the request.  So the heartbeat task does have to wait on the response.
2020-12-02 02:18:13 -05:00
Henry de Valence 69ba5584f3 network: correct parsing of reject messages
Not all reject messages include a data field.  This change partially addresses
a problem that could lead to a depleted peer set:

1. We send a response to a `getheaders` message;
2. The remote peer `reject`s our `headers` message for some reason;
3. We fail to parse their `reject` message and close the connection;
4. Repeating this process, we have no more peers.

This commit fixes (3) but does not address (2).
2020-12-02 02:12:29 -05:00
teor 34518525a5 Improve peer set logging hints
Delete hints about configuring peers.
Delete hint for typical "no ready peers" behaviour.
2020-12-01 21:37:15 -08:00
Henry de Valence 00c4f4f0e6 network: record cause of handshake failure 2020-12-01 19:16:41 -08:00
Henry de Valence 5ccd1905fc network: avoid putting null bytes in trace output 2020-12-01 19:16:41 -08:00
Henry de Valence f93deb1cac network: fix missing {0} in PeerError::Serialization 2020-12-01 19:16:41 -08:00
Henry de Valence 18cf5e0249 network: use short Display for Message in spans
This makes the span data more compact (e.g., `msg_as_req{msg=block}`) and
restores the Debug impl for Message to show all of the data contained in the
message.  The full message is added as a single event at trace level in the
span to preserve the previous full-inspectability.
2020-12-01 19:16:41 -08:00
Jane Lusby a91d0f0bb6
Include short sha in log messages and error urls (#1410)
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.

To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.

Co-authored-by: teor <teor@riseup.net>
2020-12-01 12:13:20 -08:00
teor 4d5ea4897c Log peer set ready and unready peers
* warn: if there are no peers at all
* info: if there are no ready peers
* trace: the number of ready and unready peers for every request

Log at most one warn or info log per minute, to avoid flooding the
terminal with log lines. Suppress warn and info logs for the first
minute, while the peer set is starting up.
2020-12-01 11:00:21 -05:00
teor 92eb92d1dd
Disable the nightly clippy unnecessary_wraps lint (#1403)
It seems to be a bit broken - some of our functions return `Result` for
consistency with similar functions. But the lint picks them up anyway.
2020-12-01 12:20:57 +10:00
Alfredo Garcia 4544463059
Inbound `FindBlocks` and `FindHeaders` (#1347)
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions

Add a `max_len` argument to support `FindHeaders` requests.

Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.

* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:30:37 +10:00
Alfredo Garcia 7d42c63790 fix comment 2020-11-25 10:55:44 -08:00
teor 8d6ac8eece Placate clippy 2020-11-24 20:03:21 +10:00
Henry de Valence d90e709ce1 network: tidy peer set implementation
- rename functions more descriptively
- create a common `take_ready_service` function
- organize poll_ functions separately
2020-11-24 20:03:21 +10:00
Henry de Valence f36a4800b2 network: fix invariant violation in peer set
Closes #1183.

The peer set maintains a preselected ready service that it can use to
perform power-of-two-choices (p2c) routing of requests.  Ready services
are stored by key (socket address) in an `IndexMap`, and the preselected
service is represented by an `Option<usize>` indexing that map.  This
means that whenever the set of ready services changes (e.g., a service
is removed from the peer set, or a service is taken to be used to
process a request), the preselected index is invalidated.  The original
P2C-only implementation maintained this invariant but did not document
it.

The change to inventory-based routing introduced a bug by failing to
maintain this invariant and appropriately invalidate the preselected
index.  However, this was only noticeable approximately 1/N of the time
on the next request after an inventory-directed request, so the bug
occurred infrequently.  Luckily, the use of `.expect` caused the bug to
be an immediate panic, making it possible to identify by inspecting all
uses of the ready service map.
2020-11-24 20:03:21 +10:00
teor 6387dfe1d0 Fix individual crate compilation failures
Some Zebra crates don't compile individually due to missing features in
their dependencies. Add those features to each crate's dependency list.
2020-11-23 23:56:28 -08:00
Henry de Valence add94c1c45 deps: move to tokio 0.3, tower 0.4
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware.  This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method.  See

ddc64e8d4d

for more context on the Tower changes.  To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
2020-11-20 10:08:16 -08:00
Henry de Valence 06dd39df54
network: bump network version for Canopy (#1333)
Per https://zips.z.cash/zip-0251, nodes compatible with Canopy
activation on mainnet MUST advertise protocol version 170013 or later.

Once Canopy activates on testnet or mainnet, Canopy nodes SHOULD reject
new connections from pre-Canopy nodes, so this also increases the
minimum version.
2020-11-20 09:50:05 +10:00
Henry de Valence a3ab589d89 consensus,state: document cancellation contracts for services
This change explicitly documents cancellation contracts for our Tower services,
and tries to correct a bug in the implementation of the CheckpointVerifier,
which duplicates information from the state service but did not ensure that it
would be kept in sync.
2020-11-17 14:56:27 -08:00
teor ca4e792f47 Put messages in request/response order
And fix a comment typo
2020-11-17 07:52:53 +10:00
Alfredo Garcia 128643d81e
Call `zebra_test::init` where needed. (#1227)
* Add missing `zebra_test::init()` to zebra-chain
* Add missing `zebra_test::init()` to zebra-consensus
* Add missing `zebra_test::init()` to zebra-network
* Add missing `zebra_test::init()` to zebra-state
* Add missing `zebra_test::init()` to zebra-test
* Add missing `zebra_test::init()` to zebrad
2020-11-10 10:29:25 +10:00
Henry de Valence 8e709bfa88 network: don't fail on unsolicited messages
These messages might be unsolicited, or they might be a response to a
request we already canceled.  So don't fail the whole connection, just
drop the message and move on.
2020-10-26 12:05:35 -07:00
Henry de Valence 13daefa729 network: handle request cancellation in Connection
We handle request cancellation in two places: before we transition into
the AwaitingResponse state, and while we are in AwaitingResponse.  We
need both places, or else if we started processing a request, we
wouldn't process the cancellation until the timeout elapsed.

The first is a check that the oneshot is not already canceled.

For the second, we wait on a cancellation, either from a timeout or from
the tx channel closing.
2020-10-26 12:05:35 -07:00
teor 1e97691fc8 Fix some "needless lifetime" clippy lints
These lints seem to be new in clippy nightly.
2020-10-12 08:54:23 +10:00
Dimitris Apostolou 36279621f0 Fix typos 2020-10-06 12:16:41 +10:00
Henry de Valence 6dd7318d3b deps: use Tower 0.4 from git instead of 0.3.1.
This addresses at least three pain points:

- we were affected by bugs that were already fixed in git, but not in
  the released crate;
- we can use service combinators to transform requests and responses;
- we can use the hedge middleware.

The version in git is still marked as 0.3.1 but these changes will be
part of tower 0.4: https://github.com/tower-rs/tower/issues/431
2020-09-21 14:16:56 -07:00
Deirdre Connolly 33afeb37cb Add a comment about the short looo 2020-09-21 09:26:39 -07:00
Henry de Valence 6f3288814c network: avoid GetPeers timeout to accelerate init
The GetPeers requests sent while crawling the network are randomly
load-balanced over available peers.  But at the very beginning, they may
be both routed to the same peer, causing network initialization to be
delayed while the second one times out (since zcashd only ever responds
to the first addr message).

Only sending one GetPeers request per candidate set update means we
crawl the network a little more slowly, but avoids hanging on start.
2020-09-21 09:26:39 -07:00
Henry de Valence b72c249b96 network: add a metric+warning when shedding load 2020-09-21 09:26:39 -07:00
Henry de Valence 4df5632752 network: handle Message::NotFound as a response
This cleans up the response processing logic a little bit along the way,
but the overall division of responsibility should be better documented
in a future commit.
2020-09-20 10:21:18 -07:00
Henry de Valence 64905563d1 network: remove glob import in message-handling
This clarifies which parts are the handler state and which parts are the
incoming message.
2020-09-20 10:21:18 -07:00
Henry de Valence 9c021025a7 network: fill in remaining request/response pairs 2020-09-20 10:21:18 -07:00
Henry de Valence b289cb9164 network: clean up GetHeaders, GetBlocks modeling 2020-09-20 10:21:18 -07:00
Henry de Valence 3c993f33b1 network: add PeerError::WrongMessage
This lets us distinguish between cases where the message was unsupported
(e.g., BIP11 messages), and cases where the message was uninterpretable
in context (e.g., unsolicited messages).
2020-09-20 10:21:18 -07:00
Henry de Valence 430176dd0d network: clean up message-as-request translation 2020-09-20 10:21:18 -07:00
Henry de Valence 170f588ffb network: document load-shedding behavior
This was part of the original design and is described in the Connection
internals, but we never documented it externally.
2020-09-18 18:34:25 -07:00
Henry de Valence 1d3892e1dc network: rename alias to BoxError
This is shorter and consistent with Tower (which is why we use it in the
first place).
2020-09-18 18:34:25 -07:00
Henry de Valence 95f2463188 Try workaround for generator autotrait bug
> Added a test that the handshake's version message matches specified fields, but the test does not compile, because rustc doesn't believe that the Box<dyn std::error::Error + Send + Sync + 'static> is 'static, and therefore isn't a Box<dyn std::error::Error + Send + Sync + 'static>. This manifests as being unable to spawn the connect_isolated task. From digging through Tokio issues I believe that this is an instance of rust-lang/rust#64552 .

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-09-17 12:02:20 -07:00
Henry de Valence 81e8195f68 network: add connect_isolated distinguisher test
This is currently broken due to a rustc bug.
2020-09-17 12:02:20 -07:00
Henry de Valence b7472de43f network: add a zebra_network::connect_isolated() method.
The peer set provides an automatically managed connection pool, abstracting
away all the details of handling individual peer connections.  However, it's
also useful to be able to create completely isolated and
minimally-distinguishable connections to individual peers, in order to be able
to send specific messages over Tor, or to implement some custom network crawler
logic.
2020-09-17 12:02:20 -07:00
teor 66265dc11a Adjust the EWMA decay for the latest sync timeout 2020-09-09 15:35:09 -07:00
teor 1f7af0a779 Update the inv message processing comment
Cleanup after PR #1028.
2020-09-09 15:29:38 -07:00
teor 2a68ef5acb Update the peerset buffer size and sync timeout
Also add a bunch of comments and documentation for network-constrained
nodes, and for testnet.
2020-09-08 12:44:33 -07:00
teor e6e859dce2 Tweak sync timeouts
* increase the EWMA default and decay
* increase the block download retries
* increase the request and block download timeouts
* increase the sync timeout
2020-09-08 12:44:33 -07:00
Jane Lusby 1b17691dda improve logging 2020-09-08 12:37:34 -07:00
Jane Lusby 81a3ad3a0d filter inventory advertisements correctly 2020-09-08 12:37:34 -07:00
Henry de Valence 3f150eb16e
network: implement transaction request handling. (#1016)
This commit makes several related changes to the network code:

- adds a `TransactionsByHash(HashSet<transaction::Hash>)` request and
  `Transactions(Vec<Arc<Transaction>>)` response pair that allows
  fetching transactions from a remote peer;

- adds a `PushTransaction(Arc<Transaction>)` request that pushes an
  unsolicited transaction to a remote peer;

- adds an `AdvertiseTransactions(HashSet<transaction::Hash>)` request
  that advertises transactions by hash to a remote peer;

- adds an `AdvertiseBlock(block::Hash)` request that advertises a block
  by hash to a remote peer;

Then, it modifies the connection state machine so that outbound
requests to remote peers are handled properly:

- `TransactionsByHash` generates a `getdata` message and collects the
  results, like the existing `BlocksByHash` request.

- `PushTransaction` generates a `tx` message, and returns `Nil` immediately.

- `AdvertiseTransactions` and `AdvertiseBlock` generate an `inv`
  message, and return `Nil` immediately.

Next, it modifies the connection state machine so that messages
from remote peers generate requests to the inbound service:

- `getdata` messages generate `BlocksByHash` or `TransactionsByHash`
  requests, depending on the content of the message;

- `tx` messages generate `PushTransaction` requests;

- `inv` messages generate `AdvertiseBlock` or `AdvertiseTransactions`
  requests.

Finally, it refactors the request routing logic for the peer set to
handle advertisement messages, providing three routing methods:

- `route_p2c`, which uses p2c as normal (default);
- `route_inv`, which uses the inventory registry and falls back to p2c
  (used for `BlocksByHash` or `TransactionsByHash`);
- `route_all`, which broadcasts a request to all ready peers (used for
  `AdvertiseBlock` and `AdvertiseTransactions`).
2020-09-08 10:16:29 -07:00
Henry de Valence cad38415b2
network: fix bug in inventory advertisement handling (#1022)
* network: fix bug in inventory advertisement handling

The RFC https://zebra.zfnd.org/dev/rfcs/0003-inventory-tracking.html described
the use of a `broadcast` channel in place of an `mpsc` channel to get
ring-buffer behavior, keeping a bound on the size of the channel but dropping
old entries when the channel is full.

However, it didn't explicitly describe how this works (the `broadcast` channel
returns a `RecvError::Lagged(u64)` to inform receivers that they lost
messages), so the lag-handling wasn't implemented and I didn't notice in
review.

Instead, the ? operator bubbled the lag error all the way up from
`InventoryRegistry::poll_inventory` through `<PeerSet as Service>::poll_ready`
through various Tower wrappers to users of the peer set.  The error propagation
is bad enough, because it caused client errors that shouldn't have happened,
but there's a worse interaction.

The `Service` contract distinguishes between request errors (from
`Service::call`, scoped to the request) and service errors (from
`Service::poll_ready`, scoped to the service).  The `Service` contract
specifies that once a service returns an error from `poll_ready`, the service
can be assumed to be failed permanently.

I believe (but haven't tested or carefully worked through the details) that
this caused various tower middleware to report the entire peer set service as
permanently failed due to a transient inventory "error" (more of an indicator),
and I suspect that this is the cause of #1003, where all of the sync
component's requests end up failing because the peer set reported that it
failed permanently.  I am able to reproduce #1003 locally before this change
and unable to reproduce it locally after this change, though I have not tested
exhaustively.

* network: add metric for dropped inventory advertisements

Co-authored-by: teor <teor@riseup.net>

Co-authored-by: teor <teor@riseup.net>
2020-09-07 21:24:31 -07:00
Henry de Valence 9682d452ee network: add AddressBook::potentially_connected_peers(). 2020-09-07 11:13:15 -07:00
dependabot[bot] 142226ad57 build(deps): bump indexmap from 1.5.2 to 1.6.0
Bumps [indexmap](https://github.com/bluss/indexmap) from 1.5.2 to 1.6.0.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Commits](https://github.com/bluss/indexmap/compare/1.5.2...1.6.0)

Signed-off-by: dependabot[bot] <support@github.com>
2020-09-07 07:56:39 -04:00
Alfredo Garcia 454e75e7c0
Rename old references to BlockHeaderHash and BlockHeight (#1002)
* rename some references

* Apply suggestions from code review

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: teor <teor@riseup.net>

Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: teor <teor@riseup.net>
2020-09-04 15:40:48 -07:00
teor b5c653ed93
Use ok_or for constants, rather than a redudant closure
* Use ok_or for constants in zebra-network
* Use ok_or for constants in zebra-consensus
2020-09-02 14:26:26 +10:00
Jane Lusby 88557ddd0a address more comments 2020-09-01 21:01:38 -04:00
Jane Lusby d933abeebf fix typo 2020-09-01 21:01:38 -04:00
Jane Lusby 96c8809348
Implement Inventory Tracking RFC (#963)
* Add .cargo to the gitignore file

* Implement Inventory Tracking RFC

* checkpoint

* wire together the inventory registry

* add comment documenting condition

* make inventory registry optional
2020-09-01 14:28:54 -07:00
Henry de Valence f91b91b6d8 network: clarify comment on Default for handshake::Builder
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-09-01 13:56:00 -07:00
Henry de Valence fddba7a336 network: remove handshake::Builder::with_addr
Use the listen_addr field already specified in the config.

Also, derive Clone for Handshake<S>.

Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-09-01 13:56:00 -07:00
Henry de Valence a5b6f39850 network: don't leak our exact time skew in handshakes. 2020-09-01 13:56:00 -07:00
Henry de Valence 1b5a824584 network: fix bug in BIP37 relay flag handling.
The relay flag in the version message is used in conjunction with BIP37 to
receive bloom-filtered transactions.  When it is set to false, transactions are
not relayed until a bloom filter is set.  Since we don't implement BIP37 (it's
not useful for shielded transactions), this means we'll never receive
transactions.
2020-09-01 13:56:00 -07:00
Henry de Valence 60a0b8c382 network: change Handshake::new to a Builder.
This allows more detailed control over the handshake parameters.
2020-09-01 13:56:00 -07:00
teor d7e32b68e5 fix: Split a clippy allow, so its comment is clearer 2020-09-01 11:40:18 -04:00
teor 5afa24588a fix: Remove unused dependencies 2020-08-20 14:49:17 -04:00
Henry de Valence ebdceb5197 chain: rename TransactionHash to transaction::Hash 2020-08-17 11:46:34 -07:00
Henry de Valence 2712c4b72a chain: rename BlockHeader to block::Header 2020-08-17 11:46:34 -07:00
Henry de Valence 103b663c40 chain: rename BlockHeight to block::Height 2020-08-17 11:46:34 -07:00
Henry de Valence 61dea90e2f chain: rename BlockHeaderHash to block::Hash
This is the first in a sequence of changes that change the block:: items
to not include Block as a prefix in their name, in accordance with the
Rust API guidelines.
2020-08-17 11:46:34 -07:00
Henry de Valence 948b067808 chain: move Network, NetworkUpgrade to parameters
Also, avoid using star-imports of the enum variants, which pollutes the
namespace.
2020-08-17 11:46:34 -07:00
Henry de Valence dad6340cd3 chain: move BlockHeight into block 2020-08-17 11:46:34 -07:00
Henry de Valence b36fe8f937 chain: move sha256d to serialization module.
This extracts the SHA256d code from being split across two modules and puts it
in one module, under serialization.

The code is unchanged except for three deleted tests:

* `sha256d_flush` in `sha256d_writer` (not a meaningful test);
* `transactionhash_debug` (constructs an invalid transaction hash, and the
  behavior is tested in the next test);
* `decode_state_debug` (we do not need to test the Debug output of
  DecodeState);
2020-08-17 11:46:34 -07:00
Alfredo Garcia b41e33e066
Bytes read and bytes written metrics (#901)
* add bytes read and written metrics

* Apply suggestions from code review

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* store address as string

* Apply suggestions from code review

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* change addr to label

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* remove newline

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
2020-08-14 15:50:26 -07:00
Henry de Valence a79ce97957
Fix sync algorithm. (#887)
* checkpoint: reject older of duplicate verification requests.

If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled.  Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.

* sync: add a timeout layer to block requests.

Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.

* sync: restart syncing on error

Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.

* sync: additional debug info

* sync: handle lookahead limit correctly.

Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped.  This meant that completed tasks could be left until the limit was
exceeded again.  Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.

* network: add debug instrumentation to retry policy

* sync: instrument the spawned task

* sync: streamline ObtainTips/ExtendTips logic & tracing

This change does three things:

1.  It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method.  This means that when debugging we only have one
deduplication algorithm to focus on.

2.  It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.

3.  It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip").  The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.

* sync: hack to work around zcashd behavior

* sync: localize debug statement in extend_tips

* sync: change algorithm to define tips as pairs of hashes.

This is different enough from the existing description that its comments no
longer apply, so I removed them.  A further chunk of work is to change the sync
RFC to document this algorithm.

* sync: reduce block timeout

* state: add resource limits for sled

Closes #888

* sync: add a restart timeout constant

* sync: de-pub constants
2020-08-12 16:48:01 -07:00
teor 109666cc48
fix: Tweak the the network listener log (#886) 2020-08-12 14:22:54 -07:00
Henry de Valence 299afe13df
zebra-network tweaks. (#877)
* network: move gossiped peer selection logic into address book.

* network: return BoxService from init.

* zebrad: add note on why we truncate thegossiped peer list

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* Remove unused .rustfmt.toml

Many of these options are never actually loaded by our CI because of a channel
mismatch, where they're not applied on stable but only on nightly (see the logs
from a rustfmt job).  This means that we can get different settings when
running `cargo fmt` on the nightly and stable channels, which was causing a CI
failure on this PR.  Reverting back to the default rustfmt settings avoids this
problem and keeps us in line with upstream rustfmt.  There's no loss to us
since we were using the defaults anyways.

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-11 13:07:44 -07:00
Alfredo Garcia 9c387521bd
Print endpoint addresses at startup (#867)
* print tracing and metrics endpoints in startup

* print network address in startup
2020-08-10 12:47:26 -07:00
teor ee6f0de14d refactor: Move NetworkUpgrade to zebra-chain 2020-08-10 18:54:42 +10:00
Henry de Valence 3d46ab746a
Clean up options in network config section. (#839)
Closes #536.

This removes:

- the user-agent (we can add a mechanism to specify extra BIP14 components later, if any users ask us for that feature);
- the EWMA parameters (these were put in the config just to avoid making a choice);
- the peer connection timeout (we can change the default value if anyone ever has a problem with it);
- the peer set request buffer size (setting this too low can make the application deadlock);

The new peer interval is left in.
2020-08-06 11:29:00 -07:00
teor c95d980bc2 doc: Explain current and minimum network protocol versions 2020-08-04 15:11:16 -04:00
teor 59eb23772d feature: Use the Canopy testnet network protocol version
Canopy will activate on testnet within the next 24 hours. To continue to
use testnet, we need to upgrade the Zebra network protocol version.
2020-08-04 12:13:58 +10:00
Henry de Valence ef0b200b82 restore Zebras to part of the name, not a comment 2020-07-29 18:46:47 -07:00
Jack Grigg d1e0e1abf5 fix: Broadcast a valid BIP 14 user agent
Closes ZcashFoundation/zebra#791.
2020-07-29 15:49:14 -04:00
teor 6be0f8ed2f fix: Warn if the listener port is for the wrong network
We'll fix the underlying defaults in #660, with the rest of the
listeners.
2020-07-29 16:03:52 +10:00