Commit Graph

286 Commits

Author SHA1 Message Date
Henry de Valence 948b067808 chain: move Network, NetworkUpgrade to parameters
Also, avoid using star-imports of the enum variants, which pollutes the
namespace.
2020-08-17 11:46:34 -07:00
Henry de Valence dad6340cd3 chain: move BlockHeight into block 2020-08-17 11:46:34 -07:00
Henry de Valence b36fe8f937 chain: move sha256d to serialization module.
This extracts the SHA256d code from being split across two modules and puts it
in one module, under serialization.

The code is unchanged except for three deleted tests:

* `sha256d_flush` in `sha256d_writer` (not a meaningful test);
* `transactionhash_debug` (constructs an invalid transaction hash, and the
  behavior is tested in the next test);
* `decode_state_debug` (we do not need to test the Debug output of
  DecodeState);
2020-08-17 11:46:34 -07:00
Alfredo Garcia b41e33e066
Bytes read and bytes written metrics (#901)
* add bytes read and written metrics

* Apply suggestions from code review

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* store address as string

* Apply suggestions from code review

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* change addr to label

Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>

* remove newline

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
2020-08-14 15:50:26 -07:00
Henry de Valence a79ce97957
Fix sync algorithm. (#887)
* checkpoint: reject older of duplicate verification requests.

If we get a duplicate block verification request, we should drop the older one
in favor of the newer one, because the older request is likely to have been
canceled.  Previously, this code would accept up to four duplicate verification
requests, then fail all subsequent ones.

* sync: add a timeout layer to block requests.

Note that if this timeout is too short, we'll bring down the peer set in a
retry storm.

* sync: restart syncing on error

Restart the syncing process when an error occurs, rather than ignoring it.
Restarting means we discard all tips and start over with a new block locator,
so we can have another chance to "unstuck" ourselves.

* sync: additional debug info

* sync: handle lookahead limit correctly.

Instead of extracting all the completed task results, the previous code pulled
results out until there were fewer tasks than the lookahead limit, then
stopped.  This meant that completed tasks could be left until the limit was
exceeded again.  Instead, extract all completed results, and use the number of
pending tasks to decide whether to extend the tip or wait for blocks to finish.

* network: add debug instrumentation to retry policy

* sync: instrument the spawned task

* sync: streamline ObtainTips/ExtendTips logic & tracing

This change does three things:

1.  It aligns the implementation of ObtainTips and ExtendTips so that they use
the same deduplication method.  This means that when debugging we only have one
deduplication algorithm to focus on.

2.  It streamlines the tracing output to not include information already
included in spans. Both obtain_tips and extend_tips have their own spans
attached to the events, so it's not necessary to add Scope: prefixes in
messages.

3.  It changes the messages to be focused on reporting the actual
events rather than the interpretation of the events (e.g., "got genesis hash in
response" rather than "peer could not extend tip").  The motivation for this
change is that when debugging, the interpretation of events is already known to
be incorrect, in the sense that the mental model of the code (no bug) does not
match its behavior (has bug), so presenting minimally-interpreted events forces
interpretation relative to the actual code.

* sync: hack to work around zcashd behavior

* sync: localize debug statement in extend_tips

* sync: change algorithm to define tips as pairs of hashes.

This is different enough from the existing description that its comments no
longer apply, so I removed them.  A further chunk of work is to change the sync
RFC to document this algorithm.

* sync: reduce block timeout

* state: add resource limits for sled

Closes #888

* sync: add a restart timeout constant

* sync: de-pub constants
2020-08-12 16:48:01 -07:00
teor 109666cc48
fix: Tweak the the network listener log (#886) 2020-08-12 14:22:54 -07:00
Henry de Valence 299afe13df
zebra-network tweaks. (#877)
* network: move gossiped peer selection logic into address book.

* network: return BoxService from init.

* zebrad: add note on why we truncate thegossiped peer list

Co-authored-by: Jane Lusby <jlusby42@gmail.com>

* Remove unused .rustfmt.toml

Many of these options are never actually loaded by our CI because of a channel
mismatch, where they're not applied on stable but only on nightly (see the logs
from a rustfmt job).  This means that we can get different settings when
running `cargo fmt` on the nightly and stable channels, which was causing a CI
failure on this PR.  Reverting back to the default rustfmt settings avoids this
problem and keeps us in line with upstream rustfmt.  There's no loss to us
since we were using the defaults anyways.

Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-08-11 13:07:44 -07:00
Alfredo Garcia 9c387521bd
Print endpoint addresses at startup (#867)
* print tracing and metrics endpoints in startup

* print network address in startup
2020-08-10 12:47:26 -07:00
teor ee6f0de14d refactor: Move NetworkUpgrade to zebra-chain 2020-08-10 18:54:42 +10:00
Henry de Valence 3d46ab746a
Clean up options in network config section. (#839)
Closes #536.

This removes:

- the user-agent (we can add a mechanism to specify extra BIP14 components later, if any users ask us for that feature);
- the EWMA parameters (these were put in the config just to avoid making a choice);
- the peer connection timeout (we can change the default value if anyone ever has a problem with it);
- the peer set request buffer size (setting this too low can make the application deadlock);

The new peer interval is left in.
2020-08-06 11:29:00 -07:00
teor c95d980bc2 doc: Explain current and minimum network protocol versions 2020-08-04 15:11:16 -04:00
teor 59eb23772d feature: Use the Canopy testnet network protocol version
Canopy will activate on testnet within the next 24 hours. To continue to
use testnet, we need to upgrade the Zebra network protocol version.
2020-08-04 12:13:58 +10:00
Henry de Valence ef0b200b82 restore Zebras to part of the name, not a comment 2020-07-29 18:46:47 -07:00
Jack Grigg d1e0e1abf5 fix: Broadcast a valid BIP 14 user agent
Closes ZcashFoundation/zebra#791.
2020-07-29 15:49:14 -04:00
teor 6be0f8ed2f fix: Warn if the listener port is for the wrong network
We'll fix the underlying defaults in #660, with the rest of the
listeners.
2020-07-29 16:03:52 +10:00
teor 536668f993 fix: allow(dead_code) on some protocol version functions 2020-07-28 22:10:20 -04:00
Henry de Valence 238dec51dd network: do not export Builder
This is used to construct the Codec, which is an internal type.  The
export was added in 4dc307f2.
2020-07-28 11:10:15 -07:00
teor 993532b604 feature: Add a "Genesis" network upgrade
We can use this network upgrade to implement different consensus rules
and chain context handling for genesis blocks.

Part of the chain state design in #682.
2020-07-27 14:03:14 -04:00
Henry de Valence 4aa00ad216 Align crate versions and user-agent with NU numbers.
We had a brief discussion on discord and it seemed like we had consensus on the
following versioning policy:

* zebrad: match major version to NU version, so we will start by releasing
  zebrad 3.0.0;

* zebra-* libraries: start by matching zebrad's version, then increment major
  versions of each library as we need to make breaking changes (potentially
  faster than the zebrad version, always respecting semver but making no
  guarantees about the longevity of major releases).

This commit sets all of the crate versions to 3.0.0-alpha.0 -- the -alpha.0
marks it as a prerelease not subject to perfect adherence to compatibility
guarantees.
2020-07-24 11:46:37 -07:00
teor da09965a5f feature: Get the current minimum protocol version 2020-07-23 15:52:18 +10:00
teor 85f113bc18 doc: Add a TODO to the network protocol 2020-07-23 15:52:18 +10:00
teor c9ee85c3b5 feature: Add network upgrade activation heights 2020-07-23 15:52:18 +10:00
Henry de Valence cc955a2bbe network: document Responses, add warning about unsolicited invs. 2020-07-22 17:55:52 -07:00
Jane Lusby a722cf33f7 enable new tracing instrumentation in tokio 2020-07-22 14:39:54 -04:00
Jane Lusby e105b4f6c5
properly document guarantee provided by zebra-network (#713)
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2020-07-22 14:38:00 -04:00
Henry de Valence 4a41c9254d network: avoid panic when shutting down cleanly.
When the connection sees the client_rx channel close it knows it will never get
any more requests, and it should terminate.  But instead of terminating, it
errored itself, and the method to error itself tries to pull all the
outstanding client requests from the channel in order to fail them before it
shuts down.  This results in reading from a closed channel, causing a panic.
Instead we return cleanly rather than failing (since we know there are no
outstanding requests, as the channel is closed).
2020-07-22 18:04:45 +10:00
Henry de Valence 0dc2d92ad8 network: ensure dropping a Client closes the connection.
This fixes a bug introduced when we added heartbeat support.  Recall that we
handle the Bitcoin connection state machine on a per-peer basis.  Each
connection has a task created from the `Connection` struct, and a `Client:
tower::Service` "frontend" that passes requests to it via a channel.  In the
`Connection` event loop, the connection checks whether the request channel has
been closed, indicating no further requests from the `Client`, in which case it
shuts itself down and cleans up resources.  This occurs when all of the senders
have been dropped.

However, this behavior broke when we introduced heartbeat support, because we
spawned an additional task to send heartbeat messages along the request
channel.  This meant that instead of having a single sender, dropped by the
`Client`, we have two senders, the `Client` and the "shadow client" task that
generates heartbeat messages.  This means that when the `Client` is dropped, we
still have a live sender and the connection is not closed.  To fix this, the
`Client` now uses a `oneshot` to shut down its corresponding heartbeat task.
This closes all senders.
2020-07-21 15:43:31 -07:00
teor b0cd920fad feature: Use the Heartwood protocol version in zebra-network 2020-07-21 10:46:07 -07:00
teor 1cb1f1c52e fix: Put the peer set config vars together 2020-07-21 12:20:48 -04:00
dependabot[bot] c8fe4b43d8 build(deps): bump indexmap from 1.4.0 to 1.5.0
Bumps [indexmap](https://github.com/bluss/indexmap) from 1.4.0 to 1.5.0.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Commits](https://github.com/bluss/indexmap/compare/1.4.0...1.5.0)

Signed-off-by: dependabot[bot] <support@github.com>
2020-07-21 12:19:01 -04:00
Alfredo Garcia fe2a468417
add favicon to generated docs (#681) 2020-07-17 16:45:29 -07:00
teor ab6d1f5ec8
fix: Use the default Zcash port in version messages (#661)
We don't provide our address yet, so the port should be ignored.

But let's use the correct port, to avoid carrying this bug forward into
working code.
2020-07-15 11:43:28 -07:00
Alfredo Garcia d8834b149a
Limit protocol messages size (#645)
* change body msg limit and test case
* accept body at the exact limit len
* test the edges of the limit value
2020-07-15 19:15:52 +10:00
Henry de Valence fcd2f43f39 network: add warning to connection handling code. 2020-07-09 11:15:06 -07:00
Henry de Valence 217c25ef07 network: propagate tracing Spans through peer connection 2020-07-09 11:15:06 -07:00
Dimitris Apostolou ba81d7d4c0 Fix typos 2020-07-07 11:13:49 -07:00
teor f999ec75e6 fix: Remove a non-standard unicode character in a comment 2020-07-01 16:03:14 -04:00
Deirdre Connolly 05316dee21 Listen on 0.0.0.0, not 127.0.0.1
Turns out when your node faces the internet directly, it has to listen
to those addresses directly.
2020-06-19 03:46:09 -04:00
Henry de Valence 6cc1627a5d zebrad: apply serde(default) to config sections
Each subsection has to have `serde(default)` to get the behaviour we want
(delete all fields except the ones that have been changed); otherwise, we can
delete only entire sections.
2020-06-18 17:43:36 -04:00
Jane Lusby df18ac72c5 fix sharedpeererror to propagate tracing context 2020-06-17 14:38:26 -07:00
Jane Lusby 685bdaf2df don't require absense of cancel handles
Prior to this change, we required that services that are canceled do not
have a cancel handle in the `cancel_handles` list, based on the
assumption that the handle must have been removed in the process of
canceling this service.

This doesn't holding up though, because it is currently possible for us
to have the same peer connect to us multiple times, the second connect
removes the cancel handle of the original connect and inserts it's own
cancel handle in its place. In this scenario, when the first service is
polled for readiness it will see that it has been canceled and go to
clean itself up, but when it asserts that it doesn't have a cancel
handle it will see the cancel handle of the second connect event, which
uses the same key as the first connect, and fail its debug assertion.

This change removes that debug assert on the assumption that it is okay
for a peer to connect multiple times consecutively, and that the correct
behavior in that case is to just cancel the first connection and
continue as normal.
2020-06-16 13:42:31 -07:00
Jane Lusby 4b9e4520ce
cleanup API for arc based error type (#469)
Co-authored-by: Jane Lusby <jane@zfnd.org>
2020-06-12 11:29:42 -07:00
George Tankersley d8b3db5679 Use new seeder address for yolo.money 2020-06-10 21:49:25 -04:00
George Tankersley 6606bcaa62 Update list of DNS seeders
This adds the Foundation's new seeders and removes Simon's defunct one.
2020-06-10 20:56:31 -04:00
Jane Lusby 431f194c0f
propagate errors out of zebra_network::init (#435)
Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service.

This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.
2020-06-09 12:24:28 -07:00
Jane Lusby 9f802cd8dd Wrap Transaction in Arc 2020-06-06 18:13:17 -04:00
Jane Lusby 9bcda0f9c7 Wrap Blocks in Arc throughout codebase 2020-06-05 00:36:55 -04:00
dependabot-preview[bot] 7a75522885 Bump indexmap from 1.3.2 to 1.4.0
Bumps [indexmap](https://github.com/bluss/indexmap) from 1.3.2 to 1.4.0.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Commits](https://github.com/bluss/indexmap/compare/1.3.2...1.4.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-01 15:38:00 -04:00
dependabot-preview[bot] 145d9a1835 Bump proptest from 0.9.6 to 0.10.0
Bumps [proptest](https://github.com/altsysrq/proptest) from 0.9.6 to 0.10.0.
- [Release notes](https://github.com/altsysrq/proptest/releases)
- [Changelog](https://github.com/AltSysrq/proptest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/altsysrq/proptest/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-29 15:06:40 -04:00
dependabot-preview[bot] e317b68b1d Bump proptest-derive from 0.1.2 to 0.2.0
Bumps [proptest-derive](https://github.com/AltSysrq/proptest) from 0.1.2 to 0.2.0.
- [Release notes](https://github.com/AltSysrq/proptest/releases)
- [Changelog](https://github.com/AltSysrq/proptest/blob/master/CHANGELOG.md)
- [Commits](https://github.com/AltSysrq/proptest/compare/proptest-derive-0.1.2...proptest-derive-0.2.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-28 23:00:29 -04:00