zebra

Commit Graph

Author	SHA1	Message	Date
teor	5cdcc5255f	Proptest `MetaAddr` sanitization and serialization together	2021-05-26 18:13:35 -04:00
teor	9f8b4f836e	Test round-trip serialization for gossiped `MetaAddr`s	2021-05-26 18:13:35 -04:00
teor	81630d19f2	Add service sanitization to `MetaAddr::sanitize` This makes sure that deserialization and generated `MetaAddr`s are consistent.	2021-05-26 18:13:35 -04:00
teor	bf6fe175dd	Stop deriving PartialEq for MetaAddr This makes sure Ord and ParitalEq are always consistent.	2021-05-26 18:13:35 -04:00
teor	078385ae00	Canonicalise arbitrary IP addresses in proptests This makes round-trip serialization tests work.	2021-05-26 18:13:35 -04:00
teor	c0114a2c5f	Security: Stop panicking when serializing out-of-range times Zebra assumes that deserialized times are always able to be serialized. But this assumption is wrong because: - sanitization can modify times - gossiped `MetaAddr` validation can modify times	2021-05-26 18:13:35 -04:00
Pili Guerra	e3d2ae0a8a	Update versions for zebra v1.0.0-alpha.9 release (#2196 ) * Update versions for zebra v1.0.0-alpha.9 release * Update Cargo.lock	2021-05-26 13:01:39 +02:00
teor	f0549b2f7c	Derive Arbitrary impls for a bunch of chain and network types (#2179 ) Enable proptests for internal and external network protocol messages, using times with the correct protocol-specific ranges. (4 or 8 bytes.)	2021-05-24 11:10:07 -04:00
teor	57fb5c028c	Fix up some doc links (#2180 )	2021-05-21 12:06:31 -03:00
teor	2685fc746e	Remove CandidateSet state and add last seen time limit to candidate_set::validate_addrs (#2177 )	2021-05-21 02:21:13 +00:00
teor	752358d236	Fix some candidate set and meta addr doc links (#2174 ) Suggested by jvff.	2021-05-21 11:40:14 +10:00
teor	40d06657b3	Update new_gossiped_meta_addr to the latest API	2021-05-21 06:51:34 +10:00
teor	c7ea1395e7	Security: Fix CandidateSet timeout and fanout * Refactor: Split CandidateSet::update into separate functions * Security: Apply a timeout to the entire CandidateSet::update * Security: Stop using very large fanout limits during initialization Previously, Zebra used the number of resolved peer addresses. So it was possible for all peers to fail, and for Zebra to hang on the first update. And Zebra could send a fanout for each initial peer, regardless of whether their connection was successful. Also: - wait for at least one successful peer before trying an update - warn if there are no successful initial peers	2021-05-21 06:51:34 +10:00
Deirdre Connolly	bf72d6dbc0	Update zebra-network/src/peer/handshake.rs Co-authored-by: teor <teor@riseup.net>	2021-05-18 14:02:19 +10:00
teor	92828bbb29	Reliability: send local listener address to peers When peers ask for peer addresses, add our local listener address to the set of addresses, sanitize, then truncate. Sanitize shuffles addresses, so if there are lots of addresses in the address book, our address will only be sent to some peers.	2021-05-18 14:02:19 +10:00
teor	d2a8985dbc	Reliability: Add inbound canonical addresses to the address book Add canonical addresses from inbound connections to the address book, so that Zebra can use them for reconnection attempts. Use the newly added `NeverAttemptedAlternate` state for these addresses, so we try gossiped addresses first, then canonical addresses. This avoids duplicate connections to inbound peers.	2021-05-18 14:02:19 +10:00
teor	458c26f1e3	Limit initial candidate set fanout to the number of initial peers If there is a small number of initial peers, and they are slow, the initial candidate set update can appear to hang. To avoid this issue, limit the initial candidate set fanout to the number of initial peers. Once the initial peers have sent us more peer addresses, there is no need to limit the fanouts for future updates. Reported by Niklas Long of Equilibrium.	2021-05-18 07:54:03 +10:00
teor	679920f6b8	Stop trying to resolve empty initial peer lists Instead, log an error and return immediately.	2021-05-18 07:54:03 +10:00
teor	b600e82d6e	Security: Avoid silently corrupting invalid times during serialization (#2149 ) * Security: panic if an internally generated time is out of range If Zebra has a bug where it generates blocks, transactions, or meta addresses with bad times, panic. This avoids sending bad data onto the network. (Previously, Zebra would truncate some of these times, silently corrupting the underlying data.) Make it clear that deserialization of these objects is infalliable.	2021-05-17 16:53:10 -04:00
teor	b0b8b2f61a	Add extra instrumentation for initialize and handshakes (#2122 ) * Instrument the crawl task When we created the crawl task, we forgot to instrument it with the global span. This fix makes sure that the git and network span appears on crawl logs. * Instrument the connector * Improve handshake instrumentation Make some spans debug, so there are not too many spans. * Add the address to initial peer connection errors	2021-05-17 16:49:16 -04:00
teor	7969459b19	Security: Move the Verack response after the version check (#2121 ) We should do as many local checks as possible, before sending further messages.	2021-05-17 16:39:44 -04:00
teor	c40cbee42f	Remove address book peers that have changed to clients If an address book peer stops advertising the NODE_SERVICES bit, remove it from the address book.	2021-05-14 23:45:42 +10:00
teor	f541f85792	Send unspecified addresses and client services for isolated connections	2021-05-14 23:45:42 +10:00
teor	9160365d06	Fix a comment	2021-05-14 23:45:42 +10:00
teor	a8a0d6450c	Security: stop gossiping temporary inbound remote addresses to peers - stop putting inbound addresses in the address book - drop address book entries that can't be used for outbound connections - distinguish between temporary inbound and permanent outbound peer addresses - also create variants to handle proxy connections (but don't use them yet) - avoid tracking connection state for isolated connections - document security constraints for the address book and peer set	2021-05-14 23:45:42 +10:00
teor	fde8f1e4ca	Security: stop panicking on out-of-range version timestamps, Credit: Equilibrium (#2148 ) * Security: stop panicking on out-of-range version timestamps Instead, return a deserialization error, and close the connection. This issue was reported by Equilibrium.	2021-05-14 17:13:11 +10:00
Pili Guerra	500dc2e511	Update version strings for Zebra v1.0.0-alpha.8 release (#2136 ) * Update versions for zebra v1.0.0-alpha.8 release * Update tower-batch and tower-fallback version strings * Update Cargo.lock	2021-05-12 14:27:36 +02:00
teor	1f40498fcf	Clippy nightly: disable owned cmp, stop comparing bool using assert_eq (#2073 ) * Disable clippy warnings about comparing a newly created struct In Sapling, we compare canonical JubJub bytes with a supplied byte array. Since we need to perform calculations to get it into canonical form, we need to create a newly owned object. * Clippy: use assert rather than assert_eq on a bool	2021-04-27 09:57:45 -03:00
Pili Guerra	ea1446ee92	Update version strings for Zebra v1.0.0-alpha.7 release (#2056 ) * Update version strings for Zebra v1.0.0-alpha.7 release	2021-04-23 12:56:25 +00:00
teor	7b13d5573a	Make String Zcash serialization consistent with deserialization After recent changes, serialization was `write_string`, but deserialization was `zcash_deserialize`.	2021-04-21 23:58:48 -04:00
Kirill Fomichev	afac2c2846	Use the default port for configured listen addresses with no port (#2043 ) * Allow use listen address in config without port * update comments * remove not used alias * use Network::default_port * Move tests and use toml instead json * change error message * Make match more readable Co-authored-by: teor <teor@riseup.net>	2021-04-21 23:14:29 +00:00
teor	0203d1475a	Refactor and document correctness for std::sync::Mutex<AddressBook>	2021-04-21 17:14:47 -04:00
teor	905b90d6a1	Refactor and document correctness for std::sync::Mutex in ErrorSlot	2021-04-21 16:39:06 -04:00
teor	3f45735f3f	Use futures:🔒:Mutex for the nonce set	2021-04-21 01:39:49 -04:00
teor	2ed8bb00cf	Clarify CandidateSet state diagram We get inbound connections on the listener port, but the important part is the inbound connection itself.	2021-04-21 01:37:43 -04:00
teor	ad272f2bee	Make sure handshake version negotiation always has a timeout As part of this change, refactor handshake version negotiation into its own function.	2021-04-19 18:31:28 -04:00
teor	2cecd52a10	Fix comment typo	2021-04-19 10:11:22 -04:00
teor	8fb12f07a1	Fix outdated comment	2021-04-19 10:11:22 -04:00
teor	eabadb8301	Make heartbeats wait for the connection queue to empty, with a timeout Also cleanup the heartbeat code, so each heartbeat request/response runs in a future with a single timeout.	2021-04-19 10:11:22 -04:00
teor	0def12f825	Add split array serialization functions for Transaction::V5 (#2017 ) * Add functions for serializing and deserializing split arrays In Transaction::V5, Zcash splits some types into multiple arrays, with a single prefix count before the first array. Add utility functions for serializing and deserializing the subsequent arrays, with a paramater for the original array's length. * Use zcash_deserialize_bytes_external_count in zebra-network * Move some preallocate proptests to their own file And fix the test module structure so it is consistent with the rest of zebra-chain. * Add a convenience alias zcash_serialize_external_count * Explain why u64::MAX items will never be reached	2021-04-16 08:23:00 +10:00
teor	381c20b6af	Security: change the GetAddr fanout to 3 Zebra avoids having a majority of addresses from a single peer by asking 3 peers for new addresses. Also update a bunch of security comments and related documentation.	2021-04-15 13:09:14 -04:00
teor	59aa04c9b9	Stop panicking when Zebra sends a reject without extra data Also add round-trip unit tests for extra data and no extra data.	2021-04-15 12:20:33 -04:00
teor	a417c7c8c7	Use meaningful names for select! variables	2021-04-13 23:56:16 -04:00
teor	fb95de99a6	Refactor the dial result into a From impl	2021-04-13 18:52:49 -04:00
Alfredo Garcia	5ec05e91e1	update version strings for v1.0.0-alpha.6	2021-04-08 18:48:34 -04:00
teor	1626ec383a	Add InventoryHash and MetaAddr proptests (#1985 ) * Make proptest dependencies consistent between chain and network * Implement Arbitrary for InventoryHash and use it in tests * Impl Arbitrary for MetaAddr and use it in tests Also test some extreme times in MetaAddr sanitization.	2021-04-07 14:13:52 -03:00
teor	375c8d8700	Fix a deadlock between the crawler and dialer, and other hangs (#1950 ) * Stop ignoring inbound message errors and handshake timeouts To avoid hangs, Zebra needs to maintain the following invariants in the handshake and heartbeat code: - each handshake should run in a separate spawned task (not yet implemented) - every message, error, timeout, and shutdown must update the peer address state - every await that depends on the network must have a timeout Once the Connection is created, it should handle timeouts. But we need to handle timeouts during handshake setup. * Avoid hangs by adding a timeout to the candidate set update Also increase the fanout from 1 to 2, to increase address diversity. But only return permanent errors from `CandidateSet::update`, because the crawler task exits if `update` returns an error. Also log Peers response errors in the CandidateSet. * Use the select macro in the crawler to reduce hangs The `select` function is biased towards its first argument, risking starvation. As a side-benefit, this change also makes the code a lot easier to read and maintain. * Split CrawlerAction::Demand into separate actions This refactor makes the code a bit easier to read, at the cost of sometimes blocking the crawler on `candidates.next()`. That's ok, because `next` only has a short (< 100 ms) delay. And we're just about to spawn a separate task for each handshake. * Spawn a separate task for each handshake This change avoids deadlocks by letting each handshake make progress independently. * Move the dial task into a separate function This refactor improves readability. * Fix buggy future::select function usage And document the correctness of the new code.	2021-04-07 10:25:10 -03:00
teor	de6d1c93f3	Clarify a comment	2021-04-07 18:56:38 +10:00
teor	64662a758d	Move the preallocate tests into their own files (#1977 ) * Move the preallocate tests into their own files And move the MetaAddr proptest into its own file. Also do some minor formatting and cleanups. Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>	2021-04-07 12:32:27 +10:00
Preston Evans	0daaf582e2	Implement Trusted Vector Preallocation (#1920 ) * Implement SafePreallocate. Resolves #1880 * Add proptests for SafePreallocate * Apply suggestions from code review Comments which did not include replacement code will be addressed in a follow-up commit. Co-authored-by: teor <teor@riseup.net> * Rename [Safe-> Trusted]Allocate. Add doc and tests Add tests to show that the largest allowed vec under TrustedPreallocate is small enough to fit in a Zcash block/message (depending on type). Add doc comments to all TrustedPreallocate test cases. Tighten bounds on max_trusted_alloc for some types. Note - this commit does NOT include TrustedPreallocate impls for JoinSplitData, String, and Script. These impls will be added in a follow up commit * Implement SafePreallocate. Resolves #1880 * Add proptests for SafePreallocate * Apply suggestions from code review Comments which did not include replacement code will be addressed in a follow-up commit. Co-authored-by: teor <teor@riseup.net> * Rename [Safe-> Trusted]Allocate. Add doc and tests Add tests to show that the largest allowed vec under TrustedPreallocate is small enough to fit in a Zcash block/message (depending on type). Add doc comments to all TrustedPreallocate test cases. Tighten bounds on max_trusted_alloc for some types. Note - this commit does NOT include TrustedPreallocate impls for JoinSplitData, String, and Script. These impls will be added in a follow up commit * Impl TrustedPreallocate for Joinsplit * Impl ZcashDeserialize for Vec<u8> * Arbitrary, TrustedPreallocate, Serialize, and tests for Spend<SharedAnchor> Co-authored-by: teor <teor@riseup.net>	2021-04-06 09:49:42 +10:00
teor	83b88f5b7a	Merge pull request #1972 from ZcashFoundation/peer-set-demand-deadlock-doc Document peer set deadlock resistance	2021-04-01 22:50:17 -04:00
teor	306fa88214	Document the correctness of Poll::Pending wakeups	2021-03-27 08:55:49 -04:00
teor	b329892665	Add a comment about a zcashd inv message bug	2021-03-26 11:26:59 -04:00
teor	1a159dfcb6	Add more methods for creating MetaAddrs This refactor lets us remove `MetaAddr::update_last_seen()`.	2021-03-26 07:23:49 +10:00
teor	6fe81d8992	Make MetaAddr.last_seen into a private field	2021-03-26 07:23:49 +10:00
teor	eae59de1e8	use PeerAddrState::*	2021-03-26 07:23:49 +10:00
teor	e9cdc224a2	Rewrite MetaAddr::sanitize so it's harder to misuse `sanitize` could be misused in two ways: * accidentally modifying the addresses in the address book itself * forgetting to sanitize new fields added to `MetaAddr` This change prevents accidental modification by taking `&self`, and explicitly creates a new sanitized `MetaAddr` with all fields listed.	2021-03-26 07:23:49 +10:00
Deirdre Connolly	c5bad9fac2	Rename NU5 to Nu5 to appease newly stable clippy::upper-case-acronyms (#1945 )	2021-03-26 07:22:50 +10:00
Deirdre Connolly	7efc700aca	Merge pull request #1713 from ZcashFoundation/use-groth16-batch-math Use batch optimizations, load params in groth16::Verifier, verify Spend & Output descriptions in transaction verifier	2021-03-24 12:28:25 -04:00
Deirdre Connolly	ca1d2de87d	Bump versions for v1.0.0-alpha.5 (#1932 ) Zebra's latest alpha checkpoints on Canopy activation, continues our work on NU5, and fixes a security issue. Some notable changes include: ## Added - Log address book metrics when PeerSet or CandidateSet don't have many peers (#1906) - Document test coverage workflow (#1919) - Add a final job to CI, so we can easily require all the CI jobs to pass (#1927) ## Changed - Zebra has moved its mandatory checkpoint from Sapling to Canopy (#1898, #1926) - This is a breaking change for users that depend on the exact height of the mandatory checkpoint. ## Fixed - tower-batch: wake waiting workers on close to avoid hangs (#1908) - Assert that pre-Canopy blocks use checkpointing (#1909) - Fix CI disk space usage by disabling incremental compilation in coverage builds (#1923) ## Security - Stop relying on unchecked length fields when preallocating vectors (#1925)	2021-03-22 22:05:01 -04:00
Alfredo Garcia	c5b1d0deee	move consts to start of the function	2021-03-22 11:54:31 -04:00
teor	b623acc945	Add memory DoS prevention comments	2021-03-22 11:54:31 -04:00
teor	8e18c99cdc	Avoid risky use of Read::take with untrusted lengths Zebra already uses `Read::take` to enforce message, body, and block maximum sizes. So using `Read::take` on untrusted sizes can result in short reads, without a corresponding `UnexpectedEof` error. (The old code was correct, but copying it elsewhere would have been risky.)	2021-03-22 11:54:31 -04:00
teor	609d70ae53	Stop untrusted preallocation during string deserialization This is an easy memory denial of service attack.	2021-03-22 11:54:31 -04:00
teor	4f923b90ea	Log address book metrics when peers aren't responding	2021-03-17 10:47:04 +10:00
teor	5a30268d7a	Log address metrics when the peer set has no ready peers	2021-03-17 10:47:04 +10:00
teor	6a342e93ca	Refactor AddressBook metrics into their own struct And provide an accessor function for address book metrics.	2021-03-17 10:47:04 +10:00
Alfredo Garcia	d49eaab68e	Bump versions for zebrad 1.0.0-alpha.4 (#1913 ) * Bump versions for zebrad 1.0.0-alpha.4 * add Cargo.lock	2021-03-16 21:12:37 -03:00
Jack Grigg	7a8cae9321	Tag message metrics by type	2021-03-17 09:38:07 +10:00
Jack Grigg	e51f33a4b9	Use interoperable names for common metrics These names match the equivalent metrics in zcashd, enabling common metrics to be collected across both node types.	2021-03-17 09:38:07 +10:00
teor	8fabbce037	Document and log trailing message bytes (#1888 ) * Rename a variable for consistency * Log extra trailing message bytes at debug level	2021-03-15 08:25:27 +10:00
teor	976ec912db	Document that the listed address is also advertised to peers (#1891 ) Documents a potential privacy leak, and a missing feature.	2021-03-15 08:25:07 +10:00
teor	e50692bd51	CandidateSet: Add Listener Port Connections Inbound connections on the Zcash protocol listener port perform a handshake. If the handshake is successful, it adds the peer to the AddressBook.	2021-03-09 23:05:18 -05:00
Jane Lusby	03aa6f671f	Implement outbound connection rate limiting - includes config rename with alias (#1855 ) * Implement outbound connection rate limiting * fix breaking change on config Co-authored-by: teor <teor@riseup.net>	2021-03-10 01:36:05 +00:00
Jane Lusby	e541746a50	Add initial support for NU5 to zebra (#1823 ) * Add NU5 variant to NetworkUpgrade * Add consensus branch ID for NU5 * Add network protocol versions for NU5 * Add NU5 to the protocol::version_consistent test * Make unimplemented panic messages more specific * Block target spacing doesn't change in NU5 * add comments for future updates for NU5 Co-authored-by: teor <teor@riseup.net>	2021-03-03 06:22:11 +10:00
teor	895bb43ead	Clippy: Fix inconsistent struct member orders lint	2021-03-01 23:31:18 -05:00
teor	2587a4e272	Fix a peer DNS resolution edge case (#1796 ) * Retry each peer DNS a few times individually We retry each peer individually, as well as retrying if there are no peers in the combined list. DNS failures are correlated, so all peers can fail DNS, leaving Zebra with a small list of custom-configured IP address peers. Individual retries avoid this issue. * Rename parse_peers to resolve_peers Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>	2021-02-26 09:06:27 +10:00
teor	9c3f236075	Stop sending blocks and transactions on error	2021-02-25 08:44:57 -08:00
teor	78f162733d	Revert "leverage return value for propagating errors" This reverts commit `e6cb20e13f`.	2021-02-24 13:07:31 -08:00
teor	72e2e83828	Revert "introduce Transition enum" This reverts commit `6906f87ead`.	2021-02-24 13:07:31 -08:00
teor	a5e89f4f2b	Revert "accidental drop on mustusesender" This reverts commit `5ec8d09e0d`.	2021-02-24 13:07:31 -08:00
teor	d60226a3cf	Revert "rustfmt" This reverts commit `9d9734ea81`.	2021-02-24 13:07:31 -08:00
teor	359015b2be	Revert "Only reject pending client requests when the peer has errored" This reverts commit `e06705ed81`.	2021-02-24 13:07:31 -08:00
teor	663ed6c842	Revert "Remove remaining references to fail_with" This reverts commit `5e4bf804aa`.	2021-02-24 13:07:31 -08:00
teor	3c225550ee	Revert "rename transitions from Exit to Close" This reverts commit `cfc4717b98`.	2021-02-24 13:07:31 -08:00
teor	86dc66dfa9	Revert "deduplicate match arms in handle_client_request" This reverts commit `2adee7b31a`.	2021-02-24 13:07:31 -08:00
teor	292a4391e2	Revert "update comments throughout connection.rs" This reverts commit `651d352ce1`.	2021-02-24 13:07:31 -08:00
teor	fc44a97925	Revert "remove unnecessary Option around request timeout" This reverts commit `c3724031df`.	2021-02-24 13:07:31 -08:00
teor	e06120cd36	Revert "ensure peer/client.rs comments are up to date" This reverts commit `2266886a53`.	2021-02-24 13:07:31 -08:00
teor	1a70d807b6	Revert "make sure peer/error.s comments are up to date" This reverts commit `6f205a1812`.	2021-02-24 13:07:31 -08:00
teor	3b2077fcfd	Revert "Apply suggestions from code review" This reverts commit `736092abb8`.	2021-02-24 13:07:31 -08:00
teor	7558f74c78	Bump versions for zebrad 1.0.0-alpha.3	2021-02-23 10:39:13 -05:00
dependabot[bot]	b578d1ff2e	build(deps): bump proptest-derive from 0.2.0 to 0.3.0 Bumps [proptest-derive](https://github.com/AltSysrq/proptest) from 0.2.0 to 0.3.0. - [Release notes](https://github.com/AltSysrq/proptest/releases) - [Changelog](https://github.com/AltSysrq/proptest/blob/master/CHANGELOG.md) - [Commits](https://github.com/AltSysrq/proptest/compare/proptest-derive-0.2.0...proptest-derive-0.3.0) Signed-off-by: dependabot[bot] <support@github.com>	2021-02-22 01:33:54 -05:00
teor	d4f2f27218	Add global span to spawned network tasks (#1761 ) Closes #1575	2021-02-20 08:36:50 +10:00
ebfull	b7fddbde94	Compute the expected body length to reduce heap allocations (#1773 ) * Compute the expected body length to reduce heap allocations	2021-02-19 22:18:57 +00:00
Jane Lusby	736092abb8	Apply suggestions from code review Co-authored-by: teor <teor@riseup.net>	2021-02-19 14:11:35 -08:00
Jane Lusby	6f205a1812	make sure peer/error.s comments are up to date	2021-02-19 14:11:35 -08:00
Jane Lusby	2266886a53	ensure peer/client.rs comments are up to date	2021-02-19 14:11:35 -08:00
Jane Lusby	c3724031df	remove unnecessary Option around request timeout	2021-02-19 14:11:35 -08:00
Jane Lusby	651d352ce1	update comments throughout connection.rs	2021-02-19 14:11:35 -08:00
Jane Lusby	2adee7b31a	deduplicate match arms in handle_client_request	2021-02-19 14:11:35 -08:00
Jane Lusby	cfc4717b98	rename transitions from Exit to Close	2021-02-19 14:11:35 -08:00
teor	5e4bf804aa	Remove remaining references to fail_with	2021-02-19 14:11:35 -08:00
teor	e06705ed81	Only reject pending client requests when the peer has errored - Add an `ExitClient` transition, used when the internal client channel is closed or dropped, and there are no more pending requests - Ignore pending requests after an `ExitClient` transition - Reject pending requests when the peer has caused an error (the `Exit` and `ExitRequest` transitions) - Remove `PeerError::ConnectionDropped`, because it is now handled by `ExitClient`. (Which is an internal error, not a peer error.)	2021-02-19 14:11:35 -08:00
teor	9d9734ea81	rustfmt	2021-02-19 14:11:35 -08:00
Jane Lusby	5ec8d09e0d	accidental drop on mustusesender	2021-02-19 14:11:35 -08:00
Jane Lusby	6906f87ead	introduce Transition enum	2021-02-19 14:11:35 -08:00
Jane Lusby	e6cb20e13f	leverage return value for propagating errors	2021-02-19 14:11:35 -08:00
teor	e61b5e50a2	Diagnostics for CI port conflict failures (#1766 ) Log a "Trying..." message before each listener opens, to see if the delay is inside Zebra, or in the test harness or OS. Also report the configured and actual ports where possible, for better diagnostics.	2021-02-18 12:15:09 -03:00
teor	5424e1d8ba	Fix candidate set address state handling (#1709 ) Design: - Add a `PeerAddrState` to each `MetaAddr` - Use a single peer set for all peers, regardless of state - Implement time-based liveness as an `AddressBook` method, rather than a `PeerAddrState` variant - Delete `AddressBook.by_state` Implementation: - Simplify `AddressBook` changes using `update` and `take` modifier methods - Simplify the `AddressBook` iterator implementation, replacing it with methods that are more obviously correct - Consistently collect peer set metrics Documentation: - Expand and update the peer set documentation We can optimise later, but for now we want simple code that is more obviously correct.	2021-02-18 11:18:32 +10:00
teor	579bd4a368	Retry DNS resolution on failure (#1762 ) Otherwise, a transient DNS failure makes the node hang.	2021-02-18 07:09:02 +10:00
teor	86169f6412	Update PeerSet metrics after every change (#1727 )	2021-02-18 07:06:59 +10:00
teor	8d1c498234	Log initial peer connection failures And standardise another log message	2021-02-17 09:21:53 -05:00
teor	e85441c914	Add a correctness comment to justify the revert	2021-02-16 05:52:54 +10:00
teor	a02a00a3f5	Revert "Stop using CallAllUnordered in peer_set::add_initial_peers (#1705 )" This reverts commit `241c7ad849`.	2021-02-16 05:52:54 +10:00
teor	e7176b86da	Clarify the Response::Nil documentation	2021-02-11 09:45:42 -05:00
Deirdre Connolly	0c5daa8410	Bump versions for zebrad 1.0.0-alpha.2 Including tower-batch bump to 0.2.0, tower-fallback to 0.2.0, zebra-script to 1.0.0-alpha.3	2021-02-09 16:14:29 -05:00
Alfredo Garcia	241c7ad849	Stop using CallAllUnordered in peer_set::add_initial_peers (#1705 ) * use ServiceExt::oneshot and FuturesUnordered Co-authored-by: teor <teor@riseup.net>	2021-02-09 08:16:02 +10:00
teor	1e156a5d60	Document that connect_isolated only works on mainnet Document that connect_isolated only works on mainnet. See #1687.	2021-02-04 17:32:00 -05:00
Alfredo Garcia	d7c40af2a8	Fix shutdown panics (#1637 ) * add a shutdown flag in zebra_chain::shutdown * fix network panic on shutdown * fix checkpoint panic on shutdown	2021-02-03 19:03:28 +10:00
Alfredo Garcia	221512c733	Async DNS seeder lookups (#1662 ) * replace to_socket_addrs * refactor `resolve()` into `resolve_host()` * use `resolve_host()` to resolve config peers * add DNS_LOOKUP_TIMEOUT constant * don't block the main thread in initialize	2021-02-03 12:20:26 +10:00
teor	983e94f9e4	Add a TODO for inbound error handling cleanup	2021-02-03 08:32:10 +10:00
Alfredo Garcia	4b34482264	Add hints to port conflict and lock file panics (#1535 ) * add hint for port error * add issue filter for port panic * add lock file hint * add metrics endpoint port conflict hint * add hint for tracing endpoint port conflict * add acceptance test for resource conflics * Split out common conflict test code into a function * Add state, metrics, and tracing conflict tests * Add a full set of stderr acceptance test functions This change makes the stdout and stderr acceptance test interfaces identical. * move Zcash listener opening * add todo about hint for disk full * add constant for lock file * match path in state cache * don't match windows cache path * Use Display for state path logs Avoids weird escaping on Windows when using Debug * Add Windows conflict error messages * Turn PORT_IN_USE_ERROR into a regex And add another alternative Windows-specific port error Co-authored-by: teor <teor@riseup.net> Co-authored-by: Jane Lusby <jane@zfnd.org>	2021-01-29 22:36:33 +10:00
Deirdre Connolly	1b09538277	Bump versions for zebrad 1.0.0-alpha.1 (#1646 ) * Bump versions where appropriate Tested with cargo install --locked --path etc * Remove fixed panics from 'Known Issues' * Change to alpha release series in the README Co-authored-by: teor <teor@riseup.net>	2021-01-27 20:31:39 -05:00
teor	b551d81f8d	Explain why we stay connected on Inbound errors We might be syncing using this peer, so it's ok to just ignore any internal errors in their Inbound requests, and drop the request.	2021-01-27 12:08:49 -08:00
teor	258789ed9b	Use the rustc unknown lints attribute The clippy unknown lints attribute was deprecated in nightly in rust-lang/rust#80524. The old lint name now produces a warning. Since we're using `allow(unknown_lints)` to suppress warnings, we need to add the canonical name, so we can continue to build without warnings on nightly. But we also need to keep the old name, so we can continue to build without warnings on stable. And therefore, we also need to disable the "removed lints" warning, otherwise we'll get warnings about the old name on nightly. We'll need to keep this transitional clippy config until rustc 1.51 is stable.	2021-01-19 11:02:20 -05:00
teor	05fff8e6f7	Revert "Stop panicking when fail_with is called twice on a connection" But keep the extra error information.	2021-01-18 00:23:36 -05:00
teor	4fe81da953	Improve logging for connection state errors	2021-01-18 00:23:36 -05:00
teor	a6c1cd3c35	Stop panicking when fail_with is called twice on a connection We can't rule out the connection state changing between the state checks and any eventual failures, particularly in the presence of async code. So we turn this panic into a warning.	2021-01-18 00:23:36 -05:00
teor	44c8fafc29	Stop processing the request after failing an overloaded connection zebra-network's Connection expects that `fail_with` is only called once per connection, but the overload handling code continues to process the current request after an overload error, potentially leading to further failures. Closes #1599	2021-01-18 00:23:36 -05:00
teor	0f0fb93b5c	Update some comments in zebra-network Add ticket numbers, and update based on design decisions and new code.	2021-01-15 09:02:10 -05:00
teor	730910cd99	Upgrade to tokio 0.3.6 from crates.io And remove the tokio git dependency patch	2021-01-12 15:37:27 -05:00
Jane Lusby	15698245e1	Deduplicate metrics dependencies (#1561 ) ## Motivation This PR is motivated by the regression identified in https://github.com/ZcashFoundation/zebra/issues/1349. That PR notes that the metrics stopped working for most of the crates other than `zebrad`. ## Solution This PR resolves the regression by deduplicating the `metrics` crate dependency. During a recent change we upgraded the metrics version in `zebrad` and a couple other of our crates, but we never updated the dependencies in `zebra-state`, `zebra-consensus`, or `zebra-network`. This caused the metrics macros to attempt to retrieve the current metrics exporter through the wrong function. We would install the metrics exporter in `0.13`, but then attempt to look it up through the `0.12` crate, which contains a different instance of the metrics exporter static variable which is unset. Doing this causes the metrics macros to return `None` for the current exporter after which they just silently give up. ## Related Issues closes https://github.com/ZcashFoundation/zebra/issues/1349 ## Follow Up Work I noticed we have quite a few duplicate dependencies in our tree. We might be able to save some compilation time by auditing those and deduplicating them as much as possible. - https://github.com/ZcashFoundation/zebra/issues/1582 Co-authored-by: teor <teor@riseup.net>	2021-01-12 12:28:56 +10:00
dependabot[bot]	38ac869f57	build(deps): bump byteorder from 1.3.4 to 1.4.2 Bumps [byteorder](https://github.com/BurntSushi/byteorder) from 1.3.4 to 1.4.2. - [Release notes](https://github.com/BurntSushi/byteorder/releases) - [Changelog](https://github.com/BurntSushi/byteorder/blob/master/CHANGELOG.md) - [Commits](https://github.com/BurntSushi/byteorder/compare/1.3.4...1.4.2) Signed-off-by: dependabot[bot] <support@github.com>	2021-01-11 18:45:49 -05:00
teor	b7d0a40ee1	Revert unused instrument macros Reverts most of "Instrument some functions to try to locate the panic"	2021-01-06 13:07:23 -08:00
teor	6d3aa0002c	Ensure received client request oneshots are used via the type system The `peer::Client` translates `Request`s into `ClientRequest`s, which it sends to a background task. If the send is `Ok(())`, it will assume that it is safe to unconditionally poll the `Receiver` tied to the `Sender` used to create the `ClientRequest`. We enforce this invariant via the type system, by converting `ClientRequest`s to `InProgressClientRequest`s when they are received by the background task. These conversions are implemented by `ClientRequestReceiver`. Changes: * Revert `ClientRequest` so it uses a `oneshot::Sender` * Add `InProgressClientRequest`, which is the same as `ClientRequest`, but has a `MustUseOneshotSender` * `impl From<ClientRequest> for InProgressClientRequest` * Add a new `ClientRequestReceiver` type that wraps a `mpsc::Receiver<ClientRequest>` * `impl Stream<InProgressClientRequest> for ClientRequestReceiver`, converting the successful result of `inner.poll_next_unpin` into an `InProgressClientRequest` * Replace `client_rx: mpsc::Receiver<ClientRequest>` in `Connection` with the new `ClientRequestReceiver` type * `impl From<mpsc::Receiver<ClientRequest>> for ClientRequestReceiver`	2021-01-06 13:07:23 -08:00
teor	df1b0c8d58	Defer a timeout fix until later	2021-01-06 13:07:23 -08:00
teor	d5cfd5ad5f	Clarify the ClientRequest invariant Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2021-01-06 13:07:23 -08:00
teor	f8ff2e9c0b	Add more sends before dropping ClientRequests This fix also changes heartbeat behaviour in the following ways: * if the queue is full, the connection is closed. Previously, the sender would wait until the queue had emptied * if the queue flush fails, Zebra panics, because it can't send an error on the ClientRequest sender, so the invariant is broken	2021-01-06 13:07:23 -08:00
teor	3e711ccc8a	Instrument some functions to try to locate the panic	2021-01-06 13:07:23 -08:00
teor	fa29fca917	Panic when must-use senders are dropped before use Add a MustUseOneshotSender, which panics if its inner sender is unused. Callers must call `send()` on the MustUseOneshotSender, or ensure that the sender is canceled. Replaces an unreliable panic in `Client::call()` with a reliable panic when a must-use sender is dropped.	2021-01-06 13:07:23 -08:00
teor	b03809ebe3	Add the invalid state to an unreachable panic message	2021-01-06 13:07:23 -08:00
teor	86136c7b5c	Stop ignoring errors when the new state is AwaitingRequest The previous code would send a Nil message on the Sender, even if the result was actually an error.	2021-01-06 13:07:23 -08:00
teor	da5084a10a	Split the 3-level match using a temporary	2021-01-06 13:07:23 -08:00
teor	fd23c46726	Remove a redundant fmt::Display bound	2021-01-06 13:07:23 -08:00
teor	3892894ffa	Call ClientRequest.tx.send() even if there is an error Previously, tx would be dropped before send if: - the success case would have used tx to wait for further messages, - but the response was actually an error. Instead, send the error on `tx` and call `fail_with()` using the same error. To support this change, allow `fail_with()` to take a `PeerError` or a `SharedPeerError`.	2021-01-06 13:07:23 -08:00
teor	28f3186182	Mark ClientRequest and State::AwaitingResponse as must_use	2021-01-06 13:07:23 -08:00
teor	b1f14f47c6	Rewrite GetData handling to match the zcashd implementation (#1518 ) * Rewrite GetData handling to match the zcashd implementation `zcashd` silently ignores missing blocks, but sends found transactions followed by a `NotFound` message: `e7b425298f/src/main.cpp (L5497)` This is significantly different to the behaviour expected by the old Zebra connection state machine, which expected `NotFound` for blocks. Also change Zebra's GetData responses to peer request so they ignore missing blocks. * Stop hanging on incomplete transaction or block responses Instead, if the peer sends an unexpected block, unexpected transaction, or NotFound message: 1. end the request, and return a partial response containing any items that were successfully received 2. if none of the expected blocks or transactions were received, return an error, and close the connection	2021-01-04 13:25:35 +10:00
teor	d482900e7f	Remove a redundant pattern match Identified by clippy's redundant_pattern_match lint.	2020-12-13 22:10:05 -05:00
teor	8e2f08221f	Add peer set tracing and unreachable panics (#1468 ) Add some extra tracing and panics to double-check our assumptions about the peer set state machine.	2020-12-14 11:00:39 +10:00
Henry de Valence	0842eb2dab	zebra: move to 1.x-based versioning. (#1476 ) Previously we set the crate versions to 3.x, so that the major version was aligned with the NU version. But we want to be able to make API changes independently of the NU schedule.	2020-12-08 08:53:07 +10:00
teor	b4a50fd99f	Downgrade tokio to 0.3.4 to avoid a time wheel panic (#1453 ) See tokio-rs/tokio#2789 for details. We were seeing this panic during normal operation, not just at shutdown.	2020-12-04 13:52:37 +10:00
Henry de Valence	b449fe93b2	network: correct data modeling for headers messages We modeled a Bitcoin `headers` message as being a list of block headers. However, the actual data structure is slightly different: it's a list of (block header, transaction count) pairs. This caused zcashd to reject our headers messages. To fix this, introduce a new `CountedHeader` struct with a `block::Header` and transaction count `usize`, then thread it through the inbound service and the state. I tested this locally by running Zebra with these changes and inspecting a trace-level log of the span of a peer connection that requested a nontrivial headers packet from us, and verified that it did not reject our message.	2020-12-02 10:24:31 -08:00
Henry de Valence	bfbc737b6c	network: don't cancel heartbeat requests The cancellation implementation changes made to the connection state machine mean that if a response oneshot is dropped, the connection will avoid cancelling the request. So the heartbeat task does have to wait on the response.	2020-12-02 02:18:13 -05:00
Henry de Valence	69ba5584f3	network: correct parsing of reject messages Not all reject messages include a data field. This change partially addresses a problem that could lead to a depleted peer set: 1. We send a response to a `getheaders` message; 2. The remote peer `reject`s our `headers` message for some reason; 3. We fail to parse their `reject` message and close the connection; 4. Repeating this process, we have no more peers. This commit fixes (3) but does not address (2).	2020-12-02 02:12:29 -05:00
teor	34518525a5	Improve peer set logging hints Delete hints about configuring peers. Delete hint for typical "no ready peers" behaviour.	2020-12-01 21:37:15 -08:00
Henry de Valence	00c4f4f0e6	network: record cause of handshake failure	2020-12-01 19:16:41 -08:00
Henry de Valence	5ccd1905fc	network: avoid putting null bytes in trace output	2020-12-01 19:16:41 -08:00
Henry de Valence	f93deb1cac	network: fix missing {0} in PeerError::Serialization	2020-12-01 19:16:41 -08:00
Henry de Valence	18cf5e0249	network: use short Display for Message in spans This makes the span data more compact (e.g., `msg_as_req{msg=block}`) and restores the Debug impl for Message to show all of the data contained in the message. The full message is added as a single event at trace level in the span to preserve the previous full-inspectability.	2020-12-01 19:16:41 -08:00
Jane Lusby	a91d0f0bb6	Include short sha in log messages and error urls (#1410 ) As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs. To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs. Co-authored-by: teor <teor@riseup.net>	2020-12-01 12:13:20 -08:00
teor	4d5ea4897c	Log peer set ready and unready peers * warn: if there are no peers at all * info: if there are no ready peers * trace: the number of ready and unready peers for every request Log at most one warn or info log per minute, to avoid flooding the terminal with log lines. Suppress warn and info logs for the first minute, while the peer set is starting up.	2020-12-01 11:00:21 -05:00
teor	92eb92d1dd	Disable the nightly clippy unnecessary_wraps lint (#1403 ) It seems to be a bit broken - some of our functions return `Result` for consistency with similar functions. But the lint picks them up anyway.	2020-12-01 12:20:57 +10:00
Alfredo Garcia	4544463059	Inbound `FindBlocks` and `FindHeaders` (#1347 ) * implement inbound `FindBlocks` * Handle inbound peer FindHeaders requests * handle request before having any chain tip * Split `find_chain_hashes` into smaller functions Add a `max_len` argument to support `FindHeaders` requests. Rewrite the hash collection code to use heights, so we can handle the `stop` hash and "no intersection" cases correctly. * Split state height functions into "any chain" and "best chain" * Rename the best chain block method to `best_block` * Move fmt utilities to zebra_chain::fmt * Summarise Debug for some Message variants Co-authored-by: teor <teor@riseup.net> Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-12-01 07:30:37 +10:00
Alfredo Garcia	7d42c63790	fix comment	2020-11-25 10:55:44 -08:00
teor	8d6ac8eece	Placate clippy	2020-11-24 20:03:21 +10:00
Henry de Valence	d90e709ce1	network: tidy peer set implementation - rename functions more descriptively - create a common `take_ready_service` function - organize poll_ functions separately	2020-11-24 20:03:21 +10:00
Henry de Valence	f36a4800b2	network: fix invariant violation in peer set Closes #1183. The peer set maintains a preselected ready service that it can use to perform power-of-two-choices (p2c) routing of requests. Ready services are stored by key (socket address) in an `IndexMap`, and the preselected service is represented by an `Option<usize>` indexing that map. This means that whenever the set of ready services changes (e.g., a service is removed from the peer set, or a service is taken to be used to process a request), the preselected index is invalidated. The original P2C-only implementation maintained this invariant but did not document it. The change to inventory-based routing introduced a bug by failing to maintain this invariant and appropriately invalidate the preselected index. However, this was only noticeable approximately 1/N of the time on the next request after an inventory-directed request, so the bug occurred infrequently. Luckily, the use of `.expect` caused the bug to be an immediate panic, making it possible to identify by inspecting all uses of the ready service map.	2020-11-24 20:03:21 +10:00
teor	6387dfe1d0	Fix individual crate compilation failures Some Zebra crates don't compile individually due to missing features in their dependencies. Add those features to each crate's dependency list.	2020-11-23 23:56:28 -08:00
Henry de Valence	add94c1c45	deps: move to tokio 0.3, tower 0.4 This change is mostly mechanical, with the exception of the changes to the `tower-batch` middleware. This middleware was adapted from `tower::buffer`, and the `tower::buffer` code was changed to implement its own bounded queue, because Tokio 0.3 removed the `mpsc::Sender::poll_send` method. See `ddc64e8d4d` for more context on the Tower changes. To match Tower as closely as possible in order to be able to upstream `tower-batch`, those changes are copied from `tower::Buffer` to `tower-batch`.	2020-11-20 10:08:16 -08:00
Henry de Valence	06dd39df54	network: bump network version for Canopy (#1333 ) Per https://zips.z.cash/zip-0251, nodes compatible with Canopy activation on mainnet MUST advertise protocol version 170013 or later. Once Canopy activates on testnet or mainnet, Canopy nodes SHOULD reject new connections from pre-Canopy nodes, so this also increases the minimum version.	2020-11-20 09:50:05 +10:00
Henry de Valence	a3ab589d89	consensus,state: document cancellation contracts for services This change explicitly documents cancellation contracts for our Tower services, and tries to correct a bug in the implementation of the CheckpointVerifier, which duplicates information from the state service but did not ensure that it would be kept in sync.	2020-11-17 14:56:27 -08:00
teor	ca4e792f47	Put messages in request/response order And fix a comment typo	2020-11-17 07:52:53 +10:00
Alfredo Garcia	128643d81e	Call `zebra_test::init` where needed. (#1227 ) * Add missing `zebra_test::init()` to zebra-chain * Add missing `zebra_test::init()` to zebra-consensus * Add missing `zebra_test::init()` to zebra-network * Add missing `zebra_test::init()` to zebra-state * Add missing `zebra_test::init()` to zebra-test * Add missing `zebra_test::init()` to zebrad	2020-11-10 10:29:25 +10:00
Henry de Valence	8e709bfa88	network: don't fail on unsolicited messages These messages might be unsolicited, or they might be a response to a request we already canceled. So don't fail the whole connection, just drop the message and move on.	2020-10-26 12:05:35 -07:00
Henry de Valence	13daefa729	network: handle request cancellation in Connection We handle request cancellation in two places: before we transition into the AwaitingResponse state, and while we are in AwaitingResponse. We need both places, or else if we started processing a request, we wouldn't process the cancellation until the timeout elapsed. The first is a check that the oneshot is not already canceled. For the second, we wait on a cancellation, either from a timeout or from the tx channel closing.	2020-10-26 12:05:35 -07:00
teor	1e97691fc8	Fix some "needless lifetime" clippy lints These lints seem to be new in clippy nightly.	2020-10-12 08:54:23 +10:00
Dimitris Apostolou	36279621f0	Fix typos	2020-10-06 12:16:41 +10:00
Henry de Valence	6dd7318d3b	deps: use Tower 0.4 from git instead of 0.3.1. This addresses at least three pain points: - we were affected by bugs that were already fixed in git, but not in the released crate; - we can use service combinators to transform requests and responses; - we can use the hedge middleware. The version in git is still marked as 0.3.1 but these changes will be part of tower 0.4: https://github.com/tower-rs/tower/issues/431	2020-09-21 14:16:56 -07:00
Deirdre Connolly	33afeb37cb	Add a comment about the short looo	2020-09-21 09:26:39 -07:00
Henry de Valence	6f3288814c	network: avoid GetPeers timeout to accelerate init The GetPeers requests sent while crawling the network are randomly load-balanced over available peers. But at the very beginning, they may be both routed to the same peer, causing network initialization to be delayed while the second one times out (since zcashd only ever responds to the first addr message). Only sending one GetPeers request per candidate set update means we crawl the network a little more slowly, but avoids hanging on start.	2020-09-21 09:26:39 -07:00
Henry de Valence	b72c249b96	network: add a metric+warning when shedding load	2020-09-21 09:26:39 -07:00
Henry de Valence	4df5632752	network: handle Message::NotFound as a response This cleans up the response processing logic a little bit along the way, but the overall division of responsibility should be better documented in a future commit.	2020-09-20 10:21:18 -07:00
Henry de Valence	64905563d1	network: remove glob import in message-handling This clarifies which parts are the handler state and which parts are the incoming message.	2020-09-20 10:21:18 -07:00
Henry de Valence	9c021025a7	network: fill in remaining request/response pairs	2020-09-20 10:21:18 -07:00
Henry de Valence	b289cb9164	network: clean up GetHeaders, GetBlocks modeling	2020-09-20 10:21:18 -07:00
Henry de Valence	3c993f33b1	network: add PeerError::WrongMessage This lets us distinguish between cases where the message was unsupported (e.g., BIP11 messages), and cases where the message was uninterpretable in context (e.g., unsolicited messages).	2020-09-20 10:21:18 -07:00
Henry de Valence	430176dd0d	network: clean up message-as-request translation	2020-09-20 10:21:18 -07:00
Henry de Valence	170f588ffb	network: document load-shedding behavior This was part of the original design and is described in the Connection internals, but we never documented it externally.	2020-09-18 18:34:25 -07:00
Henry de Valence	1d3892e1dc	network: rename alias to BoxError This is shorter and consistent with Tower (which is why we use it in the first place).	2020-09-18 18:34:25 -07:00
Henry de Valence	95f2463188	Try workaround for generator autotrait bug > Added a test that the handshake's version message matches specified fields, but the test does not compile, because rustc doesn't believe that the Box<dyn std::error::Error + Send + Sync + 'static> is 'static, and therefore isn't a Box<dyn std::error::Error + Send + Sync + 'static>. This manifests as being unable to spawn the connect_isolated task. From digging through Tokio issues I believe that this is an instance of rust-lang/rust#64552 . Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-09-17 12:02:20 -07:00
Henry de Valence	81e8195f68	network: add connect_isolated distinguisher test This is currently broken due to a rustc bug.	2020-09-17 12:02:20 -07:00
Henry de Valence	b7472de43f	network: add a zebra_network::connect_isolated() method. The peer set provides an automatically managed connection pool, abstracting away all the details of handling individual peer connections. However, it's also useful to be able to create completely isolated and minimally-distinguishable connections to individual peers, in order to be able to send specific messages over Tor, or to implement some custom network crawler logic.	2020-09-17 12:02:20 -07:00
teor	66265dc11a	Adjust the EWMA decay for the latest sync timeout	2020-09-09 15:35:09 -07:00
teor	1f7af0a779	Update the inv message processing comment Cleanup after PR #1028.	2020-09-09 15:29:38 -07:00
teor	2a68ef5acb	Update the peerset buffer size and sync timeout Also add a bunch of comments and documentation for network-constrained nodes, and for testnet.	2020-09-08 12:44:33 -07:00
teor	e6e859dce2	Tweak sync timeouts * increase the EWMA default and decay * increase the block download retries * increase the request and block download timeouts * increase the sync timeout	2020-09-08 12:44:33 -07:00
Jane Lusby	1b17691dda	improve logging	2020-09-08 12:37:34 -07:00
Jane Lusby	81a3ad3a0d	filter inventory advertisements correctly	2020-09-08 12:37:34 -07:00
Henry de Valence	3f150eb16e	network: implement transaction request handling. (#1016 ) This commit makes several related changes to the network code: - adds a `TransactionsByHash(HashSet<transaction::Hash>)` request and `Transactions(Vec<Arc<Transaction>>)` response pair that allows fetching transactions from a remote peer; - adds a `PushTransaction(Arc<Transaction>)` request that pushes an unsolicited transaction to a remote peer; - adds an `AdvertiseTransactions(HashSet<transaction::Hash>)` request that advertises transactions by hash to a remote peer; - adds an `AdvertiseBlock(block::Hash)` request that advertises a block by hash to a remote peer; Then, it modifies the connection state machine so that outbound requests to remote peers are handled properly: - `TransactionsByHash` generates a `getdata` message and collects the results, like the existing `BlocksByHash` request. - `PushTransaction` generates a `tx` message, and returns `Nil` immediately. - `AdvertiseTransactions` and `AdvertiseBlock` generate an `inv` message, and return `Nil` immediately. Next, it modifies the connection state machine so that messages from remote peers generate requests to the inbound service: - `getdata` messages generate `BlocksByHash` or `TransactionsByHash` requests, depending on the content of the message; - `tx` messages generate `PushTransaction` requests; - `inv` messages generate `AdvertiseBlock` or `AdvertiseTransactions` requests. Finally, it refactors the request routing logic for the peer set to handle advertisement messages, providing three routing methods: - `route_p2c`, which uses p2c as normal (default); - `route_inv`, which uses the inventory registry and falls back to p2c (used for `BlocksByHash` or `TransactionsByHash`); - `route_all`, which broadcasts a request to all ready peers (used for `AdvertiseBlock` and `AdvertiseTransactions`).	2020-09-08 10:16:29 -07:00
Henry de Valence	cad38415b2	network: fix bug in inventory advertisement handling (#1022 ) * network: fix bug in inventory advertisement handling The RFC https://zebra.zfnd.org/dev/rfcs/0003-inventory-tracking.html described the use of a `broadcast` channel in place of an `mpsc` channel to get ring-buffer behavior, keeping a bound on the size of the channel but dropping old entries when the channel is full. However, it didn't explicitly describe how this works (the `broadcast` channel returns a `RecvError::Lagged(u64)` to inform receivers that they lost messages), so the lag-handling wasn't implemented and I didn't notice in review. Instead, the ? operator bubbled the lag error all the way up from `InventoryRegistry::poll_inventory` through `<PeerSet as Service>::poll_ready` through various Tower wrappers to users of the peer set. The error propagation is bad enough, because it caused client errors that shouldn't have happened, but there's a worse interaction. The `Service` contract distinguishes between request errors (from `Service::call`, scoped to the request) and service errors (from `Service::poll_ready`, scoped to the service). The `Service` contract specifies that once a service returns an error from `poll_ready`, the service can be assumed to be failed permanently. I believe (but haven't tested or carefully worked through the details) that this caused various tower middleware to report the entire peer set service as permanently failed due to a transient inventory "error" (more of an indicator), and I suspect that this is the cause of #1003, where all of the sync component's requests end up failing because the peer set reported that it failed permanently. I am able to reproduce #1003 locally before this change and unable to reproduce it locally after this change, though I have not tested exhaustively. * network: add metric for dropped inventory advertisements Co-authored-by: teor <teor@riseup.net> Co-authored-by: teor <teor@riseup.net>	2020-09-07 21:24:31 -07:00
Henry de Valence	9682d452ee	network: add AddressBook::potentially_connected_peers().	2020-09-07 11:13:15 -07:00
dependabot[bot]	142226ad57	build(deps): bump indexmap from 1.5.2 to 1.6.0 Bumps [indexmap](https://github.com/bluss/indexmap) from 1.5.2 to 1.6.0. - [Release notes](https://github.com/bluss/indexmap/releases) - [Commits](https://github.com/bluss/indexmap/compare/1.5.2...1.6.0) Signed-off-by: dependabot[bot] <support@github.com>	2020-09-07 07:56:39 -04:00
Alfredo Garcia	454e75e7c0	Rename old references to BlockHeaderHash and BlockHeight (#1002 ) * rename some references * Apply suggestions from code review Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com> Co-authored-by: teor <teor@riseup.net> Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com> Co-authored-by: teor <teor@riseup.net>	2020-09-04 15:40:48 -07:00
teor	b5c653ed93	Use ok_or for constants, rather than a redudant closure * Use ok_or for constants in zebra-network * Use ok_or for constants in zebra-consensus	2020-09-02 14:26:26 +10:00
Jane Lusby	88557ddd0a	address more comments	2020-09-01 21:01:38 -04:00
Jane Lusby	d933abeebf	fix typo	2020-09-01 21:01:38 -04:00
Jane Lusby	96c8809348	Implement Inventory Tracking RFC (#963 ) * Add .cargo to the gitignore file * Implement Inventory Tracking RFC * checkpoint * wire together the inventory registry * add comment documenting condition * make inventory registry optional	2020-09-01 14:28:54 -07:00
Henry de Valence	f91b91b6d8	network: clarify comment on Default for handshake::Builder Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-09-01 13:56:00 -07:00
Henry de Valence	fddba7a336	network: remove handshake::Builder::with_addr Use the listen_addr field already specified in the config. Also, derive Clone for Handshake<S>. Co-authored-by: Jane Lusby <jane@zfnd.org>	2020-09-01 13:56:00 -07:00
Henry de Valence	a5b6f39850	network: don't leak our exact time skew in handshakes.	2020-09-01 13:56:00 -07:00
Henry de Valence	1b5a824584	network: fix bug in BIP37 relay flag handling. The relay flag in the version message is used in conjunction with BIP37 to receive bloom-filtered transactions. When it is set to false, transactions are not relayed until a bloom filter is set. Since we don't implement BIP37 (it's not useful for shielded transactions), this means we'll never receive transactions.	2020-09-01 13:56:00 -07:00
Henry de Valence	60a0b8c382	network: change Handshake::new to a Builder. This allows more detailed control over the handshake parameters.	2020-09-01 13:56:00 -07:00
teor	d7e32b68e5	fix: Split a clippy allow, so its comment is clearer	2020-09-01 11:40:18 -04:00
teor	5afa24588a	fix: Remove unused dependencies	2020-08-20 14:49:17 -04:00
Henry de Valence	ebdceb5197	chain: rename TransactionHash to transaction::Hash	2020-08-17 11:46:34 -07:00
Henry de Valence	2712c4b72a	chain: rename BlockHeader to block::Header	2020-08-17 11:46:34 -07:00
Henry de Valence	103b663c40	chain: rename BlockHeight to block::Height	2020-08-17 11:46:34 -07:00
Henry de Valence	61dea90e2f	chain: rename BlockHeaderHash to block::Hash This is the first in a sequence of changes that change the block:: items to not include Block as a prefix in their name, in accordance with the Rust API guidelines.	2020-08-17 11:46:34 -07:00
Henry de Valence	948b067808	chain: move Network, NetworkUpgrade to parameters Also, avoid using star-imports of the enum variants, which pollutes the namespace.	2020-08-17 11:46:34 -07:00
Henry de Valence	dad6340cd3	chain: move BlockHeight into block	2020-08-17 11:46:34 -07:00
Henry de Valence	b36fe8f937	chain: move sha256d to serialization module. This extracts the SHA256d code from being split across two modules and puts it in one module, under serialization. The code is unchanged except for three deleted tests: * `sha256d_flush` in `sha256d_writer` (not a meaningful test); * `transactionhash_debug` (constructs an invalid transaction hash, and the behavior is tested in the next test); * `decode_state_debug` (we do not need to test the Debug output of DecodeState);	2020-08-17 11:46:34 -07:00
Alfredo Garcia	b41e33e066	Bytes read and bytes written metrics (#901 ) * add bytes read and written metrics * Apply suggestions from code review Co-authored-by: Jane Lusby <jlusby42@gmail.com> * store address as string * Apply suggestions from code review Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca> * change addr to label Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca> * remove newline Co-authored-by: Jane Lusby <jlusby42@gmail.com> Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>	2020-08-14 15:50:26 -07:00
Henry de Valence	a79ce97957	Fix sync algorithm. (#887 ) * checkpoint: reject older of duplicate verification requests. If we get a duplicate block verification request, we should drop the older one in favor of the newer one, because the older request is likely to have been canceled. Previously, this code would accept up to four duplicate verification requests, then fail all subsequent ones. * sync: add a timeout layer to block requests. Note that if this timeout is too short, we'll bring down the peer set in a retry storm. * sync: restart syncing on error Restart the syncing process when an error occurs, rather than ignoring it. Restarting means we discard all tips and start over with a new block locator, so we can have another chance to "unstuck" ourselves. * sync: additional debug info * sync: handle lookahead limit correctly. Instead of extracting all the completed task results, the previous code pulled results out until there were fewer tasks than the lookahead limit, then stopped. This meant that completed tasks could be left until the limit was exceeded again. Instead, extract all completed results, and use the number of pending tasks to decide whether to extend the tip or wait for blocks to finish. * network: add debug instrumentation to retry policy * sync: instrument the spawned task * sync: streamline ObtainTips/ExtendTips logic & tracing This change does three things: 1. It aligns the implementation of ObtainTips and ExtendTips so that they use the same deduplication method. This means that when debugging we only have one deduplication algorithm to focus on. 2. It streamlines the tracing output to not include information already included in spans. Both obtain_tips and extend_tips have their own spans attached to the events, so it's not necessary to add Scope: prefixes in messages. 3. It changes the messages to be focused on reporting the actual events rather than the interpretation of the events (e.g., "got genesis hash in response" rather than "peer could not extend tip"). The motivation for this change is that when debugging, the interpretation of events is already known to be incorrect, in the sense that the mental model of the code (no bug) does not match its behavior (has bug), so presenting minimally-interpreted events forces interpretation relative to the actual code. * sync: hack to work around zcashd behavior * sync: localize debug statement in extend_tips * sync: change algorithm to define tips as pairs of hashes. This is different enough from the existing description that its comments no longer apply, so I removed them. A further chunk of work is to change the sync RFC to document this algorithm. * sync: reduce block timeout * state: add resource limits for sled Closes #888 * sync: add a restart timeout constant * sync: de-pub constants	2020-08-12 16:48:01 -07:00
teor	109666cc48	fix: Tweak the the network listener log (#886 )	2020-08-12 14:22:54 -07:00
Henry de Valence	299afe13df	zebra-network tweaks. (#877 ) * network: move gossiped peer selection logic into address book. * network: return BoxService from init. * zebrad: add note on why we truncate thegossiped peer list Co-authored-by: Jane Lusby <jlusby42@gmail.com> * Remove unused .rustfmt.toml Many of these options are never actually loaded by our CI because of a channel mismatch, where they're not applied on stable but only on nightly (see the logs from a rustfmt job). This means that we can get different settings when running `cargo fmt` on the nightly and stable channels, which was causing a CI failure on this PR. Reverting back to the default rustfmt settings avoids this problem and keeps us in line with upstream rustfmt. There's no loss to us since we were using the defaults anyways. Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-08-11 13:07:44 -07:00
Alfredo Garcia	9c387521bd	Print endpoint addresses at startup (#867 ) * print tracing and metrics endpoints in startup * print network address in startup	2020-08-10 12:47:26 -07:00
teor	ee6f0de14d	refactor: Move NetworkUpgrade to zebra-chain	2020-08-10 18:54:42 +10:00
Henry de Valence	3d46ab746a	Clean up options in network config section. (#839 ) Closes #536. This removes: - the user-agent (we can add a mechanism to specify extra BIP14 components later, if any users ask us for that feature); - the EWMA parameters (these were put in the config just to avoid making a choice); - the peer connection timeout (we can change the default value if anyone ever has a problem with it); - the peer set request buffer size (setting this too low can make the application deadlock); The new peer interval is left in.	2020-08-06 11:29:00 -07:00
teor	c95d980bc2	doc: Explain current and minimum network protocol versions	2020-08-04 15:11:16 -04:00
teor	59eb23772d	feature: Use the Canopy testnet network protocol version Canopy will activate on testnet within the next 24 hours. To continue to use testnet, we need to upgrade the Zebra network protocol version.	2020-08-04 12:13:58 +10:00
Henry de Valence	ef0b200b82	restore Zebras to part of the name, not a comment	2020-07-29 18:46:47 -07:00
Jack Grigg	d1e0e1abf5	fix: Broadcast a valid BIP 14 user agent Closes ZcashFoundation/zebra#791.	2020-07-29 15:49:14 -04:00
teor	6be0f8ed2f	fix: Warn if the listener port is for the wrong network We'll fix the underlying defaults in #660, with the rest of the listeners.	2020-07-29 16:03:52 +10:00
teor	536668f993	fix: allow(dead_code) on some protocol version functions	2020-07-28 22:10:20 -04:00
Henry de Valence	238dec51dd	network: do not export Builder This is used to construct the Codec, which is an internal type. The export was added in `4dc307f2`.	2020-07-28 11:10:15 -07:00
teor	993532b604	feature: Add a "Genesis" network upgrade We can use this network upgrade to implement different consensus rules and chain context handling for genesis blocks. Part of the chain state design in #682.	2020-07-27 14:03:14 -04:00
Henry de Valence	4aa00ad216	Align crate versions and user-agent with NU numbers. We had a brief discussion on discord and it seemed like we had consensus on the following versioning policy: * zebrad: match major version to NU version, so we will start by releasing zebrad 3.0.0; * zebra-* libraries: start by matching zebrad's version, then increment major versions of each library as we need to make breaking changes (potentially faster than the zebrad version, always respecting semver but making no guarantees about the longevity of major releases). This commit sets all of the crate versions to 3.0.0-alpha.0 -- the -alpha.0 marks it as a prerelease not subject to perfect adherence to compatibility guarantees.	2020-07-24 11:46:37 -07:00
teor	da09965a5f	feature: Get the current minimum protocol version	2020-07-23 15:52:18 +10:00
teor	85f113bc18	doc: Add a TODO to the network protocol	2020-07-23 15:52:18 +10:00
teor	c9ee85c3b5	feature: Add network upgrade activation heights	2020-07-23 15:52:18 +10:00
Henry de Valence	cc955a2bbe	network: document Responses, add warning about unsolicited invs.	2020-07-22 17:55:52 -07:00
Jane Lusby	a722cf33f7	enable new tracing instrumentation in tokio	2020-07-22 14:39:54 -04:00
Jane Lusby	e105b4f6c5	properly document guarantee provided by zebra-network (#713 ) Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>	2020-07-22 14:38:00 -04:00
Henry de Valence	4a41c9254d	network: avoid panic when shutting down cleanly. When the connection sees the client_rx channel close it knows it will never get any more requests, and it should terminate. But instead of terminating, it errored itself, and the method to error itself tries to pull all the outstanding client requests from the channel in order to fail them before it shuts down. This results in reading from a closed channel, causing a panic. Instead we return cleanly rather than failing (since we know there are no outstanding requests, as the channel is closed).	2020-07-22 18:04:45 +10:00
Henry de Valence	0dc2d92ad8	network: ensure dropping a Client closes the connection. This fixes a bug introduced when we added heartbeat support. Recall that we handle the Bitcoin connection state machine on a per-peer basis. Each connection has a task created from the `Connection` struct, and a `Client: tower::Service` "frontend" that passes requests to it via a channel. In the `Connection` event loop, the connection checks whether the request channel has been closed, indicating no further requests from the `Client`, in which case it shuts itself down and cleans up resources. This occurs when all of the senders have been dropped. However, this behavior broke when we introduced heartbeat support, because we spawned an additional task to send heartbeat messages along the request channel. This meant that instead of having a single sender, dropped by the `Client`, we have two senders, the `Client` and the "shadow client" task that generates heartbeat messages. This means that when the `Client` is dropped, we still have a live sender and the connection is not closed. To fix this, the `Client` now uses a `oneshot` to shut down its corresponding heartbeat task. This closes all senders.	2020-07-21 15:43:31 -07:00
teor	b0cd920fad	feature: Use the Heartwood protocol version in zebra-network	2020-07-21 10:46:07 -07:00
teor	1cb1f1c52e	fix: Put the peer set config vars together	2020-07-21 12:20:48 -04:00
dependabot[bot]	c8fe4b43d8	build(deps): bump indexmap from 1.4.0 to 1.5.0 Bumps [indexmap](https://github.com/bluss/indexmap) from 1.4.0 to 1.5.0. - [Release notes](https://github.com/bluss/indexmap/releases) - [Commits](https://github.com/bluss/indexmap/compare/1.4.0...1.5.0) Signed-off-by: dependabot[bot] <support@github.com>	2020-07-21 12:19:01 -04:00
Alfredo Garcia	fe2a468417	add favicon to generated docs (#681 )	2020-07-17 16:45:29 -07:00
teor	ab6d1f5ec8	fix: Use the default Zcash port in version messages (#661 ) We don't provide our address yet, so the port should be ignored. But let's use the correct port, to avoid carrying this bug forward into working code.	2020-07-15 11:43:28 -07:00
Alfredo Garcia	d8834b149a	Limit protocol messages size (#645 ) * change body msg limit and test case * accept body at the exact limit len * test the edges of the limit value	2020-07-15 19:15:52 +10:00
Henry de Valence	fcd2f43f39	network: add warning to connection handling code.	2020-07-09 11:15:06 -07:00
Henry de Valence	217c25ef07	network: propagate tracing Spans through peer connection	2020-07-09 11:15:06 -07:00
Dimitris Apostolou	ba81d7d4c0	Fix typos	2020-07-07 11:13:49 -07:00
teor	f999ec75e6	fix: Remove a non-standard unicode character in a comment	2020-07-01 16:03:14 -04:00
Deirdre Connolly	05316dee21	Listen on 0.0.0.0, not 127.0.0.1 Turns out when your node faces the internet directly, it has to listen to those addresses directly.	2020-06-19 03:46:09 -04:00
Henry de Valence	6cc1627a5d	zebrad: apply serde(default) to config sections Each subsection has to have `serde(default)` to get the behaviour we want (delete all fields except the ones that have been changed); otherwise, we can delete only entire sections.	2020-06-18 17:43:36 -04:00
Jane Lusby	df18ac72c5	fix sharedpeererror to propagate tracing context	2020-06-17 14:38:26 -07:00
Jane Lusby	685bdaf2df	don't require absense of cancel handles Prior to this change, we required that services that are canceled do not have a cancel handle in the `cancel_handles` list, based on the assumption that the handle must have been removed in the process of canceling this service. This doesn't holding up though, because it is currently possible for us to have the same peer connect to us multiple times, the second connect removes the cancel handle of the original connect and inserts it's own cancel handle in its place. In this scenario, when the first service is polled for readiness it will see that it has been canceled and go to clean itself up, but when it asserts that it doesn't have a cancel handle it will see the cancel handle of the second connect event, which uses the same key as the first connect, and fail its debug assertion. This change removes that debug assert on the assumption that it is okay for a peer to connect multiple times consecutively, and that the correct behavior in that case is to just cancel the first connection and continue as normal.	2020-06-16 13:42:31 -07:00
Jane Lusby	4b9e4520ce	cleanup API for arc based error type (#469 ) Co-authored-by: Jane Lusby <jane@zfnd.org>	2020-06-12 11:29:42 -07:00
George Tankersley	d8b3db5679	Use new seeder address for yolo.money	2020-06-10 21:49:25 -04:00
George Tankersley	6606bcaa62	Update list of DNS seeders This adds the Foundation's new seeders and removes Simon's defunct one.	2020-06-10 20:56:31 -04:00
Jane Lusby	431f194c0f	propagate errors out of zebra_network::init (#435 ) Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service. This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.	2020-06-09 12:24:28 -07:00
Jane Lusby	9f802cd8dd	Wrap Transaction in Arc	2020-06-06 18:13:17 -04:00
Jane Lusby	9bcda0f9c7	Wrap Blocks in Arc throughout codebase	2020-06-05 00:36:55 -04:00
dependabot-preview[bot]	7a75522885	Bump indexmap from 1.3.2 to 1.4.0 Bumps [indexmap](https://github.com/bluss/indexmap) from 1.3.2 to 1.4.0. - [Release notes](https://github.com/bluss/indexmap/releases) - [Commits](https://github.com/bluss/indexmap/compare/1.3.2...1.4.0) Signed-off-by: dependabot-preview[bot] <support@dependabot.com>	2020-06-01 15:38:00 -04:00
dependabot-preview[bot]	145d9a1835	Bump proptest from 0.9.6 to 0.10.0 Bumps [proptest](https://github.com/altsysrq/proptest) from 0.9.6 to 0.10.0. - [Release notes](https://github.com/altsysrq/proptest/releases) - [Changelog](https://github.com/AltSysrq/proptest/blob/master/CHANGELOG.md) - [Commits](https://github.com/altsysrq/proptest/commits) Signed-off-by: dependabot-preview[bot] <support@dependabot.com>	2020-05-29 15:06:40 -04:00
dependabot-preview[bot]	e317b68b1d	Bump proptest-derive from 0.1.2 to 0.2.0 Bumps [proptest-derive](https://github.com/AltSysrq/proptest) from 0.1.2 to 0.2.0. - [Release notes](https://github.com/AltSysrq/proptest/releases) - [Changelog](https://github.com/AltSysrq/proptest/blob/master/CHANGELOG.md) - [Commits](https://github.com/AltSysrq/proptest/compare/proptest-derive-0.1.2...proptest-derive-0.2.0) Signed-off-by: dependabot-preview[bot] <support@dependabot.com>	2020-05-28 23:00:29 -04:00
Jane Lusby	4a2d2a359c	add cargo fmt to ci (#403 ) * add cargo fmt to ci * rebase on main * switch to stable Co-authored-by: Jane Lusby <jane@zfnd.org>	2020-05-27 19:12:25 -07:00
Jane Lusby	8c178c3ee4	fix panic in seed subcommand (#401 ) Co-authored-by: Jane Lusby <jane@zfnd.org> Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs. First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections. The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.	2020-05-27 17:40:12 -07:00
Jane Lusby	8276bed400	reinstate reject error variant	2020-05-27 15:42:29 -04:00
Jane Lusby	4dc307f2f3	fix last warnings	2020-05-27 15:42:29 -04:00
Jane Lusby	b6b35364f3	cleanup warnings throughout codebase	2020-05-27 15:42:29 -04:00
George Tankersley	df79fa75e0	Implement minimal version handshaking (#295 ) Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com> Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>	2020-04-13 18:33:15 -04:00
Deirdre Connolly	a5f4db7528	Move just the Network enum to -chain, keep everything else in -network	2020-03-12 22:02:17 -04:00
Deirdre Connolly	380d622b37	Fix imports	2020-03-12 22:02:17 -04:00
Deirdre Connolly	b68e1e2d55	Move Network, Magic, and magics to zebra-chain	2020-03-12 22:02:17 -04:00
Deirdre Connolly	8c0b00109f	Remove PeerError::DeadServer, unused, unneeded Resolves #251	2020-03-12 16:23:08 -04:00
Henry de Valence	ff3efd504c	Add Zebra logo to all workspace crates. Also add html_root_url attributes.	2020-02-26 21:25:35 -08:00
Henry de Valence	3ed75cb626	Tweak peer set metrics. - Add a total peers metric to prevent races between measurements of ready/unready peers (which can cause the sum to be wrong). - Add an outbound request counter.	2020-02-21 06:48:25 -05:00
Henry de Valence	94fe2c3b57	Increase the peerset request buffer size. tower-buffer uses tokio's mpsc channels, not the futures-rs mpsc channels. Unlike futures-rs mpsc channels, which have capacity n+m, where n is the buffer size and m is the number of senders, tokio channels always have buffer size n. This means that the buffer size is shared across all peer set handles. Thanks to @hawkw for sharing details of the Tokio internals!	2020-02-21 06:48:25 -05:00
Henry de Valence	5f07a25b05	Shorten the default new_peer_interval to 60s. This increases the frequency at which we crawl the network.	2020-02-21 06:48:25 -05:00
Henry de Valence	80e7ee6dae	Add metrics for inbound and outbound messages.	2020-02-21 06:48:25 -05:00
Henry de Valence	8c938af579	Spawn tasks for handshake futures. Previously, we relied on the owner of the handshake future to drive it to completion. This meant that there were cases where handshakes might never be completed, just because nothing was actively polling them.	2020-02-21 06:48:25 -05:00
Henry de Valence	43b2d35dda	Crawl for more peers when we exhaust candidates.	2020-02-21 06:48:25 -05:00
Henry de Valence	afa2c2347f	fmt	2020-02-21 06:48:25 -05:00
Henry de Valence	00edcae0c2	Add metrics for the crawler and candidate set.	2020-02-14 20:14:05 -05:00
Henry de Valence	75d3d44fb3	Metrics MVP: add two metrics and export them to Prometheus. Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>	2020-02-14 20:14:05 -05:00
Henry de Valence	8000f888fd	Connect to multiple peers concurrently. The previous outbound peer connection logic got requests to connect to new peers and processed them one at a time, making single connection attempts and retrying if the connection attempt failed. This was quite slow, because many connections fail, and we have to wait for timeouts. Instead, this logic connects to new peers concurrently (up to 50 at a time).	2020-02-14 18:23:41 -05:00
Henry de Valence	7049f9d891	Add a FindBlocks request to get initial block hashes. Bitcoin does this either with `getblocks` (returns up to 500 following block hashes) or `getheaders` (returns up to 2000 following block headers, not just hashes). However, Bitcoin headers are much smaller than Zcash headers, which contain a giant Equihash solution block, and many Zcash blocks don't have many transactions in them, so the block header is often similarly sized to the block itself. Because we're aiming to have a highly parallel network layer, it seems better to use `getblocks` to implement `FindBlocks` (which is necessarily sequential) and parallelize the processing of the block downloads.	2020-02-14 18:23:41 -05:00
Henry de Valence	47cafc630f	Remove version fields from GetBlocks, GetHeaders. These are instead set by the negotiated version.	2020-02-14 18:23:41 -05:00
Henry de Valence	abcc0a6773	Add basic retry policies to zebra-network. This should be removed when https://github.com/tower-rs/tower/pull/414 lands but is good enough for our purposes for now.	2020-02-11 15:23:19 -05:00
Henry de Valence	befdb46dc3	Clean some warnings in the Bitcoin codec. This doesn't clean the warnings about unused items in the builder, since those are unused for a reason (the implementation that should use them is missing).	2020-02-10 09:03:56 -08:00
Henry de Valence	2082672b3c	Remove Response::Error. Error handling is already handled by Result; we don't need an "inner" error variant duplicating the outer one.	2020-02-10 09:03:56 -08:00
Henry de Valence	29f901add3	Rename Response::Ok to Response::Nil. This is a better name because it signals "no data in response" rather than "Ok", which is semantically mixed with `Ok/Err` of `Result`.	2020-02-10 09:03:56 -08:00
Henry de Valence	5929e05e52	Remove `PushPeers` and ignore unsolicited `addr` messages. PushPeers is more complicated to thread into the rest of our architecture (we would need to establish a data path connecting our service handling inbound requests to the network layer's auto-crawler), and since we crawl the network automatically anyways, we don't actually need to accept them in order to get updated address information. The only possible problem with this approach is that zcashd refuses to answer multiple address requests from the same connection, ostensibly for fingerprinting prevention (although it's totally happy to give exactly the same information, as long as you hang up and reconnect first, lol). It's unclear how this will interact with our design -- on the one hand, it could mean that we don't get new addr information when we ask, but on the other hand, we may have enough churn in our connection pool that this isn't a problem anyways.	2020-02-10 09:03:56 -08:00
Henry de Valence	2c0f48b587	Refactor connection logic and try a block request. Attempting to implement requests for block data revealed a problem with the previous connection logic. Block data is requested by sending a `getdata` message with hashes of the requested blocks; the peer responds with a sequence of `block` messages with the blocks themselves. However, this wasn't possible to handle with the previous connection logic, which could only convert a single Bitcoin message into a Response. Instead, we factor out the message handling logic into a Handler, which can statefully accumulate arbitrary data into a Response and signal completion. This is still pretty ugly but it does work. As a side effect, the HeartbeatNonceMismatch error is removed; because the Handler now tries to process messages until it comes to a Response, it just ignores mismatched nonces (and will eventually time out). The previous Mempool and Transaction requests were removed but could be re-added in a different form later. Also, the `Get` prefixes are removed from `Request` to tidy the name.	2020-02-10 09:03:56 -08:00
Henry de Valence	972d16518f	Make ZcashSerialize infallible mod its Writer. Closes #158. As discussed on the issue, this makes it possible to safely serialize data into hashes, and encourages serializable data to make illegal states unrepresentable.	2020-02-05 19:48:43 -05:00
Henry de Valence	b0f61c4dd2	Remove outdated comment (we use tokio codecs now)	2020-02-05 19:44:35 -05:00

... 4 5 6 7 8 ...

755 Commits