zebra

Commit Graph

Author	SHA1	Message	Date
teor	ac4ed57751	Cancel heartbeats that are waiting for a peer, rather than hanging Zebra (#3325 ) * If the crawler is delayed, delay future crawl intervals by the same amount * Cancel heartbeats that are waiting for network requests or responses	2022-01-12 19:15:07 +00:00
teor	7e63182cdc	Stop ignoring some peers when updating the address book (#3292 ) * Make sure MetaAddrChanges are correctly applied to an empty address book * Add all initial peers to the address book	2022-01-05 18:12:59 -05:00
teor	469fa6b917	1. Fix some address crawler timing issues (#3293 ) * Stop holding completed messages until the next inbound message * Add more info to network message block download debug logs * Simplify address metrics logs * Try handling inbound messages as responses, then try as a new request * Improve address book logging * Fix a race between the first heartbeat and getaddr requests * Temporarily reduce the getaddr fanout to 1 * Update metrics when exiting the Connection run loop * Downgrade some debug logs to trace	2022-01-04 18:43:30 -05:00
teor	d0e6de8040	Avoid deadlocks in the address book mutex (#3244 ) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready. * Spawn the address book updater on a blocking thread * Spawn CandidateSet address book operations on blocking threads * Replace the PeerSet address book with a metrics watch channel * Fix comment * Await spawned address book tasks * Run the address book update tasks concurrently (except for the mutex) * Explain an internal-only method better * Fix a typo Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>	2021-12-20 00:44:43 +00:00
teor	6cbd7dce43	Fix task handling bugs, so peers are more likely to be available (#3191 ) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready.	2021-12-20 09:02:31 +10:00
teor	ba42d59f12	Stop ignoring panics in inbound handshakes (#3192 )	2021-12-10 18:32:42 +00:00
Alfredo Garcia	f750535961	Spawn initial handshakes in separated task (#3189 ) * spawn connector * expand comment Co-authored-by: teor <teor@riseup.net> * fix error handling Co-authored-by: teor <teor@riseup.net>	2021-12-10 01:18:43 +00:00
teor	37808eaadb	Security: When there are no new peers, stop crawler using CPU and writing logs (#3177 ) * Stop useless crawler attempts when there are no peers and no crawl responses * Disable GitHub bug report URLs when the disk is full * Add help text for the `zebrad start` tracing filter option	2021-12-10 00:19:52 +00:00
Janito Vaqueiro Ferreira Filho	0ad89f2f41	Disconnect from outdated peers on network upgrade (#3108 ) * Replace usage of `discover::Change` with a tuple Remove the assumption that a `Remove` variant would never be created with type changes that allow the compiler to guarantee that assumption. * Add a `version` field to the `Client` type Keep track of the peer's reported protocol version. * Create `LoadTrackedClient` type A `peer::Client` type wrapper that implements `Load`. This helps with the creation of a client service that has extra peer information to be accessed without having to send requests. * Use `LoadTrackedClient` in `initialize` Ensure that `PeerSet` receives `LoadTrackedClient`s so that it will be able to query the peer's protocol version later on. * Require `LoadTrackedClient` in `PeerSet` Replace the generic type with a concrete `LoadTrackedClient` so that we can query its version. * Create `MinimumPeerVersion` helper type A type to track the current minimum protocol version for connected peers based on the current block height. * Use `MinimumPeerVersion` in handshakes Keep the code to obtain the current minimum peer protocol version in a central place. * Add a `MinimumPeerVersion` instance to `PeerSet` Prepare it to be able to disconnect from outdated peers based on the current minimum supported peer protocol version. * Disconnect from ready services for outdated peers When the minimum peer protocol version is detected to have changed (because of a network upgrade), remove all ready services of peers that became outdated. * Cancel added unready services of outdated peers Only add an unready service if it's for a peer that has a supported protocol version. Otherwise, add it but drop the cancel handle so that the `UnreadyService` can execute and detect that it was cancelled. * Avoid adding ready services for outdated peers If a service becomes ready but it's for a connection to an outdated peer, drop it. * Improve comment inside `crawl_and_dial` Describe an edge case that is also handled but was not explicit. Co-authored-by: teor <teor@riseup.net> * Test if calculated minimum peer version is correct Given an arbitrary best chain tip height, check that the calculated minimum peer protocol version is the expected value. * Test if minimum version changes with chain tip Apply an arbitrary list of chain tip height updates and check that for each update the minimum peer version is calculated correctly. * Test minimum peer version changed reports Simulate a series of best chain tip height updates, and check for minimum peer version updates at least once between them. Changes should only be reported once. * Create a `MockedClientHandle` helper type Used to create and then track a mock `Client` instance. * Add `MinimumPeerVersion::with_mock_chain_tip` An extension method useful for tests, that contains some shared boilerplate code. * Bias arbitrary `Version`s to be in valid range Give a 50% chance for an arbitrary `Version` to be in the range of previously used values the Zcash network. * Create a `PeerVersions` helper type Helps with the creation of mocked client services with arbitrary protocol versions. * Create a `PeerSetGuard` helper type An auxiliary type to a `PeerSet` instance created for testing. It keeps track of any dummy endpoints of channels created and passed to the `PeerSet` instance. * Create a `PeerSetBuilder` helper type Helps to reduce the code when preparing a `PeerSet` test instance. * Test if outdated peers are rejected by `PeerSet` Simulate a set of discovered peers being sent to the `PeerSet`. Ensure that only up-to-date peers are kept by the `PeerSet` and that outdated peers are dropped. * Create `BlockHeightPairAcrossNetworkUpgrades` type A helper type that allows the creation of arbitrary block height pairs, where one value is before and the other is at or after the activation height of an arbitrary network upgrade. * Test if peers are dropped as they become outdated Simulate a network upgrade, and check that peers that become outdated are dropped by the `PeerSet`. * Remove dbg! macros Co-authored-by: teor <teor@riseup.net>	2021-12-09 02:54:29 +00:00
teor	c4118dcc2c	Check for panics in the address book updater task (#3064 ) * Check for panics in the address book updater task * Fix the return type and tests Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>	2021-11-18 12:34:51 +00:00
teor	7d8240fac3	Fix verbose add_initial_peers logs (#3019 ) And update some function docs.	2021-11-07 22:21:51 +00:00
Marek	d03161c63f	Add unused seed peers to the AddressBook (#2974 ) * Add unused seed peers to the AddressBook * Document a new `await` We added an extra await on the AddressBook thread mutex. Co-authored-by: teor <teor@riseup.net> * Fix a typo * Refactor names * Return early from `limit_initial_peers` * Add `proptest`s regressions * Return `MetaAddr` instead of `None` * Test if `zebra_network::init()` deadlocks * Remove unneeded regressions * Rename `TimestampCollector` to `AddressBookUpdater` (#2992) * Rename `TimestampCollector` to `AddressBookUpdater` * Update comments Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com> Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com> * Move `all_peers` instead of copying them Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com> * Make `Duration` a const Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com> * Use a timeout instead of measuring the elapsed time Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com> * Copy `initial_peers` instead of moving them * Refactor the position of `NewInitial` and `new_initial` Co-authored-by: teor <teor@riseup.net> Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>	2021-11-04 08:34:00 -03:00
Janito Vaqueiro Ferreira Filho	0960e4fb0b	Update to Tokio 1.13.0 (#2994 ) * Update `tower` to version `0.4.9` Update to latest version to add support for Tokio version 1. * Replace usage of `ServiceExt::ready_and` It was deprecated in favor of `ServiceExt::ready`. * Update Tokio dependency to version `1.13.0` This will break the build because the code isn't ready for the update, but future commits will fix the issues. * Replace import of `tokio::stream::StreamExt` Use `futures::stream::StreamExt` instead, because newer versions of Tokio don't have the `stream` feature. * Use `IntervalStream` in `zebra-network` In newer versions of Tokio `Interval` doesn't implement `Stream`, so the wrapper types from `tokio-stream` have to be used instead. * Use `IntervalStream` in `inventory_registry` In newer versions of Tokio the `Interval` type doesn't implement `Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used instead. * Use `BroadcastStream` in `inventory_registry` In newer versions of Tokio `broadcast::Receiver` doesn't implement `Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This also requires changing the error type that is used. * Handle `Semaphore::acquire` error in `tower-batch` Newer versions of Tokio can return an error if the semaphore is closed. This shouldn't happen in `tower-batch` because the semaphore is never closed. * Handle `Semaphore::acquire` error in `zebrad` test On newer versions of Tokio `Semaphore::acquire` can return an error if the semaphore is closed. This shouldn't happen in the test because the semaphore is never closed. * Update some `zebra-network` dependencies Use versions compatible with Tokio version 1. * Upgrade Hyper to version 0.14 Use a version that supports Tokio version 1. * Update `metrics` dependency to version 0.17 And also update the `metrics-exporter-prometheus` to version 0.6.1. These updates are to make sure Tokio 1 is supported. * Use `f64` as the histogram data type `u64` isn't supported as the histogram data type in newer versions of `metrics`. * Update the initialization of the metrics component Make it compatible with the new version of `metrics`. * Simplify build version counter Remove all constants and use the new `metrics::incement_counter!` macro. * Change metrics output line to match on The snapshot string isn't included in the newer version of `metrics-exporter-prometheus`. * Update `sentry` to version 0.23.0 Use a version compatible with Tokio version 1. * Remove usage of `TracingIntegration` This seems to not be available from `sentry-tracing` anymore, so it needs to be replaced. * Add sentry layer to tracing initialization This seems like the replacement for `TracingIntegration`. * Remove unnecessary conversion Suggested by a Clippy lint. * Update Cargo lock file Apply all of the updates to dependencies. * Ban duplicate tokio dependencies Also ban git sources for tokio dependencies. * Stop allowing sentry-tracing git repository in `deny.toml` * Allow remaining duplicates after the tokio upgrade * Use C: drive for CI build output on Windows GitHub Actions uses a Windows image with two disk drives, and the default D: drive is smaller than the C: drive. Zebra currently uses a lot of space to build, so it has to use the C: drive to avoid CI build failures because of insufficient space. Co-authored-by: teor <teor@riseup.net>	2021-11-02 18:46:57 +00:00
Alfredo Garcia	07610feef3	Reduce outgoing peers demand (#2969 ) * reduce demand * use `saturating_sub`	2021-10-29 16:29:52 +00:00
teor	f26a60b801	Limit the number of inbound peer connections (#2961 ) * Limit open inbound connections based on the config * Log inbound connection errors at debug level * Test inbound connection limits * Use clone directly in function call argument lists * Remove an outdated comment * Update tests to use an unbounded channel rather than mem::forget And rename some variables. * Use a lower limit in a slow test and require that it is exceeded	2021-10-28 01:49:31 +00:00
Conrado Gouvea	8d01750459	Rate-limit initial seed peer connections (#2943 ) * Rate-limit initial seed peer connections * Revert "Rate-limit initial seed peer connections" This reverts commit `f779a1eb9e`. * Simplify logic * Avoid cooperative async task starvation in the peer crawler and listener If we don't yield in these loops, they can run for a long time before tokio forces them to yield. * Add test * Check for task panics in initial peers test * Remove duplicate code in rebase Co-authored-by: teor <teor@riseup.net>	2021-10-27 23:46:43 +00:00
teor	3e03d48799	Limit the number of outbound peer connections (#2944 ) * Limit the number of outbound connections in the crawler * Make zebra-network channel bounds depend on config.peerset_initial_target_size * Bias Zebra towards outbound connections And turn connection limits into `Config` methods. * Downgrade some connection logs to debug * Remove verbose or outdated fields in tracing logs * Clarify connection limits Includes: - `fastmod OUTBOUND_PEER_BIAS_FRACTION OUTBOUND_PEER_BIAS_DENOMINATOR zebra` - clarify connection limit documentation Clarify inventory channel capacity * Add zebra_network::initialize tests with limited numbers of peers * Avoid cooperative async task starvation in the peer crawler and listener If we don't yield in these loops, they can run for a long time before tokio forces them to yield. * Test the crawler with small connection limits And use the multi-threaded runtime to avoid long hangs. * Stop using the multi-threaded executor in tests where it's not needed * Avoid starvation for every connection Adds yields after inbound successes and initial peer connections. * Add a crawler peer connection success test * Add outbound connection limit tests * Improve outbound tests	2021-10-27 21:28:51 +00:00
teor	c2734f5661	Simplify calling `add_initial_peers` (#2945 ) Co-authored-by: Conrado Gouvea <conrado@zfnd.org>	2021-10-25 20:16:35 +00:00
teor	67327ac462	Downgrade some less interesting info-level logs to debug (#2938 ) There are a lot of these messages when Zebra starts up. They might be slowing down CI and causing timeouts. Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>	2021-10-22 02:11:09 +00:00
teor	424edfa4d9	Improve documentation and types in the PeerSet (#2925 ) * Replace some unit tuples with named unit structs This helps distinguish generic channels and make them type-safe. Also tidy imports and documentation in `peer_set::set`. * Link to the tower balance crate from docs Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>	2021-10-22 01:26:04 +00:00
Alfredo Garcia	ad5f5ff24a	Rate limit the amount of inbound connections (#2928 ) * add sleep to `accept_inbound_connections()` * Expand docs * Expand comments again Co-authored-by: teor <teor@riseup.net>	2021-10-22 00:35:34 +00:00
Alfredo Garcia	2de93bba8e	Limit the number of initial peers (#2913 ) * limit the number of initial peers * Move more code out of zebra_network::initialize * Always limit the number of initial peers in the Config This way, we can never get the unused peers out. * Revert "Always limit the number of initial peers in the Config" This reverts commit `81ede597c8`. Actually, this doesn't work, because we want those extra peers. * Minor tweaks Co-authored-by: Deirdre Connolly <deirdre@zfnd.org> Co-authored-by: teor <teor@riseup.net>	2021-10-21 23:04:46 +00:00
teor	4cdd12e2c4	Track the number of active inbound and outbound peer connections (#2912 ) * Count the number of active inbound and outbound peer connections And reduce the count when each connection fails. * Fix a comment typo Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>	2021-10-21 21:36:42 +00:00
teor	c8ad19080a	Improve logging for initial peer connections (#2896 ) Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>	2021-10-18 18:43:12 +00:00
Alfredo Garcia	4280ef5003	Give more information to the user in the wrong port init warning (#2853 ) * Update initialize.rs * grammar Co-authored-by: teor <teor@riseup.net> Co-authored-by: teor <teor@riseup.net>	2021-10-12 01:13:13 +00:00
teor	b6fe816473	Add a `ChainTipChange` type to `await` chain tip changes (#2715 ) * Rename ChainTipReceiver to CurrentChainTip `fastmod ChainTipReceiver CurrentChainTip zebra` Update chain tip documentation and variable names * Basic chain tip change implementation, without resets Also includes the following name changes: ``` fastmod CurrentChainTip LatestChainTip zebra* fastmod chain_tip_receiver latest_chain_tip zebra* ``` * Clarify the difference between `LatestChainTip` and `ChainTipChange`	2021-09-01 22:31:16 +00:00
teor	d2e14b22f9	Refactor BestTipHeight into a generic ChainTip sender and receiver (#2676 ) * Rename BestTipHeight so it can be generalised to ChainTipSender `fastmod BestTipHeight ChainTipSender zebra` For senders: `fastmod best_tip_height chain_tip_sender zebra` For receivers: `fastmod best_tip_height chain_tip_receiver zebra` Rename best_tip_height module to chain_tip * Wrap the chain tip watch channel in a ChainTipReceiver type * Create a ChainTip trait to avoid tricky crate dependencies And add convenience impls for optional and empty chain tips. * Use the ChainTip trait in zebra-network * Replace `Option<ChainTip>` with `NoChainTip` Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com> Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>	2021-08-27 11:34:33 +10:00
Janito Vaqueiro Ferreira Filho	4c4dbfe7cd	Reject connections from outdated peers (#2519 ) * Simplify state service initialization in test Use the test helper function to remove redundant code. * Create `BestTipHeight` helper type This type abstracts away the calculation of the best tip height based on the finalized block height and the best non-finalized chain's tip. * Add `best_tip_height` field to `StateService` The receiver endpoint is currently ignored. * Return receiver endpoint from service constructor Make it available so that the best tip height can be watched. * Update finalized height after finalizing blocks After blocks from the queue are finalized and committed to disk, update the finalized block height. * Update best non-finalized height after validation Update the value of the best non-finalized chain tip block height after a new block is committed to the non-finalized state. * Update finalized height after loading from disk When `FinalizedState` is first created, it loads the state from persistent storage, and the finalized tip height is updated. Therefore, the `best_tip_height` must be notified of the initial value. * Update the finalized height on checkpoint commit When a checkpointed block is commited, it bypasses the non-finalized state, so there's an extra place where the finalized height has to be updated. * Add `best_tip_height` to `Handshake` service It can be configured using the `Builder::with_best_tip_height`. It's currently not used, but it will be used to determine if a connection to a remote peer should be rejected or not based on that peer's protocol version. * Require best tip height to init. `zebra_network` Without it the handshake service can't properly enforce the minimum network protocol version from peers. Zebrad obtains the best tip height endpoint from `zebra_state`, and the test vectors simply use a dummy endpoint that's fixed at the genesis height. * Pass `best_tip_height` to proto. ver. negotiation The protocol version negotiation code will reject connections to peers if they are using an old protocol version. An old version is determined based on the current known best chain tip height. * Handle an optional height in `Version` Fallback to the genesis height in `None` is specified. * Reject connections to peers on old proto. versions Avoid connecting to peers that are on protocol versions that don't recognize a network update. * Document why peers on old versions are rejected Describe why it's a security issue above the check. * Test if `BestTipHeight` starts with `None` Check if initially there is no best tip height. * Test if best tip height is max. of latest values After applying a list of random updates where each one either sets the finalized height or the non-finalized height, check that the best tip height is the maximum of the most recently set finalized height and the most recently set non-finalized height. * Add `queue_and_commit_finalized` method A small refactor to make testing easier. The handling of requests for committing non-finalized and finalized blocks is now more consistent. * Add `assert_block_can_be_validated` helper Refactor to move into a separate method some assertions that are done before a block is validated. This is to allow moving these assertions more easily to simplify testing. * Remove redundant PoW block assertion It's also checked in `zebra_state::service::check::block_is_contextually_valid`, and it was getting in the way of tests that received a gossiped block before finalizing enough blocks. * Create a test strategy for test vector chain Splits a chain loaded from the test vectors in two parts, containing the blocks to finalize and the blocks to keep in the non-finalized state. * Test committing blocks update best tip height Create a mock blockchain state, with a chain of finalized blocks and a chain of non-finalized blocks. Commit all the blocks appropriately, and verify that the best tip height is updated. Co-authored-by: teor <teor@riseup.net>	2021-08-08 23:52:52 +00:00
teor	bcd5f2c50d	Gossip dynamic local listener ports to peers (#2277 ) * Gossip dynamically allocated listener ports to peers Previously, Zebra would either gossip port `0`, which is invalid, or skip gossiping its own dynamically allocated listener port. * Improve "no configured peers" warning And downgrade from error to warning, because inbound-only nodes are a valid use case. * Move random_known_port to zebra-test * Add tests for dynamic local listener ports and the AddressBook Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>	2021-06-23 07:59:06 +10:00
teor	1a57023eac	Security: Use canonical SocketAddrs to avoid duplicate peer connections, Feature: Send local listener to peers (#2276 ) * Always send our local listener with the latest time Previously, whenever there was an inbound request for peers, we would clone the address book and update it with the local listener. This had two impacts: - the listener could conflict with an existing entry, rather than unconditionally replacing it, and - the listener was briefly included in the address book metrics. As a side-effect, this change also makes sanitization slightly faster, because it avoids some useless peer filtering and sorting. * Skip listeners that are not valid for outbound connections * Filter sanitized addresses Zebra based on address state This fix correctly prevents Zebra gossiping client addresses to peers, but still keeps the client in the address book to avoid reconnections. * Add a full set of DateTime32 and Duration32 calculation methods * Refactor sanitize to use the new DateTime32/Duration32 methods * Security: Use canonical SocketAddrs to avoid duplicate connections If we allow multiple variants for each peer address, we can make multiple connections to that peer. Also make sure sanitized MetaAddrs are valid for outbound connections. * Test that address books contain the local listener address Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>	2021-06-22 02:16:59 +00:00
teor	92828bbb29	Reliability: send local listener address to peers When peers ask for peer addresses, add our local listener address to the set of addresses, sanitize, then truncate. Sanitize shuffles addresses, so if there are lots of addresses in the address book, our address will only be sent to some peers.	2021-05-18 14:02:19 +10:00
teor	458c26f1e3	Limit initial candidate set fanout to the number of initial peers If there is a small number of initial peers, and they are slow, the initial candidate set update can appear to hang. To avoid this issue, limit the initial candidate set fanout to the number of initial peers. Once the initial peers have sent us more peer addresses, there is no need to limit the fanouts for future updates. Reported by Niklas Long of Equilibrium.	2021-05-18 07:54:03 +10:00
teor	b0b8b2f61a	Add extra instrumentation for initialize and handshakes (#2122 ) * Instrument the crawl task When we created the crawl task, we forgot to instrument it with the global span. This fix makes sure that the git and network span appears on crawl logs. * Instrument the connector * Improve handshake instrumentation Make some spans debug, so there are not too many spans. * Add the address to initial peer connection errors	2021-05-17 16:49:16 -04:00
teor	a8a0d6450c	Security: stop gossiping temporary inbound remote addresses to peers - stop putting inbound addresses in the address book - drop address book entries that can't be used for outbound connections - distinguish between temporary inbound and permanent outbound peer addresses - also create variants to handle proxy connections (but don't use them yet) - avoid tracking connection state for isolated connections - document security constraints for the address book and peer set	2021-05-14 23:45:42 +10:00
Kirill Fomichev	afac2c2846	Use the default port for configured listen addresses with no port (#2043 ) * Allow use listen address in config without port * update comments * remove not used alias * use Network::default_port * Move tests and use toml instead json * change error message * Make match more readable Co-authored-by: teor <teor@riseup.net>	2021-04-21 23:14:29 +00:00
teor	0203d1475a	Refactor and document correctness for std::sync::Mutex<AddressBook>	2021-04-21 17:14:47 -04:00
teor	a417c7c8c7	Use meaningful names for select! variables	2021-04-13 23:56:16 -04:00
teor	fb95de99a6	Refactor the dial result into a From impl	2021-04-13 18:52:49 -04:00
teor	375c8d8700	Fix a deadlock between the crawler and dialer, and other hangs (#1950 ) * Stop ignoring inbound message errors and handshake timeouts To avoid hangs, Zebra needs to maintain the following invariants in the handshake and heartbeat code: - each handshake should run in a separate spawned task (not yet implemented) - every message, error, timeout, and shutdown must update the peer address state - every await that depends on the network must have a timeout Once the Connection is created, it should handle timeouts. But we need to handle timeouts during handshake setup. * Avoid hangs by adding a timeout to the candidate set update Also increase the fanout from 1 to 2, to increase address diversity. But only return permanent errors from `CandidateSet::update`, because the crawler task exits if `update` returns an error. Also log Peers response errors in the CandidateSet. * Use the select macro in the crawler to reduce hangs The `select` function is biased towards its first argument, risking starvation. As a side-benefit, this change also makes the code a lot easier to read and maintain. * Split CrawlerAction::Demand into separate actions This refactor makes the code a bit easier to read, at the cost of sometimes blocking the crawler on `candidates.next()`. That's ok, because `next` only has a short (< 100 ms) delay. And we're just about to spawn a separate task for each handshake. * Spawn a separate task for each handshake This change avoids deadlocks by letting each handshake make progress independently. * Move the dial task into a separate function This refactor improves readability. * Fix buggy future::select function usage And document the correctness of the new code.	2021-04-07 10:25:10 -03:00
teor	1a159dfcb6	Add more methods for creating MetaAddrs This refactor lets us remove `MetaAddr::update_last_seen()`.	2021-03-26 07:23:49 +10:00
teor	5a30268d7a	Log address metrics when the peer set has no ready peers	2021-03-17 10:47:04 +10:00
Jane Lusby	03aa6f671f	Implement outbound connection rate limiting - includes config rename with alias (#1855 ) * Implement outbound connection rate limiting * fix breaking change on config Co-authored-by: teor <teor@riseup.net>	2021-03-10 01:36:05 +00:00
teor	d4f2f27218	Add global span to spawned network tasks (#1761 ) Closes #1575	2021-02-20 08:36:50 +10:00
teor	e61b5e50a2	Diagnostics for CI port conflict failures (#1766 ) Log a "Trying..." message before each listener opens, to see if the delay is inside Zebra, or in the test harness or OS. Also report the configured and actual ports where possible, for better diagnostics.	2021-02-18 12:15:09 -03:00
teor	8d1c498234	Log initial peer connection failures And standardise another log message	2021-02-17 09:21:53 -05:00
teor	e85441c914	Add a correctness comment to justify the revert	2021-02-16 05:52:54 +10:00
teor	a02a00a3f5	Revert "Stop using CallAllUnordered in peer_set::add_initial_peers (#1705 )" This reverts commit `241c7ad849`.	2021-02-16 05:52:54 +10:00
Alfredo Garcia	241c7ad849	Stop using CallAllUnordered in peer_set::add_initial_peers (#1705 ) * use ServiceExt::oneshot and FuturesUnordered Co-authored-by: teor <teor@riseup.net>	2021-02-09 08:16:02 +10:00
Alfredo Garcia	221512c733	Async DNS seeder lookups (#1662 ) * replace to_socket_addrs * refactor `resolve()` into `resolve_host()` * use `resolve_host()` to resolve config peers * add DNS_LOOKUP_TIMEOUT constant * don't block the main thread in initialize	2021-02-03 12:20:26 +10:00
Alfredo Garcia	4b34482264	Add hints to port conflict and lock file panics (#1535 ) * add hint for port error * add issue filter for port panic * add lock file hint * add metrics endpoint port conflict hint * add hint for tracing endpoint port conflict * add acceptance test for resource conflics * Split out common conflict test code into a function * Add state, metrics, and tracing conflict tests * Add a full set of stderr acceptance test functions This change makes the stdout and stderr acceptance test interfaces identical. * move Zcash listener opening * add todo about hint for disk full * add constant for lock file * match path in state cache * don't match windows cache path * Use Display for state path logs Avoids weird escaping on Windows when using Debug * Add Windows conflict error messages * Turn PORT_IN_USE_ERROR into a regex And add another alternative Windows-specific port error Co-authored-by: teor <teor@riseup.net> Co-authored-by: Jane Lusby <jane@zfnd.org>	2021-01-29 22:36:33 +10:00

1 2

83 Commits