zebra/zebra-network/src/policies.rs

use std::pin::Pin;

use futures::{Future, FutureExt};
use tower::retry::Policy;

/// A very basic retry policy with a limited number of retry attempts.
///
/// TODO: Remove this when <https://github.com/tower-rs/tower/pull/414> lands.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct RetryLimit {
    remaining_tries: usize,
}

impl RetryLimit {
    /// Create a policy with the given number of retry attempts.
    pub fn new(retry_attempts: usize) -> Self {
        RetryLimit {
            remaining_tries: retry_attempts,
        }
    }
}

impl<Req: Clone + std::fmt::Debug, Res, E: std::fmt::Debug> Policy<Req, Res, E> for RetryLimit {
    type Future = Pin<Box<dyn Future<Output = Self> + Send + Sync + 'static>>;

    fn retry(&self, req: &Req, result: Result<&Res, &E>) -> Option<Self::Future> {
        if let Err(e) = result {
            if self.remaining_tries > 0 {
                tracing::debug!(?req, ?e, remaining_tries = self.remaining_tries, "retrying");

                let remaining_tries = self.remaining_tries - 1;
                let retry_outcome = RetryLimit { remaining_tries };

                Some(
                    // Let other tasks run, so we're more likely to choose a different peer,
                    // and so that any notfound inv entries win the race to the PeerSet.
                    //
                    // # Security
                    //
                    // We want to choose different peers for retries, so we have a better chance of getting each block.
                    // This is implemented by the connection state machine sending synthetic `notfound`s to the
                    // `InventoryRegistry`, as well as forwarding actual `notfound`s from peers.
                    Box::pin(tokio::task::yield_now().map(move |()| retry_outcome)),
                )
            } else {
                None
            }
        } else {
            None
        }
    }

    fn clone_request(&self, req: &Req) -> Option<Req> {
        Some(req.clone())
    }
}
Fix task handling bugs, so peers are more likely to be available (#3191) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready. 2021-12-19 15:02:31 -08:00			`use std::pin::Pin;`

			`use futures::{Future, FutureExt};`
fmt 2020-02-18 11:32:25 -08:00			`use tower::retry::Policy;`
Add basic retry policies to zebra-network. This should be removed when https://github.com/tower-rs/tower/pull/414 lands but is good enough for our purposes for now. 2020-02-11 10:06:38 -08:00
			`/// A very basic retry policy with a limited number of retry attempts.`
			`///`
change(docs): Replaces XXX with TODO (#6417) * Replaces XXX with TODO: * Updates block/tests 2023-03-27 21:13:04 -07:00			`/// TODO: Remove this when <https://github.com/tower-rs/tower/pull/414> lands.`
Fix syncer download order and add sync tests (#3168) * Refactor so that RetryLimit::Future is std::marker::Sync * Make the syncer future std::marker::Send by spawning tips futures * Download synced blocks in chain order, not HashSet order * Improve MockService failure messages * Add closure-based responses to the MockService API * Move MockChainTip to zebra-chain * Add a MockChainTipSender type alias * Support MockChainTip in ChainSync and its downloader * Add syncer tests for obtain tips, extend tips, and wrong block hashes * Add block too high tests for obtain tips and extend tips * Add syncer tests for duplicate FindBlocks response hashes * Allow longer request delays for mocked services in syncer tests 2022-01-11 09:11:35 -08:00			`#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]`
Add basic retry policies to zebra-network. This should be removed when https://github.com/tower-rs/tower/pull/414 lands but is good enough for our purposes for now. 2020-02-11 10:06:38 -08:00			`pub struct RetryLimit {`
			`remaining_tries: usize,`
			`}`

			`impl RetryLimit {`
			`/// Create a policy with the given number of retry attempts.`
			`pub fn new(retry_attempts: usize) -> Self {`
			`RetryLimit {`
			`remaining_tries: retry_attempts,`
			`}`
			`}`
			`}`

Fix sync algorithm. (#887) * checkpoint: reject older of duplicate verification requests. If we get a duplicate block verification request, we should drop the older one in favor of the newer one, because the older request is likely to have been canceled. Previously, this code would accept up to four duplicate verification requests, then fail all subsequent ones. * sync: add a timeout layer to block requests. Note that if this timeout is too short, we'll bring down the peer set in a retry storm. * sync: restart syncing on error Restart the syncing process when an error occurs, rather than ignoring it. Restarting means we discard all tips and start over with a new block locator, so we can have another chance to "unstuck" ourselves. * sync: additional debug info * sync: handle lookahead limit correctly. Instead of extracting all the completed task results, the previous code pulled results out until there were fewer tasks than the lookahead limit, then stopped. This meant that completed tasks could be left until the limit was exceeded again. Instead, extract all completed results, and use the number of pending tasks to decide whether to extend the tip or wait for blocks to finish. * network: add debug instrumentation to retry policy * sync: instrument the spawned task * sync: streamline ObtainTips/ExtendTips logic & tracing This change does three things: 1. It aligns the implementation of ObtainTips and ExtendTips so that they use the same deduplication method. This means that when debugging we only have one deduplication algorithm to focus on. 2. It streamlines the tracing output to not include information already included in spans. Both obtain_tips and extend_tips have their own spans attached to the events, so it's not necessary to add Scope: prefixes in messages. 3. It changes the messages to be focused on reporting the actual events rather than the interpretation of the events (e.g., "got genesis hash in response" rather than "peer could not extend tip"). The motivation for this change is that when debugging, the interpretation of events is already known to be incorrect, in the sense that the mental model of the code (no bug) does not match its behavior (has bug), so presenting minimally-interpreted events forces interpretation relative to the actual code. * sync: hack to work around zcashd behavior * sync: localize debug statement in extend_tips * sync: change algorithm to define tips as pairs of hashes. This is different enough from the existing description that its comments no longer apply, so I removed them. A further chunk of work is to change the sync RFC to document this algorithm. * sync: reduce block timeout * state: add resource limits for sled Closes #888 * sync: add a restart timeout constant * sync: de-pub constants 2020-08-12 16:48:01 -07:00			`impl<Req: Clone + std::fmt::Debug, Res, E: std::fmt::Debug> Policy<Req, Res, E> for RetryLimit {`
Fix syncer download order and add sync tests (#3168) * Refactor so that RetryLimit::Future is std::marker::Sync * Make the syncer future std::marker::Send by spawning tips futures * Download synced blocks in chain order, not HashSet order * Improve MockService failure messages * Add closure-based responses to the MockService API * Move MockChainTip to zebra-chain * Add a MockChainTipSender type alias * Support MockChainTip in ChainSync and its downloader * Add syncer tests for obtain tips, extend tips, and wrong block hashes * Add block too high tests for obtain tips and extend tips * Add syncer tests for duplicate FindBlocks response hashes * Allow longer request delays for mocked services in syncer tests 2022-01-11 09:11:35 -08:00			`type Future = Pin<Box<dyn Future<Output = Self> + Send + Sync + 'static>>;`
Fix task handling bugs, so peers are more likely to be available (#3191) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready. 2021-12-19 15:02:31 -08:00
Fix sync algorithm. (#887) * checkpoint: reject older of duplicate verification requests. If we get a duplicate block verification request, we should drop the older one in favor of the newer one, because the older request is likely to have been canceled. Previously, this code would accept up to four duplicate verification requests, then fail all subsequent ones. * sync: add a timeout layer to block requests. Note that if this timeout is too short, we'll bring down the peer set in a retry storm. * sync: restart syncing on error Restart the syncing process when an error occurs, rather than ignoring it. Restarting means we discard all tips and start over with a new block locator, so we can have another chance to "unstuck" ourselves. * sync: additional debug info * sync: handle lookahead limit correctly. Instead of extracting all the completed task results, the previous code pulled results out until there were fewer tasks than the lookahead limit, then stopped. This meant that completed tasks could be left until the limit was exceeded again. Instead, extract all completed results, and use the number of pending tasks to decide whether to extend the tip or wait for blocks to finish. * network: add debug instrumentation to retry policy * sync: instrument the spawned task * sync: streamline ObtainTips/ExtendTips logic & tracing This change does three things: 1. It aligns the implementation of ObtainTips and ExtendTips so that they use the same deduplication method. This means that when debugging we only have one deduplication algorithm to focus on. 2. It streamlines the tracing output to not include information already included in spans. Both obtain_tips and extend_tips have their own spans attached to the events, so it's not necessary to add Scope: prefixes in messages. 3. It changes the messages to be focused on reporting the actual events rather than the interpretation of the events (e.g., "got genesis hash in response" rather than "peer could not extend tip"). The motivation for this change is that when debugging, the interpretation of events is already known to be incorrect, in the sense that the mental model of the code (no bug) does not match its behavior (has bug), so presenting minimally-interpreted events forces interpretation relative to the actual code. * sync: hack to work around zcashd behavior * sync: localize debug statement in extend_tips * sync: change algorithm to define tips as pairs of hashes. This is different enough from the existing description that its comments no longer apply, so I removed them. A further chunk of work is to change the sync RFC to document this algorithm. * sync: reduce block timeout * state: add resource limits for sled Closes #888 * sync: add a restart timeout constant * sync: de-pub constants 2020-08-12 16:48:01 -07:00			`fn retry(&self, req: &Req, result: Result<&Res, &E>) -> Option<Self::Future> {`
			`if let Err(e) = result {`
Add basic retry policies to zebra-network. This should be removed when https://github.com/tower-rs/tower/pull/414 lands but is good enough for our purposes for now. 2020-02-11 10:06:38 -08:00			`if self.remaining_tries > 0 {`
Fix sync algorithm. (#887) * checkpoint: reject older of duplicate verification requests. If we get a duplicate block verification request, we should drop the older one in favor of the newer one, because the older request is likely to have been canceled. Previously, this code would accept up to four duplicate verification requests, then fail all subsequent ones. * sync: add a timeout layer to block requests. Note that if this timeout is too short, we'll bring down the peer set in a retry storm. * sync: restart syncing on error Restart the syncing process when an error occurs, rather than ignoring it. Restarting means we discard all tips and start over with a new block locator, so we can have another chance to "unstuck" ourselves. * sync: additional debug info * sync: handle lookahead limit correctly. Instead of extracting all the completed task results, the previous code pulled results out until there were fewer tasks than the lookahead limit, then stopped. This meant that completed tasks could be left until the limit was exceeded again. Instead, extract all completed results, and use the number of pending tasks to decide whether to extend the tip or wait for blocks to finish. * network: add debug instrumentation to retry policy * sync: instrument the spawned task * sync: streamline ObtainTips/ExtendTips logic & tracing This change does three things: 1. It aligns the implementation of ObtainTips and ExtendTips so that they use the same deduplication method. This means that when debugging we only have one deduplication algorithm to focus on. 2. It streamlines the tracing output to not include information already included in spans. Both obtain_tips and extend_tips have their own spans attached to the events, so it's not necessary to add Scope: prefixes in messages. 3. It changes the messages to be focused on reporting the actual events rather than the interpretation of the events (e.g., "got genesis hash in response" rather than "peer could not extend tip"). The motivation for this change is that when debugging, the interpretation of events is already known to be incorrect, in the sense that the mental model of the code (no bug) does not match its behavior (has bug), so presenting minimally-interpreted events forces interpretation relative to the actual code. * sync: hack to work around zcashd behavior * sync: localize debug statement in extend_tips * sync: change algorithm to define tips as pairs of hashes. This is different enough from the existing description that its comments no longer apply, so I removed them. A further chunk of work is to change the sync RFC to document this algorithm. * sync: reduce block timeout * state: add resource limits for sled Closes #888 * sync: add a restart timeout constant * sync: de-pub constants 2020-08-12 16:48:01 -07:00			`tracing::debug!(?req, ?e, remaining_tries = self.remaining_tries, "retrying");`
Fix syncer download order and add sync tests (#3168) * Refactor so that RetryLimit::Future is std::marker::Sync * Make the syncer future std::marker::Send by spawning tips futures * Download synced blocks in chain order, not HashSet order * Improve MockService failure messages * Add closure-based responses to the MockService API * Move MockChainTip to zebra-chain * Add a MockChainTipSender type alias * Support MockChainTip in ChainSync and its downloader * Add syncer tests for obtain tips, extend tips, and wrong block hashes * Add block too high tests for obtain tips and extend tips * Add syncer tests for duplicate FindBlocks response hashes * Allow longer request delays for mocked services in syncer tests 2022-01-11 09:11:35 -08:00
Fix task handling bugs, so peers are more likely to be available (#3191) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready. 2021-12-19 15:02:31 -08:00			`let remaining_tries = self.remaining_tries - 1;`
Fix syncer download order and add sync tests (#3168) * Refactor so that RetryLimit::Future is std::marker::Sync * Make the syncer future std::marker::Send by spawning tips futures * Download synced blocks in chain order, not HashSet order * Improve MockService failure messages * Add closure-based responses to the MockService API * Move MockChainTip to zebra-chain * Add a MockChainTipSender type alias * Support MockChainTip in ChainSync and its downloader * Add syncer tests for obtain tips, extend tips, and wrong block hashes * Add block too high tests for obtain tips and extend tips * Add syncer tests for duplicate FindBlocks response hashes * Allow longer request delays for mocked services in syncer tests 2022-01-11 09:11:35 -08:00			`let retry_outcome = RetryLimit { remaining_tries };`
Fix task handling bugs, so peers are more likely to be available (#3191) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready. 2021-12-19 15:02:31 -08:00
			`Some(`
Fix syncer download order and add sync tests (#3168) * Refactor so that RetryLimit::Future is std::marker::Sync * Make the syncer future std::marker::Send by spawning tips futures * Download synced blocks in chain order, not HashSet order * Improve MockService failure messages * Add closure-based responses to the MockService API * Move MockChainTip to zebra-chain * Add a MockChainTipSender type alias * Support MockChainTip in ChainSync and its downloader * Add syncer tests for obtain tips, extend tips, and wrong block hashes * Add block too high tests for obtain tips and extend tips * Add syncer tests for duplicate FindBlocks response hashes * Allow longer request delays for mocked services in syncer tests 2022-01-11 09:11:35 -08:00			`// Let other tasks run, so we're more likely to choose a different peer,`
			`// and so that any notfound inv entries win the race to the PeerSet.`
			`//`
Delete outdated `TODOs` refering to closed issues (#6732) * ZIPs were updated to remove ambiguity, this was tracked in #1267. * #2105 was fixed by #3039 and #2379 was closed by #3069 * #2230 was a duplicate of #2231 which was closed by #2511 * #3235 was obsoleted by #2156 which was fixed by #3505 * #1850 was fixed by #2944, #1851 was fixed by #2961 and #2902 was fixed by #2969 * We migrated to Rust 2021 edition in Jan 2022 with #3332 * #1631 was closed as not needed * #338 was fixed by #3040 and #1162 was fixed by #3067 * #2079 was fixed by #2445 * #4794 was fixed by #6122 * #1678 stopped being an issue * #3151 was fixed by #3934 * #3204 was closed as not needed * #1213 was fixed by #4586 * #1774 was closed as not needed * #4633 was closed as not needed * Clarify behaviour of difficulty spacing Co-authored-by: teor <teor@riseup.net> * Update comment to reflect implemented behaviour Co-authored-by: teor <teor@riseup.net> * Update comment to reflect implemented behaviour when retrying block downloads Co-authored-by: teor <teor@riseup.net> * Update `TODO` to remove closed issue and clarify when we might want to fix Co-authored-by: teor <teor@riseup.net> * Update `TODO` to remove closed issue and clarify what we might want to change in future Co-authored-by: teor <teor@riseup.net> * Clarify benefits of how we do block verification Co-authored-by: teor <teor@riseup.net> * Fix rustfmt errors --------- Co-authored-by: teor <teor@riseup.net> 2023-05-22 20:33:14 -07:00			`// # Security`
			`//`
			`// We want to choose different peers for retries, so we have a better chance of getting each block.`
			// This is implemented by the connection state machine sending synthetic `notfound`s to the
			// `InventoryRegistry`, as well as forwarding actual `notfound`s from peers.
Fix syncer download order and add sync tests (#3168) * Refactor so that RetryLimit::Future is std::marker::Sync * Make the syncer future std::marker::Send by spawning tips futures * Download synced blocks in chain order, not HashSet order * Improve MockService failure messages * Add closure-based responses to the MockService API * Move MockChainTip to zebra-chain * Add a MockChainTipSender type alias * Support MockChainTip in ChainSync and its downloader * Add syncer tests for obtain tips, extend tips, and wrong block hashes * Add block too high tests for obtain tips and extend tips * Add syncer tests for duplicate FindBlocks response hashes * Allow longer request delays for mocked services in syncer tests 2022-01-11 09:11:35 -08:00			`Box::pin(tokio::task::yield_now().map(move \|()\| retry_outcome)),`
Fix task handling bugs, so peers are more likely to be available (#3191) * Tweak crawler timings so peers are more likely to be available * Tweak min peer connection interval so we try all peers * Let other tasks run between fanouts, so we're more likely to choose different peers * Let other tasks run between retries, so we're more likely to choose different peers * Let other tasks run after peer crawler DemandDrop This makes it more likely that peers will become ready. 2021-12-19 15:02:31 -08:00			`)`
Add basic retry policies to zebra-network. This should be removed when https://github.com/tower-rs/tower/pull/414 lands but is good enough for our purposes for now. 2020-02-11 10:06:38 -08:00			`} else {`
			`None`
			`}`
			`} else {`
			`None`
			`}`
			`}`

			`fn clone_request(&self, req: &Req) -> Option<Req> {`
			`Some(req.clone())`
			`}`
			`}`