Async in Zebra RFC (#1965)

* Initial async RFC version * Add a table of contents Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> * Add a toc anchor Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> * Add some words that need definitions Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> * Write guide intro based on feedback * Add a code example for each reference section * Link to code examples using commit hashes * Link to PR and commit for each code example * Fix typos Co-authored-by: Deirdre Connolly <deirdre@zfnd.org> * Remove redundant version in docs.rs link * Link the guide to the reference And expand the guide descriptions * Mention TurboWish as a future diagnostic tool * Add an example of a compiler error that prevents deadlock Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com> Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2021-05-04 10:00:49 +10:00 · 2021-05-04 10:00:49 +10:00 · 6f2dbd9de8
parent 5632d8cbb8
commit 6f2dbd9de8
1 changed files with 684 additions and 0 deletions
--- a/book/src/dev/rfcs/0011-async-rust-in-zebra.md
+++ b/book/src/dev/rfcs/0011-async-rust-in-zebra.md
@ -0,0 +1,684 @@
+- Feature Name: async_rust_constraints
+- Start Date: 2021-03-30
+- Design PR: [ZcashFoundation/zebra#1965](https://github.com/ZcashFoundation/zebra/pull/1965)
+- Zebra Issue: [ZcashFoundation/zebra#1593](https://github.com/ZcashFoundation/zebra/issues/1593)
+
+# Summary
+[summary]: #summary
+
+Zebra programmers need to carefully write async code so it doesn't deadlock or hang.
+This is particularly important for `poll`, `select`, `Buffer`, `Batch`, and `Mutex`.
+
+Zebra executes concurrent tasks using [async Rust](https://rust-lang.github.io/async-book/),
+with the [tokio](https://docs.rs/tokio/) executor.
+
+At a higher level, Zebra also uses [`tower::Service`s](https://docs.rs/tower/0.4.1/tower/trait.Service.html),
+[`tower::Buffer`s](https://docs.rs/tower/0.4.1/tower/buffer/struct.Buffer.html),
+and our own [`tower-batch`](https://github.com/ZcashFoundation/zebra/tree/main/tower-batch)
+implementation.
+
+# Motivation
+[motivation]: #motivation
+
+Like all concurrent codebases, Zebra needs to obey certain constraints to avoid
+hangs. Unfortunately, Rust's tooling in these areas is still developing. So
+Zebra developers need to manually check these constraints during design,
+development, reviews, and testing.
+
+# Definitions
+[definitions]: #definitions
+
+- `hang`: a Zebra component stops making progress.
+- `constraint`: a rule that Zebra must follow to prevent `hang`s.
+- `CORRECTNESS comment`: the documentation for a `constraint` in Zebra's code.
+- `task`: an async task can execute code independently of other tasks, using
+        cooperative multitasking.
+- `contention`: slower execution because multiple tasks are waiting to
+        acquire a lock, buffer/batch slot, or readiness.
+- `missed wakeup`: a task `hang`s because it is never scheduled for wakeup.
+- `lock`: exclusive access to a shared resource. Locks stop other code from
+          running until they are released. For example, a mutex, buffer slot,
+          or service readiness.
+- `critical section`: code that is executed while holding a lock.
+- `deadlock`: a `hang` that stops an async task executing code, because it
+        is waiting for a lock, slot, or task readiness.
+        For example: a task is waiting for a service to be ready, but the
+        service readiness depends on that task making progress.
+- `starvation` or `livelock`: a `hang` that executes code, but doesn't do
+        anything useful. For example: a loop never terminates.
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+If you are designing, developing, or testing concurrent Zebra code, follow the
+patterns in these examples to avoid hangs.
+
+If you are reviewing concurrent Zebra designs or code, make sure that:
+- it is clear how the design or code avoids hangs
+- the design or code follows the patterns in these examples (as much as possible)
+- the concurrency constraints and risks are documented
+
+The [Reference](#reference-level-explanation) section contains in-depth
+background information about Rust async concurrency in Zebra.
+
+Here are some examples of concurrent designs and documentation in Zebra:
+
+## Registering Wakeups Before Returning Poll::Pending
+[wakeups-poll-pending]: #wakeups-poll-pending
+
+To avoid missed wakeups, futures must schedule a wakeup before they return
+`Poll::Pending`. For more details, see the [`Poll::Pending` and Wakeups](#poll-pending-and-wakeups)
+section.
+
+Zebra's [unready_service.rs](https://github.com/ZcashFoundation/zebra/blob/de6d1c93f3e4f9f4fd849176bea6b39ffc5b260f/zebra-network/src/peer_set/unready_service.rs#L43) uses the `ready!` macro to correctly handle
+`Poll::Pending` from the inner service.
+
+You can see some similar constraints in
+[pull request #1954](https://github.com/ZcashFoundation/zebra/pull/1954).
+
+<!-- copied from commit de6d1c93f3e4f9f4fd849176bea6b39ffc5b260f on 2020-04-07 -->
+```rust
+// CORRECTNESS
+//
+// The current task must be scheduled for wakeup every time we return
+// `Poll::Pending`.
+//
+//`ready!` returns `Poll::Pending` when the service is unready, and
+// the inner `poll_ready` schedules this task for wakeup.
+//
+// `cancel.poll` also schedules this task for wakeup if it is canceled.
+let res = ready!(this
+    .service
+    .as_mut()
+    .expect("poll after ready")
+    .poll_ready(cx));
+```
+
+## Futures-Aware Mutexes
+[futures-aware-mutexes]: #futures-aware-mutexes
+
+To avoid hangs or slowdowns, use futures-aware types. For more details, see the
+[Futures-Aware Types](#futures-aware-types) section.
+
+Zebra's [`Handshake`](https://github.com/ZcashFoundation/zebra/blob/a63c2e8c40fa847a86d00c754fb10a4729ba34e5/zebra-network/src/peer/handshake.rs#L204)
+won't block other tasks on its thread, because it uses `futures::lock::Mutex`:
+
+<!-- copied from commit a63c2e8c40fa847a86d00c754fb10a4729ba34e5 on 2020-04-30 -->
+```rust
+pub async fn negotiate_version(
+    peer_conn: &mut Framed<TcpStream, Codec>,
+    addr: &SocketAddr,
+    config: Config,
+    nonces: Arc<futures::lock::Mutex<HashSet<Nonce>>>,
+    user_agent: String,
+    our_services: PeerServices,
+    relay: bool,
+) -> Result<(Version, PeerServices), HandshakeError> {
+    // Create a random nonce for this connection
+    let local_nonce = Nonce::default();
+    // # Correctness
+    //
+    // It is ok to wait for the lock here, because handshakes have a short
+    // timeout, and the async mutex will be released when the task times
+    // out.
+    nonces.lock().await.insert(local_nonce);
+
+    ...
+}
+```
+
+Zebra's [`Inbound service`](https://github.com/ZcashFoundation/zebra/blob/0203d1475a95e90eb6fd7c4101caa26aeddece5b/zebrad/src/components/inbound.rs#L238)
+can't use an async-aware mutex for its `AddressBook`, because the mutex is shared
+with non-async code. It only holds the mutex to clone the address book, reducing
+the amount of time that other tasks on its thread are blocked:
+
+<!-- copied from commit 0203d1475a95e90eb6fd7c4101caa26aeddece5b on 2020-04-30 -->
+```rust
+// # Correctness
+//
+// Briefly hold the address book threaded mutex while
+// cloning the address book. Then sanitize after releasing
+// the lock.
+let peers = address_book.lock().unwrap().clone();
+let mut peers = peers.sanitized();
+```
+
+## Avoiding Deadlocks when Aquiring Buffer or Service Readiness
+[readiness-deadlock-avoidance]: #readiness-deadlock-avoidance
+
+To avoid deadlocks, readiness and locks must be acquired in a consistent order.
+For more details, see the [Acquiring Buffer Slots, Mutexes, or Readiness](#acquiring-buffer-slots-mutexes-readiness)
+section.
+
+Zebra's [`ChainVerifier`](https://github.com/ZcashFoundation/zebra/blob/3af57ece7ae5d43cfbcb6a9215433705aad70b80/zebra-consensus/src/chain.rs#L73)
+avoids deadlocks, contention, and errors by:
+- calling `poll_ready` before each `call`
+- acquiring buffer slots for the earlier verifier first (based on blockchain order)
+- ensuring that buffers are large enough for concurrent tasks
+
+<!-- This fix was in https://github.com/ZcashFoundation/zebra/pull/1735 , but it's
+a partial revert, so the PR is a bit confusing. -->
+
+<!-- edited from commit 306fa882148382299c8c31768d5360c0fa23c4d0 on 2020-04-07 -->
+```rust
+// We acquire checkpoint readiness before block readiness, to avoid an unlikely
+// hang during the checkpoint to block verifier transition. If the checkpoint and
+// block verifiers are contending for the same buffer/batch, we want the checkpoint
+// verifier to win, so that checkpoint verification completes, and block verification
+// can start. (Buffers and batches have multiple slots, so this contention is unlikely.)
+//
+// The chain verifier holds one slot in each verifier, for each concurrent task.
+// Therefore, any shared buffers or batches polled by these verifiers should double
+// their bounds. (For example, the state service buffer.)
+ready!(self
+    .checkpoint
+    .poll_ready(cx)
+    .map_err(VerifyChainError::Checkpoint))?;
+ready!(self.block.poll_ready(cx).map_err(VerifyChainError::Block))?;
+Poll::Ready(Ok(()))
+```
+
+## Critical Section Compiler Errors
+[critical-section-compiler-errors]: #critical-section-compiler-errors
+
+To avoid deadlocks or slowdowns, critical sections should be as short as
+possible, and they should not depend on any other tasks.
+For more details, see the [Acquiring Buffer Slots, Mutexes, or Readiness](#acquiring-buffer-slots-mutexes-readiness)
+section.
+
+Zebra's [`CandidateSet`](https://github.com/ZcashFoundation/zebra/blob/0203d1475a95e90eb6fd7c4101caa26aeddece5b/zebra-network/src/peer_set/candidate_set.rs#L257)
+must release a `std::sync::Mutex` lock before awaiting a `tokio::time::Sleep`
+future. This ensures that the threaded mutex lock isn't held over the `await`
+point.
+
+If the lock isn't dropped, compilation fails, because the mutex lock can't be
+sent between threads.
+
+<!-- edited from commit 0203d1475a95e90eb6fd7c4101caa26aeddece5b on 2020-04-30 -->
+```rust
+// # Correctness
+//
+// In this critical section, we hold the address mutex, blocking the
+// current thread, and all async tasks scheduled on that thread.
+//
+// To avoid deadlocks, the critical section:
+// - must not acquire any other locks
+// - must not await any futures
+//
+// To avoid hangs, any computation in the critical section should
+// be kept to a minimum.
+let reconnect = {
+    let mut guard = self.address_book.lock().unwrap();
+    ...
+    let reconnect = guard.reconnection_peers().next()?;
+
+    let reconnect = MetaAddr::new_reconnect(&reconnect.addr, &reconnect.services);
+    guard.update(reconnect);
+    reconnect
+};
+
+// SECURITY: rate-limit new candidate connections
+sleep.await;
+```
+
+## Sharing Progress between Multiple Futures
+[progress-multiple-futures]: #progress-multiple-futures
+
+To avoid starvation and deadlocks, tasks that depend on multiple futures
+should make progress on all of those futures. This is particularly
+important for tasks that depend on their own outputs. For more details,
+see the [Unbiased Selection](#unbiased-selection)
+section.
+
+Zebra's [peer crawler task](https://github.com/ZcashFoundation/zebra/blob/375c8d8700764534871f02d2d44f847526179dab/zebra-network/src/peer_set/initialize.rs#L326)
+avoids starvation and deadlocks by:
+- sharing progress between any ready futures using the `select!` macro
+- spawning independent tasks to avoid hangs (see [Acquiring Buffer Slots, Mutexes, or Readiness](#acquiring-buffer-slots-mutexes-readiness))
+- using timeouts to avoid hangs
+
+You can see a range of hang fixes in [pull request #1950](https://github.com/ZcashFoundation/zebra/pull/1950).
+
+<!-- edited from commit 375c8d8700764534871f02d2d44f847526179dab on 2020-04-08 -->
+```rust
+// CORRECTNESS
+//
+// To avoid hangs and starvation, the crawler must:
+// - spawn a separate task for each handshake, so they can make progress
+//   independently (and avoid deadlocking each other)
+// - use the `select!` macro for all actions, because the `select` function
+//   is biased towards the first ready future
+
+loop {
+    let crawler_action = tokio::select! {
+        a = handshakes.next() => a,
+        a = crawl_timer.next() => a,
+        _ = demand_rx.next() => {
+            if let Some(candidate) = candidates.next().await {
+                // candidates.next has a short delay, and briefly holds the address
+                // book lock, so it shouldn't hang
+                DemandHandshake { candidate }
+            } else {
+                DemandCrawl
+            }
+        }
+    };
+
+    match crawler_action {
+        DemandHandshake { candidate } => {
+            // spawn each handshake into an independent task, so it can make
+            // progress independently of the crawls
+            let hs_join =
+                tokio::spawn(dial(candidate, connector.clone()));
+            handshakes.push(Box::pin(hs_join));
+        }
+        DemandCrawl => {
+            // update has timeouts, and briefly holds the address book
+            // lock, so it shouldn't hang
+            candidates.update().await?;
+        }
+        // handle handshake responses and the crawl timer
+    }
+}
+```
+
+## Prioritising Cancellation Futures
+[prioritising-cancellation-futures]: #prioritising-cancellation-futures
+
+To avoid starvation, cancellation futures must take priority over other futures,
+if multiple futures are ready. For more details, see the [Biased Selection](#biased-selection)
+section.
+
+Zebra's [connection.rs](https://github.com/ZcashFoundation/zebra/blob/375c8d8700764534871f02d2d44f847526179dab/zebra-network/src/peer/connection.rs#L423)
+avoids hangs by prioritising the cancel and timer futures over the peer
+receiver future. Under heavy load, the peer receiver future could always
+be ready with a new message, starving the cancel or timer futures.
+
+You can see a range of hang fixes in [pull request #1950](https://github.com/ZcashFoundation/zebra/pull/1950).
+
+<!-- copied from commit 375c8d8700764534871f02d2d44f847526179dab on 2020-04-08 -->
+```rust
+// CORRECTNESS
+//
+// Currently, select prefers the first future if multiple
+// futures are ready.
+//
+// If multiple futures are ready, we want the cancellation
+// to take priority, then the timeout, then peer responses.
+let cancel = future::select(tx.cancellation(), timer_ref);
+match future::select(cancel, peer_rx.next()) {
+    ...
+}
+```
+
+## Atomic Shutdown Flag
+[atomic-shutdown-flag]: #atomic-shutdown-flag
+
+As of April 2021, Zebra implements some shutdown checks using an atomic `bool`.
+
+Zebra's [shutdown.rs](https://github.com/ZcashFoundation/zebra/blob/24bf952e982bde28eb384b211659159d46150f63/zebra-chain/src/shutdown.rs)
+avoids data races and missed updates by using the strongest memory
+ordering ([`SeqCst`](https://doc.rust-lang.org/nomicon/atomics.html#sequentially-consistent)).
+
+We plan to replace this raw atomic code with a channel, see [#1678](https://github.com/ZcashFoundation/zebra/issues/1678).
+
+<!-- edited from commit 24bf952e982bde28eb384b211659159d46150f63 on 2020-04-22 -->
+```rust
+/// A flag to indicate if Zebra is shutting down.
+///
+/// Initialized to `false` at startup.
+pub static IS_SHUTTING_DOWN: AtomicBool = AtomicBool::new(false);
+
+/// Returns true if the application is shutting down.
+pub fn is_shutting_down() -> bool {
+    // ## Correctness:
+    //
+    // Since we're shutting down, and this is a one-time operation,
+    // performance is not important. So we use the strongest memory
+    // ordering.
+    // https://doc.rust-lang.org/nomicon/atomics.html#sequentially-consistent
+    IS_SHUTTING_DOWN.load(Ordering::SeqCst)
+}
+
+/// Sets the Zebra shutdown flag to `true`.
+pub fn set_shutting_down() {
+    IS_SHUTTING_DOWN.store(true, Ordering::SeqCst);
+}
+```
+
+## Integration Testing Async Code
+[integration-testing]: #integration-testing
+
+Sometimes, it is difficult to unit test async code, because it has complex
+dependencies. For more details, see the [Testing Async Code](#testing-async-code)
+section.
+
+[`zebrad`'s acceptance tests](https://github.com/ZcashFoundation/zebra/blob/5bf0a2954e9df3fad53ad57f6b3a673d9df47b9a/zebrad/tests/acceptance.rs#L699)
+run short Zebra syncs on the Zcash mainnet or testnet. These acceptance tests
+make sure that `zebrad` can:
+- sync blocks using its async block download and verification pipeline
+- cancel a sync
+- reload disk state after a restart
+
+These tests were introduced in [pull request #1193](https://github.com/ZcashFoundation/zebra/pull/1193).
+
+<!-- edited from commit 5bf0a2954e9df3fad53ad57f6b3a673d9df47b9a on 2020-04-07 -->
+```rust
+/// Test if `zebrad` can sync some larger checkpoints on mainnet.
+#[test]
+fn sync_large_checkpoints_mainnet() -> Result<()> {
+    let reuse_tempdir = sync_until(
+        LARGE_CHECKPOINT_TEST_HEIGHT,
+        Mainnet,
+        STOP_AT_HEIGHT_REGEX,
+        LARGE_CHECKPOINT_TIMEOUT,
+        None,
+    )?;
+
+    // if stopping corrupts the rocksdb database, zebrad might hang or crash here
+    // if stopping does not write the rocksdb database to disk, Zebra will
+    // sync, rather than stopping immediately at the configured height
+    sync_until(
+        (LARGE_CHECKPOINT_TEST_HEIGHT - 1).unwrap(),
+        Mainnet,
+        "previous state height is greater than the stop height",
+        STOP_ON_LOAD_TIMEOUT,
+        Some(reuse_tempdir),
+    )?;
+
+    Ok(())
+}
+```
+
+## Instrumenting Async Functions
+[instrumenting-async-functions]: #instrumenting-async-functions
+
+Sometimes, it is difficult to debug async code, because there are many tasks
+running concurrently. For more details, see the [Monitoring Async Code](#monitoring-async-code)
+section.
+
+Zebra runs instrumentation on some of its async function using `tracing`.
+Here's an instrumentation example from Zebra's [sync block downloader](https://github.com/ZcashFoundation/zebra/blob/306fa882148382299c8c31768d5360c0fa23c4d0/zebrad/src/components/sync/downloads.rs#L128):
+<!-- there is no original PR for this code, it has been changed a lot -->
+
+<!-- copied from commit 306fa882148382299c8c31768d5360c0fa23c4d0 on 2020-04-08 -->
+```rust
+/// Queue a block for download and verification.
+///
+/// This method waits for the network to become ready, and returns an error
+/// only if the network service fails. It returns immediately after queuing
+/// the request.
+#[instrument(level = "debug", skip(self), fields(%hash))]
+pub async fn download_and_verify(&mut self, hash: block::Hash) -> Result<(), Report> {
+    ...
+}
+```
+
+## Tracing and Metrics in Async Functions
+[tracing-metrics-async-functions]: #tracing-metrics-async-functions
+
+Sometimes, it is difficult to monitor async code, because there are many tasks
+running concurrently. For more details, see the [Monitoring Async Code](#monitoring-async-code)
+section.
+
+Zebra's [client requests](https://github.com/ZcashFoundation/zebra/blob/375c8d8700764534871f02d2d44f847526179dab/zebra-network/src/peer/connection.rs#L585)
+are monitored via:
+- trace and debug logs using `tracing` crate
+- related work spans using the `tracing` crate
+- counters using the `metrics` crate
+
+<!-- there is no original PR for this code, it has been changed a lot -->
+
+<!-- copied from commit 375c8d8700764534871f02d2d44f847526179dab on 2020-04-08 -->
+```rust
+/// Handle an incoming client request, possibly generating outgoing messages to the
+/// remote peer.
+///
+/// NOTE: the caller should use .instrument(msg.span) to instrument the function.
+async fn handle_client_request(&mut self, req: InProgressClientRequest) {
+    trace!(?req.request);
+
+    let InProgressClientRequest { request, tx, span } = req;
+
+    if tx.is_canceled() {
+        metrics::counter!("peer.canceled", 1);
+        tracing::debug!("ignoring canceled request");
+        return;
+    }
+    ...
+}
+```
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+The reference section contains in-depth information about concurrency in Zebra:
+- [`Poll::Pending` and Wakeups](#poll-pending-and-wakeups)
+- [Futures-Aware Types](#futures-aware-types)
+- [Acquiring Buffer Slots, Mutexes, or Readiness](#acquiring-buffer-slots-mutexes-readiness)
+- [Buffer and Batch](#buffer-and-batch)
+  - [Buffered Services](#buffered-services)
+    - [Choosing Buffer Bounds](#choosing-buffer-bounds)
+- [Awaiting Multiple Futures](#awaiting-multiple-futures)
+  - [Unbiased Selection](#unbiased-selection)
+  - [Biased Selection](#biased-selection)
+- [Using Atomics](#using-atomics)
+- [Testing Async Code](#testing-async-code)
+- [Monitoring Async Code](#monitoring-async-code)
+
+Most Zebra designs or code changes will only touch on one or two of these areas.
+
+## `Poll::Pending` and Wakeups
+[poll-pending-and-wakeups]: #poll-pending-and-wakeups
+
+When returning `Poll::Pending`, `poll` functions must ensure that the task will be woken up when it is ready to make progress.
+
+In most cases, the `poll` function calls another `poll` function that schedules the task for wakeup.
+
+Any code that generates a new `Poll::Pending` should either have:
+* a `CORRECTNESS` comment explaining how the task is scheduled for wakeup, or
+* a wakeup implementation, with tests to ensure that the wakeup functions as expected.
+
+Note: `poll` functions often have a qualifier, like `poll_ready` or `poll_next`.
+
+## Futures-Aware Types
+[futures-aware-types]: #futures-aware-types
+
+Use futures-aware types, rather than types which will block the current thread.
+
+For example:
+- Use `futures::lock::Mutex` rather than `std::sync::Mutex`
+- Use `tokio::time::{sleep, timeout}` rather than `std::thread::sleep`
+
+Always qualify ambiguous names like `Mutex` and `sleep`, so that it is obvious
+when a call will block.
+
+If you are unable to use futures-aware types:
+- block the thread for as short a time as possible
+- document the correctness of each blocking call
+- consider re-designing the code to use `tower::Services`, or other futures-aware types
+
+## Acquiring Buffer Slots, Mutexes, or Readiness
+[acquiring-buffer-slots-mutexes-readiness]: #acquiring-buffer-slots-mutexes-readiness
+
+Ideally, buffer slots, mutexes, or readiness should be:
+- acquired with one lock per critical section, and
+- held for as short a time as possible.
+
+If multiple locks are required for a critical section, acquire them in the same
+order any time those locks are used. If tasks acquire multiple locks in different
+orders, they can deadlock, each holding a lock that the other needs.
+
+If a buffer, mutex, future or service has complex readiness dependencies,
+schedule those dependencies separate tasks using `tokio::spawn`. Otherwise,
+it might deadlock due to a dependency loop within a single executor task.
+
+In all of these cases:
+- make critical sections as short as possible, and
+- do not depend on other tasks or locks inside the critical section.
+
+### Acquiring Service Readiness
+
+Note: do not call `poll_ready` on multiple tasks, then match against the results. Use the `ready!`
+macro instead, to acquire service readiness in a consistent order.
+
+## Buffer and Batch
+[buffer-and-batch]: #buffer-and-batch
+
+The constraints imposed by the `tower::Buffer` and `tower::Batch` implementations are:
+1. `poll_ready` must be called **at least once** for each `call`
+2. Once we've reserved a buffer slot, we always get `Poll::Ready` from a buffer, regardless of the
+   current readiness of the buffer or its underlying service
+3. The `Buffer`/`Batch` capacity limits the number of concurrently waiting tasks. Once this limit
+   is reached, further tasks will block, awaiting a free reservation.
+4. Some tasks can depend on other tasks before they resolve. (For example: block validation.)
+   If there are task dependencies, **the `Buffer`/`Batch` capacity must be larger than the
+   maximum number of concurrently waiting tasks**, or Zebra could deadlock (hang).
+
+We also avoid hangs because:
+- the timeouts on network messages, block downloads, and block verification will restart verification if it hangs
+- `Buffer` and `Batch` release their reservations when response future is returned by the buffered/batched service, even if the returned future hangs
+  - in general, we should move as much work into futures as possible, unless the design requires sequential `call`s
+- larger `Buffer`/`Batch` bounds
+
+### Buffered Services
+[buffered-services]: #buffered-services
+
+A service should be provided wrapped in a `Buffer` if:
+* it is a complex service
+* it has multiple callers, or
+* it has a single caller that calls it multiple times concurrently.
+
+Services might also have other reasons for using a `Buffer`. These reasons should be documented.
+
+#### Choosing Buffer Bounds
+[choosing-buffer-bounds]: #choosing-buffer-bounds
+
+Zebra's `Buffer` bounds should be set to the maximum number of concurrent requests, plus 1:
+> it's advisable to set bound to be at least the maximum number of concurrent requests the `Buffer` will see
+https://docs.rs/tower/0.4.3/tower/buffer/struct.Buffer.html#method.new
+
+The extra slot protects us from future changes that add an extra caller, or extra concurrency.
+
+As a general rule, Zebra `Buffer`s should all have at least 5 slots, because most Zebra services can
+be called concurrently by:
+* the sync service,
+* the inbound service, and
+* multiple concurrent `zebra-client` blockchain scanning tasks.
+
+Services might also have other reasons for a larger bound. These reasons should be documented.
+
+We should limit `Buffer` lengths for services whose requests or responses contain `Block`s (or other large
+data items, such as `Transaction` vectors). A long `Buffer` full of `Block`s can significantly increase memory
+usage.
+
+For example, parsing a malicious 2 MB block can take up to 12 MB of RAM. So a 5 slot buffer can use 60 MB
+of RAM.
+
+Long `Buffer`s can also increase request latency. Latency isn't a concern for Zebra's core use case as a node
+software, but it might be an issue if wallets, exchanges, or block explorers want to use Zebra.
+
+## Awaiting Multiple Futures
+[awaiting-multiple-futures]: #awaiting-multiple-futures
+
+When awaiting multiple futures, Zebra can use biased or unbiased selection.
+
+Typically, we prefer unbiased selection, so that if multiple futures are ready,
+they each have a chance of completing. But if one of the futures needs to take
+priority (for example, cancellation), you might want to use biased selection.
+
+### Unbiased Selection
+[unbiased-selection]: #unbiased-selection
+
+The [`futures::select!`](https://docs.rs/futures/0.3.13/futures/macro.select.html) and
+[`tokio::select!`](https://docs.rs/tokio/0.3.6/tokio/macro.select.html) macros select
+ready arguments at random.
+
+Also consider the `FuturesUnordered` stream for unbiased selection of a large number
+of futures. However, this macro and stream require mapping all arguments to the same
+type.
+
+Consider mapping the returned type to a custom enum with module-specific names.
+
+### Biased Selection
+[biased-selection]: #biased-selection
+
+The [`futures::select`](https://docs.rs/futures/0.3.13/futures/future/fn.select.html)
+is biased towards its first argument. If the first argument is always ready, the second
+argument will never be returned. (This behavior is not documented or guaranteed.) This
+bias can cause starvation or hangs. Consider edge cases where queues are full, or there
+are a lot of messages. If in doubt:
+- put shutdown or cancel oneshots first, then timers, then other futures
+- use the `select!` macro to ensure fairness
+
+Select's bias can be useful to ensure that cancel oneshots and timers are always
+executed first. Consider the `select_biased!` macro and `FuturesOrdered` stream
+for guaranteed ordered selection of futures. (However, this macro and stream require
+mapping all arguments to the same type.)
+
+The `futures::select` `Either` return type is complex, particularly when nested. This
+makes code hard to read and maintain. Map the `Either` to a custom enum.
+
+## Using Atomics
+[using-atomics]: #using-atomics
+
+If you're considering using atomics, prefer:
+1. a safe, tested abstraction
+2. using the strongest memory ordering ([`SeqCst`](https://doc.rust-lang.org/nomicon/atomics.html#sequentially-consistent))
+3. using a weaker memory ordering, with:
+  - a correctness comment,
+  - multithreaded tests with a concurrency permutation harness like [loom](https://github.com/tokio-rs/loom), and
+  - benchmarks to prove that the low-level code is faster.
+
+In Zebra, we try to use safe abstractions, and write obviously correct code. It takes
+a lot of effort to write, test, and maintain low-level code. Almost all of our
+performance-critical code is in cryptographic libraries. And our biggest performance
+gains from those libraries come from async batch cryptography.
+
+## Testing Async Code
+[testing-async-code]: #testing-async-code
+
+Zebra's existing acceptance and integration tests will catch most hangs and deadlocks.
+
+Some tests are only run after merging to `main`. If a recently merged PR fails on `main`,
+we revert the PR, and fix the failure.
+
+Some concurrency bugs only happen intermittently. Zebra developers should run regular
+full syncs to ensure that their code doesn't cause intermittent hangs. This is
+particularly important for code that modifies Zebra's highly concurrent crates:
+- `zebrad`
+- `zebra-network`
+- `zebra-state`
+- `zebra-consensus`
+- `tower-batch`
+- `tower-fallback`
+
+## Monitoring Async Code
+[monitoring-async-code]: #monitoring-async-code
+
+Zebra uses the following crates for monitoring and diagnostics:
+- [tracing](https://tracing.rs/tracing/): tracing events, spans, logging
+- [tracing-futures](https://docs.rs/tracing-futures/): future and async function instrumentation
+- [metrics](https://docs.rs/metrics-core/) with a [prometheus exporter](https://docs.rs/metrics-exporter-prometheus/)
+
+These introspection tools are also useful during testing:
+- `tracing` logs individual events
+  - spans track related work through the download and verification pipeline
+- `metrics` monitors overall progress and error rates
+  - labels split counters or gauges into different categories (for example, by peer address)
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+Implementing and reviewing these constraints creates extra work for developers.
+But concurrency bugs slow down every developer, and impact users. And diagnosing
+those bugs can take a lot of developer effort.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+Can we catch these bugs using automated tests?
+
+How can we diagnose these kinds of issues faster and more reliably?
+  - [TurboWish](https://blog.pnkfx.org/blog/2021/04/26/road-to-turbowish-part-1-goals/)
+    looks really promising for task, channel, and future introspection. But as of April
+    2021, it's still in early development.