This makes the span data more compact (e.g., `msg_as_req{msg=block}`) and
restores the Debug impl for Message to show all of the data contained in the
message. The full message is added as a single event at trace level in the
span to preserve the previous full-inspectability.
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.
To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.
Co-authored-by: teor <teor@riseup.net>
* warn: if there are no peers at all
* info: if there are no ready peers
* trace: the number of ready and unready peers for every request
Log at most one warn or info log per minute, to avoid flooding the
terminal with log lines. Suppress warn and info logs for the first
minute, while the peer set is starting up.
The `CoinbaseData` parses the block height separately from the rest of the
free-form coinbase data. However, it had two bugs:
1. It did not require that the height was canonically encoded;
2. Its canonical encoding was incorrect relative to the BIP34-inherited encoding.
This meant that we computed some transaction hashes incorrectly, because when
we re-serialized the coinbase transaction, we would canonically serialize the
coinbase transaction (using the incorrect definition of canonical, bug 2). And
we didn't notice that the wrong definition of canonical encoding was being used
because we accepted what we thought were non-canonically encoded heights.
The relevant rules are here: 877212414a/src/script/script.h (L307-L346)
This commit changes the encoding to reject non-canonically encoded heights, and
to match the correct encoding rules. We check that at least one
non-canonically encoded height is correctly rejected using a new test vector.
The database format increments because we saved a bunch of wrongly encoded blocks.
This discrepancy was originally noticed by @teor2345, who pointed out that a
previous version of the block 202 test vector (now preserved as "bad block
202") did not match the block from zcashd.
Change the Merkle root validation logic to also check that a block does not
contain duplicate transactions. This check is redundant with later
double-spend checks, but is a useful defense-in-depth.
As a side effect of computing Merkle roots, we build a list of
transaction hashes. Instead of discarding these, add them to
PreparedBlock and FinalizedBlock so that they can be reused rather than
recomputed.
This commit adds Merkle root validation to:
1. the block verifier;
2. the checkpoint verifier.
In the first case, Bitcoin Merkle tree malleability has no effect,
because only a single Merkle tree in each malleablity set is valid (the
others have duplicate transactions).
In the second case, we need to check that the Merkle tree does not contain any
duplicate transactions.
Closes#1385Closes#906
UTXO requests during transaction input verification can time out because:
1. The block that creates the UTXO is queued for download or verify, but
it hasn't been committed yet. The creating block might spend UTXOs
that come from other recent blocks, so UTXO verification can depend on
a (non-contiguous) sequence of block verifications.
In this case, Zebra should wait for additional block download and
verify tasks to complete.
2. The block that creates the UTXO isn't queued for download. This can
happen because the block is gossiped block that's much higher than the
current tip, or because a peer sent the syncer a bad list of block
hashes.
In this case, Zebra should discard the timed out block, and restart
the sync.
We need to choose a timeout that balances these two cases, so we time
out after 180 seconds.
Assuming Zebra can download at least 1 MB per second, 180 seconds is
enough time to download a few hundred blocks. So Zebra should be able to
download and verify the next block before the UTXOs that it creates time
out. (Since Zebra has already verified all the blocks before the next
block, its UTXO requests should return immediately.)
Even if some peers time out downloads, a block can only be pending
download for 80 seconds (4 retries * 20 second timeout) before the
download fails. So the UTXO timeout doesn't need to be much larger than
this overall download timeout - because the download timeout will happen
first on slow networks.
Alternately, if the download for the creating block was never queued,
Zebra should timeout as soon as possible - so it can restart the sync
and download the creating block.
As a side-effect, a lower UTXO timeout also makes it slightly easier to
debug UTXO issues, because unsatisfiable queries fail faster.
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions
Add a `max_len` argument to support `FindHeaders` requests.
Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.
* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Some checks use the same blocks, so we take a copy of the block borrows
before using them. That way, we don't have to manage the position of the
iterator between checks.
The state service API says explicitly that AwaitUTXO requests should be coupled
with a timeout layer. I didn't add this when I was testing and fixing the UTXO
lookup code (#1348, #1358) because causing zebrad to hang on a failed
dependency was useful for identifying cases where the code wasn't useful (and
then inspecting execution traces).
As a side effect, this ticket resolves most of the hangs in #1389, because
far-future gossiped blocks will have their UTXO lookups time out, though we
may wish to do other work as part of debugging the combined sync+gossip logic.
* Difficulty Contextual RFC: Introduction
Add a header, summary, and motivation
* Difficulty RFC: Add draft definitions
And update the state RFC definitions to match.
* Difficulty RFC: Add relevant chain
* Difficulty RFC: draft guide-level explanation
Outline the core calculations and checks.
* Difficulty RFC: Revised based on spec fixes
Update the design based on the spec bugs in #1276, #1277, and
zcash/zips#416.
These changes make the difficulty filter into a context-free check,
so we remove it from this contextual validation RFC.
* Difficulty RFC: Explain how Zebra's calculations can match the spec
* Difficulty RFC: write most of the reference section
Includes most of the implementation, modules for each function, and
draft notes for some of the remaining parts of the RFC.
* Difficulty RFC: Add an AdjustedDifficulty struct
* Difficulty RFC: Summarise module structure in the one place
* Difficulty RFC: Create implementation notes subsections
* Difficulty RFC: add consensus critical order of operations
* Difficulty RFC: Use the ValidateContextError type
* Difficulty RFC: make the median_time arg mut owned
We have to clone the data to pass a fixed-length array to a function,
so we might as well sort that array to find the median, and avoid a
copy.