Commit Graph

209 Commits

Author SHA1 Message Date
teor 207ded6889 Add error context for contextual validation 2020-12-04 10:44:36 +10:00
teor 23e07a94cf Implement the block header time consensus rules 2020-12-04 10:44:36 +10:00
teor 0bac2dafcc Split out a separate `median_time_past` function 2020-12-04 10:44:36 +10:00
teor ab486d336f Update the contextual difficulty module doc 2020-12-04 10:44:36 +10:00
dependabot[bot] 8c052cc39a build(deps): bump color-eyre from 0.5.9 to 0.5.10
Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.9 to 0.5.10.
- [Release notes](https://github.com/yaahc/color-eyre/releases)
- [Changelog](https://github.com/yaahc/color-eyre/blob/v0.5.10/CHANGELOG.md)
- [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.9...v0.5.10)

Signed-off-by: dependabot[bot] <support@github.com>
2020-12-03 10:55:16 -05:00
Henry de Valence c04cc39a03 state: dodge a bug in zcashd
Zcashd will blindly request more block headers as long as it got 160
block headers in response to a previous query, EVEN IF THOSE HEADERS ARE
ALREADY KNOWN.  To dodge this behavior, return slightly fewer than the
maximum, to get it to go away.

0ccc885371/src/main.cpp (L6274-L6280)

Without this change, communication between a partially-synced `zebrad`
and fully-synced `zcashd` looked like this:

1.  `zebrad` connects to `zcashd`, which sends an initial `getheaders`
    request;

2.  `zebrad` correctly computes the intersection of the provided block
    locator with the node's current chain and returns 160 following
    headers;

3.  `zcashd` does not check whether it already has those headers and
    assumes that any provided headers are new and re-validates them;

4.  `zcashd` assumes that because `zebrad` responded with 160 headers,
    the `zebrad` node is ahead of it, and requests the next 160 headers.

5.  Because block locators are sparse, the intersection between the
    `zcashd` and `zebrad` chains is likely well behind the `zebrad` tip,
    so this process continues for thousands of blocks.

To avoid this problem, we return slightly fewer than the protocol
maximum (158 rather than 160, to guard against off-by-one errors in
zcashd).  This does not interfere with use of the returned headers by
peers that check the headers, but does prevent `zcashd` from trying to
download thousands of block headers it already has.

This problem does not occur in the `zcashd<->zcashd` case only because
`zcashd` does not respond to `getheaders` messages while it is syncing.
However, implementing this behavior in Zebra would be more complicated,
because we don't have a distinct "initial block sync" state (we do
poll-based syncing continuously) and we don't have shared global
variables to modify to set that state.

Relevant links (thanks @str4d):

- The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
- https://github.com/bitcoin/bitcoin/issues/6861
- https://github.com/bitcoin/bitcoin/issues/6755
- https://github.com/bitcoin/bitcoin/pull/8306#issuecomment-614916454
2020-12-02 19:44:24 -05:00
Jane Lusby d7bef1c155
bump color-eyre version to avoid a panic when printing spantraces (#1438) 2020-12-02 14:16:18 -08:00
Henry de Valence b449fe93b2 network: correct data modeling for headers messages
We modeled a Bitcoin `headers` message as being a list of block headers.
However, the actual data structure is slightly different: it's a list of (block
header, transaction count) pairs.  This caused zcashd to reject our headers
messages.

To fix this, introduce a new `CountedHeader` struct with a `block::Header` and
transaction count `usize`, then thread it through the inbound service and the
state.

I tested this locally by running Zebra with these changes and inspecting a
trace-level log of the span of a peer connection that requested a nontrivial
headers packet from us, and verified that it did not reject our message.
2020-12-02 10:24:31 -08:00
teor cee0e86190 Increase the open file limit on unix platforms
If the limit is less than the ideal, try to increase it to the ideal.
If that doesn't work, try to increase the limit as high as possible.
If the limit is still less than the minimum, panic.
2020-12-02 15:32:36 +10:00
teor 44f2326672 Move the RocksDB column family list into finalized_state
The list was previously split between config and finalized_state.
2020-12-02 15:32:36 +10:00
teor 92eb92d1dd
Disable the nightly clippy unnecessary_wraps lint (#1403)
It seems to be a bit broken - some of our functions return `Result` for
consistency with similar functions. But the lint picks them up anyway.
2020-12-01 12:20:57 +10:00
Henry de Valence 4fa119dd1f chain: fix consensus-critical coinbase encoding bug
The `CoinbaseData` parses the block height separately from the rest of the
free-form coinbase data.  However, it had two bugs:

1. It did not require that the height was canonically encoded;
2. Its canonical encoding was incorrect relative to the BIP34-inherited encoding.

This meant that we computed some transaction hashes incorrectly, because when
we re-serialized the coinbase transaction, we would canonically serialize the
coinbase transaction (using the incorrect definition of canonical, bug 2).  And
we didn't notice that the wrong definition of canonical encoding was being used
because we accepted what we thought were non-canonically encoded heights.

The relevant rules are here: 877212414a/src/script/script.h (L307-L346)

This commit changes the encoding to reject non-canonically encoded heights, and
to match the correct encoding rules.  We check that at least one
non-canonically encoded height is correctly rejected using a new test vector.

The database format increments because we saved a bunch of wrongly encoded blocks.

This discrepancy was originally noticed by @teor2345, who pointed out that a
previous version of the block 202 test vector (now preserved as "bad block
202") did not match the block from zcashd.
2020-12-01 10:14:44 +10:00
Henry de Valence 7c08c0c315 consensus: check Merkle roots
As a side effect of computing Merkle roots, we build a list of
transaction hashes.  Instead of discarding these, add them to
PreparedBlock and FinalizedBlock so that they can be reused rather than
recomputed.

This commit adds Merkle root validation to:

1. the block verifier;
2. the checkpoint verifier.

In the first case, Bitcoin Merkle tree malleability has no effect,
because only a single Merkle tree in each malleablity set is valid (the
others have duplicate transactions).

In the second case, we need to check that the Merkle tree does not contain any
duplicate transactions.

Closes #1385
Closes #906
2020-12-01 10:14:44 +10:00
Alfredo Garcia 4544463059
Inbound `FindBlocks` and `FindHeaders` (#1347)
* implement inbound `FindBlocks`
* Handle inbound peer FindHeaders requests
* handle request before having any chain tip
* Split `find_chain_hashes` into smaller functions

Add a `max_len` argument to support `FindHeaders` requests.

Rewrite the hash collection code to use heights, so we can handle the
`stop` hash and "no intersection" cases correctly.

* Split state height functions into "any chain" and "best chain"
* Rename the best chain block method to `best_block`
* Move fmt utilities to zebra_chain::fmt
* Summarise Debug for some Message variants

Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:30:37 +10:00
teor d007c76488 Turn a chain length check into an assert 2020-12-01 07:27:30 +10:00
teor d1ba1146d4 Add intra-doc links 2020-12-01 07:27:30 +10:00
teor 1e4ce74c93 Turn the relevant chain into a Vec before using it
Some checks use the same blocks, so we take a copy of the block borrows
before using them. That way, we don't have to manage the position of the
iterator between checks.
2020-12-01 07:27:30 +10:00
teor 712dd9ddf3 Make a module `pub(crate)` rather than `pub` 2020-12-01 07:27:30 +10:00
teor ec6ef93b7b Simplify an ExpandedDifficulty division 2020-12-01 07:27:30 +10:00
teor d64c2976e3 Rewrite iterator processing using unzip
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:27:30 +10:00
teor 91476535d3 Doc comment formatting
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
2020-12-01 07:27:30 +10:00
teor 678e6ad090 Implement difficulty_threshold_is_valid 2020-12-01 07:27:30 +10:00
teor 750f096a99 Implement testnet minimum difficulty 2020-12-01 07:27:30 +10:00
teor bb9c4918bf Implement threshold_bits 2020-12-01 07:27:30 +10:00
teor f0a49d64bf Split out a median_timespan function 2020-12-01 07:27:30 +10:00
teor 054d6f0525 Implement median_timespan_bounded 2020-12-01 07:27:30 +10:00
teor 75519b0ae9 Implement averaging_window_timespan 2020-12-01 07:27:30 +10:00
teor bcabf75fe9 Replace integer lengths with named constants 2020-12-01 07:27:30 +10:00
teor e07b0bc8da Implement median_time
And enough stubs to run it on real data.
2020-12-01 07:27:30 +10:00
teor 741c44cd55 Implement mean_target_difficulty
And enough stub code to actually run it on the context.
2020-12-01 07:27:30 +10:00
teor 939c2b97a6 Implement AdjustedDifficulty creation
Also:
* call the difficulty check from `block_is_contextually_valid`
* add a stub `difficulty_threshold_is_valid` function
2020-12-01 07:27:30 +10:00
teor fa03b83351 Update some contextual validation comments and error messages 2020-12-01 07:27:30 +10:00
teor 1bf5ff07fb Fix a state config comment 2020-11-30 15:57:46 -05:00
teor 176923a771
Add an info-level log when UTXO requests are pruned (#1396)
And a debug-level log when no requests are pruned.

I'm seeing some hangs during the initial sync, these logs might help
identify the cause.
2020-11-26 17:26:10 +10:00
Deirdre Connolly e11e8e1373 s/TRASPARENT/TRANSPARENT/g 2020-11-25 17:22:26 -05:00
teor 31eb0a5126 Avoid verbose default logs
Temporary fix so that Zebra's default logs support a typical workflow:
1. Developer or user runs Zebra with the default config
2. They send the logs to a terminal
3. When they see a bug, they copy-paste the last few log lines into a
   bug report

This is the same change that was merged in #1373 and reverted in #1375.
We'll create a consistent logging design for Zebra in ticket #1381.
2020-11-25 10:55:15 -08:00
teor b1bbb13978
Make debug_stop_at_height and ephemeral work together (#1339)
* Make debug_stop_at_height and ephemeral work together

* if `debug_stop_at_height` and `ephemeral` are set, delete the database
  files after reaching the stop height
* drop or flush the database before `debug_stop_at_height` exits Zebra
2020-11-25 15:04:18 +10:00
Deirdre Connolly 2a21c86b91 I before E except after C (or uh, not-english) 2020-11-24 22:23:57 -05:00
Henry de Valence 2e0ed94b22 Revert "Downgrade a per-block log to debug level"
This reverts commit 15d26e3c47.
2020-11-24 14:39:45 -05:00
teor 15d26e3c47 Downgrade a per-block log to debug level 2020-11-24 10:56:57 -05:00
Henry de Valence 040e50b183 state: service::utxo -> service::pending_utxos 2020-11-23 22:18:43 -08:00
Henry de Valence 342eb166ff state: track UTXO provenance
This commit changes the state system and database format to track the
provenance of UTXOs, in addition to the outputs themselves.
Specifically, it tracks the following additional metadata:

- the height at which the UTXO was created;
- whether or not the UTXO was created from a coinbase transaction or
  not.

This metadata will allow us to:

- check the coinbase maturity consensus rule;
- check the coinbase inputs => no transparent outputs rule;
- implement lookup of transactions by utxo (using the height to find the
  block and then scanning the block) for a future RPC mechanism.

Closes #1342
2020-11-23 22:18:43 -08:00
teor 00c52d28cd Appease rustfmt 2020-11-23 14:16:39 +10:00
teor acf6096103 Appease clippy stable 2020-11-23 14:16:39 +10:00
Henry de Valence 2a4a89c002 state,zebrad: tidy span levels for good INFO output
This provides useful and not too noisy output at INFO level.  We do an
info-level message on every block commit instead of trying to do one
message every N blocks, because this is useful both for initial block
sync as well as continuous state updates on new blocks.
2020-11-23 14:16:39 +10:00
Henry de Valence e0817d1747 state: introduce PreparedBlock, FinalizedBlock
This change introduces two new types:

- `PreparedBlock`, representing a block which has undergone semantic
  validation and has been prepared for contextual validation;
- `FinalizedBlock`, representing a block which is ready to be finalized
  immediately;

and changes the `Request::CommitBlock`,`Request::CommitFinalizedBlock`
variants to use these types instead of their previous fields.

This change solves the problem of passing data between semantic
validation and contextual validation, and cleans up the state code by
allowing it to pass around a bundle of data.  Previously, the state code
just passed around an `Arc<Block>`, which forced it to needlessly
recompute block hashes and other data, and was incompatible with the
already-known but not-yet-implemented data transfer requirements, namely
passing in the Sprout and Sapling anchors computed during contextual
validation.

This commit propagates the `PreparedBlock` and `FinalizedBlock` types
through the state code but only uses their data opportunistically, e.g.,
changing .hash() computations to use the precomputed hash.  In the
future, these structures can be extended to pass data through the
verification pipeline for reuse as appropriate.  For instance, these
changes allow the sprout and sapling anchors to be propagated through
the state.
2020-11-23 14:16:39 +10:00
Henry de Valence 3f78476693 state: check queued blocks for known UTXOs
The behavior of a request for a UTXO from a previous block depends on
whether that block has already been submitted to the state, or not:

* if it has, the state should be able to find it and answer immediately.
* if it has not, the state should see it in a later request.

However, the previous code only checked committed blocks, not queued
blocks, so if the block containing the UTXO had already arrived but had
not been committed, it would never be scanned.

This patch fixes the problem but is a bad solution, duplicating
computation between the block verifier and the state.  A better fix
follows in the next commit.
2020-11-23 14:16:39 +10:00
Henry de Valence 719a48ad9e state: shorten tracing messages
Make tracing messages more concise by omitting information already
contained in a parent span and by shortening messages.  This makes them
easier to read.
2020-11-23 14:16:39 +10:00
Henry de Valence 3192a5008d state: add additional traces to block commit logic 2020-11-23 14:16:39 +10:00
Henry de Valence 36cd76d590 state: tidy process_queued tracing
Previously, this function was instrumented with a span containing the
parent hash that was the entry to the function.  But it doesn't make
sense to consider the work done by the function as happening in the
context of the supplied parent hash (as distinct from the context of the
hash of the newly arrived block, which is already contained in an outer
span), so this adds noise without conveying extra context.

Instead, use events that occur within the context of the existing spans.
2020-11-23 14:16:39 +10:00