The previous outbound peer connection logic got requests to connect to new
peers and processed them one at a time, making single connection attempts
and retrying if the connection attempt failed. This was quite slow, because
many connections fail, and we have to wait for timeouts. Instead, this logic
connects to new peers concurrently (up to 50 at a time).
Because we represent each transaction version as a different variant of the
Transaction enum, we end up in a situation where fields that are common to
different transaction versions are awkward to access, requiring a match
statement with identical match arms.
To fix this, this commit adds the following convenience methods:
* `Transaction::inputs() -> impl Iterator<Item=&TransparentInput>`;
* `Transaction::outputs() -> impl Iterator<Item=&TransparentOutput>`;
* `Transaction::lock_time() -> LockTime`;
* `Transaction::expiry_height() -> Option<ExpiryHeight>`;
The last returns an `Option` because the field is only present in V3 and V4
transactions.
There are some remaining fields that do not get common accessors, because it
probably doesn't make sense to access independently of knowing the transaction
version: `joinsplit_data`, `shielded_data`, `value_balance`.
Bitcoin does this either with `getblocks` (returns up to 500 following block
hashes) or `getheaders` (returns up to 2000 following block headers, not
just hashes). However, Bitcoin headers are much smaller than Zcash
headers, which contain a giant Equihash solution block, and many Zcash
blocks don't have many transactions in them, so the block header is
often similarly sized to the block itself. Because we're
aiming to have a highly parallel network layer, it seems better to use
`getblocks` to implement `FindBlocks` (which is necessarily sequential)
and parallelize the processing of the block downloads.
BIP34, which is included in Zcash, encodes the block height into each
block by adding it into the unused BitcoinScript field of the block's
coinbase transaction. However, this is done just by requiring that the
script pushes the block height onto the stack when it executes, and
there are multiple different ways to push data onto the stack in
BitcoinScript. Also, the genesis block does not include the block
height, by accident.
Because we want to *parse* transactions into an algebraic data type that
encodes their structural properties, rather than allow possibly-invalid
data to float through the internals of our node, we want to extract the
block height upfront and store it separately from the rest of the
coinbase data, which is inert. So the serialization code now contains
just enough logic to parse BitcoinScript-encoded block heights, and
special-case the encoding of the genesis block.
Elsewhere in the source code, the `LockTime` struct requires that we
must use block heights less than 500,000,000 (above which the number is
interpreted as a unix timestamp, not a height). To unify invariants, we
ensure that the parsing logic works with block heights up to
500,000,000, even though these are unlikely to ever be used for Zcash.
Coinbase inputs are handled differently from other inputs and have
different consensus rules, so they should be represented differently in
the source code. This lets us discard extraneous details (for instance,
it's not necessary to maintain the all-zero hash) and specialize logic.
This doesn't clean the warnings about unused items in the builder, since
those are unused for a reason (the implementation that should use them
is missing).
PushPeers is more complicated to thread into the rest of our
architecture (we would need to establish a data path connecting our
service handling inbound requests to the network layer's auto-crawler),
and since we crawl the network automatically anyways, we don't actually
need to accept them in order to get updated address information.
The only possible problem with this approach is that zcashd refuses to
answer multiple address requests from the same connection, ostensibly
for fingerprinting prevention (although it's totally happy to give
exactly the same information, as long as you hang up and reconnect
first, lol). It's unclear how this will interact with our design -- on
the one hand, it could mean that we don't get new addr information when
we ask, but on the other hand, we may have enough churn in our
connection pool that this isn't a problem anyways.
Attempting to implement requests for block data revealed a problem with
the previous connection logic. Block data is requested by sending a
`getdata` message with hashes of the requested blocks; the peer responds
with a sequence of `block` messages with the blocks themselves.
However, this wasn't possible to handle with the previous connection
logic, which could only convert a single Bitcoin message into a
Response. Instead, we factor out the message handling logic into a
Handler, which can statefully accumulate arbitrary data into a Response
and signal completion. This is still pretty ugly but it does work.
As a side effect, the HeartbeatNonceMismatch error is removed; because
the Handler now tries to process messages until it comes to a Response,
it just ignores mismatched nonces (and will eventually time out).
The previous Mempool and Transaction requests were removed but could be
re-added in a different form later. Also, the `Get` prefixes are
removed from `Request` to tidy the name.
Closes#158.
As discussed on the issue, this makes it possible to safely serialize
data into hashes, and encourages serializable data to make illegal
states unrepresentable.