Commit Graph

1722 Commits

Author SHA1 Message Date
Henry de Valence 96ee32e5d2 consensus: fix bug in coinbase joinsplit/spend check
This function caused spurious "WrongVersion" errors, because the match
pattern in the first arm was non-exhaustive, but the fallthrough match
arm was present and assumed it would only be reached if the version was
incorrect.

This commit cleans up the implemenation, splits out the error variants,
and renames the check to be more precise.

To avoid this kind of bug in the future, two guidelines are useful:

1. Avoid fallthrough cases that circumvent non-exhaustive match checks;
2. Avoid nested conditionals, preferring a "straight-line" sequence of
   match arm => result pairs rather than nested matches or matches with
   conditionals inside.
2020-11-21 14:09:15 -05:00
Henry de Valence b116cfcd76 consensus: add debug event on wrong version check
Adding this check reveals that the WrongVersion errors aren't coming
from the correct WrongVersion check.
2020-11-21 14:09:15 -05:00
Henry de Valence 25fd52be51 chain: tidy Debug for Amount
This avoids printing a bunch of PhantomData.
2020-11-21 14:09:15 -05:00
Henry de Valence b5515123eb chain: add custom Debug for CoinbaseData
The derived Debug impl just shows u8s as numbers, which isn't what we
want.  There are basically two reasonable options here:

1. Hex-encoded bytes
2. Escaped ASCII

I picked (2) because a lot of coinbase data has ascii text in it.
2020-11-21 14:09:15 -05:00
Henry de Valence d1ee7f263a consensus: add debug span to TransactionVerifier 2020-11-21 14:09:15 -05:00
Henry de Valence 2e4f4d8e87 consensus: fix span handling in BlockVerifier
The BlockVerifier constructed a tracing span and manually entered it
inside of an async block.  Manually entering spans inside async blocks
can cause problems where the span might not be entered and exited
correctly as the resulting future is polled.  Instead, using the
.instrument creates a wrapper future that handles the bookkeeping.

I changed the span name and contents to be consistent with the spans in
the checkpoint verifier.
2020-11-21 14:09:15 -05:00
Deirdre Connolly 0b6a61c9e8 gcloud build is still the only required check for PR merge, run tests in release profile 2020-11-21 05:40:25 -05:00
Deirdre Connolly 902c6f6b29 Remove test attributes and allow(dead_code) for command timeout tests that exercise currently broken properties 2020-11-21 05:40:25 -05:00
Deirdre Connolly 558661a531 Remove test attributes and allow(dead_code) for test code that tests currently unimplemented functionality 2020-11-21 05:40:25 -05:00
Deirdre Connolly 036abd50ac Back to stable for test image 2020-11-21 05:40:25 -05:00
Deirdre Connolly 52296b96c7 Bump test job timeout to 45 minutes because Windows debug builds are taking a while 2020-11-21 05:40:25 -05:00
Deirdre Connolly 706c42de3e Filter broken command tests while including ignored otherwise 2020-11-21 05:40:25 -05:00
Henry de Valence 7dfea510d5 state: remove state_trace span
This turns out not to give much additional information when stacked with
child spans.
2020-11-20 15:28:46 -08:00
Henry de Valence bbd7a62b20 state: add service request count metrics
These are all one metric, with the type as an attribute, so that we can
display total requests, filter by a particular type, etc.
2020-11-20 17:38:21 -05:00
Henry de Valence 3bfe63e38f state: add span to state service
Here the span is added to the body of the `Service::call`
implementation, not to the futures it returns, because the state service
does all of the work synchronously in `call` rather than in the futures
it returns.

The service is skipped as a span field.  We could either include or
exclude the request itself.  It would be useful, but the request body
can be very large.  Instead, we make two spans, one at info level and
one at trace level, and filter that way.
2020-11-20 17:38:21 -05:00
Henry de Valence 04acc9da6c consensus: instrument script verification 2020-11-20 17:38:21 -05:00
teor d4da9609ee Update the max_concurrent_block_requests docs
In #1298, we decreased `max_concurrent_block_requests`,
but forgot to update the docs.
2020-11-20 10:08:57 -08:00
Henry de Valence faa9cbcade deps: bump tower to pick up auto-resize in Hedge
Picks up https://github.com/tower-rs/tower/pull/484
2020-11-20 10:08:16 -08:00
Henry de Valence ba3c19142c deps: update hyper, metrics to tokio 0.3
The metrics code becomes much simpler because the current version of the
metrics crate builds its own single-threaded runtime on a dedicated worker
thread, so no dependency on the main Zebra Tokio runtime is required.
2020-11-20 10:08:16 -08:00
Henry de Valence add94c1c45 deps: move to tokio 0.3, tower 0.4
This change is mostly mechanical, with the exception of the changes to the
`tower-batch` middleware.  This middleware was adapted from `tower::buffer`,
and the `tower::buffer` code was changed to implement its own bounded queue,
because Tokio 0.3 removed the `mpsc::Sender::poll_send` method.  See

ddc64e8d4d

for more context on the Tower changes.  To match Tower as closely as possible
in order to be able to upstream `tower-batch`, those changes are copied from
`tower::Buffer` to `tower-batch`.
2020-11-20 10:08:16 -08:00
teor ec00ee4cf0
Stop using /dev/shm on Linux (#1338)
Some systems have a very small /dev/shm, for example, see:
https://github.com/docker-library/postgres/issues/416

So we should just use the temporary directory on all operating systems.

Also:
* use TempDir to generate the temporary path
* delete the code that we copied from sled
* prefix the temporary path with the state version and network
2020-11-20 13:01:19 +10:00
Deirdre Connolly af5f3c1395 Bump down cores, running into default quotas 2020-11-19 19:47:38 -05:00
Deirdre Connolly 2b9819a190 Remove defunct memory_cache_bytes
It left with sled
2020-11-19 19:47:38 -05:00
Deirdre Connolly f6dc92a256 Correctly grep for instance group & region 2020-11-19 18:55:19 -05:00
Henry de Valence 06dd39df54
network: bump network version for Canopy (#1333)
Per https://zips.z.cash/zip-0251, nodes compatible with Canopy
activation on mainnet MUST advertise protocol version 170013 or later.

Once Canopy activates on testnet or mainnet, Canopy nodes SHOULD reject
new connections from pre-Canopy nodes, so this also increases the
minimum version.
2020-11-20 09:50:05 +10:00
Deirdre Connolly fb66c7ecdf Supply --image-project, return to N2 not N2D 2020-11-19 18:10:04 -05:00
Deirdre Connolly 623949bbaa Remove vestigial 'needs' 2020-11-19 16:50:08 -05:00
Deirdre Connolly e325775bf3 Specify region not just zone 2020-11-19 16:50:08 -05:00
Deirdre Connolly a317cc11c6 Install clang to build rocksdb dep 2020-11-19 16:20:36 -05:00
Deirdre Connolly 53d63d0514 Build this branch 2020-11-19 16:20:36 -05:00
Deirdre Connolly 938b6d6fdd Make the full test suite command explicit 2020-11-19 16:20:36 -05:00
Deirdre Connolly 44970af929 Split up big test job into its own workflow 2020-11-19 16:20:36 -05:00
Deirdre Connolly 2445d23dd8 Shell form CMD 2020-11-19 16:20:36 -05:00
Deirdre Connolly 1c49e57eba Escape single quotes passed as CMD args to cargo 2020-11-19 16:20:36 -05:00
Deirdre Connolly a23de13af9 Break up Dockerfile into (additional) test and build images 2020-11-19 16:20:36 -05:00
Jane Lusby 4c9bb87df2
zebra-state: replace sled with rocksdb (#1325)
## Motivation

Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb.

Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network.

## Solution

This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain.

The code in this pull request has:
  - [x] Documentation Comments
  - [x] Unit Tests and Property Tests

## Review

@hdevalence
2020-11-18 18:05:06 -08:00
Jane Lusby 65a605520f remove references to sled from service.rs 2020-11-18 15:09:43 -05:00
Jane Lusby 5a6a9fd51e remove some references to sled in serialization definition module 2020-11-18 15:09:43 -05:00
Jane Lusby a122a547be reorganize modules for consistency 2020-11-18 15:09:43 -05:00
Henry de Valence 4953f21670 fixup! zebrad: hack to skip alreadyverified errors 2020-11-18 03:09:06 -05:00
dependabot[bot] 3edc1f7db4 build(deps): bump codecov/codecov-action from v1.0.14 to v1.0.15
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from v1.0.14 to v1.0.15.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Commits](https://github.com/codecov/codecov-action/compare/v1.0.14...239febf655bba88b16ff5dea1d3135ea8663a1f9)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-18 03:07:14 -05:00
Henry de Valence 608b3953af deps: cargo update 2020-11-17 14:56:27 -08:00
Henry de Valence d2fc01755b zebrad: more reasonable concurrent block limit
This helps prevent overloading the network with too many concurrent
block requests.  On a fast network, we're likely to still have enough
room to saturate our bandwidth.  In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads.  If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s.  In practice, this
minimum speed will likely be much lower.
2020-11-17 14:56:27 -08:00
Henry de Valence aa7538ab15 zebrad: hack to skip alreadyverified errors 2020-11-17 14:56:27 -08:00
Henry de Valence e55392b61e zebrad: explicitly select the threaded scheduler. 2020-11-17 14:56:27 -08:00
Henry de Valence 6de824bd99 zebrad: remove block verification timeout
Because we set the lookahead limit to be at least twice the size of a checkpoint, we don't have a risk of timeouts.
2020-11-17 14:56:27 -08:00
Henry de Valence e9c847bbd7 zebrad: avoid a borrow in the ChainSync future 2020-11-17 14:56:27 -08:00
Henry de Valence b632a24436 zebrad: add diagnostics on cancelled download tasks 2020-11-17 14:56:27 -08:00
Henry de Valence ec411574ee zebrad: improve sync diagnostics 2020-11-17 14:56:27 -08:00
Henry de Valence e0b2af7123 state: add sled tree precommit metrics on tracked objects 2020-11-17 14:56:27 -08:00