Commit Graph

1702 Commits

Author SHA1 Message Date
teor ec00ee4cf0
Stop using /dev/shm on Linux (#1338)
Some systems have a very small /dev/shm, for example, see:
https://github.com/docker-library/postgres/issues/416

So we should just use the temporary directory on all operating systems.

Also:
* use TempDir to generate the temporary path
* delete the code that we copied from sled
* prefix the temporary path with the state version and network
2020-11-20 13:01:19 +10:00
Deirdre Connolly af5f3c1395 Bump down cores, running into default quotas 2020-11-19 19:47:38 -05:00
Deirdre Connolly 2b9819a190 Remove defunct memory_cache_bytes
It left with sled
2020-11-19 19:47:38 -05:00
Deirdre Connolly f6dc92a256 Correctly grep for instance group & region 2020-11-19 18:55:19 -05:00
Henry de Valence 06dd39df54
network: bump network version for Canopy (#1333)
Per https://zips.z.cash/zip-0251, nodes compatible with Canopy
activation on mainnet MUST advertise protocol version 170013 or later.

Once Canopy activates on testnet or mainnet, Canopy nodes SHOULD reject
new connections from pre-Canopy nodes, so this also increases the
minimum version.
2020-11-20 09:50:05 +10:00
Deirdre Connolly fb66c7ecdf Supply --image-project, return to N2 not N2D 2020-11-19 18:10:04 -05:00
Deirdre Connolly 623949bbaa Remove vestigial 'needs' 2020-11-19 16:50:08 -05:00
Deirdre Connolly e325775bf3 Specify region not just zone 2020-11-19 16:50:08 -05:00
Deirdre Connolly a317cc11c6 Install clang to build rocksdb dep 2020-11-19 16:20:36 -05:00
Deirdre Connolly 53d63d0514 Build this branch 2020-11-19 16:20:36 -05:00
Deirdre Connolly 938b6d6fdd Make the full test suite command explicit 2020-11-19 16:20:36 -05:00
Deirdre Connolly 44970af929 Split up big test job into its own workflow 2020-11-19 16:20:36 -05:00
Deirdre Connolly 2445d23dd8 Shell form CMD 2020-11-19 16:20:36 -05:00
Deirdre Connolly 1c49e57eba Escape single quotes passed as CMD args to cargo 2020-11-19 16:20:36 -05:00
Deirdre Connolly a23de13af9 Break up Dockerfile into (additional) test and build images 2020-11-19 16:20:36 -05:00
Jane Lusby 4c9bb87df2
zebra-state: replace sled with rocksdb (#1325)
## Motivation

Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb.

Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network.

## Solution

This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain.

The code in this pull request has:
  - [x] Documentation Comments
  - [x] Unit Tests and Property Tests

## Review

@hdevalence
2020-11-18 18:05:06 -08:00
Jane Lusby 65a605520f remove references to sled from service.rs 2020-11-18 15:09:43 -05:00
Jane Lusby 5a6a9fd51e remove some references to sled in serialization definition module 2020-11-18 15:09:43 -05:00
Jane Lusby a122a547be reorganize modules for consistency 2020-11-18 15:09:43 -05:00
Henry de Valence 4953f21670 fixup! zebrad: hack to skip alreadyverified errors 2020-11-18 03:09:06 -05:00
dependabot[bot] 3edc1f7db4 build(deps): bump codecov/codecov-action from v1.0.14 to v1.0.15
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from v1.0.14 to v1.0.15.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Commits](https://github.com/codecov/codecov-action/compare/v1.0.14...239febf655bba88b16ff5dea1d3135ea8663a1f9)

Signed-off-by: dependabot[bot] <support@github.com>
2020-11-18 03:07:14 -05:00
Henry de Valence 608b3953af deps: cargo update 2020-11-17 14:56:27 -08:00
Henry de Valence d2fc01755b zebrad: more reasonable concurrent block limit
This helps prevent overloading the network with too many concurrent
block requests.  On a fast network, we're likely to still have enough
room to saturate our bandwidth.  In the worst case, with 2MB blocks,
downloading 50 blocks concurrently is 100MB of queued downloads.  If we
need to download this in 20 seconds to avoid peer connection timeouts,
the implied worst-case minimum speed is 5MB/s.  In practice, this
minimum speed will likely be much lower.
2020-11-17 14:56:27 -08:00
Henry de Valence aa7538ab15 zebrad: hack to skip alreadyverified errors 2020-11-17 14:56:27 -08:00
Henry de Valence e55392b61e zebrad: explicitly select the threaded scheduler. 2020-11-17 14:56:27 -08:00
Henry de Valence 6de824bd99 zebrad: remove block verification timeout
Because we set the lookahead limit to be at least twice the size of a checkpoint, we don't have a risk of timeouts.
2020-11-17 14:56:27 -08:00
Henry de Valence e9c847bbd7 zebrad: avoid a borrow in the ChainSync future 2020-11-17 14:56:27 -08:00
Henry de Valence b632a24436 zebrad: add diagnostics on cancelled download tasks 2020-11-17 14:56:27 -08:00
Henry de Valence ec411574ee zebrad: improve sync diagnostics 2020-11-17 14:56:27 -08:00
Henry de Valence e0b2af7123 state: add sled tree precommit metrics on tracked objects 2020-11-17 14:56:27 -08:00
Henry de Valence aa8d95bd23 consensus: improve checkpoint request replacement diagnostics 2020-11-17 14:56:27 -08:00
Henry de Valence a3ab589d89 consensus,state: document cancellation contracts for services
This change explicitly documents cancellation contracts for our Tower services,
and tries to correct a bug in the implementation of the CheckpointVerifier,
which duplicates information from the state service but did not ensure that it
would be kept in sync.
2020-11-17 14:56:27 -08:00
Henry de Valence d5d17a9a71 consensus: remove incorrect comment
The ZcashDeserialize implementation for Block doesn't check that blocks
have a coinbase height.
2020-11-17 14:56:27 -08:00
teor 2f53ff44f7 Move chain order assertions to commit_finalized_direct
And remove a duplicate assert in the contextual verification function.
2020-11-17 13:16:31 +10:00
Deirdre Connolly 40b012acef Add mdbook stuff to path using environment files/variables instead of workflow commands
Fixes #1309
2020-11-16 21:18:19 -05:00
teor d7d15984eb Move all contextual validation code into its own function
This change has two benefits:
* reduces conflicts with the sled refactor and any replacement
* allows the function to be called independently for testing
2020-11-17 11:46:57 +10:00
Alfredo Garcia c8e6f5843f
Update RFC template (#1278)
* update rfc template
* change pull to issues
2020-11-17 11:10:21 +10:00
teor cfe779db69 Add an info-level span to check_contextual_validity 2020-11-17 10:07:37 +10:00
teor d80a0c7402 Stop panicking during contextual validation
`check_contextual_validity` mistakenly used the new block's hash to try
to get the parent block from the state. This caused a panic, because the
new block isn't in the state yet.

Use `StateService::chain` to get the parent block, because we'll be
using `chain` for difficulty adjustment contextual verification anyway.
2020-11-17 10:07:37 +10:00
teor 54cb9277ef Allow some new clippy nightly lints 2020-11-17 10:07:37 +10:00
Jane Lusby a6bd77e98a
Add check to ensure heights in state service are sequential (#1290)
* Add check to ensure heights in state service are sequential

Co-authored-by: teor <teor@riseup.net>
2020-11-17 09:53:33 +10:00
Jane Lusby 4c2b44be93
Add tests for QueuedBlocks (#1268)
* Add unit test for QueuedBlocks
* Add test for pruned blocks
2020-11-17 09:31:22 +10:00
teor 2253ab3c00 Improve state request docs
Document best and any chain requests
Explain that the block locator is sparse
2020-11-17 07:52:53 +10:00
teor ca4e792f47 Put messages in request/response order
And fix a comment typo
2020-11-17 07:52:53 +10:00
Jane Lusby 57637560b9
Add internal iterator API for accessing relevant chain blocks (#1271)
* Add internal iterator API for accessing relevant chain blocks
* get blocks from all chains in non_finalized state
* Impl FusedIterator for service::Iter
* impl ExactSizedIterator for service::Iter
* let size_hint find heights in side chains

Co-authored-by: teor <teor@riseup.net>
2020-11-16 12:22:53 +10:00
Deirdre Connolly 2b8d696221
Better naming workflows (#1301)
* Better workflow names

* We run zebrad
2020-11-15 01:17:46 -05:00
Deirdre Connolly bb99a5aa2a Linewrap 2020-11-14 23:19:31 -05:00
Deirdre Connolly 87d749ee4f I love it when capitalization matters, contrary to the docs 2020-11-14 22:31:38 -05:00
Deirdre Connolly 477eac7f19 Properly use Dockerfile ARG values 2020-11-14 22:31:38 -05:00
Deirdre Connolly 7023465d91 Pass in workflow inputs for network and checkpoint_sync (with defaults) all the way down 2020-11-14 22:31:38 -05:00