Update state RFC for incremental trees, value pools, and RocksDB (#2456)

* Update state RFC for double-spends and other recent designs

* Update the value pool column family name

* Mark incremental note commitment trees as tentative

* Change history tree type

* Apply suggestions from code review
This commit is contained in:
teor 2021-07-12 23:33:02 +10:00 committed by GitHub
parent f7026d728f
commit ccf93cf5c6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 62 additions and 12 deletions

View File

@ -600,23 +600,60 @@ order on byte strings is the numeric ordering).
We use the following rocksdb column families: We use the following rocksdb column families:
| Tree | Keys | Values | | Column Family | Keys | Values | Updates |
|----------------------|-----------------------|-------------------------------------| |-----------------------|-----------------------|--------------------------------------|---------|
| `hash_by_height` | `BE32(height)` | `block::Hash` | | `hash_by_height` | `BE32(height)` | `block::Hash` | Never |
| `height_by_hash` | `block::Hash` | `BE32(height)` | | `height_by_hash` | `block::Hash` | `BE32(height)` | Never |
| `block_by_height` | `BE32(height)` | `Block` | | `block_by_height` | `BE32(height)` | `Block` | Never |
| `tx_by_hash` | `transaction::Hash` | `(BE32(height) \|\| BE32(tx_index))`| | `tx_by_hash` | `transaction::Hash` | `(BE32(height) \|\| BE32(tx_index))` | Never |
| `utxo_by_outpoint` | `OutPoint` | `TransparentOutput` | | `utxo_by_outpoint` | `OutPoint` | `transparent::Output` | Delete |
| `sprout_nullifiers` | `sprout::Nullifier` | `()` | | `sprout_nullifiers` | `sprout::Nullifier` | `()` | Never |
| `sapling_nullifiers` | `sapling::Nullifier` | `()` | | `sapling_nullifiers` | `sapling::Nullifier` | `()` | Never |
| `orchard_nullifiers` | `orchard::Nullifier` | `()` | | `orchard_nullifiers` | `orchard::Nullifier` | `()` | Never |
| `sprout_anchors` | `sprout::tree::Root` | `()` | | `sprout_anchors` | `sprout::tree::Root` | `()` | Never |
| `sapling_anchors` | `sapling::tree::Root` | `()` | | `sprout_incremental` | `BE32(height)` *?* | `sprout::tree::NoteCommitmentTree` | Delete |
| `sapling_anchors` | `sapling::tree::Root` | `()` | Never |
| `sapling_incremental` | `BE32(height)` *?* | `sapling::tree::NoteCommitmentTree` | Delete |
| `orchard_anchors` | `orchard::tree::Root` | `()` | Never |
| `orchard_incremental` | `BE32(height)` *?* | `orchard::tree::NoteCommitmentTree` | Delete |
| `history_incremental` | `BE32(height)` | `zcash_history::Entry` | Delete |
| `tip_chain_value_pool`| `BE32(height)` | `ValueBalance<NonNegative>` | Delete |
Zcash structures are encoded using `ZcashSerialize`/`ZcashDeserialize`. Zcash structures are encoded using `ZcashSerialize`/`ZcashDeserialize`.
Other structures are encoded using `IntoDisk`/`FromDisk`.
**Note:** We do not store the cumulative work for the finalized chain, because the finalized work is equal for all non-finalized chains. So the additional non-finalized work can be used to calculate the relative chain order, and choose the best chain. **Note:** We do not store the cumulative work for the finalized chain, because the finalized work is equal for all non-finalized chains. So the additional non-finalized work can be used to calculate the relative chain order, and choose the best chain.
### Implementing consensus rules using rocksdb
[rocksdb-consensus-rules]: #rocksdb-consensus-rules
Each column family handles updates differently, based on its specific consensus rules:
- Never: Keys are never deleted, values are never updated. The value for each key is inserted once.
- Delete: Keys can be deleted, but values are never updated. The value for each key is inserted once.
- TODO: should we prevent re-inserts of keys that have been deleted?
- Update: Keys are never deleted, but values can be updated.
Currently, there are no column families that both delete and update keys.
RocksDB ignores duplicate puts and deletes, preserving the latest values.
If rejecting duplicate puts or deletes is consensus-critical,
check [`db.get_cf(cf, key)?`](https://docs.rs/rocksdb/0.16.0/rocksdb/struct.DBWithThreadMode.html#method.get_cf)
before putting or deleting any values in a batch.
Currently, these restrictions should be enforced by code review:
- multiple `zs_insert`s are only allowed on Update column families, and
- [`delete_cf`](https://docs.rs/rocksdb/0.16.0/rocksdb/struct.WriteBatch.html#method.delete_cf)
is only allowed on Delete column families.
In future, we could enforce these restrictions by:
- creating traits for Never, Delete, and Update
- doing different checks in `zs_insert` depending on the trait
- wrapping `delete_cf` in a trait, and only implementing that trait for types that use Delete column families.
As of June 2021, the Rust `rocksdb` crate [ignores the delete callback](https://docs.rs/rocksdb/0.16.0/src/rocksdb/merge_operator.rs.html#83-94),
and merge operators are unreliable (or have undocumented behaviour).
So they should not be used for consensus-critical checks.
### Notes on rocksdb column families ### Notes on rocksdb column families
- The `hash_by_height` and `height_by_hash` column families provide a bijection between - The `hash_by_height` and `height_by_hash` column families provide a bijection between
@ -641,6 +678,19 @@ Zcash structures are encoded using `ZcashSerialize`/`ZcashDeserialize`.
block. This would more traditionally be a `(hash, index)` pair, but because block. This would more traditionally be a `(hash, index)` pair, but because
we store blocks by height, storing the height saves one level of indirection. we store blocks by height, storing the height saves one level of indirection.
- Each incremental tree consists of nodes for a small number of peaks.
Peaks are written once, then deleted when they are no longer required.
New incremental tree nodes can be added each time the finalized tip changes,
and unused nodes can be deleted.
We only keep the nodes needed for the incremental tree for the finalized tip.
TODO: update this description based on the incremental merkle tree code
- The history tree indexes its peaks using blocks since the last network upgrade.
But we map those peak indexes to heights, to make testing and debugging easier.
- The value pools are only stored for the finalized tip.
We index it by height to make testing and debugging easier.
## Committing finalized blocks ## Committing finalized blocks
If the parent block is not committed, add the block to an internal queue for If the parent block is not committed, add the block to an internal queue for