zebra

Go to file

Janito Vaqueiro Ferreira Filho 9e78a8af40 Refactor mempool spend conflict checks to increase performance (#2826 ) * Add `HashSet`s to help spend conflict detection Keep track of the spent transparent outpoints and the revealed nullifiers. Clippy complained that the `ActiveState` had variants with large size differences, but that was expected, so I disabled that lint on that `enum`. * Clear the `HashSet`s when clearing the mempool Clear them so that they remain consistent with the set of verified transactions. * Use `HashSet`s to check for spend conflicts Store new outputs into its respective `HashSet`, and abort if a duplicate output is found. * Remove inserted outputs when aborting Restore the `HashSet` to its previous state. * Remove tracked outputs when removing a transaction Keep the mempool storage in a consistent state when a transaction is removed. * Remove tracked outputs when evicting from mempool Ensure eviction also keeps the tracked outputs consistent with the verified transactions. * Refactor to create a `VerifiedSet` helper type Move the code to handle the output caches into the new type. Also move the eviction code to make things a little simpler. * Refactor to have a single `remove` method Centralize the code that handles the removal of a transaction to avoid mistakes. * Move mempool size limiting back to `Storage` Because the evicted transactions must be added to the rejected list. * Remove leftover `dbg!` statement Leftover from some temporary testing code. Co-authored-by: teor <teor@riseup.net> * Remove unnecessary `TODO` It is more speculation than planning, so it doesn't add much value. Co-authored-by: teor <teor@riseup.net> * Fix typo in documentation The verb should match the subject "transactions" which is plural. Co-authored-by: teor <teor@riseup.net> * Add a comment to warn about correctness There's a subtle but important detail in the implementation that should be made more visible to avoid mistakes in the future. Co-authored-by: teor <teor@riseup.net> * Remove outdated comment Left-over from the attempt to move the eviction into the `VerifiedSet`. * Improve comment explaining lint removal Rewrite the comment explaining why the Clippy lint was ignored. * Check for spend conflicts in `VerifiedSet` Refactor to avoid API misuse. * Test rejected transaction rollback Using two transactions, perform the same test adding a conflict to both of them to check if the second inserted transaction is properly rejected. Then remove any conflicts from the second transaction and add it again. That should work, because if it doesn't it means that when the second transaction was rejected it left things it shouldn't in the cache. * Test removal of multiple transactions When removing multiple transactions from the mempool storage, all of the ones requested should be removed and any other transaction should be still be there afterwards. * Increase mempool size to 4, so that spend conflict tests work If the mempool size is smaller than 4, these tests don't fail on a trivial removal bug. Because we need a minimum number of transactions in the mempool to trigger the bug. Also commit a proptest seed that fails on a trivial removal bug. (This seed fails if we remove indexes in order, because every index past the first removes the wrong transaction.) * Summarise transaction data in proptest error output * Summarise spend conflict field data in proptest error output * Summarise multiple removal field data in proptest error output And replace the very large proptest debug output with the new summary. Co-authored-by: teor <teor@riseup.net>		2021-10-10 23:54:46 +00:00
.github	Bump codecov/codecov-action from 2.0.3 to 2.1.0	2021-09-13 15:14:19 -04:00
book	Specify Zebra Client will only support Unifed Addresses (#2706 )	2021-08-30 22:40:07 +00:00
docker	Use RUSTFLAGS=-O to optimize builds and make regenerating state faster (#2552 )	2021-08-03 19:12:24 +00:00
grafana	Make block metrics more accurate (#2835 )	2021-10-07 13:42:38 +00:00
tower-batch	Bump tracing from 0.1.28 to 0.1.29 (#2836 )	2021-10-06 21:27:04 +00:00
tower-fallback	Update versions for zebra v1.0.0-alpha.18 release (#2828 )	2021-10-05 23:22:31 -03:00
zebra-chain	Refactor mempool spend conflict checks to increase performance (#2826 )	2021-10-10 23:54:46 +00:00
zebra-client	Standardise clippy lints and require docs (#2238 )	2021-06-04 08:48:40 +10:00
zebra-consensus	Make block metrics more accurate (#2835 )	2021-10-07 13:42:38 +00:00
zebra-network	make `INITIAL_MIN_NETWORK_PROTOCOL_VERSION` suport testnet and mainnet (#2851 )	2021-10-08 14:57:04 -03:00
zebra-rpc	Standardise clippy lints and require docs (#2238 )	2021-06-04 08:48:40 +10:00
zebra-script	Remove unused mempool errors (#2831 )	2021-10-07 11:20:38 +10:00
zebra-state	Make block metrics more accurate (#2835 )	2021-10-07 13:42:38 +00:00
zebra-test	Bump tracing from 0.1.28 to 0.1.29 (#2836 )	2021-10-06 21:27:04 +00:00
zebra-utils	Bump tracing-subscriber from 0.2.24 to 0.2.25 (#2838 )	2021-10-07 06:41:27 +10:00
zebrad	Refactor mempool spend conflict checks to increase performance (#2826 )	2021-10-10 23:54:46 +00:00
.gitignore	switch to source based coverage (#1293 )	2020-12-03 13:36:40 -08:00
CHANGELOG.md	Update versions for zebra v1.0.0-alpha.18 release (#2828 )	2021-10-05 23:22:31 -03:00
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md (#1097 )	2021-03-25 10:54:08 +01:00
CONTRIBUTING.md	Make the RFC TOC into a separate step (#2126 )	2021-05-10 10:17:42 -03:00
Cargo.lock	Bump tracing from 0.1.28 to 0.1.29 (#2836 )	2021-10-06 21:27:04 +00:00
Cargo.toml	Update shared NU5 dependencies, set the NU5 testnet activation network upgrade parameters (#2825 )	2021-10-06 11:08:41 +10:00
LICENSE-APACHE	Add copyright marks on each license	2019-11-14 11:50:49 -08:00
LICENSE-MIT	Add copyright marks on each license	2019-11-14 11:50:49 -08:00
README.md	Update versions for zebra v1.0.0-alpha.18 release (#2828 )	2021-10-05 23:22:31 -03:00
SECURITY.md	Explicitly allow unencrypted disclosures for alpha releases (#2127 )	2021-05-11 14:41:33 +02:00
clippy.toml	Apply clippy fixes	2020-02-05 12:42:32 -08:00
cloudbuild.yaml	Pipe SHORT_SHA into container builds (#1451 )	2020-12-03 22:51:42 -05:00
codecov.yml	Disable CodeCov annotations via GitHub Checks	2020-09-10 14:52:01 -04:00
firebase.json	Configure redirect for firebase hosting	2020-01-16 18:38:16 -05:00
katex-header.html	Add KaTeX to rendered docs. (#832 )	2020-08-05 17:34:30 -07:00
prometheus.yaml	Tell Prometheus to scrape more aggressively	2020-02-14 20:14:05 -05:00

README.md

About

Zebra is the Zcash Foundation's independent, consensus-compatible implementation of the Zcash protocol, currently under development. Please join us on Discord if you'd like to find out more or get involved!

Alpha Releases

Every few weeks, we release a new Zebra alpha release.

The goals of the alpha release series are to:

participate in the Zcash network,
replicate the Zcash chain state,
implement the Zcash proof of work consensus rules, and
sync on Mainnet under excellent network conditions.

Currently, Zebra does not validate all the Zcash consensus rules. It may be unreliable on Testnet, and under less-than-perfect network conditions. See our current features and roadmap for details.

Getting Started

Building zebrad requires Rust, libclang, and a C++ compiler.

Detailed Build and Run Instructions

Install cargo and rustc.
- Using rustup installs the stable Rust toolchain, which zebrad targets.
Install Zebra's build dependencies:
- libclang: the libclang, libclang-dev, llvm, or llvm-dev packages, depending on your package manager
- clang or another C++ compiler: g++, Xcode, or MSVC
Run cargo install --locked --git https://github.com/ZcashFoundation/zebra --tag v1.0.0-alpha.18 zebrad
Run zebrad start

If you're interested in testing out zebrad please feel free, but keep in mind that there is a lot of key functionality still missing.

Build Troubleshooting

If you're having trouble with:

dependencies:
- install both libclang and clang - they are usually different packages
- use cargo install without --locked to build with the latest versions of each dependency
libclang: check out the clang-sys documentation
g++ or MSVC++: try using clang or Xcode instead
rustc: use rustc 1.48 or later
- Zebra does not have a minimum supported Rust version (MSRV) policy yet

System Requirements

We usually build zebrad on systems with:

2+ CPU cores
7+ GB RAM
14+ GB of disk space

On many-core machines (like, 32-core) the build is very fast; on 2-core machines it's less fast.

We continuously test that our builds and tests pass on:

Windows Server 2019
macOS Big Sur 11.0
Ubuntu 18.04 / the latest LTS
Debian Buster

We usually run zebrad on systems with:

4+ CPU cores
16+ GB RAM
50GB+ available disk space for finalized state
100+ Mbps network connections

zebrad might build and run fine on smaller and slower systems - we haven't tested its exact limits yet.

Network Ports and Data Usage

By default, Zebra uses the following inbound TCP listener ports:

8233 on Mainnet
18233 on Testnet

If Zebra is configured with a specific listen_addr, it will advertise this address to other nodes for inbound connections.

Zebra makes outbound connections to peers on any port. But zcashd prefers peers on the default ports, so that it can't be used for DDoS attacks on other networks.

zebrad's typical network usage is:

initial sync: 30 GB download
ongoing updates: 10-50 MB upload and download per day, depending on peer requests

The major constraint we've found on zebrad performance is the network weather, especially the ability to make good connections to other Zcash network peers.

Current Features

Network:

synchronize the chain from peers
download gossiped blocks from peers
answer inbound peer requests for hashes, headers, and blocks

State:

persist block, transaction, UTXO, and nullifier indexes
handle chain reorganizations

Proof of Work:

validate equihash, block difficulty threshold, and difficulty adjustment
validate transaction merkle roots

Validating proof of work increases the cost of creating a consensus split between zebrad and zcashd.

This release also implements some other Zcash consensus rules, to check that Zebra's validation architecture supports future work on a full validating node:

block and transaction structure
checkpoint-based verification up to and including Canopy activation
transaction validation (incomplete)
transaction cryptography (incomplete)
transaction scripts (incomplete)
batch verification (incomplete)

Dependencies

Zebra primarily depends on pure Rust crates, and some Rust/C++ crates:

Known Issues

There are a few bugs in Zebra that we're still working on fixing:

In rare cases, Zebra panics on shutdown #1678
- For examples, see #2055 and #2209
- These panics can be ignored, unless they happen frequently
Interrupt handler does not work when a blocking task is running #1351
- Zebra should eventually exit once the task finishes. Or you can forcibly terminate the process.
Duplicate block errors #1372
- These errors can be ignored, unless they happen frequently

Zebra's state commits changes using database transactions. If you forcibly terminate it, or it panics, any incomplete changes will be rolled back the next time it starts.

Future Work

In 2021, we intend to finish validation, add RPC support, and add wallet integration. This phased approach allows us to test Zebra's independent implementation of the consensus rules, before asking users to entrust it with their funds.

Features:

full consensus rule validation
transaction mempool
wallet functionality
RPC functionality

Performance and Reliability:

reliable syncing on Testnet
reliable syncing under poor network conditions
batch verification
performance tuning

Documentation

The Zebra website contains user documentation, such as how to run or configure Zebra, set up metrics integrations, etc., as well as developer documentation, such as design documents. We also render API documentation for the external API of our crates, as well as internal documentation for private APIs.

Architecture

Unlike zcashd, which originated as a Bitcoin Core fork and inherited its monolithic architecture, Zebra has a modular, library-first design, with the intent that each component can be independently reused outside of the zebrad full node. For instance, the zebra-network crate containing the network stack can also be used to implement anonymous transaction relay, network crawlers, or other functionality, without requiring a full node.

At a high level, the fullnode functionality required by zebrad is factored into several components:

zebra-chain, providing definitions of core data structures for Zcash, such as blocks, transactions, addresses, etc., and related functionality. It also contains the implementation of the consensus-critical serialization formats used in Zcash. The data structures in zebra-chain are defined to enforce structural validity by making invalid states unrepresentable. For instance, the Transaction enum has variants for each transaction version, and it's impossible to construct a transaction with, e.g., spend or output descriptions but no binding signature, or, e.g., a version 2 (Sprout) transaction with Sapling proofs. Currently, zebra-chain is oriented towards verifying transactions, but will be extended to support creating them in the future.
zebra-network, providing an asynchronous, multithreaded implementation of the Zcash network protocol inherited from Bitcoin. In contrast to zcashd, each peer connection has a separate state machine, and the crate translates the external network protocol into a stateless, request/response-oriented protocol for internal use. The crate provides two interfaces:
- an auto-managed connection pool that load-balances local node requests over available peers, and sends peer requests to a local inbound service, and
- a connect_isolated method that produces a peer connection completely isolated from all other node state. This can be used, for instance, to safely relay data over Tor, without revealing distinguishing information.
zebra-script provides script validation. Currently, this is implemented by linking to the C++ script verification code from zcashd, but in the future we may implement a pure-Rust script implementation.
zebra-consensus performs semantic validation of blocks and transactions: all consensus rules that can be checked independently of the chain state, such as verification of signatures, proofs, and scripts. Internally, the library uses tower-batch to perform automatic, transparent batch processing of contemporaneous verification requests.
zebra-state is responsible for storing, updating, and querying the chain state. The state service is responsible for contextual verification: all consensus rules that check whether a new block is a valid extension of an existing chain, such as updating the nullifier set or checking that transaction inputs remain unspent.
zebrad contains the full node, which connects these components together and implements logic to handle inbound requests from peers and the chain sync process.
zebra-rpc and zebra-client will eventually contain the RPC and wallet functionality, but as mentioned above, our goal is to implement replication of chain state first before asking users to entrust Zebra with their funds.

All of these components can be reused as independent libraries, and all communication between stateful components is handled internally by internal asynchronous RPC abstraction ("microservices in one process").

Security

Zebra has a responsible disclosure policy, which we encourage security researchers to follow.

License

Zebra is distributed under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE and LICENSE-MIT.