zebra

Commit Graph

Author	SHA1	Message	Date
teor	fb76eb2e6b	Add download and verify timeouts to the inbound service	2021-01-13 20:46:25 -05:00
teor	973aec8ccc	Refactor sync members into a consistent order And add comments about correctness and usage.	2021-01-13 20:46:25 -05:00
teor	c2893dce51	Warn when the user's configured lookahead limit is ignored	2021-01-13 20:46:25 -05:00
teor	3699bbdae6	Add some additional sync correctness constraints And adjust the sync restart delay as a consequence.	2021-01-13 20:46:25 -05:00
teor	cef0a492d8	Add a timeout to sync service block verification This timeout stops the sync service hanging when it is missing required blocks, but the lookahead queue is full of dependent verify tasks, so the missing blocks never get downloaded.	2021-01-13 20:46:25 -05:00
teor	730910cd99	Upgrade to tokio 0.3.6 from crates.io And remove the tokio git dependency patch	2021-01-12 15:37:27 -05:00
dependabot[bot]	e3e9990315	build(deps): bump inferno from 0.10.2 to 0.10.3 Bumps [inferno](https://github.com/jonhoo/inferno) from 0.10.2 to 0.10.3. - [Release notes](https://github.com/jonhoo/inferno/releases) - [Changelog](https://github.com/jonhoo/inferno/blob/master/CHANGELOG.md) - [Commits](https://github.com/jonhoo/inferno/compare/v0.10.2...v0.10.3) Signed-off-by: dependabot[bot] <support@github.com>	2021-01-12 00:53:49 -05:00
teor	caca450904	zebrad acceptance test cleanup (#1560 ) Check misconfigured ephemeral doesn't create a state dir Add extra misconfigured `zebrad` ephemeral mode checks: * doesn't create a state directory * doesn't create unexpected files or directories in the working directory Check ephemeral doesn't delete an existing state directory Refactor all the ephemeral configs and checks into a single test function. Also: * cleanup acceptance tests using utility functions * make some checks consistent between tests * make error messages consistent Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2021-01-12 12:26:08 +10:00
teor	c75cbdea79	Log configured network in every log message (#1568 ) * Add the configured network to error reports * Log the configured network at error level * Create the global span immediately after activating tracing And leak the span guard, so the span is always active. * Include panic metadata in the report and URL * Use `Main` and `Test` in the global span `net=Mainnet` is a bit redundant	2021-01-12 07:46:56 +10:00
teor	40c427cf7c	Make `cargo run` use `zebrad` by default When `cargo run` is run in the workspace directory, it can see two executables: - `zebrad` - `zebra_checkpoints` Adding `default-run = "zebrad"` to `zebrad/Cargo.toml` makes the workspace run `zebrad` by default. (Even though it's redundant for the `zebrad` crate itself.)	2021-01-08 15:14:08 -08:00
teor	b1f14f47c6	Rewrite GetData handling to match the zcashd implementation (#1518 ) * Rewrite GetData handling to match the zcashd implementation `zcashd` silently ignores missing blocks, but sends found transactions followed by a `NotFound` message: `e7b425298f/src/main.cpp (L5497)` This is significantly different to the behaviour expected by the old Zebra connection state machine, which expected `NotFound` for blocks. Also change Zebra's GetData responses to peer request so they ignore missing blocks. * Stop hanging on incomplete transaction or block responses Instead, if the peer sends an unexpected block, unexpected transaction, or NotFound message: 1. end the request, and return a partial response containing any items that were successfully received 2. if none of the expected blocks or transactions were received, return an error, and close the connection	2021-01-04 13:25:35 +10:00
Deirdre Connolly	f5b3412a50	Get PathBuf even if /zebrad-cache exists for create_cached_database_height()	2020-12-30 21:32:55 -05:00
teor	6cd7f255d4	Cleanup acceptance tests using utility functions (#1537 ) And add a missing network check.	2020-12-20 09:06:18 +10:00
teor	3355be4c41	Improve the random port function docs (#1536 ) Also: * rename the function * create an alternative function for the common case.	2020-12-18 11:32:33 +10:00
teor	69fcf64d6c	Disable issue URLs for "duplicate hash" errors (#1517 ) In our README, we tell users to ignore these errors, so we should also disable the issue URL. Also include the hash in the error. (We don't want the span active for all messages, we just want the hash in the error.)	2020-12-16 08:14:42 +10:00
teor	008577561c	Use a sleep future in the async acceptance tests And wait slightly longer for `zebrad` to launch. These fixes should reduce the failure rate of the acceptance tests on busy machines.	2020-12-16 08:09:48 +10:00
teor	1b6bf7f105	Use random ports in the acceptance tests This change avoids errors when tests are cancelled and re-run within a short period of time, for example, using `cargo watch`. It introduces a slight risk of port conflicts between the endpoint tests, and with (ephemeral) ports used by other services. The risk of conflicts across 2 tests is very low, and tests should be run in an isolated environment on busy servers.	2020-12-16 08:09:48 +10:00
Alfredo Garcia	41833340c1	downgrade remaining version strings to 1.0.0-alpha.0 (#1488 )	2020-12-15 11:21:00 +10:00
Deirdre Connolly	2d1698a120	Comment out Sentry stacktraces for now While panic = abort, Sentry collects the same one-line stack trace for all panics, making it incorrectly dedupe different errors into one.	2020-12-12 13:26:52 -05:00
Deirdre Connolly	cff28f7ac8	Use the commit sha as the sentry release	2020-12-09 13:06:18 -05:00
Jane Lusby	400213e2b3	integrate sentry with our existing panic reporting logic	2020-12-09 13:06:18 -05:00
Deirdre Connolly	f1ec1d626d	Tidy for now	2020-12-09 13:06:18 -05:00
Deirdre Connolly	44e1051dee	Debug	2020-12-09 13:06:18 -05:00
Deirdre Connolly	8b268e3f71	Don't keep guard around	2020-12-09 13:06:18 -05:00
Deirdre Connolly	25f6fd25b3	Test catching panic	2020-12-09 13:06:18 -05:00
Deirdre Connolly	6a17549945	Try sentry-tracing integration	2020-12-09 13:06:18 -05:00
Deirdre Connolly	c03a3a2606	Pull DSN from runtime env, enable Sentry debug mode with RUST_LOG=debug	2020-12-09 13:06:18 -05:00
Deirdre Connolly	27e42f4ed5	Set up Sentry error collection via a feature flag	2020-12-09 13:06:18 -05:00
Deirdre Connolly	47d78d4cf4	Try sentry::init()	2020-12-09 13:06:18 -05:00
teor	16ffb1dbbf	Disable issue URLs on all timeouts (#1470 ) This change helps prevent spurious bug reports.	2020-12-08 07:47:01 +10:00
teor	531a33f03b	Update the zebrad commit whenever any Zebra crate changes (#1455 ) vergen's implementation of REBUILD_ON_HEAD_CHANGE assumes that the .git directory is in the crate root, but Zebra uses a workspace. Temporary fix for rustyhorde/vergen#21.	2020-12-05 07:23:05 +10:00
teor	b4a50fd99f	Downgrade tokio to 0.3.4 to avoid a time wheel panic (#1453 ) See tokio-rs/tokio#2789 for details. We were seeing this panic during normal operation, not just at shutdown.	2020-12-04 13:52:37 +10:00
Jane Lusby	ef7e91c3c7	disable color-eyre colors if not connected to a tty (#1443 ) * disable color-eyre colors if not connected to a tty * check if color is disabled	2020-12-04 11:05:25 +10:00
dependabot[bot]	8c052cc39a	build(deps): bump color-eyre from 0.5.9 to 0.5.10 Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.9 to 0.5.10. - [Release notes](https://github.com/yaahc/color-eyre/releases) - [Changelog](https://github.com/yaahc/color-eyre/blob/v0.5.10/CHANGELOG.md) - [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.9...v0.5.10) Signed-off-by: dependabot[bot] <support@github.com>	2020-12-03 10:55:16 -05:00
Jane Lusby	90f944709b	fix git commit logic to work on gcloud (#1442 )	2020-12-03 15:18:55 +10:00
teor	c0bbac89b3	`cargo fmt --all`	2020-12-02 19:45:27 -05:00
teor	e4525d8ee2	Improve metrics acceptance test failure messages	2020-12-02 19:45:27 -05:00
Deirdre Connolly	9221dddb7b	Revert "zebrad: remove git pin on metrics dependency" It broke our metrics endpoint. This reverts commit `dc77163524`.	2020-12-02 17:28:49 -05:00
Jane Lusby	d7bef1c155	bump color-eyre version to avoid a panic when printing spantraces (#1438 )	2020-12-02 14:16:18 -08:00
teor	0e42d8b6c1	Always enable color_eyre, even when color is disabled We want to automatically disable colors upstream in color_eyre, and add a config that allows users to always turn off color.	2020-12-02 10:25:44 -08:00
teor	bed34168c1	Automatically disable abscissa colors and color_eyre when writing to a file	2020-12-02 10:25:44 -08:00
teor	97d1a81b7c	Automatically disable colors when tracing to a file	2020-12-02 10:25:44 -08:00
Henry de Valence	dc77163524	zebrad: remove git pin on metrics dependency Because the new version of the prometheus exporter launches its own single-threaded runtime on a dedicated worker thread, there's no need for the tokio and hyper versions it uses internally to align with the versions used in other crates. So we don't need to use our fork with tokio 0.3, and can just use the published alpha. Advancing to a later alpha may fix the missing-metrics issues.	2020-12-02 10:23:59 -08:00
Henry de Valence	f0db75e712	cargo fmt	2020-12-01 19:16:41 -08:00
Jane Lusby	a91d0f0bb6	Include short sha in log messages and error urls (#1410 ) As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs. To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs. Co-authored-by: teor <teor@riseup.net>	2020-12-01 12:13:20 -08:00
Jane Lusby	de34c47cc2	enable tracing acceptance test	2020-12-01 11:03:13 -05:00
Jane Lusby	fceef849cf	remove unused mutability to defuse deadlock	2020-12-01 11:03:13 -05:00
dependabot[bot]	61d0f02c57	build(deps): bump inferno from 0.10.1 to 0.10.2 Bumps [inferno](https://github.com/jonhoo/inferno) from 0.10.1 to 0.10.2. - [Release notes](https://github.com/jonhoo/inferno/releases) - [Changelog](https://github.com/jonhoo/inferno/blob/master/CHANGELOG.md) - [Commits](https://github.com/jonhoo/inferno/compare/v0.10.1...v0.10.2) Signed-off-by: dependabot[bot] <support@github.com>	2020-12-01 10:35:14 -05:00
teor	92eb92d1dd	Disable the nightly clippy unnecessary_wraps lint (#1403 ) It seems to be a bit broken - some of our functions return `Result` for consistency with similar functions. But the lint picks them up anyway.	2020-12-01 12:20:57 +10:00
Henry de Valence	1df9284444	zebrad: add a use_color option to the tracing config. This is useful for creating searchable logs without having to filter color codes after the fact.	2020-11-30 15:25:50 -08:00
Henry de Valence	e8c16b172f	zebrad: pass TracingSection to Tracing component	2020-11-30 15:25:50 -08:00
Alfredo Garcia	4544463059	Inbound `FindBlocks` and `FindHeaders` (#1347 ) * implement inbound `FindBlocks` * Handle inbound peer FindHeaders requests * handle request before having any chain tip * Split `find_chain_hashes` into smaller functions Add a `max_len` argument to support `FindHeaders` requests. Rewrite the hash collection code to use heights, so we can handle the `stop` hash and "no intersection" cases correctly. * Split state height functions into "any chain" and "best chain" * Rename the best chain block method to `best_block` * Move fmt utilities to zebra_chain::fmt * Summarise Debug for some Message variants Co-authored-by: teor <teor@riseup.net> Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-12-01 07:30:37 +10:00
Henry de Valence	fa02b266ca	clippy	2020-11-25 10:55:44 -08:00
Henry de Valence	de8415dcb1	tidy spans	2020-11-25 10:55:44 -08:00
Henry de Valence	05837797b1	tidy imports	2020-11-25 10:55:44 -08:00
Henry de Valence	77bf327b07	fix errors (2)	2020-11-25 10:55:44 -08:00
Henry de Valence	527f4d39ed	fix errors	2020-11-25 10:55:44 -08:00
Henry de Valence	e645e3bf0c	remove async	2020-11-25 10:55:44 -08:00
Henry de Valence	6569977549	test compile change	2020-11-25 10:55:44 -08:00
Alfredo Garcia	486e55104a	create Downloads for Inbound	2020-11-25 10:55:44 -08:00
Deirdre Connolly	6a0a6f6d37	allow(dead_code) not allow(clippy::dead_code)	2020-11-24 11:04:30 -05:00
Deirdre Connolly	4a67e0e7bb	Enable stateful/long sync tests by features, mount rocksdb-based state at Sapling activation for sync_past_sapling_mainnet test	2020-11-24 11:04:30 -05:00
Deirdre Connolly	d813603bac	Remove defunct memory_cache_bytes from test config	2020-11-24 11:04:30 -05:00
Jane Lusby	c2a57d7e49	slight comment tweek	2020-11-24 11:04:30 -05:00
Jane Lusby	99c5acc94f	rename test fn	2020-11-24 11:04:30 -05:00
Jane Lusby	602d8c4898	document tests	2020-11-24 11:04:30 -05:00
Jane Lusby	17fdbe941b	fix stdout issue with test framework for cached data tests	2020-11-24 11:04:30 -05:00
Jane Lusby	0f51891359	revert unnecessary change in sync_until	2020-11-24 11:04:30 -05:00
Jane Lusby	4bfe747f34	update acceptance tests	2020-11-24 11:04:30 -05:00
Jane Lusby	d093b4e528	Add network integration test for quick post sapling sync testing	2020-11-24 11:04:30 -05:00
dependabot[bot]	a4af90c2b0	build(deps): bump color-eyre from 0.5.7 to 0.5.8 Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.7 to 0.5.8. - [Release notes](https://github.com/yaahc/color-eyre/releases) - [Changelog](https://github.com/yaahc/color-eyre/blob/master/CHANGELOG.md) - [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.7...v0.5.8) Signed-off-by: dependabot[bot] <support@github.com>	2020-11-24 09:59:22 -05:00
Henry de Valence	2a4a89c002	state,zebrad: tidy span levels for good INFO output This provides useful and not too noisy output at INFO level. We do an info-level message on every block commit instead of trying to do one message every N blocks, because this is useful both for initial block sync as well as continuous state updates on new blocks.	2020-11-23 14:16:39 +10:00
Henry de Valence	f0810b028d	state,consensus,sync: shorten span lengths These changes help reduce the size of the resulting spans, making the output more compact. Together they save about 30-40 characters.	2020-11-23 14:16:39 +10:00
teor	d4da9609ee	Update the max_concurrent_block_requests docs In #1298, we decreased `max_concurrent_block_requests`, but forgot to update the docs.	2020-11-20 10:08:57 -08:00
Henry de Valence	ba3c19142c	deps: update hyper, metrics to tokio 0.3 The metrics code becomes much simpler because the current version of the metrics crate builds its own single-threaded runtime on a dedicated worker thread, so no dependency on the main Zebra Tokio runtime is required.	2020-11-20 10:08:16 -08:00
Henry de Valence	add94c1c45	deps: move to tokio 0.3, tower 0.4 This change is mostly mechanical, with the exception of the changes to the `tower-batch` middleware. This middleware was adapted from `tower::buffer`, and the `tower::buffer` code was changed to implement its own bounded queue, because Tokio 0.3 removed the `mpsc::Sender::poll_send` method. See `ddc64e8d4d` for more context on the Tower changes. To match Tower as closely as possible in order to be able to upstream `tower-batch`, those changes are copied from `tower::Buffer` to `tower-batch`.	2020-11-20 10:08:16 -08:00
Jane Lusby	4c9bb87df2	zebra-state: replace sled with rocksdb (#1325 ) ## Motivation Prior to this PR we've been using `sled` as our database for storing persistent chain data on the disk between boots. We picked sled over rocksdb to minimize our c++ dependencies despite it being a less mature codebase. The theory was if it worked well enough we'd prefer to have a pure rust codebase, but if we ever ran into problems we knew we could easily swap it out with rocksdb. Well, we ran into problems. Sled's memory usage was particularly high, and it seemed to be leaking memory. On top of all that, the performance for writes was pretty poor, causing us to become bottle-necked on sled instead of the network. ## Solution This PR replaces `sled` with `rocksdb`. We've seen a 10x improvement in memory usage out of the box, no more leaking, and much better write performance. With this change writing chain data to disk is no longer a limiting factor in how quickly we can sync the chain. The code in this pull request has: - [x] Documentation Comments - [x] Unit Tests and Property Tests ## Review @hdevalence	2020-11-18 18:05:06 -08:00
Henry de Valence	4953f21670	fixup! zebrad: hack to skip alreadyverified errors	2020-11-18 03:09:06 -05:00
Henry de Valence	d2fc01755b	zebrad: more reasonable concurrent block limit This helps prevent overloading the network with too many concurrent block requests. On a fast network, we're likely to still have enough room to saturate our bandwidth. In the worst case, with 2MB blocks, downloading 50 blocks concurrently is 100MB of queued downloads. If we need to download this in 20 seconds to avoid peer connection timeouts, the implied worst-case minimum speed is 5MB/s. In practice, this minimum speed will likely be much lower.	2020-11-17 14:56:27 -08:00
Henry de Valence	aa7538ab15	zebrad: hack to skip alreadyverified errors	2020-11-17 14:56:27 -08:00
Henry de Valence	e55392b61e	zebrad: explicitly select the threaded scheduler.	2020-11-17 14:56:27 -08:00
Henry de Valence	6de824bd99	zebrad: remove block verification timeout Because we set the lookahead limit to be at least twice the size of a checkpoint, we don't have a risk of timeouts.	2020-11-17 14:56:27 -08:00
Henry de Valence	e9c847bbd7	zebrad: avoid a borrow in the ChainSync future	2020-11-17 14:56:27 -08:00
Henry de Valence	b632a24436	zebrad: add diagnostics on cancelled download tasks	2020-11-17 14:56:27 -08:00
Henry de Valence	ec411574ee	zebrad: improve sync diagnostics	2020-11-17 14:56:27 -08:00
teor	54cb9277ef	Allow some new clippy nightly lints	2020-11-17 10:07:37 +10:00
dependabot[bot]	8c5f6d0177	build(deps): bump once_cell from 1.5.1 to 1.5.2 Bumps [once_cell](https://github.com/matklad/once_cell) from 1.5.1 to 1.5.2. - [Release notes](https://github.com/matklad/once_cell/releases) - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.5.1...v1.5.2) Signed-off-by: dependabot[bot] <support@github.com>	2020-11-13 14:48:11 -05:00
Jane Lusby	7c0275ac0b	reorganize stop check (#1288 ) * reorganize stop check * remove unused enum * move out and make it unique Co-authored-by: teor <teor@riseup.net>	2020-11-13 11:37:52 +10:00
Henry de Valence	e0c92167bc	Revert "Hedge every syncer block download request" This reverts commit `656bd24ba7`. The Hedge middleware keeps a pair of histograms, writing into one in the current time interval and reading from the previous time interval's data. This means that the reverted change resulted in doubling all block downloads until after at least the second measurement interval (which means that the time measurements are also incorrect, as they're operating under double the network load...)	2020-11-12 16:45:47 -05:00
Alfredo Garcia	128643d81e	Call `zebra_test::init` where needed. (#1227 ) * Add missing `zebra_test::init()` to zebra-chain * Add missing `zebra_test::init()` to zebra-consensus * Add missing `zebra_test::init()` to zebra-network * Add missing `zebra_test::init()` to zebra-state * Add missing `zebra_test::init()` to zebra-test * Add missing `zebra_test::init()` to zebrad	2020-11-10 10:29:25 +10:00
teor	efef2a2bd7	Reduce acceptance test sled memory usage (#1236 ) * Use the default memory limit in the acceptance tests PR #1233 changed the default `memory_cache_bytes`, but left the acceptance tests with their old value.	2020-11-10 07:42:30 +10:00
dependabot[bot]	a58299a0f0	build(deps): bump color-eyre from 0.5.6 to 0.5.7 Bumps [color-eyre](https://github.com/yaahc/color-eyre) from 0.5.6 to 0.5.7. - [Release notes](https://github.com/yaahc/color-eyre/releases) - [Changelog](https://github.com/yaahc/color-eyre/blob/master/CHANGELOG.md) - [Commits](https://github.com/yaahc/color-eyre/compare/v0.5.6...v0.5.7) Signed-off-by: dependabot[bot] <support@github.com>	2020-11-09 08:40:55 -05:00
dependabot[bot]	1e3cf6dc5c	build(deps): bump tracing-subscriber from 0.2.14 to 0.2.15 Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.14 to 0.2.15. - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.14...tracing-subscriber-0.2.15) Signed-off-by: dependabot[bot] <support@github.com>	2020-11-04 20:37:40 -05:00
dependabot[bot]	785fc30481	build(deps): bump hyper from 0.13.8 to 0.13.9 Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.8 to 0.13.9. - [Release notes](https://github.com/hyperium/hyper/releases) - [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md) - [Commits](https://github.com/hyperium/hyper/compare/v0.13.8...v0.13.9) Signed-off-by: dependabot[bot] <support@github.com>	2020-11-04 20:07:18 -05:00
Henry de Valence	0ad648fb6a	zebrad: make lookahead limit configurable. Sets the default value to the previous lookahead limit. My testing on mainnet suggested that the newly lower value (changed when the checkpoint frequency was decreased) is low enough to cause stalls, even when using hedged requests.	2020-11-01 10:47:46 -08:00
teor	92c623eddf	Log each genesis download This change helps us diagnose sync hangs.	2020-10-28 11:31:04 -04:00
teor	656bd24ba7	Hedge every syncer block download request Remove the minimum data points from the syncer hedge configuragtion. When there are no data points, hedge sends the second request immediately. Where there are less than 1/(1-latency_percentile) data points (20), hedge delays the second request by the highest recent download time. This change should improve genesis and post-restart sync latency.	2020-10-28 11:31:04 -04:00
teor	ea510b7d41	Run a block sync in CI with 2 large checkpoints (#1193 ) * Run large checkpoint sync tests in CI * Improve test child output match error context * Add a debug_stop_at_height config * Use stop at height in acceptance tests And add some restart acceptance tests, to make sure the stop at height feature works correctly.	2020-10-27 19:25:29 +10:00
Henry de Valence	4c960c4e6d	zebrad: treat duplicate downloads as an error We should error if we notice that we're attempting to download the same blocks multiple times, because that indicates that peers reported bad information to us, or we got confused trying to interpret their responses.	2020-10-26 12:05:35 -07:00
Henry de Valence	4127d086ea	zebrad: clarify hedge layering motivation Co-authored-by: teor <teor@riseup.net>	2020-10-26 12:05:35 -07:00

1 2 3 4 5 ...

416 Commits