zebra

Commit Graph

Author	SHA1	Message	Date
Henry de Valence	fe61090a64	zebrad: make Inbound Poll::Ready before setup. The Inbound service only needs the network setup for some requests, but it can service other requests without it. Making it return Poll::Pending until the network setup finishes means that initial network connections may view the Inbound service as overloaded and attempt to load-shed.	2020-09-21 09:26:39 -07:00
dependabot[bot]	85241a49d6	build(deps): bump hyper from 0.13.7 to 0.13.8 Bumps [hyper](https://github.com/hyperium/hyper) from 0.13.7 to 0.13.8. - [Release notes](https://github.com/hyperium/hyper/releases) - [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md) - [Commits](https://github.com/hyperium/hyper/compare/v0.13.7...v0.13.8) Signed-off-by: dependabot[bot] <support@github.com>	2020-09-21 11:58:31 -04:00
Henry de Valence	9c021025a7	network: fill in remaining request/response pairs	2020-09-20 10:21:18 -07:00
Henry de Valence	4b35fea492	zebrad: document Inbound, ChainSync responsibilities	2020-09-18 18:34:25 -07:00
Henry de Valence	65877cb4b1	zebrad: make Inbound propagate backpressure	2020-09-18 18:34:25 -07:00
Henry de Valence	55f46967b2	zebrad: serve blocks from Inbound service The original version of this commit ran into https://github.com/rust-lang/rust/issues/64552 again. Thanks to @yaahc for suggesting a workaround (using futures combinators to avoid writing an async block).	2020-09-18 18:34:25 -07:00
Henry de Valence	170f588ffb	network: document load-shedding behavior This was part of the original design and is described in the Connection internals, but we never documented it externally.	2020-09-18 18:34:25 -07:00
Henry de Valence	1d0ebf89c6	zebrad: move seed command into inbound component Remove the seed command entirely, and make the behavior it provided (responding to `Request::Peers`) part of the ordinary functioning of the start command. The new `Inbound` service should be expanded to handle all request types.	2020-09-18 18:34:25 -07:00
Henry de Valence	1d3892e1dc	network: rename alias to BoxError This is shorter and consistent with Tower (which is why we use it in the first place).	2020-09-18 18:34:25 -07:00
Jane Lusby	ca648ff27c	Enable issue-url feature in color-eyre (#1072 ) * Enable issue-url feature in color-eyre * get version automatically * and the url!	2020-09-17 15:09:18 -07:00
dependabot[bot]	ba32d27f6e	build(deps): bump tracing-subscriber from 0.2.11 to 0.2.12 (#1059 ) Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.2.11 to 0.2.12. - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.2.11...tracing-subscriber-0.2.12) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-09-14 13:49:07 -07:00
Jane Lusby	a7b418bfe5	Add test for first checkpoint verification (#1018 ) * add test for first checkpoint sync Prior this this change we've not had any tests that verify our sync / network logic is well behaved. This PR cleans up the test helper code to make error reports more consistent and uses this cleaned up API to implement a checkpoint sync test which runs zebrad until it reads the first checkpoint event from stdout. Co-authored-by: teor <teor@riseup.net> * move include out of unix cfg Co-authored-by: teor <teor@riseup.net>	2020-09-11 13:39:39 -07:00
Henry de Valence	3133214e4f	zebrad: use new state API	2020-09-11 13:37:49 -07:00
teor	b1e1291f45	Log inbound peer requests at debug Logging at info was a bit too verbose. Also add a short log message.	2020-09-10 09:46:53 -07:00
Henry de Valence	24de90c900	zebrad: tidy sync imports	2020-09-10 09:45:52 -07:00
Henry de Valence	9b6e66c1b9	zebrad: rename Syncer to ChainSync This name clarifies what is being synced and avoids an agent-noun construction.	2020-09-10 09:45:52 -07:00
Henry de Valence	0bc79686b8	zebrad: move sync into components module. Part of #1030.	2020-09-10 09:45:52 -07:00
teor	adafe1d189	Restart sync after the first failed ObtainTips The ObtainTips retry was redundant. The timeout wasn't much shorter, but it made the code and sync logic more complicated.	2020-09-09 15:35:09 -07:00
teor	2a68ef5acb	Update the peerset buffer size and sync timeout Also add a bunch of comments and documentation for network-constrained nodes, and for testnet.	2020-09-08 12:44:33 -07:00
teor	b062a682b0	Refactor "waiting for pending blocks" log	2020-09-08 12:44:33 -07:00
teor	e6e859dce2	Tweak sync timeouts * increase the EWMA default and decay * increase the block download retries * increase the request and block download timeouts * increase the sync timeout	2020-09-08 12:44:33 -07:00
teor	ce12d4dadc	Add timeouts for tip responses and block verify tasks	2020-09-08 12:44:33 -07:00
teor	379ce5c1b8	Retry obtain and extend tips on failure	2020-09-08 12:44:33 -07:00
Alfredo Garcia	ca1a451895	Add test for metrics and tracing endpoints (#1000 ) * add metrics and tracking endpoint tests * test endpoints more * add change filter test for tracing * add await to post * separate metrics and tracing tests * Apply suggestions from code review Co-authored-by: teor <teor@riseup.net> Co-authored-by: teor <teor@riseup.net>	2020-09-07 17:05:23 -07:00
Alfredo Garcia	454e75e7c0	Rename old references to BlockHeaderHash and BlockHeight (#1002 ) * rename some references * Apply suggestions from code review Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com> Co-authored-by: teor <teor@riseup.net> Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com> Co-authored-by: teor <teor@riseup.net>	2020-09-04 15:40:48 -07:00
teor	48497d4857	Ignore sync errors when the block is already verified (#980 ) * Ignore sync errors when the block is already verified If we get an error for a block that is already in our state, we don't need to restart the sync. It was probably a duplicate download. Also: Process any ready tasks before reset, so the logs and metrics are up to date. (But ignore the errors, because we're about to reset.) Improve sync logging and metrics during the download and verify task. * Remove duplicate hashes in logs Co-authored-by: Jane Lusby <jlusby42@gmail.com> * Log the sync hash span at warn level Co-authored-by: Jane Lusby <jlusby42@gmail.com>	2020-09-04 08:13:00 +10:00
teor	437549d8e9	Always drop the final hash in peer responses (#991 ) To workaround a zcashd bug that squashes responses together.	2020-09-04 08:09:34 +10:00
teor	c770daa51f	If the first ExtendTips hash is bad, discard it and re-check (#992 )	2020-09-04 08:08:19 +10:00
Alfredo Garcia	5485f4429a	Add config path to acceptance tests (#946 ) * add and apply config mode to get_child * remove option to read config from current directory * remove argument from get_child	2020-09-03 13:13:23 -07:00
Jane Lusby	ffdec0cb23	Remove in-memory state service (#974 ) * Remove in-memory state service * make the config compatible with toml again * checkpoint commit to see how much I still have to revert * back to the starting point... * remove unused dependency * reorganize error handling a bit * need to make a new color-eyre release now * reorder again because I have problems * remove unnecessary helpers * revert changes to config loading * add back missing space * Switch to released color-eyre version * add back missing newline again... * improve error message on unix when terminated by signal * add context to last few asserts in acceptance tests * instrument some of the helpers * remove accidental extra space * try to make this compile on windows * reorg platform specific code * hide on_disk module and fix broken link	2020-09-01 12:39:04 -07:00
teor	3fdfcb3179	fix: remove old tips that are behind new tips This change makes sync less reliant on the exact order of ObtainTips and ExtendTips responses.	2020-09-01 11:42:48 -04:00
teor	a6d6e65940	fix: fix the flamegraph module comment	2020-09-01 11:40:18 -04:00
Ramana Venkata	448250f901	Deduplicate test config defaults (#971 ) Fixes #967	2020-08-31 12:43:43 -07:00
Ramana Venkata	ad0001f7f7	zebra-state: Add support for temporary sled databases (#939 ) * Test config with persistent sled database * Test ephemeral config * Add misconfigured ephemeral test	2020-08-31 18:32:55 +10:00
teor	fa04072298	Make the checkpoint limit test more readable (#941 ) * fix: Pass zebra_consensus::Config in a test * fix: Remove a redundant import	2020-08-24 11:34:10 -07:00
teor	78201b456d	feature: Implement checkpoint_sync for checkpoint verification * add CheckpointList::new_up_to(limit: NetworkUpgrade) * if checkpoint_sync is false, limit checkpoints to Sapling * update tests for CheckpointList and chain::init	2020-08-24 15:34:46 +10:00
teor	06f4a59664	feature: Add a checkpoint_sync config option (The option doesn't do anything yet.)	2020-08-24 15:34:46 +10:00
Ramana Venkata	991c70723a	zebrad: Create zebrad.toml in acceptance tests from ZebradConfig Fixes #929	2020-08-23 21:24:19 -04:00
teor	3400b72699	fix: Make the start acceptance tests stricter	2020-08-21 07:22:53 +10:00
teor	02e6027c57	refactor: Remove duplicate acceptance test code	2020-08-21 07:22:53 +10:00
teor	1e0e4914a0	fix: Improve an acceptance test failure message If the tests conflict with a local zebrad, zcashd, or other tests, they need to be run with a custom config, or in an isolated environment.	2020-08-21 07:22:53 +10:00
teor	b8e8d4f548	fix: Remove some deeply-nested instrument spans Closes #923.	2020-08-20 14:52:39 -04:00
Alfredo Garcia	d349f2bbc2	Refactor acceptance serialized_tests (#920 ) * add network listening address to default config	2020-08-20 07:48:22 +10:00
Henry de Valence	103b663c40	chain: rename BlockHeight to block::Height	2020-08-17 11:46:34 -07:00
Henry de Valence	61dea90e2f	chain: rename BlockHeaderHash to block::Hash This is the first in a sequence of changes that change the block:: items to not include Block as a prefix in their name, in accordance with the Rust API guidelines.	2020-08-17 11:46:34 -07:00
Henry de Valence	948b067808	chain: move Network, NetworkUpgrade to parameters Also, avoid using star-imports of the enum variants, which pollutes the namespace.	2020-08-17 11:46:34 -07:00
Henry de Valence	0d1f56ad2f	chain: remove utils module A catch-all utils module can really easily slip into being a place to stash miscellaneous functions that don't really belong anywhere in particular.	2020-08-17 11:46:34 -07:00
Deirdre Connolly	27ed2288b5	Remove redundant clones for PathBufs	2020-08-14 20:15:24 -04:00
Alfredo Garcia	e73f976194	Valid generated config acceptance test (#859 ) * add valid generated config test * change to pathbuf * use -c to make sure we are using the generated file * add and use a ZebraTestDir type * change approach to generate tempdir in top of each test * pass tempdir to test_cmd and set current dir to it * add and use a `generated_config_path` variable in tests	2020-08-13 13:31:13 -07:00
Henry de Valence	a79ce97957	Fix sync algorithm. (#887 ) * checkpoint: reject older of duplicate verification requests. If we get a duplicate block verification request, we should drop the older one in favor of the newer one, because the older request is likely to have been canceled. Previously, this code would accept up to four duplicate verification requests, then fail all subsequent ones. * sync: add a timeout layer to block requests. Note that if this timeout is too short, we'll bring down the peer set in a retry storm. * sync: restart syncing on error Restart the syncing process when an error occurs, rather than ignoring it. Restarting means we discard all tips and start over with a new block locator, so we can have another chance to "unstuck" ourselves. * sync: additional debug info * sync: handle lookahead limit correctly. Instead of extracting all the completed task results, the previous code pulled results out until there were fewer tasks than the lookahead limit, then stopped. This meant that completed tasks could be left until the limit was exceeded again. Instead, extract all completed results, and use the number of pending tasks to decide whether to extend the tip or wait for blocks to finish. * network: add debug instrumentation to retry policy * sync: instrument the spawned task * sync: streamline ObtainTips/ExtendTips logic & tracing This change does three things: 1. It aligns the implementation of ObtainTips and ExtendTips so that they use the same deduplication method. This means that when debugging we only have one deduplication algorithm to focus on. 2. It streamlines the tracing output to not include information already included in spans. Both obtain_tips and extend_tips have their own spans attached to the events, so it's not necessary to add Scope: prefixes in messages. 3. It changes the messages to be focused on reporting the actual events rather than the interpretation of the events (e.g., "got genesis hash in response" rather than "peer could not extend tip"). The motivation for this change is that when debugging, the interpretation of events is already known to be incorrect, in the sense that the mental model of the code (no bug) does not match its behavior (has bug), so presenting minimally-interpreted events forces interpretation relative to the actual code. * sync: hack to work around zcashd behavior * sync: localize debug statement in extend_tips * sync: change algorithm to define tips as pairs of hashes. This is different enough from the existing description that its comments no longer apply, so I removed them. A further chunk of work is to change the sync RFC to document this algorithm. * sync: reduce block timeout * state: add resource limits for sled Closes #888 * sync: add a restart timeout constant * sync: de-pub constants	2020-08-12 16:48:01 -07:00

1 2 3 4 5

235 Commits