* Generate snapshot after reaching agreement in wen_restart.
* Fix a bad merge and carry new_root_slot through HeaviestFork.
* Replace real snapshot service with fake one to avoid circular dependency.
* Remove circular dependency.
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Brooks <brooks@prumo.org>
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Brooks <brooks@prumo.org>
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Brooks <brooks@prumo.org>
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Brooks <brooks@prumo.org>
* Add extra newline.
* Fix constant name.
* Do not use &Arc<...>.
* Check return values in tests.
* Check more return values.
* Remove unnecessary rehash and comment on why new_root_bank is always present.
* Split trigger_eah_calculation_if_needed into separate function.
* Find base slot for incremental snapshot correctly, generate full snapshot
if base is not available.
* Switch to new send_eah_request_if_needed() interface.
* Always write full snapshot into our own directory.
* No need to specify snapshot under the snapshot dir.
* The normal_flow test doesn't need fake snapshot service if it doesn't
need to trigger EAH request.
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Brooks <brooks@prumo.org>
* Add a test for generate_snapshot.
* Small fixes.
* Return error of the slot we picked is lower than any of the snapshot slots.
Always write incremental snapshot, which is faster.
* Add more tests for generate_snapshot.
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Brooks <brooks@prumo.org>
* Update comments about set_root().
* Write directly into incremental snapshot dir and purge bank snapshots.
* Remove the loop, we should have the snapshot when the method exits.
* Change comments.
* Small fixes.
* Fix a bad merge.
* Remove unnecessary loop and error.
---------
Co-authored-by: Brooks <brooks@prumo.org>
* Send and Aggregate RestartHeaviestFork.
* total_active_stake in my_heaviest_fork should always be the sum of the
stake of all the validators which sent me HeaviestFork.
* A few name changes and other small fixes.
* Move active_peers update to after stakes_map is updated.
* Only send out RestartHeaviestFork and write snapshots every 30 minutes.
* Proceed if 5% of the nodes disagree and log the disagreement if the
(slot, hash) chosen by us is not the majority choice.
* Make linter happy.
* Make linter happy.
* Add successful case.
* Add a few constants and methods.
* Account for 5% non_conforming when calculating exit threshold.
* Adding a few more logs.
* Fix tests to use 75% when aggregating HeaviestFork and a few bugs.
* Reuse adjusted_threhold_percent.
* Find the bank hash of the heaviest fork, replay if necessary.
* Make it more explicit how heaviest fork slot is selected.
* Use process_single_slot instead of process_blockstore_from_root, the latter
may re-insert banks already frozen.
* Put BlockstoreProcessError into the error message.
* Check that all existing blocks link to correct parent before replay.
* Use the default number of threads instead.
* Check whether block is full and other small fixes.
* Fix root_bank and move comments to function level.
* Remove the extra parent link check.
* Pass the final result of LastVotedForkSlots aggregation to next
stage and find the heaviest fork we will Gossip to others.
* Change comments.
* Small fixes to address PR comments.
* Move correctness proof to SIMD.
* Fix a broken merge.
* Use blockstore to check parent slot of any block in FindHeaviestFork
* Change error message.
* Add special message when first slot in the list doesn't link to root.
* Switch to blockstore.is_full() check because replay thread isn't active.
* Use make_chaining_slot_entries and add first_parent to the method.
Small style fixes.
* Switch to blockstore.is_full() check because replay thread isn't active.
* Push and aggregate RestartLastVotedForkSlots.
* Fix API and lint errors.
* Reduce clutter.
* Put my own LastVotedForkSlots into the aggregate.
* Write LastVotedForkSlots aggregate progress into local file.
* Fix typo and name constants.
* Fix flaky test.
* Clarify the comments.
* - Use constant for wait_for_supermajority
- Avoid waiting after first shred when repair is in wen_restart
* Fix delay_after_first_shred and remove loop in wen_restart.
* Read wen_restart slots inside the loop instead.
* Discard turbine shreds while in wen_restart in windows insert rather than
shred_fetch_stage.
* Use the new Gossip API.
* Rename slots_to_repair_for_wen_restart and a few others.
* Rename a few more and list all states.
* Pipe exit down to aggregate loop so we can exit early.
* Fix import of RestartLastVotedForkSlots.
* Use the new method to generate test bank.
* Make linter happy.
* Use new bank constructor for tests.
* Fix a bad merge.
* - add new const for wen_restart
- fix the test to cover more cases
- add generate_repairs_for_slot_not_throtted_by_tick and
generate_repairs_for_slot_throtted_by_tick to make it readable
* Add initialize and put the main logic into a loop.
* Change aggregate interface and other fixes.
* Add failure tests and tests for state transition.
* Add more tests and add ability to recover from written records in
last_voted_fork_slots_aggregate.
* Various name changes.
* We don't really care what type of error is returned.
* Wait on expected progress message in proto file instead of sleep.
* Code reorganization and cleanup.
* Make linter happy.
* Add WenRestartError.
* Split WenRestartErrors into separate erros per state.
* Revert "Split WenRestartErrors into separate erros per state."
This reverts commit 4c920cb8f8d492707560441912351cca779129f6.
* Use individual functions when testing for failures.
* Move initialization errors into initialize().
* Use anyhow instead of thiserror to generate backtrace for error.
* Add missing Cargo.lock.
* Add error log when last_vote is missing in the tower storage.
* Change error log info.
* Change test to match exact error.
* Fix wen_restart proto compilation:
- should recompile when proto changes
- no need for customization
* There is only one proto file, no need for loop.
* Add wen_restart module:
- Implement reading LastVotedForkSlots from blockstore.
- Add proto file to record the intermediate results.
- Also link wen_restart into validator.
- Move recreation of tower outside replay_stage so we can get last_vote.
* Update lock file.
* Fix linter errors.
* Fix depencies order.
* Update wen_restart explanation and small fixes.
* Generate tower outside tvu.
* Update validator/src/cli.rs
Co-authored-by: Tyera <teulberg@gmail.com>
* Update wen-restart/protos/wen_restart.proto
Co-authored-by: Tyera <teulberg@gmail.com>
* Update wen-restart/build.rs
Co-authored-by: Tyera <teulberg@gmail.com>
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Tyera <teulberg@gmail.com>
* Rename proto directory.
* Rename InitRecord to MyLastVotedForkSlots, add imports.
* Update wen-restart/Cargo.toml
Co-authored-by: Tyera <teulberg@gmail.com>
* Update wen-restart/src/wen_restart.rs
Co-authored-by: Tyera <teulberg@gmail.com>
* Move prost-build dependency to project toml.
* No need to continue if the distance between slot and last_vote is
already larger than MAX_SLOTS_ON_VOTED_FORKS.
* Use 16k slots instead of 81k slots, a few more wording changes.
* Use AncestorIterator which does the same thing.
* Update Cargo.lock
* Update Cargo.lock
---------
Co-authored-by: Tyera <teulberg@gmail.com>