Filtering out storages for incremental snapshots will be needed by the
background services for incremental snapshot support, but there is not a
Bank at that point. Since the filtering doesn't apply only to Bank, and
more to snapshots, move the functionality into snapshot_utils.
While reviewing PR #18565, as issue was brought up to refactor some code
around verifying the bank after rebuilding from snapshots. A new
top-level function has been added to get the latest snapshot archives
and load the bank then verify. Additionally, new tests have been
written and existing tests have been updated to use this new function.
Fixes#18973
While resolving the issue, it became clear there was some additional
low-hanging fruit this change enabled. Specifically, the functions
`bank_to_xxx_snapshot_archive()` now return their respective
`SnapshotArchiveInfo`. And on the flip side,
`bank_from_snapshot_archives()` now takes `SnapshotArchiveInfo`s instead
of separate paths and archive formats. This bundling simplifies bank
rebuilding.
When making a snapshot archive, we used to shell out and call `tar -S`
for sparse file support. The tar crate supports sparse files, so no
need to do this anymore.
Fixes#10860
I introduced a bug where old snapshot archives were incorrectly purged.
Instead of purged to oldest, I was purged the newest...
The fix is to add a `reverse()` in the purge logic, and I've added a
test to catch this bug in the future.
Fixes#19057
This PR solves #18815. Note that I had to make the snapshot prefix
constants inside `snapshot_utils.rs` public at the crate level in order
to make this work. I'm not sure whether or not introducing this
dependency is entirely good, either way the `snapshot_utils.rs` file
needs a lot of rework so things will move around, I believe this does
the work in the meantime. Any feedback will be greatly appreciated.
This commit builds on PR #18504 by adding a test to core/tests/snapshot.rs for Incremental Snapshots. The test adds banks to bank forks in a loop and takes both full snapshots and incremental snapshots at intervals, and validates they are rebuild-able.
For background info about Incremental Snapshots, see #17088.
Fixes#18829 and #18972
This commit adds high-level functions for creating and loading-from
incremental snapshots, plus all low-level functions required to perform
those tasks. This commit **does not** add taking incremental snapshots
as part of a running validator, nor starting up a node with an
incremental snapshot; just laying ground work.
Additionally, `snapshot_utils` and `serde_snapshot` have been
refactored to use a common code paths for the different snapshots.
Also of note, some renaming has happened:
1. Snapshots are now either `full_` or `incremental_` throughout the
codebase. If not specified, the code applies to both.
2. Bank snapshots now are called "bank snapshots"
(before they were called "slot snapshots", "bank snapshots", or
just "snapshots"). The one exception is within `Bank`, where they
are still just "snapshots", because they are already "bank
snapshots".
3. Snapshot archives now have `_archive` in the code. This
should clear up an ambiguity between bank snapshots and snapshot
archives.
1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true
Fixes#17544
Refactor a few functions that are on the load-from-snapshot path, to facilitate
adding in incremental snapshots more easily.
Additionally, add some tests and doc comments.
* purge_old_snapshot_archives is changed to take an extra argument 'maximum_snapshots_to_retain' to control the max number of latest snapshot archives to retain. Note the oldest snapshot is always retained as before and is not subjected to this new options.
* The validator and ledger-tool executables are modified with a CLI argument --maximum-snapshots-to-retain. And the options are propagated down the call chains. Their corresponding shell scripts were changed accordingly.
* SnapshotConfig is modified to have an extra field for the maximum_snapshots_to_retain
* Unit tests are developed to cover purge_old_snapshot_archives
Account data hashing used to use different ways of hashing on different
clusters. That is no longer the case, but the old code still existed.
This commit removes that old, now used code.
**NOTE** The golden hash values in bank.rs needed to be updated. Since
the original code that selected the hash algorithm used `if >` instead
of `if >=`, this meant that the genesis block's hash _always_ used the
old hashing method, which is no longer valid.
Validated by running `cargo test` successfully.
* lazy calculate account hash
* push to bg thread
* remove deadlock
* logs
* format
* some cleanup on aisle 9
* format, fix up some metrics
* fix test, remove legacy function only there for tests
* cleanup
* remove unused store_hasher
* Switch to crossbeam
* clippy
* format
* use iter()
* rework from feedback
* hash_slot -> slot
* hash(cluster_type)
Co-authored-by: Carl Lin <carl@solana.com>