* Add run parent directory for accounts files
* fix test test_concurrent_snapshot_packaging
* review comments. renamed the path setup function
* Addressed most of the review comments
* remove explict type def for map result
* handle create_accounts_run_and_snapshot_dirs error with expect
* update with more review comments
* minor fixes from review comments
* simplify account_filename option assignment
* handle error from create_accounts_run_and_snapshot_dirs
* use then instead of then_some for lazy evaluation
* Clean up files in the old account_path before trasitioning to the new run path
* try_exist and accounts_dir removing extra
* sync rmdir, is_dir check
* handle the account_path not deletable case
* Add run parent directory for accounts files
* fix test test_concurrent_snapshot_packaging
* review comments. renamed the path setup function
* Addressed most of the review comments
* remove explict type def for map result
* handle create_accounts_run_and_snapshot_dirs error with expect
* update with more review comments
* minor fixes from review comments
* simplify account_filename option assignment
* handle error from create_accounts_run_and_snapshot_dirs
* use then instead of then_some for lazy evaluation
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
Now that CRDS supports incremental snapshot hashes,
SnapshotPackagerService needs to push 'em!
This commit does two main things:
1. SnapshotPackagerService now knows about incremental snapshot hashes,
and will push SnapshotPackage::IncrementalSnapshot hashes to CRDS.
2. At startup, when loading from a full + incremental snapshot, the
hashes need to be passed all the way to SnapshotPackagerService so it
can push these starting hashes to CRDS. Those values have been piped
through.
Fixes#20441 and #20423
AccountsBackgroundService now knows about incremental snapshots. It is
now also in charge of deciding if an AccountsPackage is destined to be a
SnapshotPackage or not (or just used by AccountsHashVerifier).
!!! New behavior changes !!!
Taking snapshots (both bank and archive) **MUST** succeed.
This is required because of how the last full snapshot slot is
calculated, which is used by AccountsBackgroundService when calling
`clean_accounts()`.
File system calls are now unwrapped and will result in a crash. As Trent told me:
>Well I think if a snapshot fails due to some IO error, it's very likely that the operator is going to have to intervene before it works. We should exit error in this case, otherwise the validator might happily spin for several more hours, never successfully writing a complete snapshot, before something else brings it down. This would leave the validator's last local snapshot many more slots behind than it would be had we exited outright and potentially force the operator to abandon ledger continuity in favor of a quick catchup
Other errors will set the `exit` flag to `true`, and the node will gracefully shutdown.
Fixes#19167Fixes#19168
#### Problem
Snapshot names are overloaded, and there are multiple terms that mean the same thing. This is confusing. Here's a list of ones in the codebase that I've found:
```
- snapshot_dir
- snapshots_dir
- snapshot_path
- snapshot_output_dir
- snapshot_package_output_path
- snapshot_archives_dir
```
#### Summary of Changes
For all the ones that are about the directory where snapshot archives are stored, ensure they are `snapshot_archives_dir`. For the ones about the (bank) snapshots directory, set to `bank_snapshots_dir`.
Co-authored-by: Michael Vines <mvines@gmail.com>
Renaming these types to better communicate their usages, which will
further diverge as incremental snapshot support is added.
With the new names, AccountsPacakge now refers to the type between
AccountsBackgroundProcess and AccountsHashVerifier, and SnapshotPackage
refers to the type between AccountsHashVerifier and
SnapshotPackagerService.
While reviewing PR #18565, as issue was brought up to refactor some code
around verifying the bank after rebuilding from snapshots. A new
top-level function has been added to get the latest snapshot archives
and load the bank then verify. Additionally, new tests have been
written and existing tests have been updated to use this new function.
Fixes#18973
While resolving the issue, it became clear there was some additional
low-hanging fruit this change enabled. Specifically, the functions
`bank_to_xxx_snapshot_archive()` now return their respective
`SnapshotArchiveInfo`. And on the flip side,
`bank_from_snapshot_archives()` now takes `SnapshotArchiveInfo`s instead
of separate paths and archive formats. This bundling simplifies bank
rebuilding.
This commit adds high-level functions for creating and loading-from
incremental snapshots, plus all low-level functions required to perform
those tasks. This commit **does not** add taking incremental snapshots
as part of a running validator, nor starting up a node with an
incremental snapshot; just laying ground work.
Additionally, `snapshot_utils` and `serde_snapshot` have been
refactored to use a common code paths for the different snapshots.
Also of note, some renaming has happened:
1. Snapshots are now either `full_` or `incremental_` throughout the
codebase. If not specified, the code applies to both.
2. Bank snapshots now are called "bank snapshots"
(before they were called "slot snapshots", "bank snapshots", or
just "snapshots"). The one exception is within `Bank`, where they
are still just "snapshots", because they are already "bank
snapshots".
3. Snapshot archives now have `_archive` in the code. This
should clear up an ambiguity between bank snapshots and snapshot
archives.
* Move gossip modules to solana-gossip
* Update Protocol abi digest due to move
* Move gossip benches and hook up CI
* Remove unneeded Result entries
* Single use statements
* purge_old_snapshot_archives is changed to take an extra argument 'maximum_snapshots_to_retain' to control the max number of latest snapshot archives to retain. Note the oldest snapshot is always retained as before and is not subjected to this new options.
* The validator and ledger-tool executables are modified with a CLI argument --maximum-snapshots-to-retain. And the options are propagated down the call chains. Their corresponding shell scripts were changed accordingly.
* SnapshotConfig is modified to have an extra field for the maximum_snapshots_to_retain
* Unit tests are developed to cover purge_old_snapshot_archives