The existing logic would clean account paths only when we had secondary
access to the blockstore. However, the access mode shouldn't dictate
when we clean accounts data; we can and should clean account data from
previous runs in all instances given that we always start over from a
snapshot.
Introduce a struct to store all of the relevant slot/root information, and then output all in one go at the end as either human-readable or json. There are some slight changes to the human-readable format for the case of an empty ledger
Removed an "ok" print that occurs after several commands; we already
print a log statement that indicates ledger-tool is done and how long it
took. Additionally, switch a println to info to avoid polluting stdout
incase we want to print information in a more easily readable format
such as json.
* init copy cmd
* extract creating emulator connection logic
* extract copy args as struct
* add new_for_emulator
* add tryFrom confirmed block to versioned block
* implement bigtable copy command
* use 'force' flag to force upload
* use unwrap_or
* remove redundant importing
* fix nightly lint
* explicit transactions missing error
* process ending_slot
* prevent start slot > end slot
* print skip slots in debug level
* fix destination bigtable should not be readonly
* combine is-emulator and endpoint to emulated source. conflict with crenditial path
* wording
* log some error messages with error level
* nightly lint
* add dry-run
* extract create bigtable instances logic
* use a lighter way to check block
* use the latest futures version which is used in the repo
* use futures = "0.3"
Co-authored-by: Tyera <teulberg@gmail.com>
* wording
Co-authored-by: Tyera <teulberg@gmail.com>
* wording
Co-authored-by: Tyera <teulberg@gmail.com>
Co-authored-by: Tyera <teulberg@gmail.com>
Currently, if starting-slot is unspecified, a value of 0 will be chosen.
In the common case where someone is operating on a much more recent
range, this would result in a ton of wasted operations & time.
Instead, choose a smarter default value for starting-slot based on what
we detect is currently in the blockstore.
* Update cost model to use requested_cu instead of estimated cu #27608
* remove CostUpdate and CostModel from replay/tvu
* revive cost update service to send cost tracker stats
* CostModel is now static
* remove unused package
Co-authored-by: Tao Zhu <tao@solana.com>
A fifo rocksdb instance must be opened with max size parameter on the
fifo columns. To support this, we previously plumbed a constant up to
callers that provided a default if unbounded growth desired.
This change attempts to be more rusty by exposing an option for this
value, and converting the option to a constant at the lowest level
possible.
#### Summary of Changes
Removes the constant default for ShredStorageType::RocksFifo
as the shred storage size is either user-specified or derived
from --limit-ledger-size in #27459.
### Problem
When FIFO compaction is used while --rocksdb_fifo_shred_storage_size
is unspecified, the FIFO shred storage size is set to a const default based
on the default `--limit-ledger-size`.
### Summary of the Change
When --rocksdb_fifo_shred_storage_size is unspecified, it is now
derived from `--limit-ledger-size` by reserving 1500 bytes for each
shred.
#### Problem
The ledger-tool copy command currently assumes target_db uses the same
shred storage type as the source ledger.
#### Summary of Changes
This PR enables ledger-tool copy command to infer the target_db shred storage
type based on whether rocksdb or rocksdb_fifo exists under the target_db directory
(same way as how ledger-tool infers the shred storage type for the main db.)
If the ledger-tool is not able to infer the shred storage type of the target_db, then
the default level compaction will be used.
Add ledger-tool command print-file-metadata
#### Summary of Changes
This PR adds a ledger tool subcommand print-file-metadata.
```
USAGE:
solana-ledger-tool print-file-metadata [FLAGS] [OPTIONS] [SST_FILE_NAME]
Prints the metadata of the specified ledger-store file.
If no file name is unspecified, then it will print the metadata of all ledger files
```
RocksDB settings include an option to create_if_missing, which will
create missing columns or the entire rocksdb directory if starting from
scratch. However, create_if_missing functionality only works if the
session has Primary (read+write) access. Many ledger-tool commands only
need Secondary (read-only) access to the database, so these commands are
unable to open the Blockstore when a column must be added.
This change detects when Secondary access open fails due to missing
column(s) or files, opens the database temporarily with Primary access,
and then reattempts to open the database Secondary access.
Prior to this change, long running commands like `solana-ledger-tool
verify` would OOM due to AccountsDb cleanup not happening.
Co-authored-by: Michael Vines <mvines@gmail.com>
#### Problem
Ledger-tool doesn't support shred-compaction-type other than the default rocksdb level compaction.
#### Summary of Changes
This PR enables ledger-tool to automatically detect the shred-compaction-type of the specified ledger.
#### Test Plan
New ledger-tool tests are added for both level and fifo compactions.
* slots_connected check used by ledger-tool should not require a full slot for snapshot slot
* Cleaner Result<Option<>> unwrap/default
* return false if no meta for starting slot
* Add clarifying comments
* working on local snapshot
* Parallelization for slot storage minimization
* Additional clean-up and fixes
* make --minimize an option of create-snapshot
* remove now unnecessary function
* Parallelize parts of minimized account set generation
* clippy fixes
* Add rent collection accounts and voting node_pubkeys
* Simplify programdata_accounts generation
* Loop over storages to get slot set
* Parallelize minimized slot set generation
* Parallelize adding owners and programdata_accounts
* Remove some now unncessary checks on the blockstore
* Add a warning for minimized snapshots across epoch boundary
* Simplify ledger-tool minimize
* Clarify names of bank's minimization helper functions
* Remove unnecesary funciton, fix line spacing
* Use DashSets instead of HashSets for minimized account and slot sets
* Filter storages uses all threads instead of thread_pool
* Add some additional comments on functions for minimization
* Moved more into bank and parallelized
* Update programs/bpf/Cargo.lock for dashmap in ledger
* Clippy fix
* ledger-tool: convert minimize_bank_for_snapshot Measure into measure!
* bank.rs: convert minimize_bank_for_snapshot Measure into measure!
* accounts_db.rs: convert minimize_accounts_db Measure into measure!
* accounts_db.rs: add comment about use of minimize_accounts_db
* ledger-tool: CLI argument clarification
* minimization functions: make infos unique
* bank.rs: Add test_get_rent_collection_accounts_between_slots
* bank.rs: Add test_minimization_add_vote_accounts
* bank.rs: Add test_minimization_add_stake_accounts
* bank.rs: Add test_minimization_add_owner_accounts
* bank.rs: Add test_minimization_add_programdata_accounts
* accounts_db.rs: Add test_minimize_accounts_db
* bank.rs: Add negative case and comments in test_get_rent_collection_accounts_between_slots
* bank.rs: Negative test in test_minimization_add_programdata_accounts
* use new static runtime and sdk ids
* bank comments to doc comments
* Only need to insert the maximum slot a key is found in
* rename remove_pubkeys to purge_pubkeys
* add comment on builtins::get_pubkeys
* prevent excessive logging of removed dead slots
* don't need to remove slot from shrink slot candidates
* blockstore.rs: get_accounts_used_in_range shouldn't return Result
* blockstore.rs: get_accounts_used_in_range: parallelize slot loop
* report filtering progress on time instead of count
* parallelize loop over snapshot storages
* WIP: move some bank minimization functionality into a new class
* WIP: move some accounts_db minimization functionality into SnapshotMinimizer
* WIP: Use new SnapshotMinimizer
* SnapshotMinimizer: fix use statements
* remove bank and accounts_db minimization code, where possible
* measure! doesn't take a closure
* fix use statement in blockstore
* log_dead_slots does not need pub(crate)
* get_unique_accounts_from_storages does not need pub(crate)
* different way to get stake accounts/nodes
* fix tests
* move rent collection account functionality to snapshot minimizer
* move accounts_db minimize behavior to snapshot minimizer
* clean up
* Use bank reference instead of Arc. Additional comments
* Add a comment to blockstore function
* Additional clarifying comments
* Moved all non-transaction account accumulation into the SnapshotMinimizer.
* transaction_account_set does not need to be mutable now
* Add comment about load_to_collect_rent_eagerly
* Update log_dead_slots comment
* remove duplicate measure/print of get_minimized_slot_set
* Add ability to use a non-default app profile id in bigtable requests
* Only run subcommand once when getting global configs
* Remove unneded scoping on option type
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
Fix pre-check of blockstore slts during load_bank_forks. Now iterates from starting_slot to halt_slot via slot_meta.next_slots to confirm they are connected.
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.
#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.
By doing this, we address the mutual dependency and also make the code cleaner.
* Add a check during ledger-tool create-snapshot startup to see if the snapshot slot is available
* check all slots from the start to snapshot_slot during load_bank_forks
* unwrap_or_default incremental snapshot slot before comparison
* Improve error messages on missing or not full slots
value_t_or_exit()! will error if the user doesn't specify a value at
runtime, use value_of() instead which will give either the default value
or whatever the user specified.
* Add ConfirmedBlockUploadConfig, no behavior changes
* Add comment
* A little DRY cleanup
* Add configurable limit to number of blocks to check in Blockstore and Bigtable before uploading
* Limit blockstore and bigtable look-ahead
* Exit iterator early when reach ending_slot
* Use rooted_slot_iterator instead of slot_meta_iterator
* Only check blocks in the ledger
This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will
pass-down this struct to LedgerColumn to allow it to perform metric reporting.
Creating a new ledger implicitly means that no other process could have
previously held access to it. Additionally, creating a new ledger
implicitly requires writing, so it follows that Primary access is
required and we can drop access type as an argument.
#### Summary of Changes
This PR further enables group by operation on storage type in blockstore_rocksdb_cfs metrics.
Such group-by allows us to further compare the performance metrics between rocks-level and
rocks-fifo.
To make things extensible, this PR introduces BlockstoreAdvancedOptions and move shred_storage_type.
All fields in BlockstoreAdvancedOptions will support group-by operation in blockstore_rocksdb_cfs.
Dependency: #23580