Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
A slot may be purged from the blockstore with clear_unconfirmed_slot().
If the slot is added back, the slot should only exist once in its'
parent SlotMeta::next_slots Vec. Prior to this change, repeated clearing
and re-adding of a slot could result in the slot existing in parent's
next_slots multiple times. The result is that if the next time the
parent slot is processed (node restart or ledger-tool-replay), slot
could be added to the queue of slots to play multiple times.
Added test that failed before change and works now as well
Fix pre-check of blockstore slts during load_bank_forks. Now iterates from starting_slot to halt_slot via slot_meta.next_slots to confirm they are connected.
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.
#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.
By doing this, we address the mutual dependency and also make the code cleaner.
In addition to thread_local -> lazy_static change, a number of thread-pools are
initialized with get_max_thread_count to achieve parity with the older code in
terms of number of validator threads.
* initial work for poh timing report service
* add poh_timing_report_service to validator
* fix comments
* clippy
* imrove test coverage
* delete record when complete
* rename shred full to slot full.
* debug logging
* fix slot full
* remove debug comments
* adding fmt trait
* derive default
* default for poh timing reporter
* better comments
* remove commented code
* fix test
* more test fixes
* delete timestamps for slot that are older than root_slot
* debug log
* record poh start end in bank reset
* report full to start time instead
* fix poh slot offset
* report poh start for normal ticks
* fix typo
* refactor out poh point report fn
* rename
* optimize delete - delete only when last_root changed
* change log level to trace
* convert if to match
* remove redudant check
* fix SlotPohTiming comments
* review feedback on poh timing reporter
* review feedback on poh_recorder
* add test case for out-of-order arrival of timing points and incomplete timing points
* refactor poh_timing_points into its own mod
* remove option for poh_timing_report service
* move poh_timing_point_sender to constructor
* clippy
* better comments
* more clippy
* more clippy
* add slot poh timing point macro
* clippy
* assert in test
* comments and display fmt
* fix check
* assert format
* revise comments
* refactor
* extrac send fn
* revert reporting_poh_timing_point
* align loggin
* small refactor
* move type declaration to the top of the module
* replace macro with constructor
* clippy: remove redundant closure
* review comments
* simplify poh timing point creation
Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>
Now that nodes correctly populate position field in coding shreds, and
first_coding_index in erasure meta, the old code to maintain backward
compatibility can be removed.
The commit is working towards changing erasure coding schema to 32:64.
Current slot stats are removed when the slot is full or every 30 seconds
if the slot is before root:
https://github.com/solana-labs/solana/blob/493a8e234/ledger/src/blockstore.rs#L2017-L2027
In order to track if the slot is ultimately marked as dead or rooted and
emit more metrics, this commit expands lifetime of SlotStats while
bounding total size of cache using an LRU eviction policy.
This PR does a refactoring on column family-related metrics reporting.
As the metric reporting is per column family basis, the PR creates
ColumnMetrics trait and move the metric reporting logic into it.
This refactoring will make future column metric reporting (such as
read PerfContext) much cleaner.
* transaction-status: Add return data to meta
* Add return data to simulation results
* Use pretty-hex for printing return data
* Update arg name, make TransactionRecord struct
* Rename TransactionRecord -> ExecutionRecord
This PR adds `--rocksdb-ledger-compression` as a hidden argument to the validator
for specifying the compression algorithm for TransactionStatus. Available compression
algorithms include `lz4`, `snappy`, `zlib`. The default value is `none`.
Experimental results show that with lz4 compression, we can achieve ~37% size-reduction
on the TransactionStatus column family, or ~8% size-reduction of the ledger store size.
This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will
pass-down this struct to LedgerColumn to allow it to perform metric reporting.
Creating a new ledger implicitly means that no other process could have
previously held access to it. Additionally, creating a new ledger
implicitly requires writing, so it follows that Primary access is
required and we can drop access type as an argument.
#### Summary of Changes
This PR further enables group by operation on storage type in blockstore_rocksdb_cfs metrics.
Such group-by allows us to further compare the performance metrics between rocks-level and
rocks-fifo.
To make things extensible, this PR introduces BlockstoreAdvancedOptions and move shred_storage_type.
All fields in BlockstoreAdvancedOptions will support group-by operation in blockstore_rocksdb_cfs.
Dependency: #23580
This PR enables blockstore to periodically report RocksDB column family properties.
The reported properties are under blockstore_rocksdb_cfs, and the properties also
support group by operation on cf_name.
#### Summary of Changes
To avoid mixing the use of different shred storage types, each shred storage type
will have its blockstore in a different directory.
This PR still keeps the RocksFifo setting hidden. The default ShredStorageType and
blockstore directory are still RocksLevel and `rocksdb`.
Will follow-up with PRs on making FIFO option public in ledger-tool and validator.
#### Test Plan
* Added a new test to verify the existence of `rocksdb-fifo` directory when FIFO compaction is used.
* Updated existing test to verify the current setting still store ledger under `rocksdb` directory.
* Manually ran ledger_cleanup_test with both level and fifo compaction and verified the resulting ledger.
* Ran a validator with this PR.
* Bump first-available block to first complete block
* Remove obsolete purges in tests (PrimaryIndex toggling no longer in use
* Check first-available block in Rpc check_slot_cleaned_up