This is a bugfix.
When `load_frozen_forks()` called `snapshot_bank()`, the accounts hash
was not computed. So then when a snapshot archive was eventually
created, its hash would be all 1s, which is clearly wrong. The fix is
to update the bank's accounts hash before taking the bank snapshot.
* Add TransactionMemos column family
* Traitify extract_memos
* Write TransactionMemos in TransactionStatusService
* Populate memos from column
* Dedupe and add unit test
Some of the blockstore_processor tests weren't using
`test_process_blockstore()`, but could. I updated those tests, and also
changed to using `..` for ignoring, instead of `_`, since I'll be adding
another return value here shortly (thus reducing the need to change
these again).
```text
error: all if blocks contain the same code at the end
--> ledger/src/blockstore.rs:3473:5
|
3473 | / Ok(insert_map.get(&slot).unwrap().clone())
3474 | | }
| |_____^
|
= note: `-D clippy::branches-sharing-code` implied by `-D warnings`
= note: The end suggestion probably needs some adjustments to use the expression result correctly
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#branches_sharing_code
help: consider moving the end statements out like this
|
3473 ~ }
3474 + Ok(insert_map.get(&slot).unwrap().clone())
|
error: could not compile `solana-ledger` due to previous error
```
#### Problem
Snapshot names are overloaded, and there are multiple terms that mean the same thing. This is confusing. Here's a list of ones in the codebase that I've found:
```
- snapshot_dir
- snapshots_dir
- snapshot_path
- snapshot_output_dir
- snapshot_package_output_path
- snapshot_archives_dir
```
#### Summary of Changes
For all the ones that are about the directory where snapshot archives are stored, ensure they are `snapshot_archives_dir`. For the ones about the (bank) snapshots directory, set to `bank_snapshots_dir`.
Co-authored-by: Michael Vines <mvines@gmail.com>
* increase block_cost_max by 10 times to accommodate updated account read and write costs; Otherwise during performace test, banking_stage will start reject and retry transactions in next block, result regressed performance.
* fix typo
Shreds recovered from erasure codes have not been received from turbine
and have not been retransmitted to other nodes downstream. This results
in more repairs across the cluster which is slower.
This commit channels through recovered shreds to retransmit stage in
order to further broadcast the shreds to downstream nodes in the tree.
* Handle cleaning zero-lamport accounts
Handle cleaning zero-lamport accounts in slots higher than the last full
snapshot slot. This is part of the Incremental Snapshot work.
Fixes#18825
* added realtime cost checking logic to reject block that would exceed max limit:
- defines max limits at block_cost_limits.rs
- right after each bath's execution, accumulate its cost and check again
limit, return error if limit is exceeded
* update abi that changed due to adding additional TransactionError
* To avoid counting stats mltiple times, only accumulate execute-timing when a bank is completed
* gate it by a feature
* move cost const def into block_cost_limits.rs
* redefine the cost for signature and account access, removed signer part as it is not well defined for now
* check if per_program_timings of execute_timings before sending
While reviewing PR #18565, as issue was brought up to refactor some code
around verifying the bank after rebuilding from snapshots. A new
top-level function has been added to get the latest snapshot archives
and load the bank then verify. Additionally, new tests have been
written and existing tests have been updated to use this new function.
Fixes#18973
While resolving the issue, it became clear there was some additional
low-hanging fruit this change enabled. Specifically, the functions
`bank_to_xxx_snapshot_archive()` now return their respective
`SnapshotArchiveInfo`. And on the flip side,
`bank_from_snapshot_archives()` now takes `SnapshotArchiveInfo`s instead
of separate paths and archive formats. This bundling simplifies bank
rebuilding.
This commit adds high-level functions for creating and loading-from
incremental snapshots, plus all low-level functions required to perform
those tasks. This commit **does not** add taking incremental snapshots
as part of a running validator, nor starting up a node with an
incremental snapshot; just laying ground work.
Additionally, `snapshot_utils` and `serde_snapshot` have been
refactored to use a common code paths for the different snapshots.
Also of note, some renaming has happened:
1. Snapshots are now either `full_` or `incremental_` throughout the
codebase. If not specified, the code applies to both.
2. Bank snapshots now are called "bank snapshots"
(before they were called "slot snapshots", "bank snapshots", or
just "snapshots"). The one exception is within `Bank`, where they
are still just "snapshots", because they are already "bank
snapshots".
3. Snapshot archives now have `_archive` in the code. This
should clear up an ambiguity between bank snapshots and snapshot
archives.
* Move transaction sanitization earlier in the pipeline
* Renamed HashedTransaction to SanitizedTransaction
* Implement deref for sanitized transaction
* bring back process_transactions test method
* Use sanitized transactions for cost model calculation
* Fix link target in doc comment
* Fix formatting of log examples in process_instruction
* Fix doc markdown in solana-gossip
* Fix doc markdown in solana-runtime
* Escape square braces in doc comments to avoid warnings
* Surround 'account references' doc items in code spans to avoid warnings
* Fix code block in loader_upgradeable_instruction
* Fix doctest for loader_upgradable_instruction
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true
Fixes#17544
* Create solana-poh crate
* Move BigTableUploadService to solana-ledger
* Add solana-rpc to workspace
* Move dependencies to solana-rpc
* Move remaining rpc modules to solana-rpc
* Single use statement solana-poh
* Single use statement solana-rpc
* Update rocksdb to v0.16.0
* Promote the infrequent and important log to info!
* Force background compaction by ttl without manual compaction
* Fix test
* Support no compaction mode in test_ledger_cleanup_compaction
* Fix comment
* Make compaction_interval customizable
* Avoid major compaction with periodic filtering...
* Adress lazy_static, special cfs and range check
* Clean up a bit and add comment
* Add comment
* More comments...
* Config code cleanup
* Add comment
* Use .conflicts_with()
* Nullify unneeded delete_range ops for special CFs
* Some clean ups
* Clarify the locking intention
* Ensure special CFs' consistency with PurgeType::CompactionFilter
* Fix comment
* Fix bad copy paste
* Fix various types...
* Don't use tuples
* Add a unit test for compaction_filter
* Fix typo...
* Remove flag and just use new behavior always
* Fix wrong condition negation...
* Doc. about no set_last_purged_slot in purge_slots
* Write a test and fix off-by-one bug....
* Apply suggestions from code review
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
* Follow up to github review suggestions
* Fix line-wrapping
* Fix conflict
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
Refactor a few functions that are on the load-from-snapshot path, to facilitate
adding in incremental snapshots more easily.
Additionally, add some tests and doc comments.
* Add BlockHeight CF to blockstore
* Rename CacheBlockTimeService to be more general
* Cache block-height using service
* Fixup previous proto mishandling
* Add block_height to block structs
* Add block-height to solana block
* Fallback to BankForks if block time or block height are not yet written to Blockstore
* Add docs
* Review comments
* Add blockstore-root-scan for api nodes on boot
* Ensure cluster-confirmed root and parents are set as root in blockstore in load_frozen_forks()
* Plumb rpc-scan-and-fix-roots validator flag
* Upgrade Rust to 1.52.0
update nightly_version to newly pushed docker image
fix clippy lint errors
1.52 comes with grcov 0.8.0, include this version to script
* upgrade to Rust 1.52.1
* disabling Serum from downstream projects until it is upgraded to Rust 1.52.1
* Require that blockstore block-time only be recognized slot, instead of root
* Move cache_block_time to after Bank freeze
* Single use statement
* Pass transaction_status_sender by reference
* Remove unnecessary slot-existence check before caching block time altogether
* Move block-time existence check into Blockstore::cache_block_time, Blockstore no longer needed in blockstore_processor helper
CodingShredHeader.position is equal to
ShredCommonHeader.index - ShredCommonHeader.fec_set_index
and is so redundant. The extra position field can add bugs if not
consistent with index and fec_set_index.
Strip the zero-padding off of data shreds before insertion into blockstore
Co-authored-by: Stephen Akridge <sakridge@gmail.com>
Co-authored-by: Nathan Hawkins <utsl@utsl.org>
Number of parity coding shreds is always less than the number of data
shreds in FEC blocks:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719
Data shreds are batched in chunks of 32 shreds each:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714
However the very last batch of data shreds in a slot can be small, in
which case the loss rate can be exacerbated.
This commit expands the number of coding shreds in the last FEC block in
slots to: 64 - number of data shreds; so that FEC blocks are always 64
data and parity coding shreds each.
As a consequence of this, the last FEC block has more parity coding
shreds than data shreds. So for some shred indices we will have a coding
shred but no data shreds. This should not cause any kind of overlapping
FEC blocks as in:
https://github.com/solana-labs/solana/pull/10095
since this is done only for the very last batch in a slot, and the next
slot will reset the shred index.
* Track transaction check time separately from account loads
* banking packet process metrics
* Remove signature clone in status cache lookup
* Reduce allocations when converting packets to transactions
* Add blake3 hash of transaction messages in status cache
* Bug fixes
* fix tests and run fmt
* Address feedback
* fix simd tx entry verification
* Fix rebase
* Feedback
* clean up
* Add tests
* Remove feature switch and fall back to signature check
* Bump programs/bpf Cargo.lock
* clippy
* nudge benches
* Bump `BankSlotDelta` frozen ABI hash`
* Add blake3 to sdk/programs/Cargo.lock
* nudge bpf tests
* short circuit status cache checks
Co-authored-by: Trent Nelson <trent@solana.com>
If the vector is pinned and has a recycler, From<PinnedVec>
implementation of Vec should clone (instead of consuming) the underlying
vector so that the next allocation of a PinnedVec will recycle an
already pinned one.
* Update blockstore method to allow return of unfinalized signature
* Support confirmed sigs in getConfirmedSignaturesForAddress2
* Add deprecated comments
* Update docs
* Enable confirmed transaction-history in cli
* Return real confirmation_status; fill in not-yet-finalized block time if possible
* Plumb allow-dead-slots to ledger-tool verify
* ledger-tool cleanup and add some useful missing args
Print root slots and how many unrooted past last root.
* Update tonic & prost, and regenerate proto
* Reignore doc code
* Revert pull #14367, but pin tokio to v0.2 for jsonrpc
* Bump backoff and goauth -> and therefore tokio
* Bump tokio in faucet, net-utils
* Bump remaining tokio, plus tarpc
* Add validator flag to opt in to cpi and logs storage
* Default TestValidator to opt-in; allow using in multinode-demo
* No clone
Co-authored-by: Carl Lin <carl@solana.com>
* add block_time to get_confirmed_signatures_for_address2 and protobuf implementation for tx_by_addr
* add tests for convert
* update cargo lock
* run cargo format after rebase
* introduce legacy TransactionByAddrInfo
* move LegacyTransactionByAddrInfo back to storage-bigtable
* feat: store pre / post token balances
* move helper functions into separate include
* move token balance functionality to transaction-status crate
* fix blockstore processor test
* fix bigtable legacy test
* add caching to decimals
* Adds a CLI option to the validator to enable just-in-time compilation of BPF.
* Refactoring to use bpf_loader_program instead of feature_set to pass JIT flag from the validator CLI to the executor.
Gossip and other places repeatedly de-serialize vote-state stored in
vote accounts. Ideally the first de-serialization should cache the
result.
This commit adds new VoteAccount type which lazily de-serializes
VoteState from Account data and caches the result internally.
Serialize and Deserialize traits are manually implemented to match
existing code. So, despite changes to frozen_abi, this commit should be
backward compatible.
* Fix fragile tests in prep of stake rewrite pr
* Restore BOOTSTRAP_VALIDATOR_LAMPORTS where appropriate
* Further clean up
* Further clean up
* Aligh with other call site change
* Remove false warn!
* fix ci!
* Fix slow/stuck unstaking due to toggling in epoch
* nits
* nits
* Add stake_program_v2 feature status check to cli
Co-authored-by: Tyera Eulberg <tyera@solana.com>
ClusterInfo::process_packets handles incoming packets in a thread_pool:
https://github.com/solana-labs/solana/blob/87311cce7/core/src/cluster_info.rs#L2118-L2134
However, profiling runtime shows that threads are not well utilized and
a lot of the processing is done sequentially.
This commit redistributes the work done in parallel. Testing on a gce
cluster shows 20%+ improvement in processing gossip packets with much
smaller variations.
* Add Blockstore protobuf cf type
* Add Rewards message to proto and make generated pub
* Convert Rewards cf to ProtobufColumn
* Add bench
* Adjust tags
* Move solana proto definitions and conversion methods to new crate
* Move timestamp helper to sdk
* Add Bank method for getting timestamp estimate
* Return sysvar info from Bank::clock
* Add feature-gated timestamp correction
* Rename unix_timestamp method to be more descriptive
* Review comments
* Add timestamp metric
* introduce store program logs in blockstore / bigtable
* fix test, transaction logs created for successful transactions
* fix test for legacy bincode implementation around log_messages
* only api nodes should record logs
* truncate transaction logs to 100KB
* refactor log truncate for improved coverage
* Add blockstore column to store performance sampling data
* introduce timer and write performance metrics to blockstore
* introduce getRecentPerformanceSamples rpc
* only run on rpc nodes enabled with transaction history
* add unit tests for get_recent_performance_samples
* remove RpcResponse from rpc call
* refactor to use Instant::now and elapsed for timer
* switch to root bank and ensure not negative subraction
* Add PerfSamples to purge/compaction
* refactor to use Instant::now and elapsed for timer
* switch to root bank and ensure not negative subraction
* remove duplicate constants
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* Save/restore Tower
* Avoid unwrap()
* Rebase cleanups
* Forcibly pass test
* Correct reconcilation of votes after validator resume
* d b g
* Add more tests
* fsync and fix test
* Add test
* Fix fmt
* Debug
* Fix tests...
* save
* Clarify error message and code cleaning around it
* Move most of code out of tower save hot codepath
* Proper comment for the lack of fsync on tower
* Clean up
* Clean up
* Simpler type alias
* Manage tower-restored ancestor slots without banks
* Add comment
* Extract long code blocks...
* Add comment
* Simplify returned tuple...
* Tweak too aggresive log
* Fix typo...
* Add test
* Update comment
* Improve test to require non-empty stray restored slots
* Measure tower save and dump all tower contents
* Log adjust and add threshold related assertions
* cleanup adjust
* Properly lower stray restored slots priority...
* Rust fmt
* Fix test....
* Clarify comments a bit and add TowerError::TooNew
* Further clean-up arround TowerError
* Truly create ancestors by excluding last vote slot
* Add comment for stray_restored_slots
* Add comment for stray_restored_slots
* Use BTreeSet
* Consider root_slot into post-replay adjustment
* Tweak logging
* Add test for stray_restored_ancestors
* Reorder some code
* Better names for unit tests
* Add frozen_abi to SavedTower
* Fold long lines
* Tweak stray ancestors and too old slot history
* Re-adjust error conditon of too old slot history
* Test normal ancestors is checked before stray ones
* Fix conflict, update tests, adjust behavior a bit
* Fix test
* Address review comments
* Last touch!
* Immediately after creating cleaning pr
* Revert stray slots
* Revert comment...
* Report error as metrics
* Revert not to panic! and ignore unfixable test...
* Normalize lockouts.root_slot more strictly
* Add comments for panic! and more assertions
* Proper initialize root without vote account
* Clarify code and comments based on review feedback
* Fix rebase
* Further simplify based on assured tower root
* Reorder code for more readability
Co-authored-by: Michael Vines <mvines@gmail.com>
* Check bank capitalization
* Simplify and unify capitalization calculation
* Improve and add tests
* Avoid overflow and inhibit automatic restart
* Fix test
* Tweak checked sum for cap. and add tests
* Fix broken build after merge conflicts..
* Rename to ClusterType
* Rename confusing method
* Clarify comment
* Verify cap. in rent and inflation tests
Co-authored-by: Stephen Akridge <sakridge@gmail.com>
* Add blockstore column to cache block times
* Add method to cache block time
* Add service to cache block time
* Update rpc getBlockTime to use new method, and refactor blockstore slightly
* Return block_time with confirmed block, if available
* Add measure and warning to cache-block-time
* Prevent unbound memory growth by blockstore_processor
* Promote log to info! considering infrequency
* Exclude the time of freeing from interval...
* Skip not-shrinkable slots even if forced
* Add comment
* Implement Debug for MessageProcessor
* Switch from delta-based gating to whole-set gating
* Remove dbg!
* Fix clippy
* Clippy
* Add test
* add loader to stable operating mode at proper epoch
* refresh_programs_and_inflation after ancestor setup
* Callback via snapshot; avoid account re-add; Debug
* Fix test
* Fix test and fix the past history
* Make callback management stricter and cleaner
* Fix test
* Test overwrite and frozen for native programs
* Test epoch callback with genesis-programs
* Add assertions for parent bank
* Add tests and some minor cleaning
* Remove unsteady assertion...
* Fix test...
* Fix DOS
* Skip ensuring account by dual (whole/delta) gating
* Fix frozen abi implementation...
* Move compute budget constatnt init back into bank
Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
* Submit a timestamp for every vote
* Submit at most one vote timestamp per second
* Submit a timestamp for every new vote
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* Refactor bigtable apis to accept start and end keys
* Make helper fn to deserialize cell data
* Refactor get_confirmed_signatures_for_address to use get_row_data range
* Add until param to get_confirmed_signatures_for_address
* Add until param to blockstore api
* Plumb until through client/cli
* Simplify client params
* Freeze address-signature index in the middle of slot to show failure case
* Secondary filter on signature
* Use AddressSignatures iterator instead of manually decrementing slots
* Remove unused method
* Add metrics
* Add transaction-status-index doccumentation
* Add panicking test
* Add failing test: fresh transaction-status column shouldn't point at valid root 0
* Prevent transaction status match outside of primary-index bounds
* Initialize transaction-status and address-signature primer entries with Slot::MAX
* Revert "Add failing test: fresh transaction-status column shouldn't point at valid root 0"
This reverts commit cbad2a9fae22e5531e3b4ff1b0a9d6a223826c71.
* Revert "Initialize transaction-status and address-signature primer entries with Slot::MAX"
This reverts commit ffaeac0669d0cbe18dd68b5ce177e15a92360b72.
* Skip and warn for hard-forks which are less than the start slot
Option is used during a restart, but then after the restart is
complete, then the option is not needed if the starting slot
is past the hard-fork since the hard-fork should already be
in the snapshot it started from.
* Update ledger/src/blockstore_processor.rs
Co-authored-by: Michael Vines <mvines@gmail.com>
Co-authored-by: Michael Vines <mvines@gmail.com>