Commit Graph

858 Commits

Author SHA1 Message Date
Yueh-Hsuan Chiang ba3d9cd325
Add LedgerColumn::multi_get() (#26354)
#### Problem
Blockstore operations such as get_slots_since() issues multiple rocksdb::get()
at once which is not optimal for performance.

#### Summary of Changes
This PR adds LedgerColumn::multi_get() based on rocksdb::batched_multi_get(),
the optimized version of multi_get() where get requests are processed in batch
to minimize read I/O.
2022-09-12 15:01:22 -07:00
Jeff Washington (jwash) 765c628546
use exit signal for acct idx bg threads (#27483) 2022-09-12 11:51:12 -07:00
Jeff Washington (jwash) f0770c199e
fix ledger tool final hash calc (#27725)
fix ledger tool final hash calc
2022-09-12 11:09:34 -07:00
behzad nouri 37296caee8
removes set_{slot,index,last_in_slot} implementations for merkle shreds (#27689)
These methods are only used in tests but invoked on a merkle shred they
will always invalidate the shred because the merkle proof will no longer
verify. As a result the shred will not sanitize and blockstore will
avoid inserting them. Their use in tests will result in spurious test
coverage because the shreds will not be ingested.

The commit removes implementation of these methods for merkle shreds.
Follow up commits will entirely remove these methods from shreds api.
2022-09-09 17:58:04 +00:00
Yueh-Hsuan Chiang ed00365101
Add ledger tool command print-file-metadata (#26790)
Add ledger-tool command print-file-metadata

#### Summary of Changes
This PR adds a ledger tool subcommand print-file-metadata.
```
USAGE:
    solana-ledger-tool print-file-metadata [FLAGS] [OPTIONS] [SST_FILE_NAME]

    Prints the metadata of the specified ledger-store file.
    If no file name is unspecified, then it will print the metadata of all ledger files
```
2022-09-06 21:46:35 -07:00
Jeff Biseda 269eb519dd
track time to coalesce entries in recv_slot_entries (#27525) 2022-09-06 16:07:17 -07:00
Yueh-Hsuan Chiang c62aef6e02
Add code comments for lowest_cleanup_slot functions (#27497)
#### Summary of Changes
Add code comments for lowest_cleanup_slot related functions to improve
the code readability for the consistency between blockstore purge logic
and the read side.
2022-09-06 16:03:42 -07:00
Tao Zhu 8bb039d08d
collect min prioritization fees when replaying sanitized transactions (#26709)
* Collect blocks' minimum prioritization fees when replaying sanitized transactions

* Limits block min-fee metrics reporting to top 10 writable accounts

* Add service thread to asynchronously update and finalize prioritization fee cache

* Add bench test for prioritization_fee_cache

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2022-08-31 08:00:55 -05:00
Yueh-Hsuan Chiang 6d070cee08
Fix the boundary inconsistency between delete_file_in_range and delete_range (#27201)
#### Problem
RocksDB's delete_range applies to [from, to) while delete_file_in_range
applies to [from, to] by default, and the rust-rocksdb api does not include
the option to make delete_file_in_range apply to [from, to).  Such inconsistency
might cause `blockstore::run_purge` to produce an inconsistent result as it
invokes both delete_range and delete_file_in_range.

#### Summary of Changes
This PR makes all our purge / delete related functions to be inclusive
on both starting and ending slots.
2022-08-30 21:38:54 -07:00
Brennan Watt dfdb422fb1
Minor shred constant cleanup (#27472)
* Minor shred constant cleanup to eliminate magic number
2022-08-30 18:53:05 -07:00
Jeff Biseda d1522fc790
coalesce entries in recv_slot_entries to target byte count (#27321) 2022-08-25 13:51:55 -07:00
Brennan Watt 5150877fb3
Attempt to skip redundant startup account verification (#26999) 2022-08-25 09:29:57 -07:00
Brennan Watt e4a7d01e10
Rust v1.63 (#27303)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

* Update quinn-udp crate

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-22 18:01:03 -07:00
Michael Vines 3f4731b37f Standardize thread names
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
2022-08-20 07:49:39 -07:00
behzad nouri c0b63351ae
recovers merkle shreds from erasure codes (#27136)
The commit
* Identifies Merkle shreds when recovering from erasure codes and
  dispatches specialized code to reconstruct shreds.
* Coding shred headers are added to recovered erasure shards.
* Merkle tree is reconstructed for the erasure batch and added to
  recovered shreds.
* The common signature (for the root of Merkle tree) is attached to all
  recovered shreds.
2022-08-19 21:07:32 +00:00
apfitzge 40b9f2f2be
slots_connected: check if the range is connected (>= ending_slot) (#27152) 2022-08-19 09:33:50 -05:00
Brennan Watt 7573000d87
Revert "Rust v1.63.0 (#27148)" (#27245)
This reverts commit a2e7bdf50a.
2022-08-19 09:19:44 +01:00
HaoranYi 4634fb944c
Use from_secs api to create duration (#27222)
use from_secs api to create duration
2022-08-18 11:06:52 -05:00
Yueh-Hsuan Chiang 6d12bb6ec3
Fix a corner-case panic in get_entries_in_data_block() (#27195)
#### Problem
get_entries_in_data_block() panics when there's inconsistency between
slot_meta and data_shred.

However, as we don't lock on reads, reading across multiple column families is
not atomic (especially for older slots) and thus does not guarantee consistency
as the background cleanup service could purge the slot in the middle.  Such
panic was reported in #26980 when the validator serves a high load of RPC calls.

#### Summary of Changes
This PR makes get_entries_in_data_block() panic only when the inconsistency
between slot-meta and data-shred happens on a slot older than lowest_cleanup_slot.
2022-08-18 02:37:19 -07:00
Brennan Watt a2e7bdf50a
Rust v1.63.0 (#27148)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-17 15:48:33 -07:00
behzad nouri c813d75944
renames size_of_erasure_encoded_slice to ShredCode::capacity (#27157)
Maintain symmetry between code and data shreds:
* ShredData::capacity -> data buffer capacity
* ShredCode::capacity -> erasure code capacity
2022-08-16 12:30:38 +00:00
behzad nouri 0e30609394
adds Shred{Code,Data}::SIZE_OF_HEADERS trait constants (#27144) 2022-08-15 19:02:32 +00:00
behzad nouri b3b57a0f07
adjusts max coding shreds per slot (#27083)
As a consequence of removing buffering when generating coding shreds:
https://github.com/solana-labs/solana/pull/25807
more coding shreds are generated than data shreds, and so
MAX_CODE_SHREDS_PER_SLOT needs to be adjusted accordingly.

The respective value is tied to ERASURE_BATCH_SIZE.
2022-08-12 18:02:01 +00:00
steviez 14d9922105
Bump rust-rocksdb to 0.19.0 tag (#26949) 2022-08-11 09:41:01 -05:00
behzad nouri ac91cdab74
removes buffering when generating coding shreds in broadcast (#25807)
Given the 32:32 erasure recovery schema, current implementation requires
exactly 32 data shreds to generate coding shreds for the batch (except
for the final erasure batch in each slot).
As a result, when serializing ledger entries to data shreds, if the
number of data shreds is not a multiple of 32, the coding shreds for the
last batch cannot be generated until there are more data shreds to
complete the batch to 32 data shreds. This adds latency in generating
and broadcasting coding shreds.

In addition, with Merkle variants for shreds, data shreds cannot be
signed and broadcasted until coding shreds are also generated. As a
result *both* code and data shreds will be delayed before broadcast if
we still require exactly 32 data shreds for each batch.

This commit instead always generates and broadcast coding shreds as soon
as there any number of data shreds available. When serializing entries
to shreds:
* if the number of resulting data shreds is less than 32, then more
  coding shreds will be generated so that the resulting erasure batch
  has the same recovery probabilities as a 32:32 batch.
* if the number of data shreds is more than 32, then the data shreds are
  split uniformly into erasure batches with _at least_ 32 data shreds in
  each batch. Each erasure batch will have the same number of code and
  data shreds.

For example:
* If there are 19 data shreds, 27 coding shreds are generated. The
  resulting 19(data):27(code) erasure batch has the same recovery
  probabilities as a 32:32 batch.
* If there are 107 data shreds, they are split into 3 batches of 36:36,
  36:36 and 35:35 data:code shreds each.

A consequence of this change is that code and data shreds indices will
no longer align as there will be more coding shreds than data shreds
(not only in the last batch in each slot but also in the intermediate
ones);
2022-08-11 12:44:27 +00:00
behzad nouri e2a2d271f2
adds number of coding shreds to broadcast metrics (#27006) 2022-08-09 13:59:40 +00:00
steviez 1d66ae0f66
Bail out of execute_batches() early for empty batches slice (#26932)
The caller of execute_batches() that assembles batches may call this
function with empty batches; we know we can bail early in this scenario.
2022-08-08 12:14:49 -05:00
Yueh-Hsuan Chiang 99ef2184cc
Delete files older than the lowest_cleanup_slot in LedgerCleanupService::cleanup_ledger (#26651)
#### Problem
LedgerCleanupService requires compactions to propagate & digest range-delete tombstones
to eventually reclaim disk space.

#### Summary of Changes
This PR makes LedgerCleanupService::cleanup_ledger delete any file whose slot-range is
older than the lowest_cleanup_slot.  This allows us to reclaim disk space more often with
fewer IOps.  Experimental results on mainnet validators show that the PR can effectively
reduce 33% to 40% ledger disk size.
2022-08-09 00:48:06 +08:00
Richard Patel 270315a7f6
transaction-status, storage-proto: add compute_units_consumed (#26528)
* transaction-status, storage-proto: add compute_units_consumed

* fix bpf test

Co-authored-by: Justin Starry <justin@solana.com>
2022-08-06 17:14:31 +00:00
Justin Starry 69598ed4c0
Refactor: Add `RuntimeConfig` field to Bank (#26946)
* Refactor: Simplify arguments for bank constructor methods

* Refactor: Add RuntimeConfig to Bank fields

* Arc wrap runtime_config

* Arc wrap all runtime config usages

* Remove Copy trait derivation from RuntimeConfig

* Remove some arc wrapping
2022-08-05 20:49:00 +01:00
Brooks Prumo 4e43aa6c18
Remove accounts data size checks in blockstore_processor (#26776) 2022-08-05 09:55:41 -04:00
Tyera Eulberg 2dca239480
Remove runtime dependency from solana-transaction-status (#26930)
* Move RewardType out of runtime

* Move collect_token_balances to solana-ledger

* Remove solana-runtime dependency
2022-08-05 00:20:27 -06:00
steviez 300666dce7
Make `solana-ledger-tool` run AccountsBackgroundService (#26914)
Prior to this change, long running commands like `solana-ledger-tool
verify` would OOM due to AccountsDb cleanup not happening.

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-04 15:44:31 -05:00
behzad nouri 403b2e4841
records num data shreds obtained from serializing entries (#26888) 2022-08-03 17:07:40 +00:00
behzad nouri 1181510531
infers erasure batches from FEC-set indices of data shreds (#26873)
data_shreds_to_coding_shreds relies on the hardcoded
MAX_DATA_SHREDS_PER_FEC_BLOCK batches of data shreds:
https://github.com/solana-labs/solana/blob/e74ad90cd/ledger/src/shredder.rs#L175-L183

This hardcoded logic is unnecessary since the shreds belonging to the
same erasure batch can be identified from run of the same FEC-set index.
2022-08-02 16:05:27 +00:00
Jeff Washington (jwash) f6e4f76dac
--halt-at-slot-store-hash-raw-data works for root bank (#26865) 2022-08-01 14:25:47 -05:00
Jeff Biseda 857be1e237
sign repair requests (#26833) 2022-07-31 15:48:51 -07:00
Jeff Washington (jwash) c7462b7a52
ledger tool verify can store debug info on hash calc (#26837) 2022-07-29 15:54:56 -05:00
Brennan Watt 467cb5def5
Concurrent slot replay (#26465)
* Concurrent replay slots

* Split out concurrent and single bank replay paths

* Sub function processing of replay results for readability

* Add feature switch for concurrent replay
2022-07-28 11:33:19 -07:00
Yueh-Hsuan Chiang dcbda2c6c5
Allow ledger tool to automatically detect the shred compaction style (#26182)
#### Problem
Ledger-tool doesn't support shred-compaction-type other than the default rocksdb level compaction.

#### Summary of Changes
This PR enables ledger-tool to automatically detect the shred-compaction-type of the specified ledger.

#### Test Plan
New ledger-tool tests are added for both level and fifo compactions.
2022-07-19 01:19:17 +08:00
Jeff Washington (jwash) 47716a5e01
async hash verify on load (#26208)
* verify accounts hash in bg on startup

* fix some tests and loading from genesis

* add extra state for when background thread has completed
2022-07-15 14:29:56 -05:00
Alexander Meißner 038da82b6f
Feature: Early verification of account modifications in `BorrowedAccount` (#25899)
* Adjusts test cases for stricter requirements.

* Removes account reset in deserialization test.

* Removes verify related test cases.

* Replicates account modification verification logic of PreAccount in BorrowedAccount.

* Adds TransactionContext::account_touched_flags.

* Adds account modification verification to the BPF ABIv0 and ABIv1 deserialization, CPI syscall and program-test.

* Replicates the total sum of all lamports verification of PreAccounts in InstructionContext

* Check that the callers instruction balance is maintained during a call / push.

* Replicates PreAccount statistics in TransactionContext.

* Disable verify() and verify_and_update() if the feature enable_early_verification_of_account_modifications is enabled.

* Moves Option<Rent> of enable_early_verification_of_account_modifications into TransactionContext::new().

* Relaxes AccountDataMeter related test cases.

* Don't touch the account if nothing changes.

* Adds two tests to trigger InstructionError::UnbalancedInstruction.

Co-authored-by: Justin Starry <justin@solana.com>
2022-07-15 09:31:34 +02:00
Tao Zhu f13b5c832d
Remove obsoleted metrics reporting to reduce lock contention on cost_model (#26608)
remove obsoleted metrics reporting to reduce lock contention on cost_model
2022-07-14 23:02:49 -05:00
Yueh-Hsuan Chiang 493108f026
Move Blockstore::blockstore_directory() to ShredStorageType (#26179)
#### Summary of Changes
Blockstore::blockstore_directory() function takes a ShredStorageType and returns a path.
As it's a ShredStorageType related function, moving it under ShredStorageType as a
member function is a better fit.

In addition, as we start doing automatic shred compaction type selection under ledger-tool,
it becomes more convenient to reuse the function and const under blockstore_options
namespace instead of blockstore.
2022-07-13 00:05:54 +08:00
Yueh-Hsuan Chiang f284bba53b
Create const &strs for rocksdb perf write operation names (#26352)
#### Summary of Changes
Define PERF_METRIC_OP_NAME_PUT and PERF_METRIC_OP_NAME_WRITE_BATCH
to replace repetitive / hard-coded operation names for report_rocksdb_write_perf.
2022-07-13 00:05:26 +08:00
Yueh-Hsuan Chiang 9985215bc8
Make report_rocksdb_read_perf() to take a operation name. (#26351)
#### Problem
report_rocksdb_read_perf() always uses the hard-coded operation name "get"

#### Summary of Changes
As we will add a new read operation -- multi_get(), report_rocksdb_read_perf()
needs to have an input parameter for operation name.
2022-07-12 07:18:49 +08:00
steviez 26ad45e723
Add log for missing start_slot in blockstore_processor (#26566)
blockstore_processor can do nothing in this case; the log might help
user realize that their snapshot and ledger are disjoint.
2022-07-11 18:11:16 -05:00
apfitzge 1c2f6a4c41
Bugfix: slots_connected snapshot_slot not full (#26506)
* slots_connected check used by ledger-tool should not require a full slot for snapshot slot

* Cleaner Result<Option<>> unwrap/default

* return false if no meta for starting slot

* Add clarifying comments
2022-07-11 13:45:12 -05:00
Nicholas Clarke ee0a40937e
Add validator argument log_messages_bytes_limit to change log truncation limit.
Add new cli argument log_messages_bytes_limit to solana-validator to control how long program logs can be before truncation
2022-07-11 10:53:18 -05:00
behzad nouri ba785cf8ab
removes erroneous uses of std::mem::swap (#26536)
All instances should be replace by std::mem::{replace,take},
or just plain assignment.
2022-07-11 11:33:15 +00:00