Most nodes in the cluster receive the same shred from two different
nodes: parent, and the first node of their neighborhood:
https://github.com/solana-labs/solana/blob/a8c695ba5/core/src/cluster_nodes.rs#L178-L197
Because of the erasure codings, half of the shreds are already
redundant. So this redundant propagation path will only add extra
overhead.
Additionally the very first node of the broadcast tree has 2x fanout
(i.e. 400 nodes) which adds too much load at one node.
This commit simplifies the broadcast tree by dropping the redundant
propagation path and removing the 2x fanout at root node.
- Add write-lock-contention option, replacing same_payer
- write-lock-contention also has a same-batch-only value, where
contention happens only inside batches, not between them
- Rename num-threads to batches-per-iteration, which is closer to what
it is actually doing.
- Add num-banking-threads as a new option
- Rename packets-per-chunk to packets-per-batch, because this is closer
to what's happening; and it was previously confusing that num-chunks
had little to do with packets-per-chunk.
Example output for a iterations=100 and a permutation of inputs:
contention,threads,batchsize,batchcount,tps
none, 3,192, 4,65290.30
none, 4,192, 4,77358.06
none, 5,192, 4,86436.65
none, 3, 12,64,43944.57
none, 4, 12,64,65852.15
none, 5, 12,64,70674.37
same-batch-only,3,192, 4,3928.21
same-batch-only,4,192, 4,6460.15
same-batch-only,5,192, 4,7242.85
same-batch-only,3, 12,64,11377.58
same-batch-only,4, 12,64,19582.79
same-batch-only,5, 12,64,24648.45
full, 3,192, 4,3914.26
full, 4,192, 4,2102.99
full, 5,192, 4,3041.87
full, 3, 12,64,11316.17
full, 4, 12,64,2224.99
full, 5, 12,64,5240.32
1. use was_executed to correctly identify transactions requires cost adjustment;
2. add function to specifically handle executino cost adjustment without have to copy accounts
* Increase connection timeouts
* Bump quic connection cache to 1024
* Use constant for quic connection timeout and add warm cache service
* Fixes to QUIC warmup service
* fix check failure
* fixes after rebase
* fix timeout test
Co-authored-by: Pankaj Garg <pankaj@solana.com>
Benchmarks show roughly a 6% improvement. The impact could be more
significant when transactions need to be retried a lot.
after patch:
{'name': 'banking_bench_total', 'median': '72767.43'}
{'name': 'banking_bench_tx_total', 'median': '80240.38'}
{'name': 'banking_bench_success_tx_total', 'median': '72767.43'}
test bench_banking_stage_multi_accounts
... bench: 6,137,264 ns/iter (+/- 1,364,111)
test bench_banking_stage_multi_programs
... bench: 10,086,435 ns/iter (+/- 2,921,440)
before patch:
{'name': 'banking_bench_total', 'median': '68572.26'}
{'name': 'banking_bench_tx_total', 'median': '75704.75'}
{'name': 'banking_bench_success_tx_total', 'median': '68572.26'}
test bench_banking_stage_multi_accounts
... bench: 6,521,007 ns/iter (+/- 1,926,741)
test bench_banking_stage_multi_programs
... bench: 10,526,433 ns/iter (+/- 2,736,530)
- don't store pending tx signatures and costs in CostTracker
- apply tx costs to global state immediately again
- go from commit_or_cancel to update_or_remove, where the cost tracker
is either updated with the true costs for successful tx, or the costs
of a retryable tx is removed
- move the function into qos_service and hold the cost tracker lock for
the whole loop
* panic when test timeout
* nonblocking send when when droping banks
* debug log
* timeout for tvu
* unused varaible
* timeout for tpu
* Revert "debug log"
This reverts commit da780a3301a51d7c496141a85fcd35014fe6dff5.
* add timeout const
* fix typo
* Revert "nonblocking send when when droping banks".
I will create another pull request for this.
This reverts commit 088c98ec0facf825b5eca058fb860deba6d28888.
* Update core/src/tpu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/tpu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/tvu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/tvu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/validator.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
test_skip_repair in retransmit-stage is no longer relevant because
following: https://github.com/solana-labs/solana/pull/19233
repair packets are filtered out earlier in window-service and so
retransmit stage does not know if a shred is repaired or not.
Also, following turbine peer shuffle changes:
https://github.com/solana-labs/solana/pull/24080
the test has become flaky since it does not take into account how peers
are shuffled for each shred.
#### Summary of Changes
This PR enables RocksDB read side performance metrics to report to blockstore_rocksdb_read_perf.
The sampling rate is controlled by an env arg `SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K`,
specifies the number of perf samples for every 1000 operations. The default value is set to 10, meaning
we will report 10 out of 1000 (or 1/100) reads.
The metrics are based on the RocksDB [PerfContext](https://github.com/facebook/rocksdb/blob/main/include/rocksdb/perf_context.h).
It includes many useful metrics including block read time, cache hit rate, and time spent on decompressing the block.
* initial work for poh timing report service
* add poh_timing_report_service to validator
* fix comments
* clippy
* imrove test coverage
* delete record when complete
* rename shred full to slot full.
* debug logging
* fix slot full
* remove debug comments
* adding fmt trait
* derive default
* default for poh timing reporter
* better comments
* remove commented code
* fix test
* more test fixes
* delete timestamps for slot that are older than root_slot
* debug log
* record poh start end in bank reset
* report full to start time instead
* fix poh slot offset
* report poh start for normal ticks
* fix typo
* refactor out poh point report fn
* rename
* optimize delete - delete only when last_root changed
* change log level to trace
* convert if to match
* remove redudant check
* fix SlotPohTiming comments
* review feedback on poh timing reporter
* review feedback on poh_recorder
* add test case for out-of-order arrival of timing points and incomplete timing points
* refactor poh_timing_points into its own mod
* remove option for poh_timing_report service
* move poh_timing_point_sender to constructor
* clippy
* better comments
* more clippy
* more clippy
* add slot poh timing point macro
* clippy
* assert in test
* comments and display fmt
* fix check
* assert format
* revise comments
* refactor
* extrac send fn
* revert reporting_poh_timing_point
* align loggin
* small refactor
* move type declaration to the top of the module
* replace macro with constructor
* clippy: remove redundant closure
* review comments
* simplify poh timing point creation
Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>
* run validator_exit_test sequentially
* limit validator exit run to its own serial run subset
add 10ms delay in the validator exit tests
* fix intermittent validator exit failure
* no sleep
* undo the code move
* Revert "core: partial versioned transaction support for voting service"
This reverts commit eb3df4c20e.
* Manually serialize vote tx before sending to TPU
* transaction-status: Add return data to meta
* Add return data to simulation results
* Use pretty-hex for printing return data
* Update arg name, make TransactionRecord struct
* Rename TransactionRecord -> ExecutionRecord
This PR adds `--rocksdb-ledger-compression` as a hidden argument to the validator
for specifying the compression algorithm for TransactionStatus. Available compression
algorithms include `lz4`, `snappy`, `zlib`. The default value is `none`.
Experimental results show that with lz4 compression, we can achieve ~37% size-reduction
on the TransactionStatus column family, or ~8% size-reduction of the ledger store size.
* Use QUIC client in voting service
* guard quic-client usage with a flag
* add measure to time the quic client
* move time measure outside if block
* remove quic vs UDP flag from voting service
This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will
pass-down this struct to LedgerColumn to allow it to perform metric reporting.