Commit Graph

2527 Commits

Author SHA1 Message Date
Jeff Washington (jwash) fddd162645
reserialize bank in ahv by first writing to temp file in abs (#23947) 2022-04-06 21:39:26 -05:00
Tyera Eulberg fb67ff14de
Remove replica-node crates (#24152) 2022-04-06 16:52:19 -06:00
Brooks Prumo c322842257
Replace channel with Mutex<Option> for AccountsPackage (#24013) 2022-04-06 05:47:19 -05:00
HaoranYi 302142bb25
fix typo (#24123) 2022-04-05 15:55:47 -05:00
behzad nouri db23295e1c
removes legacy weighted_shuffle and weighted_best methods (#24125)
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
https://github.com/solana-labs/solana/pull/18343
and can be replaced with WeightedShuffle.

Also, as described in:
https://github.com/solana-labs/solana/pull/13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
2022-04-05 19:19:22 +00:00
carllin 4ea59d8cb4
Set drop callback on first root bank (#23999) 2022-04-05 13:02:33 -05:00
behzad nouri 2282571493
removes outdated and flaky test_skip_repair from retransmit-stage (#24121)
test_skip_repair in retransmit-stage is no longer relevant because
following: https://github.com/solana-labs/solana/pull/19233
repair packets are filtered out earlier in window-service and so
retransmit stage does not know if a shred is repaired or not.
Also, following turbine peer shuffle changes:
https://github.com/solana-labs/solana/pull/24080
the test has become flaky since it does not take into account how peers
are shuffled for each shred.
2022-04-05 16:02:53 +00:00
behzad nouri 2b718d00b0 removes legacy compatibility turbine peers shuffle code 2022-04-05 12:04:12 +00:00
behzad nouri d0b850cdd9 removes turbine peers shuffle patch feature 2022-04-05 12:04:12 +00:00
behzad nouri 855801cc95 removes deterministic-shred-seed feature 2022-04-05 12:04:12 +00:00
Jeff Biseda ee6bb0d5d3
track fec set turbine stats (#23989) 2022-04-04 14:44:21 -07:00
HaoranYi 6ba4e870c4
Blockstore should drop signals before validator exit (#24025)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* wip

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

* clean up import

* add mutex for signal senders in blockstore

* remove mut

* refactor: extract add signal functions

* make blockstore signal private

* let compiler infer mutex type

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-04 11:38:05 -05:00
behzad nouri 7cb3b6cbe2
demotes WeightedShuffle failures to error metrics (#24079)
Since call-sites are calling unwrap anyways, panicking seems too punitive
for our use cases.
2022-04-03 16:20:06 +00:00
HaoranYi ffa4cafe1c
Revert sequential execution of validator_exit and validator_parallel_exit tests (#24048)
* handle channel disconnect

* revert sequential execution of validator_exit and parallel_validator_exit tests
2022-04-02 10:22:47 -05:00
Yueh-Hsuan Chiang 0b5ed87220
(LedgerStore) Enable performance sampling in column family get() (#23834)
#### Summary of Changes
This PR enables RocksDB read side performance metrics to report to blockstore_rocksdb_read_perf.
The sampling rate is controlled by an env arg `SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K`,
specifies the number of perf samples for every 1000 operations.  The default value is set to 10, meaning
we will report 10 out of 1000 (or 1/100) reads.

The metrics are based on the RocksDB [PerfContext](https://github.com/facebook/rocksdb/blob/main/include/rocksdb/perf_context.h).
It includes many useful metrics including block read time, cache hit rate, and time spent on decompressing the block.
2022-04-01 13:13:32 -07:00
Pankaj Garg df4d92f9cf
Revert voting service to use UDP instead of QUIC (#24032) 2022-04-01 09:34:18 -07:00
HaoranYi 51b37f0184
Modify rpc_completed_slot_service to be non-blocking (#24007)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-03-31 16:44:23 -05:00
Jeff Washington (jwash) 9c8dad33c7
add epoch_schedule and rent_collector to hash calc (#24012) 2022-03-31 10:51:18 -05:00
Jeff Washington (jwash) da001d54e5
calculate_accounts_hash_helper uses config (#24003) 2022-03-31 09:29:45 -05:00
Jeff Washington (jwash) 125f9634fd
add hash calc config.use_write_cache (#24005) 2022-03-30 17:19:34 -05:00
HaoranYi ba770832d0
Poh timing service (#23736)
* initial work for poh timing report service

* add poh_timing_report_service to validator

* fix comments

* clippy

* imrove test coverage

* delete record when complete

* rename shred full to slot full.

* debug logging

* fix slot full

* remove debug comments

* adding fmt trait

* derive default

* default for poh timing reporter

* better comments

* remove commented code

* fix test

* more test fixes

* delete timestamps for slot that are older than root_slot

* debug log

* record poh start end in bank reset

* report full to start time instead

* fix poh slot offset

* report poh start for normal ticks

* fix typo

* refactor out poh point report fn

* rename

* optimize delete - delete only when last_root changed

* change log level to trace

* convert if to match

* remove redudant check

* fix SlotPohTiming comments

* review feedback on poh timing reporter

* review feedback on poh_recorder

* add test case for out-of-order arrival of timing points and incomplete timing points

* refactor poh_timing_points into its own mod

* remove option for poh_timing_report service

* move poh_timing_point_sender to constructor

* clippy

* better comments

* more clippy

* more clippy

* add slot poh timing point macro

* clippy

* assert in test

* comments and display fmt

* fix check

* assert format

* revise comments

* refactor

* extrac send fn

* revert reporting_poh_timing_point

* align loggin

* small refactor

* move type declaration to the top of the module

* replace macro with constructor

* clippy: remove redundant closure

* review comments

* simplify poh timing point creation

Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>
2022-03-30 09:04:49 -05:00
Jeff Washington (jwash) c24de17278
remove index hash calculation as an option (#23928) 2022-03-25 15:32:53 -05:00
HaoranYi 01af40d6b6
Fix intermittent validator_exit test failure (#23594)
* run validator_exit_test sequentially

* limit validator exit run to its own serial run subset
add 10ms delay in the validator exit tests

* fix intermittent validator exit failure

* no sleep

* undo the code move
2022-03-25 14:38:19 -05:00
ryleung-solana 6b85c2104c
Implement forwarding via TpuConnection (#23817) 2022-03-25 11:31:40 -04:00
Steven Luscher f44c8f296f
fix: thread `enforce_ulimit_nofile` config down when opening blockstore (#23925) 2022-03-25 03:13:33 -05:00
Jeff Washington (jwash) 51f5524e2f
make verify_accounts_package_hash like other hash calc (#23906) 2022-03-24 17:49:48 -05:00
Jeff Washington (jwash) 55d61023f7
document 'accounts' hash (#23907) 2022-03-24 15:58:52 -05:00
HaoranYi fedf4e984f
typo (#23910) 2022-03-24 15:21:59 -05:00
Jeff Washington (jwash) 37c36ce3fa
pass stats separately from CalcAccountsHashConfig (#23892) 2022-03-24 12:48:47 -05:00
steviez c31db81ac4
Use VoteAccountsHashMap type alias in all applicable spots (#23904) 2022-03-24 12:09:48 -05:00
ryleung-solana 82945ba973
Optimize TpuConnection and its implementations and refactor connection-cache to not use dyn in order to enable those changes (#23877) 2022-03-24 11:40:26 -04:00
Jeff Washington (jwash) 5b916961b5
HashCalc uses self.accounts_cache (#23890) 2022-03-24 10:34:28 -05:00
Jeff Washington (jwash) b22165ad69
hash calc uses self.filler_account_suffix (#23887) 2022-03-24 09:58:06 -05:00
Jeff Washington (jwash) 9022931689
calc hash uses self.num_hash_scan_passes (#23883) 2022-03-24 09:44:42 -05:00
Jeff Washington (jwash) db5d68f01f
HashCalc uses self.accounts_hash_cache_path (#23882) 2022-03-24 09:31:55 -05:00
Jeff Washington (jwash) 3e22d4b286
calc hash uses self.thread_pool_clean (#23881) 2022-03-23 20:52:38 -05:00
Jeff Washington (jwash) 9e61fe7583
add AccountsHashConfig to manage parameters (#23850) 2022-03-23 13:44:23 -05:00
HaoranYi db49b826f0
seperate blockstore metrics from window service metrics (#23871) 2022-03-23 13:38:17 -05:00
HaoranYi 7ff8ed869c
typos (#23870) 2022-03-23 13:36:55 -05:00
Jeff Washington (jwash) b1280b670a
calculate_accounts_hash_without_index takes &self (#23846)
* calculate_accounts_hash_without_index takes &self

* Update runtime/src/snapshot_package.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-03-23 11:57:32 -05:00
Justin Starry 92462ae031
Manually serialize and use `send_wire_transaction` for votes (#23826)
* Revert "core: partial versioned transaction support for voting service"

This reverts commit eb3df4c20e.

* Manually serialize vote tx before sending to TPU
2022-03-23 09:47:55 +08:00
Jon Cinque 7af48465fa
transaction-status: Add return data to meta (#23688)
* transaction-status: Add return data to meta

* Add return data to simulation results

* Use pretty-hex for printing return data

* Update arg name, make TransactionRecord struct

* Rename TransactionRecord -> ExecutionRecord
2022-03-22 23:17:05 +01:00
Trent Nelson eb3df4c20e core: partial versioned transaction support for voting service 2022-03-21 22:59:05 -06:00
HaoranYi 45a7c6edfb
Fix typos and a small refactor (#23805)
* fix typo

* remove packet_has_more_unprocessed_transactions function
2022-03-21 18:35:31 -05:00
Pankaj Garg 5d03b188c8
Use QUIC client in voting service (#23713)
* Use QUIC client in voting service

* guard quic-client usage with a flag

* add measure to time the quic client

* move time measure outside if block

* remove quic vs UDP flag from voting service
2022-03-21 09:10:16 -07:00
Tao Zhu 71ea05c176 replace nested for_each with flat_map 2022-03-18 16:37:41 -05:00
Tao Zhu 1c369fb55f Scan entire UnprocessedPacketBatches buffer to produce stake and locator of each packet 2022-03-18 16:37:41 -05:00
Yueh-Hsuan Chiang f999eef452
(LedgerStore) Rename BlockstoreAdvancedOptions to LedgerColumnOptions (#23764)
This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will
pass-down this struct to LedgerColumn to allow it to perform metric reporting.
2022-03-18 11:13:35 -07:00
Tao Zhu 56428be629 Not exposing inner cost_table to encapsulating implementation details,
making future change easier.
2022-03-18 12:58:43 -05:00
Tao Zhu 0ed23899e7 directly use compute_budget MAX_UNITS and DEFAULT_UNITS 2022-03-18 08:53:11 -05:00
Tao Zhu a4cacf3389 add deterministic default cost 2022-03-18 08:53:11 -05:00
Tao Zhu c478fe2047 add timing metrics, some renaming 2022-03-17 19:31:28 -05:00
Tao Zhu fd515097d8 leader qos part 2: add stage to find sender stake, set to packet meta 2022-03-17 19:31:28 -05:00
Stephen Akridge 976b138e76 Add tx weighting stage 2022-03-17 19:31:28 -05:00
Michael Vines 3773b753d1 Configure shrink paths during blockstore load 2022-03-15 23:08:07 -07:00
Michael Vines ab373bb1a9 Refactor new_banks_from_ledger() into load and process steps 2022-03-15 23:08:07 -07:00
Michael Vines 2da4e3eb6c Add --no-os-memory-stats-reporting 2022-03-15 17:07:40 -07:00
Michael Vines dbc62f2e28 Use consistent variable naming for DropBankService 2022-03-15 17:07:13 -07:00
Michael Vines d44f3d7216 Remove unhelpful log message 2022-03-15 17:07:13 -07:00
Tao Zhu 2d3501dff9 make upsert infallible op 2022-03-15 17:05:41 -05:00
Tao Zhu 61cead9b9b Remove injection of exit signal into cost_update_service 2022-03-15 09:58:56 -05:00
Tao Zhu eb73dacd58 harden banking tests 2022-03-15 09:58:08 -05:00
Justin Starry 8c8f9694e0
Refactor: Sanitized transaction creation (#23558)
* Refactor: SanitizedTransaction::try_create optionally computes hash

* Refactor: Add SimpleAddressLoader
2022-03-15 12:02:22 +08:00
Tyera Eulberg 102dd68a03
Rename AccountsDb plugins to Geyser plugins (#23604) 2022-03-14 19:18:46 -06:00
Michael Vines 17cc095d28 Slot warping doesn't need to be in new_banks_from_ledger 2022-03-14 15:29:58 -07:00
Michael Vines 2e7ee0f177 Tower loading doesn't need to be in new_banks_from_ledger 2022-03-14 15:29:58 -07:00
Michael Vines 390dc24608 Create leader schedule before processing blockstore 2022-03-14 15:29:58 -07:00
Michael Vines 543d5d4a5d Reduce new_banks_from_ledger arguments 2022-03-14 15:29:58 -07:00
Michael Vines 115f376465 Factor out bank_forks_utils::load_bank_forks() 2022-03-14 15:29:58 -07:00
Michael Vines c2ce152be8 Inline do_process_blockstore_from_root 2022-03-14 15:29:58 -07:00
Tao Zhu 5ea6a1e500 code review 2022-03-14 13:14:27 -05:00
Tao Zhu 8590911b0a Replace type alias with newtype for UnprocesedPacketBatches 2022-03-14 13:14:27 -05:00
Brooks Prumo 7758c32035
Banking Stage drops transactions that'll exceed the total account data size limit (#23537) 2022-03-13 15:58:57 +00:00
Yueh-Hsuan Chiang 1e20bd8f9a
(LedgerStore) Include storage type as a tag in RocksDB metric reporting (#23523)
#### Summary of Changes
This PR further enables group by operation on storage type in blockstore_rocksdb_cfs metrics.
Such group-by allows us to further compare the performance metrics between rocks-level and
rocks-fifo.

To make things extensible, this PR introduces BlockstoreAdvancedOptions and move shred_storage_type. 
All fields in BlockstoreAdvancedOptions will support group-by operation in blockstore_rocksdb_cfs.

Dependency: #23580
2022-03-11 15:17:34 -08:00
Tao Zhu 35d1235ed0
- move `unprocessed_packet_batches` from `BankingStage` to its own (#23508)
module
- deserialize packets during receving and buffering
2022-03-10 18:47:46 +00:00
carllin 588414a776
Report even if slot begins and ends in process_buffered_packets() (#23549) 2022-03-09 23:42:35 -05:00
Tao Zhu f68c5a274d remove persist_cost_table code 2022-03-09 21:05:47 -07:00
Tao Zhu 9f71958d7d Patch validator from loading persisted program costs 2022-03-09 21:05:47 -07:00
sakridge 7a9884c831
Quic limit connections (#23283)
* quic server limit connections

* bump per_ip

* Review comments

* Make the connections per port
2022-03-09 10:52:31 +01:00
Carl Lin 5a0cd05866 Revert "- estimate a program cost as 2 standard deviation above mean"
This reverts commit a25ac1c988.
2022-03-08 17:18:44 -08:00
Carl Lin 9acbfa5eb1 Revert "use EMA in place of Welford"
This reverts commit 6587dbfa47.
2022-03-08 17:18:44 -08:00
Carl Lin c878c9e2cb Revert "1. Persist to blockstore less frequently;"
This reverts commit 7aa1fb4e24.
2022-03-08 17:18:44 -08:00
Carl Lin 0a17edcc1f Revert "fix tests after merge"
This reverts commit ba2d83f580.
2022-03-08 17:18:44 -08:00
Michael Vines b719d6a2ad `solana-validator set-identity` no longer writes a tower file unnecessarily 2022-03-08 15:34:23 -08:00
Justin Starry 3114c199bd
Add RPC support for versioned transactions (#22530)
* Add RPC support for versioned transactions

* fix doc tests

* Add rpc test for versioned txs

* Switch to preflight bank
2022-03-08 15:20:34 +08:00
HaoranYi 181fffb916
rename status filename to be consistent (#23501) 2022-03-07 17:34:35 +00:00
Yueh-Hsuan Chiang b8b7163b66
(Ledger Store) Report RocksDB Column Family Metrics (#22503)
This PR enables blockstore to periodically report RocksDB column family properties.
The reported properties are under blockstore_rocksdb_cfs, and the properties also
support group by operation on cf_name.
2022-03-05 16:13:03 -08:00
Yueh-Hsuan Chiang 62d2a4cd88
Make ShredStorageType::RocksLevel public (#23272)
#### Summary of Changes
This PR adds two hidden arguments to the validator that allow users to use RocksDB's FIFO compaction for storing shreds.

        --shred-storage <SHRED_STORAGE>
            EXPERIMENTAL: Controls how RocksDB compacts shreds.  *WARNING*: You will lose your ledger data
            when you switch between options. Possible values are: 'level': stores shreds using RocksDB's default (level)
            compaction. 'fifo': stores shreds under RocksDB's FIFO compaction. This option is more efficient on
            disk-write-bytes of the ledger store. [default: level]  [possible values: level, fifo]

        --shred-storage-size <SHRED_STORAGE_SIZE_BYTES>
            The shred storage size in bytes. The suggested value is 50% of your ledger storage size in bytes. [default:
            268435456000]
2022-03-03 12:43:58 -08:00
Jeff Washington (jwash) 26aa18b3f3
fmt (#23448) 2022-03-02 11:54:58 -06:00
HaoranYi 41f78b9925
small optimization. use shift for pow of 2. (#22975) 2022-03-02 09:11:12 -06:00
HaoranYi 8de88d0a55
Refactor packet_threshold adjustment code into its own struct (#23216)
* refactor packet_threshold adjustment code into own struct and add unittest for it

* fix a typo in error message

* code review feedbacks

* another code review feedback

* Update core/src/ancestor_hashes_service.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* share packet threshold with repair service (credit to carl)

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-03-02 09:09:06 -06:00
HaoranYi 86e2f728c3
Fix a batch limits bug in banking (#23327)
* add thread index in thread name for debugging

* fix batch_limit

* use NUM_VOTE_THREAD instead of hardcoded number (credit to carllin)
2022-03-02 09:08:08 -06:00
Jeff Biseda c69e3b73ff
bench get_retransmit_peers (#23292) 2022-03-01 19:10:29 -08:00
Brooks Prumo 533eca3b4c
Simplify replay_blockstore_into_bank() (#23282) 2022-02-25 06:57:04 -06:00
Trent Nelson d4292774c5 checks 2022-02-25 08:05:28 +00:00
Justin Starry d0e85c293f
Fix rustfmt check (#23296) 2022-02-23 16:38:53 +08:00
Gavin Chan 20d031e2b8
Refactor ExecuteTimings w/ enum-indexed array (#23085) 2022-02-22 14:46:56 -08:00
Tyera Eulberg 7e08ae1d0c
Revert "Add simulation detection countermeasure (#22880)" (#23261)
This reverts commit c42b80f099.
2022-02-21 21:15:37 +00:00
buffalu 70ebab2c82
Add rustfmt.toml and `cargo fmt` (#23238)
* fmt

* formatted

Co-authored-by: Lucas B <buffalu@jito.network>
2022-02-19 13:32:29 +08:00
carllin 619335df1a
Add execute timings (#23097) 2022-02-17 01:14:32 -05:00