Commit Graph

2968 Commits

Author SHA1 Message Date
behzad nouri 705ea53353
moves sign_shred and new_coding_shred_header out of Shredder (#24487) 2022-04-19 20:00:05 +00:00
Tao Zhu 94b0186a96
Cost model tracks builtins and bpf programs separately (#24468)
* Cost model tracks builtins and bpf programs separatele (enables adjusting block cost by actual bpf programs execution costs)

* Address reviews: expand test; add metrics stat
2022-04-19 13:25:47 -05:00
behzad nouri 3bbfaae7b6
moves shred stats to a separate file (#24484) 2022-04-19 18:25:09 +00:00
Jeff Washington (jwash) d9d0dad258
report swap mem as bytes like other metrics (#24455) 2022-04-19 10:03:25 -05:00
behzad nouri 039488b562
drops redundant turbine propagation path (#24351)
Most nodes in the cluster receive the same shred from two different
nodes: parent, and the first node of their neighborhood:
https://github.com/solana-labs/solana/blob/a8c695ba5/core/src/cluster_nodes.rs#L178-L197

Because of the erasure codings, half of the shreds are already
redundant. So this redundant propagation path will only add extra
overhead.

Additionally the very first node of the broadcast tree has 2x fanout
(i.e. 400 nodes) which adds too much load at one node.

This commit simplifies the broadcast tree by dropping the redundant
propagation path and removing the 2x fanout at root node.
2022-04-19 00:11:29 +00:00
behzad nouri 1d50832389
replaces counters with datapoints in gossip metrics (#24451) 2022-04-18 23:14:59 +00:00
Jason Davis c2f7f2fff8 Remove redundant epoch_schedule from AccountsPackage 2022-04-18 11:57:40 -05:00
Jason Davis 5472d2e605 Removing redundant EpochSchedule param from fns 2022-04-18 11:57:40 -05:00
Christian Kamm d2c6c04d3e banking-bench: Add and rearrange options
- Add write-lock-contention option, replacing same_payer
- write-lock-contention also has a same-batch-only value, where
  contention happens only inside batches, not between them
- Rename num-threads to batches-per-iteration, which is closer to what
  it is actually doing.
- Add num-banking-threads as a new option
- Rename packets-per-chunk to packets-per-batch, because this is closer
  to what's happening; and it was previously confusing that num-chunks
  had little to do with packets-per-chunk.

Example output for a iterations=100 and a permutation of inputs:

contention,threads,batchsize,batchcount,tps
none,           3,192, 4,65290.30
none,           4,192, 4,77358.06
none,           5,192, 4,86436.65
none,           3, 12,64,43944.57
none,           4, 12,64,65852.15
none,           5, 12,64,70674.37
same-batch-only,3,192, 4,3928.21
same-batch-only,4,192, 4,6460.15
same-batch-only,5,192, 4,7242.85
same-batch-only,3, 12,64,11377.58
same-batch-only,4, 12,64,19582.79
same-batch-only,5, 12,64,24648.45
full,           3,192, 4,3914.26
full,           4,192, 4,2102.99
full,           5,192, 4,3041.87
full,           3, 12,64,11316.17
full,           4, 12,64,2224.99
full,           5, 12,64,5240.32
2022-04-18 09:43:46 -05:00
steviez 38f0d60b00
Move repeated logic into common function (#24373) 2022-04-18 00:16:06 -05:00
Tao Zhu 578d59c802 Remove the code that handles cost update for separate pr 2022-04-17 19:26:24 -05:00
Tao Zhu e97ffb55cb nit - renaming variables to concise names 2022-04-17 19:26:24 -05:00
Tao Zhu 6bc6384f8e refactor to consolidate info into single return field 2022-04-17 19:26:24 -05:00
Tao Zhu 9dadfb2e2c Add checked_add_signed() to apply cost adjustment to cost_tracker 2022-04-17 19:26:24 -05:00
Tao Zhu 810b1dff40 undo cost of executed-but-not-recorded transactions from cost_tracker 2022-04-17 19:26:24 -05:00
Tao Zhu 23d365d02f Address review comment: extract transaction was_executed status to avoid cloning execution_results 2022-04-17 19:26:24 -05:00
Tao Zhu 094da35b91 Address review comments:
1. use was_executed to correctly identify transactions requires cost adjustment;
2. add function to specifically handle executino cost adjustment without have to copy accounts
2022-04-17 19:26:24 -05:00
Tao Zhu 29ca21ed78 undo transaction cost from cost_tracker if it was not executed successfully 2022-04-17 19:26:24 -05:00
sakridge d71986cecf
Separate staked and un-staked on quic tpu port (#24339) 2022-04-16 10:54:22 +02:00
sakridge 1b7d1f78de
Implement QUIC connection warmup service for future leaders (#24054)
* Increase connection timeouts

* Bump quic connection cache to 1024

* Use constant for quic connection timeout and add warm cache service

* Fixes to QUIC warmup service

* fix check failure

* fixes after rebase

* fix timeout test

Co-authored-by: Pankaj Garg <pankaj@solana.com>
2022-04-15 12:09:24 -07:00
Christian Kamm 97f2eb8e65 Banking stage: Deserialize packets only once
Benchmarks show roughly a 6% improvement. The impact could be more
significant when transactions need to be retried a lot.

after patch:
{'name': 'banking_bench_total', 'median': '72767.43'}
{'name': 'banking_bench_tx_total', 'median': '80240.38'}
{'name': 'banking_bench_success_tx_total', 'median': '72767.43'}
test bench_banking_stage_multi_accounts
... bench:   6,137,264 ns/iter (+/- 1,364,111)
test bench_banking_stage_multi_programs
... bench:  10,086,435 ns/iter (+/- 2,921,440)

before patch:
{'name': 'banking_bench_total', 'median': '68572.26'}
{'name': 'banking_bench_tx_total', 'median': '75704.75'}
{'name': 'banking_bench_success_tx_total', 'median': '68572.26'}
test bench_banking_stage_multi_accounts
... bench:   6,521,007 ns/iter (+/- 1,926,741)
test bench_banking_stage_multi_programs
... bench:  10,526,433 ns/iter (+/- 2,736,530)
2022-04-15 00:57:11 -06:00
sakridge 7a4a6597c0
Don't enforce ulimit for validator test config (#24272) 2022-04-12 22:06:37 +02:00
Jon Cinque 9b8850f99e
test-validator: Add `--max-compute-units` flag (#24130)
* test-validator: Add `--max-compute-units` flag

* Add `RuntimeConfig` for tweaking runtime behavior

* Actually add the file

* Move RuntimeConfig to runtime
2022-04-12 02:28:10 +02:00
Michael Vines c1687b0604 Switch to await-aware tokio::sync::Mutex 2022-04-11 18:15:03 -04:00
Giorgio Gambino 60b2155bd3
Add accounts-filler-size command line option (#23896) 2022-04-11 13:10:09 -05:00
carllin ff3b6d2b8b
Remove duplicate increment (#24219) 2022-04-09 15:21:39 -05:00
Christian Kamm a058f348a2 Address review comments 2022-04-08 14:37:55 -05:00
Christian Kamm 2ed29771f2 Unittest for cost tracker after process_and_record_transactions 2022-04-08 14:37:55 -05:00
Christian Kamm 924b8ea1eb Adjustments to cost_tracker updates
- don't store pending tx signatures and costs in CostTracker
- apply tx costs to global state immediately again
- go from commit_or_cancel to update_or_remove, where the cost tracker
  is either updated with the true costs for successful tx, or the costs
  of a retryable tx is removed
- move the function into qos_service and hold the cost tracker lock for
  the whole loop
2022-04-08 14:37:55 -05:00
Tao Zhu 9e07272af8 - Only commit successfully executed transactions' cost to cost_tracker;
- In-fly transactions are pended in cost_tracker until being committed
  or cancelled;
2022-04-08 14:37:55 -05:00
Tyera Eulberg d2702201ca
Bump tonic, tonic-build, prost, and etcd-client (#24147)
* Bump tonic, prost, and etcd-client

* Restore doc ignores
2022-04-08 10:21:45 -06:00
Jeff Washington (jwash) 210f6a6fab
move hash calculation out of acct bg svc (#23689)
* move hash calculation out of acct bg svc

* pr feedback
2022-04-08 10:42:03 -05:00
steviez 1dd63631c0
Add high level overview comments on ledger_cleanup_service (#24184) 2022-04-08 00:49:21 -05:00
HaoranYi e105547c14
tvu and tpu timeout on joining its microservices (#24111)
* panic when test timeout

* nonblocking send when when droping banks

* debug log

* timeout for tvu

* unused varaible

* timeout for tpu

* Revert "debug log"

This reverts commit da780a3301a51d7c496141a85fcd35014fe6dff5.

* add timeout const

* fix typo

* Revert "nonblocking send when when droping banks".
I will create another pull request for this.

This reverts commit 088c98ec0facf825b5eca058fb860deba6d28888.

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-07 20:20:13 -05:00
Jeff Washington (jwash) c27150b1a3
reserialize_bank_fields_with_hash (#23916)
* reserialize_bank_with_new_accounts_hash

* Update runtime/src/serde_snapshot.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/serde_snapshot/tests.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/serde_snapshot/tests.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* pr feedback

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-04-07 14:05:57 -05:00
Jeff Washington (jwash) 550ca7bf92
compare contents of serialized banks instead of exact file format (#24141)
* compare contents of serialized banks instead of exact file format

* Update runtime/src/snapshot_utils.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/snapshot_utils.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* pr feedback

* get rid of clone

* pr feedback

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-04-06 21:55:44 -05:00
Jeff Washington (jwash) fddd162645
reserialize bank in ahv by first writing to temp file in abs (#23947) 2022-04-06 21:39:26 -05:00
Tyera Eulberg fb67ff14de
Remove replica-node crates (#24152) 2022-04-06 16:52:19 -06:00
Tyera Eulberg afeb1d3cca
Bump lru crate (#24150) 2022-04-06 16:18:42 -06:00
Brooks Prumo c322842257
Replace channel with Mutex<Option> for AccountsPackage (#24013) 2022-04-06 05:47:19 -05:00
HaoranYi 302142bb25
fix typo (#24123) 2022-04-05 15:55:47 -05:00
behzad nouri db23295e1c
removes legacy weighted_shuffle and weighted_best methods (#24125)
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
https://github.com/solana-labs/solana/pull/18343
and can be replaced with WeightedShuffle.

Also, as described in:
https://github.com/solana-labs/solana/pull/13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
2022-04-05 19:19:22 +00:00
carllin 4ea59d8cb4
Set drop callback on first root bank (#23999) 2022-04-05 13:02:33 -05:00
behzad nouri 2282571493
removes outdated and flaky test_skip_repair from retransmit-stage (#24121)
test_skip_repair in retransmit-stage is no longer relevant because
following: https://github.com/solana-labs/solana/pull/19233
repair packets are filtered out earlier in window-service and so
retransmit stage does not know if a shred is repaired or not.
Also, following turbine peer shuffle changes:
https://github.com/solana-labs/solana/pull/24080
the test has become flaky since it does not take into account how peers
are shuffled for each shred.
2022-04-05 16:02:53 +00:00
behzad nouri 2b718d00b0 removes legacy compatibility turbine peers shuffle code 2022-04-05 12:04:12 +00:00
behzad nouri d0b850cdd9 removes turbine peers shuffle patch feature 2022-04-05 12:04:12 +00:00
behzad nouri 855801cc95 removes deterministic-shred-seed feature 2022-04-05 12:04:12 +00:00
Jeff Biseda ee6bb0d5d3
track fec set turbine stats (#23989) 2022-04-04 14:44:21 -07:00
HaoranYi 6ba4e870c4
Blockstore should drop signals before validator exit (#24025)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* wip

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

* clean up import

* add mutex for signal senders in blockstore

* remove mut

* refactor: extract add signal functions

* make blockstore signal private

* let compiler infer mutex type

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-04 11:38:05 -05:00
behzad nouri 7cb3b6cbe2
demotes WeightedShuffle failures to error metrics (#24079)
Since call-sites are calling unwrap anyways, panicking seems too punitive
for our use cases.
2022-04-03 16:20:06 +00:00