Commit Graph

3057 Commits

Author SHA1 Message Date
carllin 870ac80b79
Prioritize BankingStage packets individually in min-max heap (#24187) 2022-05-04 21:50:56 -05:00
dependabot[bot] bece7f32c8
chore: bump log from 0.4.16 to 0.4.17 (#24987)
* chore: bump log from 0.4.16 to 0.4.17

Bumps [log](https://github.com/rust-lang/log) from 0.4.16 to 0.4.17.
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/log/commits/0.4.17)

---
updated-dependencies:
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-05 01:49:18 +00:00
dependabot[bot] 61a9faae17
chore: bump serde_json from 1.0.80 to 1.0.81 (#24960)
* chore: bump serde_json from 1.0.80 to 1.0.81

Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.80 to 1.0.81.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.80...v1.0.81)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <you@example.com>
2022-05-04 14:57:38 -06:00
dependabot[bot] 9258d81ba3
chore: bump serde from 1.0.136 to 1.0.137 (#24957)
* chore: bump serde from 1.0.136 to 1.0.137

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.136 to 1.0.137.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.136...v1.0.137)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-03 22:14:59 -06:00
dependabot[bot] 2c9d2a2140
chore: bump serde_json from 1.0.79 to 1.0.80 (#24943)
* chore: bump serde_json from 1.0.79 to 1.0.80

Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.79 to 1.0.80.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.79...v1.0.80)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-03 13:46:45 -06:00
Christian Kamm 503d0baf6d
SigVerify: Add total time metrics for dedup/discard/verify (#24768)
* SigVerify: Add total time metrics for dedup/discard/verify

Previously it was impossible to determine the total time the stage spent
on these activities within a measurement window.

* SigVerify: Add _us postfix to time metrics
2022-05-03 14:59:25 +02:00
behzad nouri eff59193db
enforces that LAST_SHRED_IN_SLOT is also DATA_COMPLETE_SHRED (#24892)
A data shred cannot be LAST_SHRED_IN_SLOT if not also DATA_COMPLETE_SHRED.
So LAST_SHRED_IN_SLOT should also imply DATA_COMPLETE_SHRED:
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shredder.rs#L116-L117
https://github.com/solana-labs/solana/blob/74b586ae7/core/src/broadcast_stage/standard_broadcast_run.rs#L80-L81

However current shred constructs allow specifying a shred which is
LAST_SHRED_IN_SLOT but not DATA_COMPLETE_SHRED:
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shred.rs#L117-L118
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shred.rs#L272-L273

The commit updates ShredFlags so that if a shred is not
DATA_COMPLETE_SHRED it cannot be LAST_SHRED_IN_SLOT either.
2022-05-02 23:33:53 +00:00
apfitzge 112a0b475a
Revert "Refactor to use EpochSchedule from within RentCollector struct" (#24893)
* Revert "Ran cargo fmt"

This reverts commit 9052e41b32.

* Revert "Fix build error introduced by my editor setup, part 2"

This reverts commit 4dfeab3b38.

* Revert "Fix build error introduced by my editor setup"

This reverts commit 87fb78dc56.

* Revert "Remove redundant epoch_schedule from AccountsPackage"

This reverts commit c2f7f2fff8.

* Revert "Fix a test"

This reverts commit 36c0bdaa78.

* Revert "Fixes to initial code"

This reverts commit ed7813e698.

* Revert "Removing redundant EpochSchedule param from fns"

This reverts commit 5472d2e605.
2022-05-02 13:46:17 -05:00
behzad nouri e812430e28
defines shred flags using bitflags crate (#24874)
Shred flags uses raw bit-masking ops which lacks type-safety:
https://github.com/solana-labs/solana/blob/a829ddc92/ledger/src/shred.rs#L112-L114

This commit instead uses bitflags crate to define shred flags.
2022-05-01 19:25:15 +00:00
Pankaj Garg 88c16c0176
Check if quic is enabled before warming up quic connections (#24821)
* Check if quic is enabled before warming up quic connections

* fix after rebase

* don't start warmup service if quic not enabled

* fix test
2022-05-01 03:52:38 +00:00
Justin Starry a61652104b
Avoid holding lock guards in match expressions (#24805)
* Avoid holding bank forks read lock for RPC requests

* Avoid using lock guards in temporaries

* revert fetch stage change
2022-04-29 16:32:46 +08:00
Justin Starry 4e58b3870c
Update all BankForks methods to return owned values (#24801) 2022-04-28 18:51:00 +00:00
dependabot[bot] b22a14ca68
chore: bump etcd-client from 0.9.0 to 0.9.1 (#24774)
* chore: bump etcd-client from 0.9.0 to 0.9.1

Bumps [etcd-client](https://github.com/etcdv3/etcd-client) from 0.9.0 to 0.9.1.
- [Release notes](https://github.com/etcdv3/etcd-client/releases)
- [Commits](https://github.com/etcdv3/etcd-client/compare/v0.9.0...v0.9.1)

---
updated-dependencies:
- dependency-name: etcd-client
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-04-28 12:30:37 -06:00
sakridge 5a430c15e2
Separate sigverify metrics for each verifier (#24744) 2022-04-28 01:16:17 -07:00
behzad nouri 0f60665100
replaces Shred::new_empty_coding with Shred::new_from_parity_shard (#24749)
Removing implementation details of shreds and payload offsets from
shredder, so that shredder does not need to mutate payload:
https://github.com/solana-labs/solana/blob/71ad12128/ledger/src/shred.rs#L968-L977

Also, Shred::new_from_data can simply obtain a slice as opposed to
Option<&[u8]>:
https://github.com/solana-labs/solana/blob/71ad12128/ledger/src/shred.rs#L268-L278
2022-04-27 18:04:10 +00:00
behzad nouri 081c844d6e
removes Shred::new_empty_data_shred (#24714)
Shred::new_empty_data_shred returns an invalid shred (i.e.
shred.sanitize() returns error). The method is only used in tests and
can be easily replaced with Shred::new_from_data. To keep the shred api
surface small, this commit removes this method.
2022-04-26 23:13:12 +00:00
behzad nouri 12ae8d3be5
returns Error when Shred::sanitize fails (#24653)
Including the error in the output allows to debug when Shred::sanitize
fails.
2022-04-25 23:19:37 +00:00
behzad nouri 895f76a93c
hides implementation details of shred from its public interface (#24563)
Working towards embedding versioning into shreds binary, so that a new
variant of shred struct can include merkle tree hashes of the erasure
set.
2022-04-25 12:43:22 +00:00
carllin 8a062273de
Move error counters to be reported by leader only at end of slot (#24581)
* Add error counters to leader metrics only

* Add dependencies
2022-04-23 18:10:47 -05:00
Michael Vines d0a8a16a57 ReplayStage no longer relies on Validator to reset the poh recorder at start 2022-04-22 21:17:49 -07:00
Michael Vines 84e3342612 Process blockstore after starting the TVU 2022-04-22 21:17:49 -07:00
Michael Vines 83e041299a Run real snapshot packager while processing blockstore at validator startup 2022-04-22 21:17:49 -07:00
HaoranYi 2d4defa477
fix typo (#24576) 2022-04-22 08:43:57 -05:00
Jon Cinque 0d51596224
sim: Override slot hashes account on simulation bank (#24543)
* sim: Override slot hashes during simulation

* Add simulation test program

* Address feedback

* Add AccountOverrides explicit type

* Cargo fmt
2022-04-22 12:32:31 +02:00
Justin Starry c544742091
Local cluster test cleanup and refactoring (#24559)
* remove FixedSchedule.start_epoch

* use duration for timing

* Rename to partition bool to turbine_disabled

* simplify partition config
2022-04-22 12:14:07 +08:00
behzad nouri 454ef38e43 caches StakeAccount instead of Delegation in Stakes
The commit makes values in stake_delegations map in Stakes struct
generic. Stakes<Delegation> is equivalent to the old code and is used
for backward compatibility in BankFieldsTo{Serialize,Deserialize}.

But banks cache Stakes<StakeAccount> which includes the entire stake
account and StakeState deserialized from account. Doing so, will remove
the need to load stake account from accounts-db when working with
stake-delegations.
2022-04-21 15:28:41 +00:00
Justin Starry 02bfb85c16
Refactor transaction processing in banking stage (#24336)
* Refactor transaction processing in banking stage

* feedback

* more feedback
2022-04-21 21:06:26 +08:00
Justin Starry d5127abf46
Only add hashes for completed blocks to recent blockhashes (#24389)
* Only add hashes for completed blocks to recent blockhashes

* feedback
2022-04-21 21:05:29 +08:00
Tao Zhu a21fc3f303
Apply transaction actual execution units to cost_tracker (#24311)
* Pass the sum of consumed compute units to cost_tracker

* cost model tracks builtins and bpf programs separately, enabling adjust block cost by actual bpf programs execution costs

* Copied nightly-only experimental `checked_add_(un)signed` implementation to sdk

* Add function to update cost tracker with execution cost adjustment

* Review suggestion - using enum instead of struct for CommitTransactionDetails
Co-authored-by: Justin Starry <justin.m.starry@gmail.com>

* review - rename variable to distinguish accumulated_consumed_units from individual compute_units_consumed

* not to use signed integer operations

* Review - using saturating_add_assign!(), and checked_*().unwrap_or()

* Review - using Ordering enum to cmp

* replace checked_ with saturating_

* review - remove unnecessary Option<>

* Review - add function to report number of non-zero units account to metrics
2022-04-21 07:38:07 +00:00
Justin Starry 79923c3b58
Refactor: Rename BlockhashQueue fields and methods for clarity (#24426) 2022-04-21 11:57:17 +08:00
HaoranYi d0761d0ca4
demote receive_window_num_slot_shreds to debug logging (#24505) 2022-04-20 08:51:46 -05:00
dependabot[bot] dd15193c69
chore: bump rayon from 1.5.1 to 1.5.2 (#24470)
* chore: bump rayon from 1.5.1 to 1.5.2

Bumps [rayon](https://github.com/rayon-rs/rayon) from 1.5.1 to 1.5.2.
- [Release notes](https://github.com/rayon-rs/rayon/releases)
- [Changelog](https://github.com/rayon-rs/rayon/blob/master/RELEASES.md)
- [Commits](https://github.com/rayon-rs/rayon/commits)

---
updated-dependencies:
- dependency-name: rayon
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-04-19 23:17:52 -06:00
Michael Vines 05f32f287c solana-validator monitor now reports slot-level progress while loading blockstore 2022-04-19 22:09:48 -07:00
Michael Vines 9e4999ef6a Remove halt_at_slot from RuntimeConfig, it's not a runtime concern 2022-04-19 19:23:58 -07:00
Michael Vines 988210908c Move verify_udp_stats_access out of the way 2022-04-19 19:23:58 -07:00
Michael Vines c6f3da4879 blockstore_processor now accepts an Arc<Rwlock<BankForks>> 2022-04-19 19:23:58 -07:00
Michael Vines 0e2e0c8b7d Extract most storage-related services from the Tvu abstraction 2022-04-19 19:23:58 -07:00
Michael Vines 268a2109de Relocate hard forks info log 2022-04-19 19:23:58 -07:00
Michael Vines dd766042df Remove LedgerMetricReportService from TVU 2022-04-19 19:23:58 -07:00
behzad nouri 705ea53353
moves sign_shred and new_coding_shred_header out of Shredder (#24487) 2022-04-19 20:00:05 +00:00
Tao Zhu 94b0186a96
Cost model tracks builtins and bpf programs separately (#24468)
* Cost model tracks builtins and bpf programs separatele (enables adjusting block cost by actual bpf programs execution costs)

* Address reviews: expand test; add metrics stat
2022-04-19 13:25:47 -05:00
behzad nouri 3bbfaae7b6
moves shred stats to a separate file (#24484) 2022-04-19 18:25:09 +00:00
Jeff Washington (jwash) d9d0dad258
report swap mem as bytes like other metrics (#24455) 2022-04-19 10:03:25 -05:00
behzad nouri 039488b562
drops redundant turbine propagation path (#24351)
Most nodes in the cluster receive the same shred from two different
nodes: parent, and the first node of their neighborhood:
https://github.com/solana-labs/solana/blob/a8c695ba5/core/src/cluster_nodes.rs#L178-L197

Because of the erasure codings, half of the shreds are already
redundant. So this redundant propagation path will only add extra
overhead.

Additionally the very first node of the broadcast tree has 2x fanout
(i.e. 400 nodes) which adds too much load at one node.

This commit simplifies the broadcast tree by dropping the redundant
propagation path and removing the 2x fanout at root node.
2022-04-19 00:11:29 +00:00
behzad nouri 1d50832389
replaces counters with datapoints in gossip metrics (#24451) 2022-04-18 23:14:59 +00:00
Jason Davis c2f7f2fff8 Remove redundant epoch_schedule from AccountsPackage 2022-04-18 11:57:40 -05:00
Jason Davis 5472d2e605 Removing redundant EpochSchedule param from fns 2022-04-18 11:57:40 -05:00
Christian Kamm d2c6c04d3e banking-bench: Add and rearrange options
- Add write-lock-contention option, replacing same_payer
- write-lock-contention also has a same-batch-only value, where
  contention happens only inside batches, not between them
- Rename num-threads to batches-per-iteration, which is closer to what
  it is actually doing.
- Add num-banking-threads as a new option
- Rename packets-per-chunk to packets-per-batch, because this is closer
  to what's happening; and it was previously confusing that num-chunks
  had little to do with packets-per-chunk.

Example output for a iterations=100 and a permutation of inputs:

contention,threads,batchsize,batchcount,tps
none,           3,192, 4,65290.30
none,           4,192, 4,77358.06
none,           5,192, 4,86436.65
none,           3, 12,64,43944.57
none,           4, 12,64,65852.15
none,           5, 12,64,70674.37
same-batch-only,3,192, 4,3928.21
same-batch-only,4,192, 4,6460.15
same-batch-only,5,192, 4,7242.85
same-batch-only,3, 12,64,11377.58
same-batch-only,4, 12,64,19582.79
same-batch-only,5, 12,64,24648.45
full,           3,192, 4,3914.26
full,           4,192, 4,2102.99
full,           5,192, 4,3041.87
full,           3, 12,64,11316.17
full,           4, 12,64,2224.99
full,           5, 12,64,5240.32
2022-04-18 09:43:46 -05:00
steviez 38f0d60b00
Move repeated logic into common function (#24373) 2022-04-18 00:16:06 -05:00
Tao Zhu 578d59c802 Remove the code that handles cost update for separate pr 2022-04-17 19:26:24 -05:00
Tao Zhu e97ffb55cb nit - renaming variables to concise names 2022-04-17 19:26:24 -05:00
Tao Zhu 6bc6384f8e refactor to consolidate info into single return field 2022-04-17 19:26:24 -05:00
Tao Zhu 9dadfb2e2c Add checked_add_signed() to apply cost adjustment to cost_tracker 2022-04-17 19:26:24 -05:00
Tao Zhu 810b1dff40 undo cost of executed-but-not-recorded transactions from cost_tracker 2022-04-17 19:26:24 -05:00
Tao Zhu 23d365d02f Address review comment: extract transaction was_executed status to avoid cloning execution_results 2022-04-17 19:26:24 -05:00
Tao Zhu 094da35b91 Address review comments:
1. use was_executed to correctly identify transactions requires cost adjustment;
2. add function to specifically handle executino cost adjustment without have to copy accounts
2022-04-17 19:26:24 -05:00
Tao Zhu 29ca21ed78 undo transaction cost from cost_tracker if it was not executed successfully 2022-04-17 19:26:24 -05:00
sakridge d71986cecf
Separate staked and un-staked on quic tpu port (#24339) 2022-04-16 10:54:22 +02:00
sakridge 1b7d1f78de
Implement QUIC connection warmup service for future leaders (#24054)
* Increase connection timeouts

* Bump quic connection cache to 1024

* Use constant for quic connection timeout and add warm cache service

* Fixes to QUIC warmup service

* fix check failure

* fixes after rebase

* fix timeout test

Co-authored-by: Pankaj Garg <pankaj@solana.com>
2022-04-15 12:09:24 -07:00
Christian Kamm 97f2eb8e65 Banking stage: Deserialize packets only once
Benchmarks show roughly a 6% improvement. The impact could be more
significant when transactions need to be retried a lot.

after patch:
{'name': 'banking_bench_total', 'median': '72767.43'}
{'name': 'banking_bench_tx_total', 'median': '80240.38'}
{'name': 'banking_bench_success_tx_total', 'median': '72767.43'}
test bench_banking_stage_multi_accounts
... bench:   6,137,264 ns/iter (+/- 1,364,111)
test bench_banking_stage_multi_programs
... bench:  10,086,435 ns/iter (+/- 2,921,440)

before patch:
{'name': 'banking_bench_total', 'median': '68572.26'}
{'name': 'banking_bench_tx_total', 'median': '75704.75'}
{'name': 'banking_bench_success_tx_total', 'median': '68572.26'}
test bench_banking_stage_multi_accounts
... bench:   6,521,007 ns/iter (+/- 1,926,741)
test bench_banking_stage_multi_programs
... bench:  10,526,433 ns/iter (+/- 2,736,530)
2022-04-15 00:57:11 -06:00
sakridge 7a4a6597c0
Don't enforce ulimit for validator test config (#24272) 2022-04-12 22:06:37 +02:00
Jon Cinque 9b8850f99e
test-validator: Add `--max-compute-units` flag (#24130)
* test-validator: Add `--max-compute-units` flag

* Add `RuntimeConfig` for tweaking runtime behavior

* Actually add the file

* Move RuntimeConfig to runtime
2022-04-12 02:28:10 +02:00
Michael Vines c1687b0604 Switch to await-aware tokio::sync::Mutex 2022-04-11 18:15:03 -04:00
Giorgio Gambino 60b2155bd3
Add accounts-filler-size command line option (#23896) 2022-04-11 13:10:09 -05:00
carllin ff3b6d2b8b
Remove duplicate increment (#24219) 2022-04-09 15:21:39 -05:00
Christian Kamm a058f348a2 Address review comments 2022-04-08 14:37:55 -05:00
Christian Kamm 2ed29771f2 Unittest for cost tracker after process_and_record_transactions 2022-04-08 14:37:55 -05:00
Christian Kamm 924b8ea1eb Adjustments to cost_tracker updates
- don't store pending tx signatures and costs in CostTracker
- apply tx costs to global state immediately again
- go from commit_or_cancel to update_or_remove, where the cost tracker
  is either updated with the true costs for successful tx, or the costs
  of a retryable tx is removed
- move the function into qos_service and hold the cost tracker lock for
  the whole loop
2022-04-08 14:37:55 -05:00
Tao Zhu 9e07272af8 - Only commit successfully executed transactions' cost to cost_tracker;
- In-fly transactions are pended in cost_tracker until being committed
  or cancelled;
2022-04-08 14:37:55 -05:00
Tyera Eulberg d2702201ca
Bump tonic, tonic-build, prost, and etcd-client (#24147)
* Bump tonic, prost, and etcd-client

* Restore doc ignores
2022-04-08 10:21:45 -06:00
Jeff Washington (jwash) 210f6a6fab
move hash calculation out of acct bg svc (#23689)
* move hash calculation out of acct bg svc

* pr feedback
2022-04-08 10:42:03 -05:00
steviez 1dd63631c0
Add high level overview comments on ledger_cleanup_service (#24184) 2022-04-08 00:49:21 -05:00
HaoranYi e105547c14
tvu and tpu timeout on joining its microservices (#24111)
* panic when test timeout

* nonblocking send when when droping banks

* debug log

* timeout for tvu

* unused varaible

* timeout for tpu

* Revert "debug log"

This reverts commit da780a3301a51d7c496141a85fcd35014fe6dff5.

* add timeout const

* fix typo

* Revert "nonblocking send when when droping banks".
I will create another pull request for this.

This reverts commit 088c98ec0facf825b5eca058fb860deba6d28888.

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-07 20:20:13 -05:00
Jeff Washington (jwash) c27150b1a3
reserialize_bank_fields_with_hash (#23916)
* reserialize_bank_with_new_accounts_hash

* Update runtime/src/serde_snapshot.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/serde_snapshot/tests.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/serde_snapshot/tests.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* pr feedback

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-04-07 14:05:57 -05:00
Jeff Washington (jwash) 550ca7bf92
compare contents of serialized banks instead of exact file format (#24141)
* compare contents of serialized banks instead of exact file format

* Update runtime/src/snapshot_utils.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/snapshot_utils.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* pr feedback

* get rid of clone

* pr feedback

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-04-06 21:55:44 -05:00
Jeff Washington (jwash) fddd162645
reserialize bank in ahv by first writing to temp file in abs (#23947) 2022-04-06 21:39:26 -05:00
Tyera Eulberg fb67ff14de
Remove replica-node crates (#24152) 2022-04-06 16:52:19 -06:00
Tyera Eulberg afeb1d3cca
Bump lru crate (#24150) 2022-04-06 16:18:42 -06:00
Brooks Prumo c322842257
Replace channel with Mutex<Option> for AccountsPackage (#24013) 2022-04-06 05:47:19 -05:00
HaoranYi 302142bb25
fix typo (#24123) 2022-04-05 15:55:47 -05:00
behzad nouri db23295e1c
removes legacy weighted_shuffle and weighted_best methods (#24125)
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
https://github.com/solana-labs/solana/pull/18343
and can be replaced with WeightedShuffle.

Also, as described in:
https://github.com/solana-labs/solana/pull/13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
2022-04-05 19:19:22 +00:00
carllin 4ea59d8cb4
Set drop callback on first root bank (#23999) 2022-04-05 13:02:33 -05:00
behzad nouri 2282571493
removes outdated and flaky test_skip_repair from retransmit-stage (#24121)
test_skip_repair in retransmit-stage is no longer relevant because
following: https://github.com/solana-labs/solana/pull/19233
repair packets are filtered out earlier in window-service and so
retransmit stage does not know if a shred is repaired or not.
Also, following turbine peer shuffle changes:
https://github.com/solana-labs/solana/pull/24080
the test has become flaky since it does not take into account how peers
are shuffled for each shred.
2022-04-05 16:02:53 +00:00
behzad nouri 2b718d00b0 removes legacy compatibility turbine peers shuffle code 2022-04-05 12:04:12 +00:00
behzad nouri d0b850cdd9 removes turbine peers shuffle patch feature 2022-04-05 12:04:12 +00:00
behzad nouri 855801cc95 removes deterministic-shred-seed feature 2022-04-05 12:04:12 +00:00
Jeff Biseda ee6bb0d5d3
track fec set turbine stats (#23989) 2022-04-04 14:44:21 -07:00
HaoranYi 6ba4e870c4
Blockstore should drop signals before validator exit (#24025)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* wip

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

* clean up import

* add mutex for signal senders in blockstore

* remove mut

* refactor: extract add signal functions

* make blockstore signal private

* let compiler infer mutex type

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-04 11:38:05 -05:00
behzad nouri 7cb3b6cbe2
demotes WeightedShuffle failures to error metrics (#24079)
Since call-sites are calling unwrap anyways, panicking seems too punitive
for our use cases.
2022-04-03 16:20:06 +00:00
HaoranYi ffa4cafe1c
Revert sequential execution of validator_exit and validator_parallel_exit tests (#24048)
* handle channel disconnect

* revert sequential execution of validator_exit and parallel_validator_exit tests
2022-04-02 10:22:47 -05:00
Yueh-Hsuan Chiang 0b5ed87220
(LedgerStore) Enable performance sampling in column family get() (#23834)
#### Summary of Changes
This PR enables RocksDB read side performance metrics to report to blockstore_rocksdb_read_perf.
The sampling rate is controlled by an env arg `SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K`,
specifies the number of perf samples for every 1000 operations.  The default value is set to 10, meaning
we will report 10 out of 1000 (or 1/100) reads.

The metrics are based on the RocksDB [PerfContext](https://github.com/facebook/rocksdb/blob/main/include/rocksdb/perf_context.h).
It includes many useful metrics including block read time, cache hit rate, and time spent on decompressing the block.
2022-04-01 13:13:32 -07:00
Pankaj Garg df4d92f9cf
Revert voting service to use UDP instead of QUIC (#24032) 2022-04-01 09:34:18 -07:00
HaoranYi 51b37f0184
Modify rpc_completed_slot_service to be non-blocking (#24007)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-03-31 16:44:23 -05:00
Jeff Washington (jwash) 9c8dad33c7
add epoch_schedule and rent_collector to hash calc (#24012) 2022-03-31 10:51:18 -05:00
Jeff Washington (jwash) da001d54e5
calculate_accounts_hash_helper uses config (#24003) 2022-03-31 09:29:45 -05:00
Jeff Washington (jwash) 125f9634fd
add hash calc config.use_write_cache (#24005) 2022-03-30 17:19:34 -05:00
HaoranYi ba770832d0
Poh timing service (#23736)
* initial work for poh timing report service

* add poh_timing_report_service to validator

* fix comments

* clippy

* imrove test coverage

* delete record when complete

* rename shred full to slot full.

* debug logging

* fix slot full

* remove debug comments

* adding fmt trait

* derive default

* default for poh timing reporter

* better comments

* remove commented code

* fix test

* more test fixes

* delete timestamps for slot that are older than root_slot

* debug log

* record poh start end in bank reset

* report full to start time instead

* fix poh slot offset

* report poh start for normal ticks

* fix typo

* refactor out poh point report fn

* rename

* optimize delete - delete only when last_root changed

* change log level to trace

* convert if to match

* remove redudant check

* fix SlotPohTiming comments

* review feedback on poh timing reporter

* review feedback on poh_recorder

* add test case for out-of-order arrival of timing points and incomplete timing points

* refactor poh_timing_points into its own mod

* remove option for poh_timing_report service

* move poh_timing_point_sender to constructor

* clippy

* better comments

* more clippy

* more clippy

* add slot poh timing point macro

* clippy

* assert in test

* comments and display fmt

* fix check

* assert format

* revise comments

* refactor

* extrac send fn

* revert reporting_poh_timing_point

* align loggin

* small refactor

* move type declaration to the top of the module

* replace macro with constructor

* clippy: remove redundant closure

* review comments

* simplify poh timing point creation

Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>
2022-03-30 09:04:49 -05:00
Jeff Washington (jwash) c24de17278
remove index hash calculation as an option (#23928) 2022-03-25 15:32:53 -05:00
HaoranYi 01af40d6b6
Fix intermittent validator_exit test failure (#23594)
* run validator_exit_test sequentially

* limit validator exit run to its own serial run subset
add 10ms delay in the validator exit tests

* fix intermittent validator exit failure

* no sleep

* undo the code move
2022-03-25 14:38:19 -05:00
ryleung-solana 6b85c2104c
Implement forwarding via TpuConnection (#23817) 2022-03-25 11:31:40 -04:00