Commit Graph

3984 Commits

Author SHA1 Message Date
Brooks 93f5b514fa
Adds StartingSnapshotStorages to AccountsHashVerifier (#58) 2024-03-04 16:32:51 -05:00
Wen bfe44d95f4
Wen restart aggregate last voted fork slots (#33892)
* Push and aggregate RestartLastVotedForkSlots.

* Fix API and lint errors.

* Reduce clutter.

* Put my own LastVotedForkSlots into the aggregate.

* Write LastVotedForkSlots aggregate progress into local file.

* Fix typo and name constants.

* Fix flaky test.

* Clarify the comments.

* - Use constant for wait_for_supermajority
- Avoid waiting after first shred when repair is in wen_restart

* Fix delay_after_first_shred and remove loop in wen_restart.

* Read wen_restart slots inside the loop instead.

* Discard turbine shreds while in wen_restart in windows insert rather than
shred_fetch_stage.

* Use the new Gossip API.

* Rename slots_to_repair_for_wen_restart and a few others.

* Rename a few more and list all states.

* Pipe exit down to aggregate loop so we can exit early.

* Fix import of RestartLastVotedForkSlots.

* Use the new method to generate test bank.

* Make linter happy.

* Use new bank constructor for tests.

* Fix a bad merge.

* - add new const for wen_restart
- fix the test to cover more cases
- add generate_repairs_for_slot_not_throtted_by_tick and
  generate_repairs_for_slot_throtted_by_tick to make it readable

* Add initialize and put the main logic into a loop.

* Change aggregate interface and other fixes.

* Add failure tests and tests for state transition.

* Add more tests and add ability to recover from written records in
last_voted_fork_slots_aggregate.

* Various name changes.

* We don't really care what type of error is returned.

* Wait on expected progress message in proto file instead of sleep.

* Code reorganization and cleanup.

* Make linter happy.

* Add WenRestartError.

* Split WenRestartErrors into separate erros per state.

* Revert "Split WenRestartErrors into separate erros per state."

This reverts commit 4c920cb8f8d492707560441912351cca779129f6.

* Use individual functions when testing for failures.

* Move initialization errors into initialize().

* Use anyhow instead of thiserror to generate backtrace for error.

* Add missing Cargo.lock.

* Add error log when last_vote is missing in the tower storage.

* Change error log info.

* Change test to match exact error.
2024-03-01 18:52:47 -08:00
steviez 7d6f1d5911
Give streamer::receiver() threads unique names (#35369)
The name was previously hard-coded to solReceiver. The use of the same
name makes it hard to figure out which thread is which when these
threads are handling many services (Gossip, Tvu, etc).
2024-03-01 13:36:08 -06:00
Andrew Fitzgerald ede9163633
Comments clarifying non-emptiness of threadset (#35388) 2024-03-01 11:18:42 -08:00
steviez 7c878973e2
Cleanup ReplayStage loop timing struct (#35361)
- Track loop_count in the struct
- Rename ReplayTiming ==> ReplayLoopTiming
- Make all metrics consistent to end with "_elapsed_us"
2024-03-01 12:30:50 -06:00
Pankaj Garg 990ca1d0b8
Add limit to looping in banking-stage (#35342) 2024-02-28 17:36:45 -08:00
Brooks 6aaaf858c9
Adds more info to panic message in AccountsHashVerifier (#35353) 2024-02-28 15:55:05 -05:00
behzad nouri a7a41e7631
adds Merkle shred variant with retransmitter's signature (#35293)
Moving towards locking down Turbine propagation path, the commit
reserves a buffer within shred payload for retransmitter's signature.
2024-02-28 20:31:40 +00:00
steviez 140818221c
Rename SamplePerformanceService thread for consistency (#35332)
- Rename thread
- Add uniform service start/stop logs
- Misc cleanup with variables / constants / exit flag check
2024-02-28 13:47:27 -06:00
Andrew Fitzgerald 9f581113bd
Scheduler: Leader-Slot metrics for Scheduler (#35087) 2024-02-23 17:06:22 -08:00
enjoyoor c02f47a6fb
fix: cleanup (#35298) 2024-02-23 19:59:52 +00:00
Tao Zhu 139b9c8c25
Add fee_details to fee calculation (#35021)
* add fee_details to fee calculation

* fix - no need to round after summing u64

* feature gate on removing unwanted rounding
2024-02-23 08:58:48 -06:00
Andrew Fitzgerald 367f489f63
scheduler inner metrics (#35271) 2024-02-22 15:01:08 -08:00
Ashwin Sekar 07955e79ad
replay: gracefully exit if tower load fails (#35269) 2024-02-21 18:51:30 -08:00
Ryo Onodera 024d6ecc4f
Add --unified-scheduler-handler-threads (#35195)
* Add --unified-scheduler-handler-threads

* Adjust value name

* Warn if the flag was ignored

* Tweak message a bit
2024-02-22 09:05:17 +09:00
DimAn 531793b4be
validator: ignore too old tower error (#35229)
* validator: ignore too old tower error

* Update core/src/replay_stage.rs

Co-authored-by: Ashwin Sekar <ashwin@solana.com>

* remove redundant references

---------

Co-authored-by: Ashwin Sekar <ashwin@solana.com>
2024-02-21 13:23:23 -05:00
steviez 4905076fb6
Remove channel that sends roots to BlockstoreCleanupService (#35211)
Currently, ReplayStage sends new roots to BlockstoreCleanupService, and
BlockstoreCleanupService decides when to clean based on advancement of
the latest root. This is totally unnecessary as the latest root is
cached by the Blockstore, and this value can simply be fetched.

This change removes the channel completely, and instead just fetches
the latest root from Blockstore directly. Moreso, some logic is added
to check the latest root less frequently, based on the set purge
interval.

All in all, we went from sending > 100 slots/min across a crossbeam
channel to reading an atomic roughly 3 times/min, while also removing
the need for an additional thread that read from the channel.
2024-02-21 10:16:16 -06:00
steviez 5c04a9731c
Obtain BankForks read lock once to get ancestors and descendants (#35273)
No need to get the read lock twice; instead, hold it and get both items
2024-02-21 10:07:57 -06:00
Andrew Fitzgerald cd4cf814fc
Scheduler: Separate scheduler metrics module (#35216) 2024-02-20 19:39:00 -08:00
Ashwin Sekar b0134ab04d
validator: include waited_for_supermajority in startup metric (#35137) 2024-02-20 16:13:57 -08:00
Dmitri Makarov 0acee67891
SVM: move transaction_results from accounts-db to SVM (#35183)
SVM: Remove accounts-db deps in accounts_loader tests
2024-02-20 12:54:56 -08:00
Ashwin Sekar befe8b9d98
replay: reload tower if set-identity during startup (#35173)
* replay: reload tower if set-identity during startup

* pr feedback: add unit tests

* pr feedback: use tower.node_pubkey, more descriptive names
2024-02-20 09:30:46 -08:00
HaoranYi ebf60359f4
clean up dev-context-only attribute (#35201)
Co-authored-by: HaoranYi <haoran.yi@solana.com>
2024-02-19 07:56:27 -06:00
sakridge e21251090f
Remove spammy banking-stage retryable tx metric which is not needed (#35207)
Already covered by other metrics like the filtered retryable and the
number filtered.
2024-02-16 18:29:42 +01:00
steviez 897adb2711
Update the directory naming for incorrect shred version backup (#35158)
The directory is currently named with the expected_shred_version;
however, the backup contains shreds that do NOT match the
expected_shred_version. So, use the found (incorrect) shred version in
the name instead.
2024-02-13 09:42:05 -07:00
Andrew Fitzgerald 1517d22ecc
Scheduler - prioritization fees/cost (#34888) 2024-02-09 08:51:21 -08:00
Dmitri Makarov 245d1c4087
SVM: Move TransactionCheckResult definition from accounts-db to SVM (#35153) 2024-02-08 21:13:00 -05:00
Dmitri Makarov 2c0001b530
SVM: Move RewardInfo from accounts-db to Solana SDK (#35120) 2024-02-07 10:55:39 -08:00
Pankaj Garg 46b9586630
SVM: Move SVM code to its own crate folder (#35119) 2024-02-06 16:06:32 -08:00
Pankaj Garg 10defb161f
SVM: Move TransactionErrorMetrics to SVM folder (#35112) 2024-02-06 11:15:48 -08:00
Ashwin Sekar 3e24b410fb
replay: votes made before restart are eligible for refresh (#34737)
* replay: votes made before restart are eligible for refresh

* pr feedback: rename to mark

* pr feedback: limit scope to non voting validators
2024-02-06 11:09:59 -08:00
Andrew Fitzgerald 9dca15a5b7
Rename priority to compute_unit_price (#35062)
* rename several priorities to compute_unit_price

* TransactionPriorityDetails -> ComputeBudgetDetails

* prioritization_fee_cache: fix comment

* transaction_state: fix comments and variable names

* immutable_deserialized_packet: fix comment
2024-02-05 16:41:01 -08:00
Ashwin Sekar 0e4e81a44c
banking stage: remove spammy packet conversion metric (#35014) 2024-02-05 14:46:32 -08:00
Pankaj Garg 3cf5dd2afb
SVM: Move RuntimeConfig to svm folder (#35085) 2024-02-05 13:49:36 -08:00
Brooks f62293918d
Moves the async deleter code to accounts-db (#35040) 2024-02-02 09:21:26 -05:00
galactus 35f900b03b
Metrics prioritization fees (#34653)
* Adding metrics for prioritization fees min/max per thread

* Adding scheduled transaction prioritization fees to the metrics

* Changes after andrews comments

* fixing Taos comments

* Adding metrics to the new scheduler

* Fixing getting of min max for TransactionStateContainer

* Fix clippy CI Issue

* Changes after andrews comments about min/max for new scheduler

* Creating a new structure to store prio fee metrics

* Reporting with prio fee stats banking_stage_scheduler_counts

* merging prioritization stats into SchedulerCountMetrics

* Minor changes after andrews review
2024-02-01 15:06:45 -06:00
Brooks daa2449ad4
Removes RwLock on AccountsDb::shrink_paths (#35027) 2024-02-01 09:35:34 -05:00
Lijun Wang 8fde8d26c7
don't sign X.509 certs (#34896)
This get rid of 3rd party components rcgen in the path of private key access to make the code more secure.
2024-01-28 16:17:46 -08:00
behzad nouri 79bbe4381a
adds chained_merkle_root to shredder arguments (#34952)
Working towards chaining Merkle root of erasure batches, the commit adds
chained_merkle_root to shredder arguments.
2024-01-27 15:04:31 +00:00
behzad nouri d4fdcd940a
adds feature to enable chained Merkle shreds (#34916)
During a cluster upgrade when only half of the cluster can ingest the new shred
variant, sending shreds of the new variant can cause nodes to diverge.
The commit adds a feature to enable chained Merkle shreds explicitly.
2024-01-27 15:03:16 +00:00
Brooks 02062a6b6a
Removes unused AccountsHashFaultInjector (#34977) 2024-01-26 19:21:23 -05:00
Brooks e1260a9604
Removes unused parameters from AccountsHashVerifier::new() (#34976) 2024-01-26 21:52:05 +00:00
Pankaj Garg 0d117d420c
Remove BlockhashQueue dependency from SVM related code (#34974) 2024-01-26 13:46:44 -08:00
Brooks c656ca68b8
Stops pushing accounts hashes to gossip in AccountsHashVerifier (#34971) 2024-01-26 15:25:23 -05:00
Ashwin Sekar 93271d91b0
gossip: notify state machine of duplicate proofs (#32963)
* gossip: notify state machine of duplicate proofs

* Add feature flag for ingesting duplicate proofs from Gossip.

* Use the Epoch the shred is in instead of the root bank epoch.

* Fix unittest by activating the feature.

* Add a test for feature disabled case.

* EpochSchedule is now not copyable, clone it explicitly.

* pr feedback: read epoch schedule on startup, add guard for ff recache

* pr feedback: bank_forks lock, -cached_slots_in_epoch, init ff

* pr feedback: bank.forks_try_read() -> read()

* pr feedback: fix local-cluster setup

* local-cluster: do not expose gossip internals, use retry mechanism instead

* local-cluster: split out case 4b into separate test and ignore

* pr feedback: avoid taking lock if ff is already found

* pr feedback: do not cache ff epoch

* pr feedback: bank_forks lock, revert to cached_slots_in_epoch

* pr feedback: move local variable into helper function

* pr feedback: use let else, remove epoch 0 hack

---------

Co-authored-by: Wen <crocoxu@gmail.com>
2024-01-26 07:58:37 -08:00
Andrew Fitzgerald 29737ab5e4
Use ThreadLocalMultiIterator for tests (#34947)
* Use ThreadLocalMultiIterator for tests

* some validator config was not using default_for_test
2024-01-25 11:22:27 -07:00
Pankaj Garg b161f6ce08
Create SVM folder as a placeholder for the relevant code (#34942) 2024-01-25 06:20:00 -08:00
Andrew Fitzgerald 62e7ebd0cc
BlockProductionMethod::CentralScheduler as default (#34891) 2024-01-24 15:30:32 -08:00
Brooks b150de6d10
Replaces fs-err in clean_orphaned_account_snapshot_dirs() (#34902)
* Replaces fs-err in clean_orphaned_account_snapshot_dirs()

* pr: revert info message format changes
2024-01-23 19:46:02 +00:00
Andrew Fitzgerald bb829c0bcf
remove unused functions (#34895) 2024-01-23 09:32:35 -08:00