Commit Graph

4001 Commits

Author SHA1 Message Date
GHA: Update Upstream From Fork 27eff8408b Revert "Allow configuration of replay thread pools from CLI (#236)"
This reverts commit 973d05c098.
2024-03-22 15:58:10 -05:00
steviez 973d05c098 Allow configuration of replay thread pools from CLI (#236)
Bubble up the constants to the CLI that control the sizes of the
following two thread pools:
- The thread pool used to replay multiple forks in parallel
- The thread pool used to execute transactions in parallel
2024-03-20 16:29:43 -05:00
Tao Zhu 0119437764 qos service should also accumulate executed but errored units (#328)
qos service should also accumulated executed but errored units
2024-03-20 16:28:38 -05:00
Brennan 01e48239be fix polarity for concurrent replay (#297)
* fix polarity for concurrent replay
2024-03-20 16:28:37 -05:00
Pankaj Garg 403225f112 Remove public visibility of program cache from bank (#279) 2024-03-20 16:24:48 -05:00
Trent Nelson e80f8fa9e6 remove raptor coding experiments (#255) 2024-03-15 22:25:14 -05:00
Wen 9e394bd0e5 Wen_restart: check block full using blockstore (#250)
* Switch to blockstore.is_full() check because replay thread isn't active.

* Use make_chaining_slot_entries and add first_parent to the method.
Small style fixes.

* Switch to blockstore.is_full() check because replay thread isn't active.
2024-03-15 22:25:14 -05:00
steviez 55408093cd Make ReplayStage own the threadpool for tx replay (#190)
The threadpool used to replay multiple transactions in parallel is
currently global state via a lazy_static definition. Making this pool
owned by ReplayStage will enable subsequent work to make the pool
size configurable on the CLI.

This makes `ReplayStage` create and hold the threadpool which is passed
down to blockstore_processor::confirm_slot().

blockstore_processor::process_blockstore_from_root() now creates its'
own threadpool as well; however, this pool is only alive while for
the scope of that function and does not persist the lifetime of the
process.
2024-03-15 22:22:45 -05:00
Brooks 096a1f4e5c Removes holding storages in AccountsHashVerifier for fastboot (#120) 2024-03-15 22:22:45 -05:00
steviez 1fcef51714 Make ReplayStage create the parallel fork replay threadpool (#137)
ReplayStage owning the pool allows for subsequent work to configure
the size of the pool; configuring the size of the pool inside of the
lazy_static would have been a little messy
2024-03-09 13:28:08 -06:00
Lucas Steuernagel 2c55819e7b Gather recording booleans in a data structure (#134) 2024-03-09 13:28:08 -06:00
Dmitri Makarov 264f4dfdd0 [SVM] Move RuntimeConfig to program-runtime (#96)
RuntimeConfig doesn't use anything SVM specific and logically belongs
in program runtime rather than SVM.  This change moves the definition
of RuntimeConfig struct from the SVM crate to program-runtime and
adjusts `use` statements accordingly.
2024-03-09 13:27:11 -06:00
Tao Zhu e5ec7853c6 Combine builtin and BPF compute cost in cost model (#29)
* Combine builtin and BPF execution cost into programs_execution_cost since VM has started to consume CUs uniformly

* update tests

* apply suggestions from code review
2024-03-09 13:26:34 -06:00
steviez f3c6c08752 Give SigVerify and ShredFetch threads unique names (#98)
- solTvuFetchPmod ==> solTvuPktMod + solTvuRepPktMod
- solSigVerifier ==> solSigVerTpu + solSigVerTpuVot
2024-03-09 13:23:06 -06:00
steviez 4753dcae71 Rename and uniquify QUIC thread names (#28)
When viewing in various tools such as gdb and perf, it is not easy to
distinguish which threads are serving which function (TPU or TPU FWD)
2024-03-09 13:23:06 -06:00
steviez e81ecfbe9f
Revert "remove repetitive words (#35434)" (#35436)
This reverts commit 556a749948.
2024-03-08 23:36:52 -06:00
gcmutator 556a749948
remove repetitive words (#35434)
Signed-off-by: gcmutator <329964069@qq.com>
2024-03-08 23:01:33 -06:00
Brooks f94752d514 Adds StartingSnapshotStorages to AccountsHashVerifier (#58) 2024-03-05 09:43:25 -06:00
Wen bfe44d95f4
Wen restart aggregate last voted fork slots (#33892)
* Push and aggregate RestartLastVotedForkSlots.

* Fix API and lint errors.

* Reduce clutter.

* Put my own LastVotedForkSlots into the aggregate.

* Write LastVotedForkSlots aggregate progress into local file.

* Fix typo and name constants.

* Fix flaky test.

* Clarify the comments.

* - Use constant for wait_for_supermajority
- Avoid waiting after first shred when repair is in wen_restart

* Fix delay_after_first_shred and remove loop in wen_restart.

* Read wen_restart slots inside the loop instead.

* Discard turbine shreds while in wen_restart in windows insert rather than
shred_fetch_stage.

* Use the new Gossip API.

* Rename slots_to_repair_for_wen_restart and a few others.

* Rename a few more and list all states.

* Pipe exit down to aggregate loop so we can exit early.

* Fix import of RestartLastVotedForkSlots.

* Use the new method to generate test bank.

* Make linter happy.

* Use new bank constructor for tests.

* Fix a bad merge.

* - add new const for wen_restart
- fix the test to cover more cases
- add generate_repairs_for_slot_not_throtted_by_tick and
  generate_repairs_for_slot_throtted_by_tick to make it readable

* Add initialize and put the main logic into a loop.

* Change aggregate interface and other fixes.

* Add failure tests and tests for state transition.

* Add more tests and add ability to recover from written records in
last_voted_fork_slots_aggregate.

* Various name changes.

* We don't really care what type of error is returned.

* Wait on expected progress message in proto file instead of sleep.

* Code reorganization and cleanup.

* Make linter happy.

* Add WenRestartError.

* Split WenRestartErrors into separate erros per state.

* Revert "Split WenRestartErrors into separate erros per state."

This reverts commit 4c920cb8f8d492707560441912351cca779129f6.

* Use individual functions when testing for failures.

* Move initialization errors into initialize().

* Use anyhow instead of thiserror to generate backtrace for error.

* Add missing Cargo.lock.

* Add error log when last_vote is missing in the tower storage.

* Change error log info.

* Change test to match exact error.
2024-03-01 18:52:47 -08:00
steviez 7d6f1d5911
Give streamer::receiver() threads unique names (#35369)
The name was previously hard-coded to solReceiver. The use of the same
name makes it hard to figure out which thread is which when these
threads are handling many services (Gossip, Tvu, etc).
2024-03-01 13:36:08 -06:00
Andrew Fitzgerald ede9163633
Comments clarifying non-emptiness of threadset (#35388) 2024-03-01 11:18:42 -08:00
steviez 7c878973e2
Cleanup ReplayStage loop timing struct (#35361)
- Track loop_count in the struct
- Rename ReplayTiming ==> ReplayLoopTiming
- Make all metrics consistent to end with "_elapsed_us"
2024-03-01 12:30:50 -06:00
Pankaj Garg 990ca1d0b8
Add limit to looping in banking-stage (#35342) 2024-02-28 17:36:45 -08:00
Brooks 6aaaf858c9
Adds more info to panic message in AccountsHashVerifier (#35353) 2024-02-28 15:55:05 -05:00
behzad nouri a7a41e7631
adds Merkle shred variant with retransmitter's signature (#35293)
Moving towards locking down Turbine propagation path, the commit
reserves a buffer within shred payload for retransmitter's signature.
2024-02-28 20:31:40 +00:00
steviez 140818221c
Rename SamplePerformanceService thread for consistency (#35332)
- Rename thread
- Add uniform service start/stop logs
- Misc cleanup with variables / constants / exit flag check
2024-02-28 13:47:27 -06:00
Andrew Fitzgerald 9f581113bd
Scheduler: Leader-Slot metrics for Scheduler (#35087) 2024-02-23 17:06:22 -08:00
enjoyoor c02f47a6fb
fix: cleanup (#35298) 2024-02-23 19:59:52 +00:00
Tao Zhu 139b9c8c25
Add fee_details to fee calculation (#35021)
* add fee_details to fee calculation

* fix - no need to round after summing u64

* feature gate on removing unwanted rounding
2024-02-23 08:58:48 -06:00
Andrew Fitzgerald 367f489f63
scheduler inner metrics (#35271) 2024-02-22 15:01:08 -08:00
Ashwin Sekar 07955e79ad
replay: gracefully exit if tower load fails (#35269) 2024-02-21 18:51:30 -08:00
Ryo Onodera 024d6ecc4f
Add --unified-scheduler-handler-threads (#35195)
* Add --unified-scheduler-handler-threads

* Adjust value name

* Warn if the flag was ignored

* Tweak message a bit
2024-02-22 09:05:17 +09:00
DimAn 531793b4be
validator: ignore too old tower error (#35229)
* validator: ignore too old tower error

* Update core/src/replay_stage.rs

Co-authored-by: Ashwin Sekar <ashwin@solana.com>

* remove redundant references

---------

Co-authored-by: Ashwin Sekar <ashwin@solana.com>
2024-02-21 13:23:23 -05:00
steviez 4905076fb6
Remove channel that sends roots to BlockstoreCleanupService (#35211)
Currently, ReplayStage sends new roots to BlockstoreCleanupService, and
BlockstoreCleanupService decides when to clean based on advancement of
the latest root. This is totally unnecessary as the latest root is
cached by the Blockstore, and this value can simply be fetched.

This change removes the channel completely, and instead just fetches
the latest root from Blockstore directly. Moreso, some logic is added
to check the latest root less frequently, based on the set purge
interval.

All in all, we went from sending > 100 slots/min across a crossbeam
channel to reading an atomic roughly 3 times/min, while also removing
the need for an additional thread that read from the channel.
2024-02-21 10:16:16 -06:00
steviez 5c04a9731c
Obtain BankForks read lock once to get ancestors and descendants (#35273)
No need to get the read lock twice; instead, hold it and get both items
2024-02-21 10:07:57 -06:00
Andrew Fitzgerald cd4cf814fc
Scheduler: Separate scheduler metrics module (#35216) 2024-02-20 19:39:00 -08:00
Ashwin Sekar b0134ab04d
validator: include waited_for_supermajority in startup metric (#35137) 2024-02-20 16:13:57 -08:00
Dmitri Makarov 0acee67891
SVM: move transaction_results from accounts-db to SVM (#35183)
SVM: Remove accounts-db deps in accounts_loader tests
2024-02-20 12:54:56 -08:00
Ashwin Sekar befe8b9d98
replay: reload tower if set-identity during startup (#35173)
* replay: reload tower if set-identity during startup

* pr feedback: add unit tests

* pr feedback: use tower.node_pubkey, more descriptive names
2024-02-20 09:30:46 -08:00
HaoranYi ebf60359f4
clean up dev-context-only attribute (#35201)
Co-authored-by: HaoranYi <haoran.yi@solana.com>
2024-02-19 07:56:27 -06:00
sakridge e21251090f
Remove spammy banking-stage retryable tx metric which is not needed (#35207)
Already covered by other metrics like the filtered retryable and the
number filtered.
2024-02-16 18:29:42 +01:00
steviez 897adb2711
Update the directory naming for incorrect shred version backup (#35158)
The directory is currently named with the expected_shred_version;
however, the backup contains shreds that do NOT match the
expected_shred_version. So, use the found (incorrect) shred version in
the name instead.
2024-02-13 09:42:05 -07:00
Andrew Fitzgerald 1517d22ecc
Scheduler - prioritization fees/cost (#34888) 2024-02-09 08:51:21 -08:00
Dmitri Makarov 245d1c4087
SVM: Move TransactionCheckResult definition from accounts-db to SVM (#35153) 2024-02-08 21:13:00 -05:00
Dmitri Makarov 2c0001b530
SVM: Move RewardInfo from accounts-db to Solana SDK (#35120) 2024-02-07 10:55:39 -08:00
Pankaj Garg 46b9586630
SVM: Move SVM code to its own crate folder (#35119) 2024-02-06 16:06:32 -08:00
Pankaj Garg 10defb161f
SVM: Move TransactionErrorMetrics to SVM folder (#35112) 2024-02-06 11:15:48 -08:00
Ashwin Sekar 3e24b410fb
replay: votes made before restart are eligible for refresh (#34737)
* replay: votes made before restart are eligible for refresh

* pr feedback: rename to mark

* pr feedback: limit scope to non voting validators
2024-02-06 11:09:59 -08:00
Andrew Fitzgerald 9dca15a5b7
Rename priority to compute_unit_price (#35062)
* rename several priorities to compute_unit_price

* TransactionPriorityDetails -> ComputeBudgetDetails

* prioritization_fee_cache: fix comment

* transaction_state: fix comments and variable names

* immutable_deserialized_packet: fix comment
2024-02-05 16:41:01 -08:00
Ashwin Sekar 0e4e81a44c
banking stage: remove spammy packet conversion metric (#35014) 2024-02-05 14:46:32 -08:00