Commit Graph

3739 Commits

Author SHA1 Message Date
Trent Nelson ad67fd5be5
validator: remove optional remote accounts hash consistency check (#31279) 2023-05-16 14:23:13 -06:00
Brennan a9b19f5b14
Add replay invalidator scaffolding upstream (#31567)
* Add replay invalidator scaffolding upstream
2023-05-16 13:08:39 -07:00
Brooks dd4cfe9924
Do not purge bank snapshots in AccountsBackgroundService (#31647) 2023-05-15 13:53:48 -04:00
Andrew Fitzgerald 694099bbe3
Remove unused debug (#31618) 2023-05-15 09:38:32 -07:00
Brooks bac4d50761
Uses `_` instead of `-` for datapoint field names (#31648) 2023-05-15 12:02:43 -04:00
Andrew Fitzgerald d2bd6c72aa
Keep signal_receiver in scope (#31625) 2023-05-15 08:56:57 -07:00
behzad nouri 5178d4d49b
adds quic tvu port to contact-info (#31614)
Working towards migrating turbine to QUIC.
2023-05-15 15:13:21 +00:00
Ashwin Sekar 3e8f5bad81
refactor: highest_cluster_confirmed_root -> highest_super_majority_root (#31619) 2023-05-14 00:42:03 -07:00
Ashwin Sekar c85b057cc8
disambiguate the matches then mismatches case for ancestor samples (#31617) 2023-05-13 11:12:21 -07:00
Ashwin Sekar ef75f1cb4e
Add ancestor hashes to state machine (#31627)
* Notify replay of pruned duplicate confirmed slots

* Ingest replay signal and run ancestor hashes for pruned

* Forward PDC to ancestor hashes and ingest pruned dumps from ancestor hashes service

* Add local-cluster test
2023-05-13 02:05:44 -07:00
Brooks 962650e88e
AccountsHashVerifier purges old bank snapshots (#31519) 2023-05-12 17:02:43 -04:00
Brooks 8e5e66fdb4
Revert "Revert "SnapshotPackagerService purges old bank snapshots (#31511)" (#31524)" (#31558)
This reverts commit 775639c058.
2023-05-12 15:39:14 -04:00
Andrew Fitzgerald 6adbb1254c
consumer bench (#31414) 2023-05-12 09:44:37 -07:00
behzad nouri 4e34abbf3d
specifies protocol in contact-info get-socket api (#31602) 2023-05-12 16:16:20 +00:00
Andrew Fitzgerald 2c869ef778
QoS refactor: Allow pre-filtering (#31542) 2023-05-12 08:53:22 -07:00
Jeff Washington (jwash) 3e543665c7
remove unused functions connecting hash calc and ancient append vec boundary (#31575)
remove coupling between hash calc and ancient append vec boundary
2023-05-11 13:30:44 -07:00
Jeff Washington (jwash) 122b05b9f5
pass include_slot_in_hash through hash calcs to allow rehashing if hash is not stored (#31579)
* pass include_slot_in_hash through hash calcs to allow rehashing

* tests use each include_slot_in_hash value

* move include_slot_in_hash

* typo

* reorder struct init

* spelling is hard
2023-05-11 13:23:29 -07:00
Brooks 93087324e3
Uses AccountsBackgroundService::setup_bank_drop_callback() in tests (#31598) 2023-05-11 12:40:03 -04:00
Tao Zhu 49f44f5ded
Refactor pass feature status to deserialized packet via packet meta (#31549)
Add a flag to packet, set its value by packet_deserializer when received by banking_stage with working_bank
2023-05-11 09:31:05 -05:00
Tyera 3f70ddb2c5
Add entry notification service for geyser (#31290)
* Move entry_notifier_interface

* Add EntryNotifierService

* Use descriptive struct in sender/receiver

* Optionally initialize EntryNotifierService in validator

* Plumb EntryNotfierSender into Tvu, blockstore_processor

* Plumb EntryNotfierSender into Tpu

* Only return one option when constructing EntryNotifierService
2023-05-10 17:20:51 -06:00
Ashwin Sekar c900ef8290
refactor: combine SlotStateUpdate impls (#31578) 2023-05-10 15:12:39 -06:00
steviez 18a118b438
Condense banking_stage counters into existing datapoint (#31564)
Counters incur additional overhead in sending points to the MetricsAgent
over a crossbeam channel. Additionally, some of these counters would be
submitted by non-voting nodes which is just extra overhead and noise.

This change condenses several updates of a counter into a field of the
existing BankingStageStats metrics struct.
2023-05-10 15:44:42 -05:00
Tao Zhu fb7ba97afc
refactor: move test and bench only code from main code to where they are needed (#31550)
* refactor: move test and bench only code to specific location
* remove inactive bench test
2023-05-09 16:39:23 -05:00
Brooks 3bb2e3b546
Purges incomplete snapshot dirs at startup (#31555) 2023-05-09 14:08:12 -04:00
steviez 4300d84c68
Remove counters from ReplayStage (#31532)
replay_stage-voted_empty_bank has been converted into a datapoint that
now includes slot number. replay_stage-replay_transactions has been
removed altogether as we can get similar information on a per-slot basis
from replay-slot-stats metric.
2023-05-09 11:44:02 -05:00
behzad nouri 6a4a0418a6
removes hard-coded QUIC_PORT_OFFSET from connection-cache (#31541)
New ContactInfo has api identifying QUIC vs UDP ports; no need to hard-code
port-offset deep in connection-cache.
2023-05-09 13:46:17 +00:00
Brooks 6e342ded42
clippy: Removes redundant async blocks (#31526) 2023-05-09 09:35:38 -04:00
behzad nouri 8e638b785a
removes feature gate code sending votes to tpu-vote-port (#31529) 2023-05-08 18:12:35 +00:00
Tao Zhu 1f91a90a53
Refactor remove unnecessary parameter (#31520)
Refactor: remove unnecessary parameter from DeserializedPacket constructor
2023-05-07 10:56:24 -05:00
Brooks 775639c058
Revert "SnapshotPackagerService purges old bank snapshots (#31511)" (#31524)
This reverts commit a6c39ded8e.
2023-05-06 09:18:03 -04:00
Tao Zhu b19cc03c9a
Refactor: remove test only public function, update tests (#31518) 2023-05-05 17:22:09 -05:00
Brooks a6c39ded8e
SnapshotPackagerService purges old bank snapshots (#31511) 2023-05-05 17:22:48 -04:00
Brooks c5e071c7fe
Upgrades nightly Rust to 2023-03-04 (#31487) 2023-05-05 08:28:23 -04:00
HaoranYi 0f4293914c
remove unnecessary-struct-initialization (#31486)
* remove unnecessary-struct-initialization

* more  remove unnecessary-struct-initialization

---------

Co-authored-by: haoran <haoran.yi@solana.com>
2023-05-04 17:48:33 -05:00
Brooks ef7470f50c
Removes needless borrows (#31489) 2023-05-04 18:09:17 +00:00
Andrew Fitzgerald 886aea21cb
Internal structs for ThreadAware AccountRead/WriteLocks (#31431) 2023-05-04 09:38:34 -07:00
Andrew Fitzgerald 18cd4311af
remove counters in hotpath (#31398) 2023-05-04 09:36:28 -07:00
Jeff Biseda 19319d5b70
Rationalize (Slot, Hash) in repair by removing SlotHash type (#31470) 2023-05-03 14:03:05 -07:00
Lijun Wang 7cf50e60fc
Fixed missing Root notifications via geyser plugin framework (#31180)
* Fixed missing Root notifications via geyser plugin framework

* Renamed a variable

* fmt issue

* Do not try the loop if no subscribers.

* Addressing some feedback -- passing parent roots from replay_stage to avoid race conditions

* clippy issue

* Address some reviewing findings

* Addressed some feedback from Carl

* fix a clippy issue

* Added comments on optimistically_confirmed_bank_tracker module to explain the workflow

* Addressed Trent's review
2023-05-03 18:50:00 +08:00
Brooks 90e1b00fb5
Uses get_bank_snapshot_dir() in `core/tests/snapshot.rs` (#31434) 2023-05-01 22:57:59 +00:00
Xiang Zhu 0a2e897f16
Clean up the outdated SnapshotPackage snapshot_links field (#31360)
* Remove snapshot_links

* Change the function name from snapshot_dir to bank_snapshot_dir

* Format fix

* Fix test_concurrent_snapshot_packaging

* Fix clippy error

* Fix nits

* Fix nits 2nd try

* Use get_bank_snapshots_dir

* Use slot_dir

* Revert "Use get_bank_snapshots_dir" because get_bank_snapshots_dir is private to crate

This reverts commit 1ed9b3b2c8e84689a918beee7159f63c56500a96.
2023-05-01 11:24:59 -07:00
steviez 427ad7b5bd
Combine AccountsHashVerifier metrics (#31420)
It is more efficient to submit the metrics together, and there is no
reason for them to be separate.
2023-04-30 21:39:44 -04:00
Jeff Biseda b5bb5c6da1
filter invalid repair requests by size (#30951) 2023-04-28 16:57:15 -07:00
Brennan e79b84ea70
Rework tx sig verify batching (#31355)
* Rework tx sig verify batching to eliminate special casing and increase the packet limit for sigverify from 2k to 5k
2023-04-28 09:21:12 -07:00
behzad nouri aafcac27d8
removes pubkey from LegacyContactInfo public interface (#31375)
Working towards LegacyContactInfo => ContactInfo migration, the commit
adds more api parity between the two.
2023-04-28 12:05:15 +00:00
Ryo Onodera a30830d7a9
ci: treewide: deny used_underscore_binding (#31319)
* Enforce used_underscore_binding

* Fix all

* Work around for cfg()-ed code...

* ci....

* Make clipply fixes more pleasant

* Clone exit signal while intentionally shadowing

* Use more verbose code to avoid any #[allow(...)]s
2023-04-27 10:10:16 +09:00
Xiang Zhu f3e94ca73c
AHV processes the snapshot dirs in place (#30978)
* AHV processes the snapshot dirs in place

Let account pacakge use the snapshot dir, so AHV computes the accounts hash and turns the pre snapshot dir into a post snapshot dir

* fix status cache path to maintain the archive layout for the in-place snapshot dir archiving

* fix test_package_snapshots

* Fix test_concurrent_snapshot_packaging

* Remove debug change.

* Fix snapshot_links path

* change to borrow for bank_snapshots_dir

* Reverted changes in create_and_verify_snapshot

* Fix param errors

* Fix rebase errors

* Remove NOTE 1

* Remove unwrap

* Remove the variables to make it apparent taht snapshot_links is the bank_snapshots_dir

* Use soft link instead of hard link for snapshot and status cache

* After switching to soft symlinking, the src path should be absolute
2023-04-26 11:48:48 -07:00
Andrew Fitzgerald 2d91c52373
derive Debug for multi_iterator_scanner::ProcessingDecision (#31312) 2023-04-26 08:53:31 -07:00
Brooks 32fb15655e
Upgrades fs_extra to v1.3.0 (#31341) 2023-04-25 18:29:19 -04:00
steviez 758bc1ca75
Make ReplayStage panic before dumping repeated-repair-attempt slots (#31333)
When ReplayStage repeatedly fails to compute the correct for a block
after purging and repairing, it panics on the assumption that something
is very wrong and will require human intervention.

If this is the case, there is typically something to be debugged, and
having the slot available locally is valuable. This change does the
retry check that will panic before purging the failure slot.
2023-04-25 11:50:47 -05:00
Xiang Zhu 16f3dcd5d2
Update fn create_and_verify_snapshot (#31245)
* only 1 snapshot per archive in create_and_verify_snapshot

* Update create_and_verify_snapshot with the newer funtion calls

* Fix test_package_snapshots

* Remove account path access change

* Rename slot to num_snapshots

* Remove unncessary purge_old_bank_snapshots in test

* Update non-deterministic format comment

* Cleanup unnecessary hash calls

* Use get_accounts_hash

* Remove extra declaration

* Remove rehash

* Remove clean_accounts

* Revert "Cleanup unnecessary hash calls"

This reverts commit 06b1457462cf6d7acf62e0e5531633caf5d9fc58.

Removing unncessary hash calls should be done for create_and_verify_snapshot,
not bank_to_full_snapshot_archive

* Fix typo appenedvecs

* Remove bank_snapshots_dir after rebasing
2023-04-24 18:52:50 -07:00
behzad nouri 1b08d01a80
removes shred_version from LegacyContactInfo public interface (#31304)
Working towards LegacyContactInfo => ContactInfo migration, the commit
adds more api parity between the two.
2023-04-24 15:19:33 +00:00
behzad nouri a88024e295
removes wallclock from LegacyContactInfo public interface (#31303) 2023-04-22 20:18:39 +00:00
Jeff Biseda 3cdd59e55f
check for prior discard in shred_fetch_stage (#31293) 2023-04-21 11:43:10 -07:00
behzad nouri cb65a785bc
makes sockets in LegacyContactInfo private (#31248)
Working towards LegacyContactInfo => ContactInfo migration, the commit
hides some implementation details of LegacyContactInfo and expands API
parity with the new ContactInfo.
2023-04-21 15:39:16 +00:00
Andrew Fitzgerald 7a393e479d
Scheduler Messages (#30976) 2023-04-19 15:14:47 -07:00
Andrew Fitzgerald 10d637d2e6
PohRecorder take Arc not &Arc for blockstore (#31234) 2023-04-19 11:41:18 -07:00
steviez 377ba53a31
Fix bug where ReplayStage holds an Arc<Bank> for process lifetime (#31267)
* Fix bug where ReplayStage holds an Arc<Bank> for process lifetime

When ReplayStage::new() kicks off, it needs to do some setup with the
working bank prior to entering the main processing loop. This setup is
done before entering the main processing loop; however, a bug made it
such that an Arc<Bank> remained in scope after the processing loop had
been entered. The processing loop is only exited when the process exits,
so this means that Bank was being held for the lifetime of the process.
This is a waste of resources and prevents background cleanup.

* clippy
2023-04-19 18:12:34 +00:00
Xiang Zhu a5275f8839
Remove bank_snapshots_dir param (#31249)
* Remove bank_snapshots_dir param

* Remove outdated comment

* Revert "Remove outdated comment"

This reverts commit e4441432bec57edb0dc22c4bacf4d48ce26ed818.

* Handle parent() error

* Fix format error
2023-04-19 09:37:46 -07:00
Andrew Fitzgerald 748220c9d3
Forwarder: Add common setup for tests (#31232) 2023-04-19 09:08:13 -07:00
Brooks ca1bde3591
Use Arc instead of &Arc in AccountsBackgroundService::new (#31268) 2023-04-19 11:10:41 -04:00
Brooks 80b27f3cd9
Use Arc instead of &Arc in AccountsHashVerifier::new (#31269) 2023-04-19 11:10:08 -04:00
Brooks 1d14156832
Use Arc instead of &Arc in SnapshotPackagerService::new (#31270) 2023-04-19 11:09:49 -04:00
bji a45710838d
Add new vote state version that replaces Lockout with LandedVote to a… (#30831)
Add new vote state version that replaces Lockout with LandedVote to allow vote latency to be tracked in a future change.

Includes a feature to be enabled which will when enabled cause the vote state to be written in the new form.
2023-04-18 20:27:38 -07:00
Trent Nelson f34a6bcfce
runtime: transpose `VoteAccount::vote_state()` return to improve ergonomics (#31256) 2023-04-18 14:48:52 -06:00
Brennan 2164a50d00
Move BankIncrementalSnapshotPersistence (#31236)
* Move BankIncrementalSnapshotPersistence

* Update bank serialize ABI digest
2023-04-18 11:18:17 -07:00
Xiang Zhu e74bc4e2e7
Add a filter_by_type param for purge_old_bank_snapshots (#31191)
* Add a type_select param for purge_old_bank_snapshots

* Use flags to make the function calls more readable

* Remove the extra purge calls

* replace select_type with filter_by_type

* Add test

* Use matches

* Fix CI test on reference

* use match and call do_purge once

* Let bank_snapshots_dir be TempDir

* remove account_paths in the test

* replace bank with _bank

* Remove create_snapshot_dirs_for_tests, will take the lastest from master

* Fix merge errors
2023-04-17 16:16:41 -07:00
Xiang Zhu 5747290d51
Move reference-holding last_snapshot_storages from ABS to AHV (#31175)
* Let AHV hold and update last_snapshot_storages

* Clean up comment

* Move cloning after enqueued_time

* Minor positon change

* Remove  type last_snapshot_storages annotation
2023-04-14 14:38:44 -07:00
Andrew Fitzgerald b657004141
store slot on BlockBatchUpdate (#31190) 2023-04-14 13:15:31 -07:00
Brooks d43e19bb03
Refactors the Full/Incremental SnapshotHash types (#31186) 2023-04-13 18:01:27 -04:00
Brooks 1f67591e21
Removes `base` from `IncrementalSnapshotHash` (#31185) 2023-04-13 17:35:35 +00:00
Brooks e05957d8fa
Push starting snapshot hashes in SnapshotGossipManager::new() (#31173) 2023-04-13 11:49:17 -04:00
Andrew Fitzgerald c847236147
decision maker perf (#30618) 2023-04-12 21:40:59 -07:00
Andrew Fitzgerald 01659edd16
Forwarder: forward_packets w/o metrics (#30925) 2023-04-12 14:09:24 -07:00
Brooks 602297e29f
Add comment in SolanaGossipMananger::update_latest_full_snapshot_hash() (#31171) 2023-04-12 16:52:23 +00:00
Brooks 1761c0947b
Removes unused arg from SnapshotGossipManager::new() (#31169) 2023-04-12 16:45:31 +00:00
Brooks 944b9d574a
Only push latest snapshot hashes in SnapshotGossipManager (#31154) 2023-04-12 11:00:26 -04:00
Brooks 965dd37924
Moves SnapshotGossipManager to its own file (#31147) 2023-04-11 15:51:52 -04:00
Brooks 453f272698
Rename IncrementalSnapshotHashes to SnapshotHashes (#31136) 2023-04-11 10:30:29 -04:00
Brooks f3083ad2e0
Rename SnapshotHashes to LegacySnapshotHashes (#31086) 2023-04-10 17:52:20 -04:00
behzad nouri ce21a58b65
reworks streamer::StakedNodes (#31082)
{min,max}_stake are computed but never assigned:
https://github.com/solana-labs/solana/blob/4564bcdc1/core/src/staked_nodes_updater_service.rs#L54-L57

The updater code is also inefficient and verbose.
2023-04-10 17:07:40 +00:00
Brooks f9276d1748
Uses MAX_ACCOUNTS_HASHES instead of MAX_SNAPSHOT_HASHES in accounts_hash_verifier.rs (#31114) 2023-04-10 10:44:40 -04:00
HaoranYi fcd1fe0959
Refactor fault hash injection into lambda (#31093)
* refactor out fault hash inject output AccountsHashVerifier

* refactor faught injector out of AccountHashVerifier

* use type alias

* Apply suggestions from code review

Co-authored-by: Brooks <brooks@prumo.org>

* move type alias

* rename

---------

Co-authored-by: Brooks <brooks@prumo.org>
2023-04-07 17:50:21 -05:00
Tao Zhu 9a7b6abc91
update benches after removing packet.sender_stake (#31110) 2023-04-07 14:27:29 -05:00
Andrew Fitzgerald 926bb0c794
MultiIteratorScanner::finalize returns (payload, already_processed) (#31054)
* finalize() returns (payload, already_processed)

* Additional testing around already_handled

* Return type wrapper and comment update
2023-04-07 11:17:36 -07:00
behzad nouri 466a9a2449
removes ip_stake_map field from streamer::StakedNodes (#31078) 2023-04-07 13:27:29 +00:00
Ryo Onodera f0432ec50f
Avoid overflow in ThreadSet::any() and nits (#31098)
Avoid overflow in ThreadSet::any and etc
2023-04-07 12:45:29 +09:00
behzad nouri 4d0abebe0e
removes Packet Meta.sender_stake and find_packet_sender_stake_stage (#31077)
Packet Meta.sender_stake is unused since
https://github.com/solana-labs/solana/pull/26512
removed sender_stake from banking-stage buffer prioritization.
2023-04-06 21:33:43 +00:00
Andrew Fitzgerald 00250819b8
ThreadAwareAccountLocks (#30422) 2023-04-06 10:12:03 -07:00
Ashwin Sekar 85dbd3d94d
Add stake breakdown to metrics for HeaviestForkFailures (#31067) 2023-04-05 20:35:12 -06:00
Brennan 60c4a718a5
enhance replay partition metrics (#31010)
* enhance replay partition metrics
2023-04-04 19:57:09 -07:00
Ashwin Sekar 9cbaf0e234
Rename DeadSlotAncestorRequestStatus -> AncestorRequestStatus (#31050) 2023-04-04 20:48:45 +00:00
Tyera 3442f184f7
Remove unneeded `clippy::new_ret_no_self` allows (#31035)
Remove unneeded allows
2023-04-03 20:35:20 -06:00
behzad nouri ff9a42a354
uses Duration type instead of untyped ..._ms: u64 (#30971) 2023-03-31 15:42:49 +00:00
steviez cc8e531a5d
Enforce a minimum of 1 on full and incremental snapshot retention (#30968) 2023-03-30 10:16:36 -05:00
Illia Bobyr fe5ae7733b
ledger: confirm_slot_entries(): confirmation_elapsed metric (#30807)
Measure total time spent inside `confirm_slot_entries()`.  Useful metric
in addition to `replay_elapsed` and
`poh_verify_elapsed`/`transaction_verify_elapsed`, as it shows how PoH
and transaction verification interact with the replay process.
2023-03-29 13:11:29 -07:00
Andrew Fitzgerald 8e910b494f
Forwarder: separate get_leader_and_addr (#30922) 2023-03-28 16:34:36 -07:00
Illia Bobyr 564f8c9b17
ledger: Extract `BatchExecutionTiming` (#30806)
Extracted time metrics related to transaction execution into a separate
structure.  This allows me to call `process_entries_with_callback()`
without locking the whole instance of `ConfirmationTiming`, passing just
the `BatchExecutionTiming` part.

I want to add a new metric that starts at the beginning of the
`confirm_slot_entries()` call and ends until the very end.  In order to
use a `scopeguard::defer`, I need to be able to have an excursive
reference to it for the whole body of `confirm_slot_entries()`.

Plus a few minor renamings to clarify which verifications and results
variables actually store.  And corrected a few messages, that
incorrectly stated PoH verification, while they were actually issued
for transaction verification failures.
2023-03-28 15:37:34 -07:00
Andrew Fitzgerald b72be0f086
Forwarder: clean up packet_vec filter (#30921) 2023-03-28 14:54:04 -07:00
Andrew Fitzgerald 2dfc46c71e
Forwarder: separate update_data_budget (#30920) 2023-03-28 12:58:39 -07:00