Commit Graph

189 Commits

Author SHA1 Message Date
behzad nouri 4d0abebe0e
removes Packet Meta.sender_stake and find_packet_sender_stake_stage (#31077)
Packet Meta.sender_stake is unused since
https://github.com/solana-labs/solana/pull/26512
removed sender_stake from banking-stage buffer prioritization.
2023-04-06 21:33:43 +00:00
steviez 5344a789d7
Revert "Add new vote state version that replaces Lockout with LandedV… (#30817)
Revert "Add new vote state version that replaces Lockout with LandedVote to a… (#29524)"

This reverts commit d77f0a22c7.
2023-03-21 22:54:13 +08:00
behzad nouri 5d9aba5548
increases retransmit-stage deduper capacity and reset-cycle (#30758)
For duplicate block detection, for each (slot, shred-index, shred-type)
we need to allow 2 different shreds to be retransmitted.
The commit implements this using two bloom-filter dedupers:
* Shreds are deduplicated using the 1st deduper.
* If a shred is not a duplicate, then we check if:
      (slot, shred-index, shred-type, k)
  is not a duplicate for either k = 0  or k = 1 using the 2nd deduper,
  and if so then the shred is retransmitted.

This allows to achieve larger capactiy compared to current LRU-cache.
2023-03-20 20:32:23 +00:00
bji d77f0a22c7
Add new vote state version that replaces Lockout with LandedVote to a… (#29524)
* Add new vote state version that replaces Lockout with LandedVote to allow vote latency to be tracked in a future change. Includes a feature to be enabled which will when enabled cause the vote state to be written in the new form.

* Update feature set key to one owned by ashwin

---------

Co-authored-by: Ashwin Sekar <ashwin@solana.com>
2023-03-20 08:31:46 -06:00
Tyera 7b1d446001
Admin RPC Service: move post-init activation to before wait-for-supermajority (#30544)
* Move AdminRpcRequestMetadataPostInit to solana-core

* Move AdminRpcRequestMetadataPostInit write to just before wait_for_supermajority

* Pass AdminRpcRequestMetadataPostInit in TestValidatorGenesis

* Fixup local-cluster
2023-03-01 19:38:11 -07:00
Proph3t 2271a3920b
chore: fix broken docs link (#30274)
docs: fix broken link
2023-02-13 13:31:16 -06:00
Ryo Onodera 40bbf99c74
Add fully-reproducible online tracer for banking (#29196)
* Add fully-reproducible online tracer for banking

* Don't use eprintln!()...

* Update programs/sbf/Cargo.lock...

* Remove meaningless assert_eq

* Group test-only code under aptly named mod

* Remove needless overflow handling in receive_until

* Delay stat aggregation as it's possible now

* Use Cow to avoid needless heap allocs

* Properly consume metrics action as soon as hold

* Trace UnprocessedTransactionStorage::len() instead

* Loosen joining api over type safety for replaystage

* Introce hash event to override these when simulating

* Use serde_with/serde_as instead of hacky workaround

* Update another Cargo.lock...

* Add detailed comment for Packet::buffer serialize

* Rename sender_overhead_minimized_receiver_loop()

* Use type interference for TraceError

* Another minor rename

* Retire now useless ForEach to simplify code

* Use type alias as much as possible

* Properly translate and propagate tracing errors

* Clarify --enable-banking-trace with better naming

* Consider unclean (signal-based) node restarts..

* Tweak logging and cli

* Remove Bank events as it's not needed anymore

* Make tpu own banking tracer thread

* Reduce diff a bit..

* Use latest serde_with

* Finally use the published rolling-file crate

* Make test code change more consistent

* Revive dead and non-terminating test code path...

* Dispose batches early now that possible

* Split off thread handle very early at ::new()

* Tweak message for TooSmallDirByteLimitl

* Remove too much of indirection

* Remove needless pub from ::channel()

* Clarify test comments

* Avoid needless event creation if tracer is disabled

* Write tests around file rotation and spill-over

* Remove unneeded PathBuf::clone()s...

* Introduce inner struct instead of tuple...

* Remove unused enum BankStatus...

* Avoid .unwrap() for the case of disabled tracer...
2023-01-25 21:54:38 +09:00
apfitzge 5fc83a3d19
BankingStage Refactor: Separate Next Leader Functions (#29401)
Separate next_leader functions
2023-01-20 10:02:29 -08:00
apfitzge bdd162492c
Feature/multi-iterator-scanner-read-locks (#28862) 2022-11-28 11:23:04 -06:00
Andrew Fitzgerald ee2f760d3d
MultiIteratorScanner - improve banking stage performance with high contention 2022-11-17 10:54:12 -06:00
Ashwin Sekar 9119dc13ec
Add structure to house unprocessed transactions in banking_stage (#27777)
Separate storage for voting and transaction threads:
- Voting threads utilize a shared reference in order to dedup extraneous
  votes
- Transactions have thread local storage like before
2022-09-14 10:40:44 -07:00
Ashwin Sekar c74df830b1
Add structure to collect and coalesce vote packets (#27558)
* Add structure to collect and coalesce vote packets

Will be used in banking stage to throw out extraneous vote packets
before processing

* pr comments

* Update inner lock to arc to improve performance
2022-09-14 00:44:26 -07:00
apfitzge 3bdc5b3f2b
separate packet_deserializer inside banking_stage (#27120)
* separate packet_deserializer inside banking_stage

* Make ReceivePacketResults into a struct with named fields
2022-09-01 10:00:48 -05:00
Tao Zhu 8bb039d08d
collect min prioritization fees when replaying sanitized transactions (#26709)
* Collect blocks' minimum prioritization fees when replaying sanitized transactions

* Limits block min-fee metrics reporting to top 10 writable accounts

* Add service thread to asynchronously update and finalize prioritization fee cache

* Add bench test for prioritization_fee_cache

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2022-08-31 08:00:55 -05:00
apfitzge c03f3b1436
Separate file for ImmutableDeserializedPacket type (#26951) 2022-08-09 22:39:01 -07:00
Tao Zhu 22d465cd57
Share function to get priority details from various transaction types (#26643) 2022-07-15 18:17:22 -05:00
Tao Zhu c1d89ad749
forward packets by prioritization in desc order (#25406)
- Forward packets by prioritization in desc order
- Add support of cost-tracking by transaction requested compute units
- Hook up account buckets to forwarder
- Add metrics for forwardable batches count
- Remove redundant invalid packets filtering at end of slot since forwarder will do the same when batch forwardable packets
- Add bench test for forwarding
2022-07-05 23:24:58 -05:00
carllin ce39c14025
Add end-to-end replay slot metrics (#25752) 2022-07-05 13:58:51 -05:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00
behzad nouri d2afa6b418
moves packet-hasher out of the mutex (#26091)
Packet-hasher is not mutated across threads and does not need to be
wrapped in a mutex.
2022-06-21 16:29:27 +00:00
carllin bf8faa8a30
Report banking stage tracer metrics (#25620) 2022-06-09 00:25:37 -05:00
sakridge d71986cecf
Separate staked and un-staked on quic tpu port (#24339) 2022-04-16 10:54:22 +02:00
sakridge 1b7d1f78de
Implement QUIC connection warmup service for future leaders (#24054)
* Increase connection timeouts

* Bump quic connection cache to 1024

* Use constant for quic connection timeout and add warm cache service

* Fixes to QUIC warmup service

* fix check failure

* fixes after rebase

* fix timeout test

Co-authored-by: Pankaj Garg <pankaj@solana.com>
2022-04-15 12:09:24 -07:00
HaoranYi ba770832d0
Poh timing service (#23736)
* initial work for poh timing report service

* add poh_timing_report_service to validator

* fix comments

* clippy

* imrove test coverage

* delete record when complete

* rename shred full to slot full.

* debug logging

* fix slot full

* remove debug comments

* adding fmt trait

* derive default

* default for poh timing reporter

* better comments

* remove commented code

* fix test

* more test fixes

* delete timestamps for slot that are older than root_slot

* debug log

* record poh start end in bank reset

* report full to start time instead

* fix poh slot offset

* report poh start for normal ticks

* fix typo

* refactor out poh point report fn

* rename

* optimize delete - delete only when last_root changed

* change log level to trace

* convert if to match

* remove redudant check

* fix SlotPohTiming comments

* review feedback on poh timing reporter

* review feedback on poh_recorder

* add test case for out-of-order arrival of timing points and incomplete timing points

* refactor poh_timing_points into its own mod

* remove option for poh_timing_report service

* move poh_timing_point_sender to constructor

* clippy

* better comments

* more clippy

* more clippy

* add slot poh timing point macro

* clippy

* assert in test

* comments and display fmt

* fix check

* assert format

* revise comments

* refactor

* extrac send fn

* revert reporting_poh_timing_point

* align loggin

* small refactor

* move type declaration to the top of the module

* replace macro with constructor

* clippy: remove redundant closure

* review comments

* simplify poh timing point creation

Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>
2022-03-30 09:04:49 -05:00
Tao Zhu fd515097d8 leader qos part 2: add stage to find sender stake, set to packet meta 2022-03-17 19:31:28 -05:00
Stephen Akridge 976b138e76 Add tx weighting stage 2022-03-17 19:31:28 -05:00
Tao Zhu 35d1235ed0
- move `unprocessed_packet_batches` from `BankingStage` to its own (#23508)
module
- deserialize packets during receving and buffering
2022-03-10 18:47:46 +00:00
Yueh-Hsuan Chiang b8b7163b66
(Ledger Store) Report RocksDB Column Family Metrics (#22503)
This PR enables blockstore to periodically report RocksDB column family properties.
The reported properties are under blockstore_rocksdb_cfs, and the properties also
support group by operation on cf_name.
2022-03-05 16:13:03 -08:00
HaoranYi 8de88d0a55
Refactor packet_threshold adjustment code into its own struct (#23216)
* refactor packet_threshold adjustment code into own struct and add unittest for it

* fix a typo in error message

* code review feedbacks

* another code review feedback

* Update core/src/ancestor_hashes_service.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* share packet threshold with repair service (credit to carl)

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-03-02 09:09:06 -06:00
carllin 619335df1a
Add execute timings (#23097) 2022-02-17 01:14:32 -05:00
carllin 2f9e30a1f7
Introduce slot-specific packet metrics (#22906) 2022-02-11 03:07:45 -05:00
Ashwin Sekar 5acf0f6331
Add feature gate for new vote instruction and plumb through replay (#21683)
* Add feature gate for new vote instruction and plumb through replay

Add tower versions

* Add check for slot hashes history

* Update is_recent check to exclude voting on hard fork root slot

* Move tower rollback test to flaky and ignore it until #22551 lands
2022-02-07 14:06:19 -08:00
anatoly yakovenko d343713f61
Optimize packet dedup (#22571)
* Use bloom filter to dedup packets

* dedup first

* Update bloom/src/bloom.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/sigverify_stage.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/sigverify_stage.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/sigverify_stage.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* fixup

* fixup

* fixup

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-01-19 13:58:20 -08:00
Justin Starry 93c776ce19
Refactor packet deduplication and harden bench test (#22080) 2021-12-22 23:05:10 -06:00
Jeff Biseda 97a1fa10a6
streamer send destination metrics for repair, gossip (#21564) 2021-12-17 15:21:05 -08:00
Jeff Biseda 2ed7e3af89
prioritize slot repairs for unknown last index and close to completion (#21070) 2021-11-19 19:17:30 -08:00
sakridge 0bda0c3e0c
Add bank drop service (#21322) 2021-11-19 17:20:18 +01:00
Tao Zhu 11153e1f87
refactor cost calculation (#21062)
* - cache calculated transaction cost to allow sharing;
- atomic cost tracking op;
- only lock accounts for transactions eligible for current block;
- moved qos service and stats reporting to its own model;
- add cost_weight default to neutral (as 1), vote has zero weight;

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Update core/src/qos_service.rs

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Update core/src/qos_service.rs

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-11-12 01:04:53 -06:00
sakridge a8d78e89d3
Move test-validator to own module to reduce core dependencies (#20658)
* Move test-validator to own module to reduce core dependencies

* Fix a few TestValidator paths

* Use solana_test_validator crate for solana_test_validator bin

* Move client int tests to separate crate

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-10-29 01:27:07 +00:00
Jeff Biseda 4cac66244d
report udp stats from validator (#20587) 2021-10-15 15:11:11 -07:00
Tao Zhu 005d6863fd
- move cost tracker into bank, so each bank has its own cost tracker; (#20527)
- move related modules to runtime
2021-10-12 08:51:33 -05:00
Tao Zhu 03913f6661
add tx count and thread id to stats, each stat reports and resets when slot changes (#20451) 2021-10-06 00:09:19 -05:00
Michael Vines e9722474eb Move tower storage into its own module 2021-08-11 00:20:46 -07:00
carllin 1ee64afb12
Introduce AncestorHashesService (#18812) 2021-07-23 16:54:47 -07:00
carllin 588c0464b8
Add sampling logic and DuplicateSlotRepairStatus module (#18721) 2021-07-21 11:15:08 -07:00
sakridge 0f8bcf65af
Add voting service (#18552) 2021-07-15 16:35:51 +02:00
carllin 4d3e301ee4
Introduce slot dumping to ReplayStage (#18160) 2021-07-08 19:07:32 -07:00
behzad nouri 04787be8b1
encapsulates turbine peers computations of broadcast & retransmit stages (#18238)
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.

Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.

This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
2021-07-07 00:35:25 +00:00
Tao Zhu 5e424826ba
Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
2021-07-01 11:32:41 -05:00
Tao Zhu ae27fcbcda
replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;

* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;

* changed cost unit to microsecond, using value collected from mainnet;

* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
2021-06-09 17:10:59 -05:00