Commit Graph

3272 Commits

Author SHA1 Message Date
steviez 300666dce7
Make `solana-ledger-tool` run AccountsBackgroundService (#26914)
Prior to this change, long running commands like `solana-ledger-tool
verify` would OOM due to AccountsDb cleanup not happening.

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-04 15:44:31 -05:00
Boqin Qin(秦 伯钦) 83a0f5da0f
core: fix double-readlock in replay_stage (#26052) 2022-08-04 10:45:31 -04:00
Brennan Watt 457f9ef739
Reduce severity level of log in replay (#26893)
* Reduce active banks log severity from warn to trace
2022-08-03 13:51:16 -07:00
github-actions[bot] fbf1bf6d86
Bump Version to 1.11.6 (#26906)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-08-03 12:48:43 -05:00
Brennan Watt f3b760dd91
Add IO metrics (#26804)
* Add Disk IO metrics
2022-08-02 14:29:53 -07:00
behzad nouri ec36f0c5df removes redundant Option<&Arc<...>> wrapper for Blockstore in serve-repair 2022-08-02 15:30:53 +00:00
behzad nouri 6423da0218 removes redundant Arc<RwLock<...>> wrapper off ServeRepair 2022-08-02 15:30:53 +00:00
Jeff Biseda ded9a35cd6
mark repair ping packets for discard only after successful signature verification (#26878) 2022-08-01 16:17:19 -07:00
Jeff Biseda 857be1e237
sign repair requests (#26833) 2022-07-31 15:48:51 -07:00
apfitzge fbfcc3febf
Bugfix: VoteProcessingTiming reset both counters (#26843) 2022-07-29 12:56:04 -05:00
Pankaj Garg fb922f613c
Compute maximum parallel QUIC streams using client stake (#26802)
* Compute maximum parallel QUIC streams using client stake

* clippy fixes

* Add unit test
2022-07-29 08:44:24 -07:00
Brennan Watt 467cb5def5
Concurrent slot replay (#26465)
* Concurrent replay slots

* Split out concurrent and single bank replay paths

* Sub function processing of replay results for readability

* Add feature switch for concurrent replay
2022-07-28 11:33:19 -07:00
Jeff Washington (jwash) 817f65bb50
add full_snapshot to hash config (#26811) 2022-07-28 09:46:34 -05:00
Ashwin Sekar 8d69e8d447
Compact vote state updates to reduce block size (#26616)
* Compact vote state updates to reduce block size

* Add rpc transaction tests
2022-07-27 13:23:44 -06:00
Ashwin Sekar ed539d65b4
Only take the latest vote for each validator in gossip (#25934)
* Only take the latest vote for each validator in gossip

Since the new vote updates are no longer incremental, there
is no value in storing intermediate votes.

* Address pr feedback

* Handle potential downgrade path, FullTowerVote -> Incremental

* Rename sent to bank -> gossip slot

* Handle downgrade case properly

* Only downgrade for newer votes and feature flag, ignore incremental votes otherwise

* Update test
2022-07-26 16:38:30 -06:00
github-actions[bot] 5d038b9d2a
Bump Version to 1.11.5 (#26758)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-07-25 13:05:14 -06:00
carllin f6d5b253fb
Enforce a 12MB limit on outbound repair (#26493) 2022-07-24 20:44:22 -05:00
Jeff Washington (jwash) d9c7bc7e78
Revert "cleanup feature: default units per instruction (#26684)" (#26750)
This reverts commit 39a34db52a.
2022-07-23 11:03:46 -05:00
Trent Nelson a603c8b0bc Enable QUIC servers by default 2022-07-22 15:45:10 -06:00
Trent Nelson 2ee19f536a Revert "Revert "core: disable quic servers on mainnet-beta" (#26216)"
This reverts commit 4a7fb2a808.
2022-07-22 15:45:10 -06:00
Tao Zhu a6215c1b92
Remove unnecessary poh_recorder read lock acquire (#26743)
Remove unnecessary acquiring of poh_recorder read lock
2022-07-22 15:23:05 -05:00
Tao Zhu 333a48e4e2
Add comment for forwarder buffer iteration behavior (#26721)
Add comment of forwarder buffer iteration behavior
2022-07-22 01:15:17 +00:00
Brennan Watt 932abe98a7
Switch UDP stats from usize to u64 (#26700) 2022-07-20 15:20:30 -07:00
Jack May 39a34db52a
cleanup feature: default units per instruction (#26684) 2022-07-20 19:13:34 +00:00
Brennan Watt 502f249904
Add proc net dev metrics to net stats (#26603)
* Add proc net dev metrics to net stats
2022-07-20 11:44:36 -07:00
sakridge 4a7fb2a808
Revert "core: disable quic servers on mainnet-beta" (#26216)
Enable QUIC server
2022-07-20 20:37:24 +02:00
Jeff Washington (jwash) 263911e7fd
save off what we find when calculating hash (#26663) 2022-07-19 09:55:52 -05:00
behzad nouri 2dd8573287
removes erroneous allow(dead_code) annotations from core (#26660) 2022-07-18 17:15:47 +00:00
Tao Zhu 22d465cd57
Share function to get priority details from various transaction types (#26643) 2022-07-15 18:17:22 -05:00
Jeff Washington (jwash) 47716a5e01
async hash verify on load (#26208)
* verify accounts hash in bg on startup

* fix some tests and loading from genesis

* add extra state for when background thread has completed
2022-07-15 14:29:56 -05:00
Tao Zhu f13b5c832d
Remove obsoleted metrics reporting to reduce lock contention on cost_model (#26608)
remove obsoleted metrics reporting to reduce lock contention on cost_model
2022-07-14 23:02:49 -05:00
Pankaj Garg 49a112ae74
Use pubkey of peer for active QUIC connection table (#26597)
* Use pubkey of peer for active QUIC connection table

* clippy

* update code
2022-07-13 09:59:01 -07:00
HaoranYi bf14440895
clean up and optimize account hash verify (#26560)
* remove unused code

* extract test related fault hash inject fn

* use rotate to optimize hashes removal

* use rotate to optimize snapshot hashes removal

* address code reveiw feedbacks

* revise comments

* inline
2022-07-12 19:27:28 +00:00
github-actions[bot] fd5df1cf25
Bump Version to 1.11.4 (#26578)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-07-11 23:30:38 -05:00
Pankaj Garg ea7448c568
Use client certs in QUIC to get peer's stake (#26477)
* Use client certs in QUIC to get peer's stake

* fixes to cert processing

* integrate the code

* clippy

* more cleanup

* sort cargo deps

* test fixes

* info -> debug
2022-07-11 18:06:40 +00:00
Tao Zhu a3b094300b
Remove sender stakes from banking_stage buffer prioritization (#26512)
* remove sender stakes from banking_stage buffer prioritization
2022-07-11 12:46:15 -05:00
Nicholas Clarke ee0a40937e
Add validator argument log_messages_bytes_limit to change log truncation limit.
Add new cli argument log_messages_bytes_limit to solana-validator to control how long program logs can be before truncation
2022-07-11 10:53:18 -05:00
behzad nouri ba785cf8ab
removes erroneous uses of std::mem::swap (#26536)
All instances should be replace by std::mem::{replace,take},
or just plain assignment.
2022-07-11 11:33:15 +00:00
Jeff Washington (jwash) 275e47f931
do this right: add 2nd pass at hash calc when failure seen (#26392) (#26538) 2022-07-10 23:10:22 -05:00
Jeff Washington (jwash) 602da5e51f
add accounts db config to bank tests (#26517) 2022-07-10 19:42:06 -05:00
Ashwin Sekar 734fedea4c
Create a more compact vote state update transaction (#26092)
* Create a more compact vote state update transaction

* pr comments

* change root to not be an option and update abi
2022-07-07 22:29:02 -07:00
carllin 5bffee248c
Cleanup repair logging (#26461) 2022-07-07 15:02:43 -05:00
github-actions[bot] 9d937fb8a0
Bump Version to 1.11.3 (#26481)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-07-07 14:39:46 -05:00
dependabot[bot] 5e8e1beeb5
chore: bump serial_test from 0.6.0 to 0.8.0 (#26463)
Bumps [serial_test](https://github.com/palfrey/serial_test) from 0.6.0 to 0.8.0.
- [Release notes](https://github.com/palfrey/serial_test/releases)
- [Commits](https://github.com/palfrey/serial_test/compare/v0.6.0...v0.8.0)

---
updated-dependencies:
- dependency-name: serial_test
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-07 10:52:20 -06:00
behzad nouri 6f4838719b
decouples shreds sig-verify from tpu vote and transaction packets (#26300)
Shreds have different workload and traffic pattern from TPU vote and
transaction packets. Some of recent changes to SigVerifyStage are not
suitable or at least optimal for shreds sig-verify; e.g. random discard,
dedup with false positives, discard excess by IP-address, ...

SigVerifier trait is meant to abstract out the distinctions between the
two pipelines, but in practice it has led to more verbose and convoluted
code.

This commit discards SigVerifier implementation for shreds sig-verify
and instead provides a standalone stage for verifying shreds signatures.
2022-07-07 11:13:13 +00:00
behzad nouri d33c548660
bypasses window-service stage before retransmitting shreds (#26291)
With recent patches, window-service recv-window does not do much other
than redirecting packets/shreds to downstream channels.
The commit removes window-service recv-window and instead sends
packets/shreds directly from sigverify to retransmit-stage and
window-service insert thread.
2022-07-06 11:49:58 +00:00
dependabot[bot] 37f4621c06
chore: bump serde from 1.0.137 to 1.0.138 (#26421)
* chore: bump serde from 1.0.137 to 1.0.138

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.137 to 1.0.138.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.137...v1.0.138)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-07-05 23:18:08 -06:00
Tao Zhu c1d89ad749
forward packets by prioritization in desc order (#25406)
- Forward packets by prioritization in desc order
- Add support of cost-tracking by transaction requested compute units
- Hook up account buckets to forwarder
- Add metrics for forwardable batches count
- Remove redundant invalid packets filtering at end of slot since forwarder will do the same when batch forwardable packets
- Add bench test for forwarding
2022-07-05 23:24:58 -05:00
Jeff Washington (jwash) 8eba4d1698
add 2nd pass at hash calc when failure seen (#26392) 2022-07-05 18:01:02 -05:00
behzad nouri d3a14f5b30
simplifies packet/shred sanity checks (#26356) 2022-07-05 21:41:19 +00:00
carllin ce39c14025
Add end-to-end replay slot metrics (#25752) 2022-07-05 13:58:51 -05:00
Nick Rempel 7e4a5de99c
Refactor ConnectionCache::use_quic (#26235)
* Remove UseQuic type

Move to storing the UdpSocket on ConnectionCache and accepting a bool

* Remove use_quic from ConnectionCache constructor

Replace with separate with_udp constructor to force callers to choose
2022-07-05 10:49:42 -07:00
behzad nouri 61f0a7d9c3
replaces Mutex<PohRecorder> with RwLock<PohRecorder> (#26370)
Mutex causes superfluous lock contention when a read-only reference suffices.
2022-07-05 14:29:44 +00:00
Brooks Prumo 9ec38a3191
Cleanup snapshot integration tests (#26390) 2022-07-05 09:23:23 -05:00
Pankaj Garg 94685e1222
Implement randomized pruning of QUIC connection from staked peers (#26299) 2022-06-30 17:56:15 -07:00
behzad nouri 88599fd760
skips shreds deserialization before retransmit (#26230)
Fully deserializing shreds in window-service before sending them to
retransmit stage adds latency to shreds propagation.
This commit instead channels through the payload and relies on only
partial deserialization of a few required fields: slot, shred-index,
shred-type.
2022-06-30 12:13:00 +00:00
Jack May 4563bf40f6
cleanup feature: tx-wide-compute-cap (#26326) 2022-06-29 23:54:45 -07:00
Jeff Washington (jwash) 557bf6e656
allow initial hash calc to occur in bg (#26271)
* allow initial hash calc to occur in bg

* validator_initialized -> startup_verification_complete

* add infos for leader and vote

* rework snapshot for startup verification

* change to assert
2022-06-29 16:48:33 -05:00
behzad nouri f875733a9e
patches bug in retransmit stats where slot stats are erroneously dropped (#26317)
slot_stats are submitted at a different cadence from the rest of
RetransmitStats. Current code erroneously clears slot_stats before
submitting any metrics.
2022-06-29 21:35:58 +00:00
behzad nouri b3406b5b2a
removes IndexedParallelIterator::with_min_len from retransmit (#26305)
Testing on mainnet-beta, with_min_len does not seem to have much impact
in the current retransmit code.
2022-06-29 13:27:17 +00:00
behzad nouri 348fe9ebe2
verifies shred slot and parent in fetch stage (#26225)
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
2022-06-28 12:45:50 +00:00
Steven Luscher 9765034e04
Enable wire compression in Solana CLI and Rust client (#26236) 2022-06-27 15:38:07 -07:00
behzad nouri 39ca788b95
discards shreds in sigverify if the slot leader is the node itself (#26229)
Shreds are dropped in window-service if the slot leader is the node
itself:
https://github.com/solana-labs/solana/blob/cd2878acf/core/src/window_service.rs#L181-L185

However this is done after wasting resources verifying signature on
these shreds, and requires a redundant 2nd lookup of the slot leader.

This commit instead discards such shreds in sigverify stage where we
already know the leader for the slot.
2022-06-27 20:12:23 +00:00
behzad nouri 67936aaa74
moves Shred::seed to ShredId and adds test coverage (#26251)
Following commits will skip shreds deserializaton before retransmit, and
so we will only have a ShredId and not a fully deserialized shred to
obtain the shuffling seed from.
2022-06-27 17:58:43 +00:00
Brooks Prumo 662818ef0d
Use `VoteAccount::node_pubkey()` (#26207) 2022-06-27 09:09:06 -05:00
HaoranYi d5efbdb19b
Add timing measurement for gossip vote txn processing (#26163)
* add timing for gossip vote txn processing

* fix build

* fix too many arg error in clippy

* atomic interval
2022-06-27 08:53:34 -05:00
Ryo Onodera cd2878acf9
Avoid to miss to root for local slots before the hard fork (#19912)
* Make sure to root local slots even with hard fork

* Address review comments

* Cleanup a bit

* Further clean up

* Further clean up a bit

* Add comment

* Tweak hard fork reconciliation code placement
2022-06-26 15:14:17 +09:00
behzad nouri 30d2b112e4
bypasses rayon thread-pool for small retransmit shred batches (#26222)
In order to preserve current behavior, the threshold is set to the
current value of the argument to IndexedParallelIterator::with_min_len.
Follow up commits will recalibrate this threshold to optimize
performance on mainnet-beta.
2022-06-25 21:15:42 +00:00
Justin Starry 7cd7173b71
Refactor: Add get_delegated_stake method to VoteAccounts (#26221) 2022-06-25 16:41:35 +00:00
Justin Starry 44d1e62007
Refactor: No need to return stake in Bank::get_vote_account (#26220) 2022-06-25 16:27:43 +00:00
behzad nouri f1b82ec44d
factors out common retransmit work for shreds of the same slot (#26218)
Shreds arriving at a node for retransmit tend to belong to the same slot
(or a just a couple of different slots). Slot leader and cluster nodes
are common for the shreds of the same slot, and so the common work to
look up these values can be factored out.
This commit first group-bys shreds by slot to factor out that common
lookup work.
2022-06-25 15:49:05 +00:00
Jeff Washington (jwash) a3395a786a
vote_account uses AccountSharedData to avoid copies (#23687)
* vote_account uses AccountSharedData to avoid copies

* simpler deserialize
2022-06-24 15:08:01 -05:00
Brooks Prumo 877fedadac
Remove StatusCacheRc type and use StatusCache directly (#26184) 2022-06-24 08:38:56 -05:00
Brooks Prumo 23c50a2389
Add StatusCache::root_slot_deltas() and use it (#26170) 2022-06-23 15:19:06 -05:00
Tyera Eulberg a6ba5a9a05
Add transaction index in slot to geyser plugin TransactionInfo (#25688)
* Define shuffle to prep using same shuffle for multiple slices

* Determine transaction indexes and plumb to execute_batch

* Pair transaction_index with transaction in TransactionStatusService

* Add new ReplicaTransactionInfoVersion

* Plumb transaction_indexes through BankingStage

* Prepare BankingStage to receive transaction indexes from PohRecorder

* Determine transaction indexes in PohRecorder; add field to WorkingBank

* Add PohRecorder::record unit test

* Only pass starting_transaction_index around PohRecorder

* Add helper structs to simplify test DashMap

* Pass entry and starting-index into process_entries_with_callback together

* Add tx-index checks to test_rebatch_transactions

* Revert shuffle definition and use zip/unzip

* Only zip/unzip if randomize

* Add confirm_slot_entries test

* Review nits

* Add type alias to make sender docs more clear
2022-06-23 13:37:38 -06:00
behzad nouri f534b8981b
maps number of data shreds to erasure batch size (#25917)
In prepration of
https://github.com/solana-labs/solana/pull/25807
which reworks erasure batch sizes, this commit:
* adds a helper function mapping the number of data shreds to the
  erasure batch size.
* adds ProcessShredsStats to Shredder::entries_to_shreds in order to
  replace and remove entries_to_data_shreds from the public interface.
2022-06-23 13:27:54 +00:00
github-actions[bot] 5c2f819f99
Bump Version to 1.11.2 (#26159) 2022-06-22 21:16:18 -05:00
dependabot[bot] 55a7e53b9e
chore: bump lru from 0.7.6 to 0.7.7 (#26140)
* chore: bump lru from 0.7.6 to 0.7.7

Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.7.6 to 0.7.7.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.7.6...0.7.7)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-06-22 14:39:07 -06:00
Jeff Biseda bafdb7dd62
Revert handle start_http failure in rpc_service (#25400) (#26130)
* revert e263be2000
2022-06-22 10:52:27 -07:00
Michael Vines f3639b76ce Remove some clippy lints 2022-06-22 09:23:22 -07:00
HaoranYi b5d0c7b468
Revert "tvu and tpu timeout on joining its microservices (#24111)" (#26132)
This reverts commit e105547c14.
2022-06-22 10:57:46 -05:00
behzad nouri faa6c32162 removes packet modifier from shred_fetch_stage
... in favor of just passing packet flags.
2022-06-22 12:17:37 +00:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00
Pankaj Garg 43ff65ece9
Use single send socket in UdpTpuConnection (#26105) 2022-06-21 14:56:21 -07:00
behzad nouri 75425521b4
moves slot updates notifications after shreds retransmit (#26094)
RetransmitSlotStats can already be utilized to track when the first
shred for a slot was received; therefore
    first_shreds_received: &Mutex<BTreeSet<Slot>>

is redundant. Sending update notifications after shreds retransmit will
also bypass the need for a mutex.
2022-06-21 17:19:40 -04:00
Lijun Wang 61946a49c3
Weight concurrent streams by stake (#25993)
Weight concurrent streams by stake for staked nodes
Ported changes from #25056 after address merge conflicts and some refactoring
2022-06-21 12:06:44 -07:00
Will Hickey 51f26dc96e
Bump version to 1.11.1 (#26104) 2022-06-21 12:07:46 -05:00
behzad nouri d2afa6b418
moves packet-hasher out of the mutex (#26091)
Packet-hasher is not mutated across threads and does not need to be
wrapped in a mutex.
2022-06-21 16:29:27 +00:00
Pankaj Garg e344c8476f
Do not use UdpTpuConnection to forward votes (#26082)
* Do not use UdpTpuConnection to forward votes

* fix tests
2022-06-21 05:56:11 -07:00
Boqin Qin(秦 伯钦) 611d2ec73c
core: fix double-readlock in validator (#26053) 2022-06-20 15:07:00 +00:00
behzad nouri 47e62add5b
removes feature gate code adding shred-type to shred seed (#25963)
The feature is already activated on all clusters, and does not impact
processing of ledger/snapshots.
2022-06-20 14:39:24 +00:00
Trent Nelson a5f290a66f core: disable quic servers on mainnet-beta 2022-06-17 20:04:05 -06:00
behzad nouri b3d1f8d1ac
tracks number of shreds sent and received at different distances from the root (#25989) 2022-06-17 21:33:23 +00:00
behzad nouri eacb9183d4
patches bug where the 1st coding shred is not inserted into blockstore (#25916)
StandardBroadcastRun::insert skips 1st shred with index zero because
the 1st *data* shred is inserted synchronously:
https://github.com/solana-labs/solana/blob/53695ecd2/core/src/broadcast_stage/standard_broadcast_run.rs#L239-L246
https://github.com/solana-labs/solana/blob/53695ecd2/core/src/broadcast_stage/standard_broadcast_run.rs#L334-L339

https://github.com/solana-labs/solana/pull/7481
which added this code was not inserting coding shreds into blockstore.
Starting with
https://github.com/solana-labs/solana/pull/8899
coding shreds are inserted into blockstore as well as data shreds, but
the insert logic erroneously skips first coding shred because it does
not check if shred is code or data.
2022-06-16 13:59:15 +00:00
behzad nouri fe3c1d3d49
removes erroneous uses of &Arc<...> from broadcast-stage (#25962) 2022-06-15 13:44:24 +00:00
Brian Anderson db9004bd0f
Fix doc warnings (#25953) 2022-06-14 21:55:08 -06:00
Tao Zhu c96d9d127a
Include forwarding counters in leader slot metrics (#25874)
* To include forwarding counters in leader slot metrics

* Capture slot_end_detected time when checking leader slots, to be used in reporting later

* Simplify banking stage loop to report leader slot metrics

Co-authored-by: carllin <carl@solana.com>
2022-06-13 17:03:34 -05:00
Michael Vines ace24a7c82 A default tower is no longer considered to contain a stray last vote 2022-06-10 14:17:26 -07:00
Lijun Wang 29b597cea5
Connection pool support in connection cache and QUIC connection reliability improvement (#25793)
* Connection pool in connection cache and handle connection errors

1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
2022-06-10 09:25:24 -07:00
Yueh-Hsuan Chiang ee4469c882
Skip compaction in backup_and_clear_blockstore (#25810)
#### Problem
blockstore clean and compact is quite slow with wait-for-supermajority purge and can take 20-30 minutes
as described in #25710.

#### Summary of Changes
This PR removes the compaction logic in backup_and_clear_blockstore as the
actual the restoration from a bad fork is handled by `blockstore.purge_slots`
(which is done by issuing rocksdb range-delete that makes the bad fork
unavailable.)

Compaction is irreverent to the shred version, as its main job in this context
is to reclaim disk storage from the deleted slots, which we can let the rocksdb
automatic background compaction to handle it.

Fixes #25710
2022-06-09 17:11:50 +08:00
carllin bf8faa8a30
Report banking stage tracer metrics (#25620) 2022-06-09 00:25:37 -05:00
Jon Cinque 79a8ecd0ac
client: Remove static connection cache, plumb it instead (#25667)
* client: Remove static connection cache, plumb it instead

* Add TpuClient::new_with_connection_cache to not break downstream

* Refactor get_connection and RwLock into ConnectionCache

* Fix merge conflicts from new async TpuClient

* Remove `ConnectionCache::set_use_quic`

* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
2022-06-08 13:57:12 +02:00
behzad nouri 6c9f2eac78
removes fec_set_offset from UnfinishedSlotInfo (#25815)
If the blockstore has shreds for a slot, it should not recreate the
slot:
https://github.com/solana-labs/solana/blob/ff68bf6c2/ledger/src/leader_schedule_cache.rs#L142-L146
https://github.com/solana-labs/solana/pull/15849/files#r596657314

Therefore in broadcast stage if UnfinishedSlotInfo is None, then
fec_set_offset will be zero:
https://github.com/solana-labs/solana/blob/ff68bf6c2/core/src/broadcast_stage/standard_broadcast_run.rs#L111-L120

As a result fec_set_offset will always be zero, and is so redundant and
can be removed.
2022-06-07 22:17:37 +00:00
Brennan Watt ba04063956
Add CPUmetrics (#25802)
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
2022-06-07 11:34:25 -07:00
apfitzge e6c21a3036
Convert Measure::this to measure! and remove Measure::this (#25776)
* Remove the args param from Measure::this since we don't ever use it

* banking_stage.rs: convert to measure!

* poh_recorder.rs: convert to measure!

* cost_update_service.rs: convert to measure!

* poh_service.rs: convert to measure!

* bank.rs: convert to measure!

* measure.rs: Remove Measure::this now that all have been converted to measure!
2022-06-06 20:21:05 -05:00
sakridge 447a3239e7
Add new replay metrics for replay blockstore_into_bank and complete (#25717) 2022-06-03 19:45:27 +02:00
behzad nouri 5dbf7d8f91
removes raw indexing into packet data (#25554)
Packets are at the boundary of the system where, vast majority of the
time, they are received from an untrusted source. Raw indexing into the
data buffer can open attack vectors if the offsets are invalid.
Validating offsets beforehand is verbose and error prone.

The commit updates Packet::data() api to take a SliceIndex and always to
return an Option. The call-sites are so forced to explicitly handle the
case where the offsets are invalid.
2022-06-03 01:05:06 +00:00
behzad nouri 81231a89b9 adds support for different variants of ShredCode and ShredData
The commit implements two new types:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
    }

Following commits will extend these types by adding merkle variants:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
        Merkle(merkle::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
        Merkle(merkle::ShredData),
    }
2022-06-02 18:55:50 +00:00
Pankaj Garg 1c2ae470c5
Fix forwarding of transactions over QUIC (#25674)
* Spawn QUIC server to receive forwarded txs

* Update validator port range

* forward votes using UDP

* no forwarding from unstaked nodes

* forwarding stats in banking stage

* fix test builds

* fix lifetime of forward sender
2022-06-02 11:14:58 -07:00
HaoranYi d3ac4e941b
Bench: preshrink + sigverify (#25480)
* double shrinking

* add bench

* rename

* aggregate timing

* remove pre/post shrink time

* update api after merge
2022-06-02 09:19:01 -05:00
Tao Zhu 51ac599915
Add user requested CU (eg. compute_budget.compute_unit_limit) to immutable_deserialized_packet, to be used in cost model and prioritized forwarding (#25695) 2022-06-01 22:43:48 +00:00
Ryo Onodera aedcb05dc8
Record solana-validator ver to metrics at startup (#25635)
* Record solana-validator ver to metrics at startup

* Update Cargo.lock
2022-06-01 13:37:50 +09:00
Christian Kamm 02b26ddd82
SigVerify: Fix num_valid_packets metric (#25643)
It used to report the number of packets with successful signature
validations but was accidentally changed to count packets passed into
the verifier by e4409a87fe.

This restores the previous meaning.
2022-05-31 18:51:20 +10:00
carllin 90a3315b69
Detect tracer key in sigverify (#25579)
* Mark the tracer transaction

* simplify tracer check
2022-05-30 18:41:54 -05:00
Justin Starry e4409a87fe
Add pre shrink pass before sigverify batch (#25136) 2022-05-28 01:51:55 +10:00
Yueh-Hsuan Chiang 5b67960c76
(Refactor) Move blocktore options related stuff to blockstore_options.rs (#25509)
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.

#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.

By doing this, we address the mutual dependency and also make the code cleaner.
2022-05-26 16:59:26 -07:00
dependabot[bot] 7f4128947b
chore: bump lru from 0.7.5 to 0.7.6 (#25572)
* chore: bump lru from 0.7.5 to 0.7.6

Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.7.5 to 0.7.6.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.7.5...0.7.6)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-26 19:05:02 +00:00
ryleung-solana 1ca5c3a7bd
Switch to using enum-dispatch to switch between UDP and Quic (#24713) 2022-05-26 11:21:16 -04:00
behzad nouri de612c25b3
removes shred wire layout specs from sigverify (#25520)
sigverify_shreds relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
https://github.com/solana-labs/solana/pull/25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
2022-05-26 13:06:27 +00:00
Christian Kamm 0efb7478cd
FindPacketSenderStake: Remove parallelism to improve performance (#25562)
* FindPacketSenderStake: Remove parallelism to improve performance

The work unit sizes were so small that using the thread pool
slowed down this stage significantly.

* fix checks

Co-authored-by: Justin Starry <justin@solana.com>
2022-05-26 21:17:52 +10:00
behzad nouri cafa85bfbb
includes shred-type when computing turbine broadcast seed (#25556)
Indices for code and data shreds of the same slot overlap; and so they
will have the same random number generator seed when shuffling cluster
nodes for turbine broadcast.

This results in the same propagation path for code and data shreds of
the same index and effectively smaller sample size for re-transmitter
nodes. For example a 32:32 batch (32 code + 32 data shreds), is
retransmitted through _at most_ 32 unique nodes, whereas ideally we want
~64 unique re-transmitters.

This commit adds shred-type to seed function so that code and data
sherds of the same (slot, index) will (most likely) have different
propagation paths.
2022-05-25 20:31:53 +00:00
behzad nouri 880684565c
limits read access into Packet data to Packet.meta.size (#25484)
Bytes past Packet.meta.size are not valid to read from.

The commit makes the buffer field private and instead provides two
methods:
* Packet::data() which returns an immutable reference to the underlying
  buffer up to Packet.meta.size. The rest of the buffer is not valid to
  read from.
* Packet::buffer_mut() which returns a mutable reference to the entirety
  of the underlying buffer to write into. The caller is responsible to
  update Packet.meta.size after writing to the buffer.
2022-05-25 16:52:54 +00:00
carllin 9651cdad99
Refactor Sigverify trait (#25359) 2022-05-24 16:01:41 -05:00
Jeff Biseda 61c5a471e8
preserve optimistic_slot in blockstore (#25311) 2022-05-24 12:03:28 -07:00
Justin Starry e66ea7cb6a Clean up Bank::commit_transactions parameters 2022-05-24 20:24:42 +08:00
Justin Starry cad1c41ce2 Add Packet::deserialize_slice convenience method 2022-05-24 17:31:14 +08:00
Tyera Eulberg 514f73f4b1
Remove retain_mut dep (#25494) 2022-05-23 21:45:49 +00:00
steviez ec7ca411dd
Make PacketBatch packets vector non-public (#25413)
Upcoming changes to PacketBatch to support variable sized packets will
modify the internals of PacketBatch. So, this change removes usage of
the internal packet struct and instead uses accessors (which are
currently just wrappers of Vector functions but will change down the
road).
2022-05-23 15:30:15 -05:00
dependabot[bot] e75569e85a
chore: bump systemstat from 0.1.10 to 0.1.11 (#25471)
Bumps [systemstat](https://github.com/unrelentingtech/systemstat) from 0.1.10 to 0.1.11.
- [Release notes](https://github.com/unrelentingtech/systemstat/releases)
- [Commits](https://github.com/unrelentingtech/systemstat/compare/v0.1.10...v0.1.11)

---
updated-dependencies:
- dependency-name: systemstat
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-23 20:08:47 +00:00
Christian Kamm 6429aff13b
findpacketsenderstake: add discard after receive (#25458)
This mimics a similar change in sigverify, see #25388
2022-05-23 21:27:20 +02:00
behzad nouri c248fb3f51
renames Packet Meta::{,set_}addr methods to {,set_}socket_addr (#25478)
In order to distinguish between Meta.addr field which is an IpAddr and
the methods which refer to a SocketAddr.
2022-05-23 15:48:59 +00:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
sakridge e22be02d3a
sigverify: add discard before dedup (#25388) 2022-05-23 03:40:33 +02:00
Brooks Prumo f8842032c6
clippy: fix "this let-binding has unit value" warnings (#25429) 2022-05-22 12:17:59 -04:00
Pankaj Garg 7fb0ef1fa5
Use async send for forwarding transactions (#25435) 2022-05-20 21:20:47 -07:00
Jeff Biseda e263be2000
handle start_http failure in rpc_service (#25400) 2022-05-20 17:59:23 -07:00
Brennan Watt e025376719
Fix packet accounting after dedup (#25357)
* Fix packet accounting after dedup
* Rename function to better represent intent
2022-05-20 17:00:13 -07:00
Brennan Watt 2fdc850176
Use Shared IP to Stake Map (#25377)
* Find packet sender stake stage use shared IP to stake map
2022-05-20 12:51:07 -07:00
Michael Vines c54e06355f
voteSubscribe pubsub notification now includes the vote transaction signature (#25291) 2022-05-19 18:28:46 -07:00
Michael Vines 97efbdc303
Defer tower saving until push_vote(), there's no need to do it sooner (#25374) 2022-05-19 18:27:58 -07:00
buffalu 971748b335
fix banking stage starvation (#25245) 2022-05-18 22:37:47 +02:00
dependabot[bot] 7402878628
chore: bump raptorq from 1.6.5 to 1.7.0 (#25330)
Bumps [raptorq](https://github.com/cberner/raptorq) from 1.6.5 to 1.7.0.
- [Release notes](https://github.com/cberner/raptorq/releases)
- [Commits](https://github.com/cberner/raptorq/compare/v1.6.5...v1.7.0)

---
updated-dependencies:
- dependency-name: raptorq
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-18 17:56:38 +00:00
dependabot[bot] 542bd0ec3c
chore: bump rayon from 1.5.2 to 1.5.3 (#25242)
* chore: bump rayon from 1.5.2 to 1.5.3

Bumps [rayon](https://github.com/rayon-rs/rayon) from 1.5.2 to 1.5.3.
- [Release notes](https://github.com/rayon-rs/rayon/releases)
- [Changelog](https://github.com/rayon-rs/rayon/blob/master/RELEASES.md)
- [Commits](https://github.com/rayon-rs/rayon/compare/v1.5.2...v1.5.3)

---
updated-dependencies:
- dependency-name: rayon
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-18 09:39:57 -06:00
Justin Starry 5548baf4dd
Don't drop transactions which use a request heap size ix (#25315) 2022-05-18 17:47:24 +08:00
steviez b27125815a
Simplify logic around MAX_ORPHAN_REPAIR_RESPONSES constant (#25032) 2022-05-17 19:45:45 -06:00
Tao Zhu b1b3702e6d
Prioritize transactions in banking stage by their compute unit price (#25178)
* - get prioritization fee from compute_budget instruction;
- update compute_budget::process_instruction function to take instruction iter to support sanitized versioned message;
- updated runtime.md

* update transaction fee calculation for prioritization fee rate as lamports per 10K CUs

* review changes

* fix test

* fix a bpf test

* fix bpf test

* patch feedback

* fix clippy

* fix bpf test

* feedback

* rename prioritization fee rate to compute unit price

* feedback

Co-authored-by: Justin Starry <justin@solana.com>
2022-05-16 12:06:33 +08:00
sakridge 52db2e19bc
Lower default batch size to 64 and add 2 banking threads (#25226) 2022-05-15 16:52:47 +02:00
sakridge 3d96a1ab76
Block packets in vote-only mode (#24906) 2022-05-14 17:53:37 +02:00
Pankaj Garg 71dd95e842
Tune banking_stage receive loop timing (#25172) 2022-05-13 03:42:08 +00:00
Jeff Washington (jwash) 896729f25e
keep track of oldest slot used by last hash calculation (#25152) 2022-05-12 11:18:08 -05:00