Commit Graph

3081 Commits

Author SHA1 Message Date
Ryo Onodera cd2878acf9
Avoid to miss to root for local slots before the hard fork (#19912)
* Make sure to root local slots even with hard fork

* Address review comments

* Cleanup a bit

* Further clean up

* Further clean up a bit

* Add comment

* Tweak hard fork reconciliation code placement
2022-06-26 15:14:17 +09:00
behzad nouri 30d2b112e4
bypasses rayon thread-pool for small retransmit shred batches (#26222)
In order to preserve current behavior, the threshold is set to the
current value of the argument to IndexedParallelIterator::with_min_len.
Follow up commits will recalibrate this threshold to optimize
performance on mainnet-beta.
2022-06-25 21:15:42 +00:00
Justin Starry 7cd7173b71
Refactor: Add get_delegated_stake method to VoteAccounts (#26221) 2022-06-25 16:41:35 +00:00
Justin Starry 44d1e62007
Refactor: No need to return stake in Bank::get_vote_account (#26220) 2022-06-25 16:27:43 +00:00
behzad nouri f1b82ec44d
factors out common retransmit work for shreds of the same slot (#26218)
Shreds arriving at a node for retransmit tend to belong to the same slot
(or a just a couple of different slots). Slot leader and cluster nodes
are common for the shreds of the same slot, and so the common work to
look up these values can be factored out.
This commit first group-bys shreds by slot to factor out that common
lookup work.
2022-06-25 15:49:05 +00:00
Jeff Washington (jwash) a3395a786a
vote_account uses AccountSharedData to avoid copies (#23687)
* vote_account uses AccountSharedData to avoid copies

* simpler deserialize
2022-06-24 15:08:01 -05:00
Tyera Eulberg a6ba5a9a05
Add transaction index in slot to geyser plugin TransactionInfo (#25688)
* Define shuffle to prep using same shuffle for multiple slices

* Determine transaction indexes and plumb to execute_batch

* Pair transaction_index with transaction in TransactionStatusService

* Add new ReplicaTransactionInfoVersion

* Plumb transaction_indexes through BankingStage

* Prepare BankingStage to receive transaction indexes from PohRecorder

* Determine transaction indexes in PohRecorder; add field to WorkingBank

* Add PohRecorder::record unit test

* Only pass starting_transaction_index around PohRecorder

* Add helper structs to simplify test DashMap

* Pass entry and starting-index into process_entries_with_callback together

* Add tx-index checks to test_rebatch_transactions

* Revert shuffle definition and use zip/unzip

* Only zip/unzip if randomize

* Add confirm_slot_entries test

* Review nits

* Add type alias to make sender docs more clear
2022-06-23 13:37:38 -06:00
behzad nouri f534b8981b
maps number of data shreds to erasure batch size (#25917)
In prepration of
https://github.com/solana-labs/solana/pull/25807
which reworks erasure batch sizes, this commit:
* adds a helper function mapping the number of data shreds to the
  erasure batch size.
* adds ProcessShredsStats to Shredder::entries_to_shreds in order to
  replace and remove entries_to_data_shreds from the public interface.
2022-06-23 13:27:54 +00:00
Jeff Biseda bafdb7dd62
Revert handle start_http failure in rpc_service (#25400) (#26130)
* revert e263be2000
2022-06-22 10:52:27 -07:00
Michael Vines f3639b76ce Remove some clippy lints 2022-06-22 09:23:22 -07:00
HaoranYi b5d0c7b468
Revert "tvu and tpu timeout on joining its microservices (#24111)" (#26132)
This reverts commit e105547c14.
2022-06-22 10:57:46 -05:00
behzad nouri faa6c32162 removes packet modifier from shred_fetch_stage
... in favor of just passing packet flags.
2022-06-22 12:17:37 +00:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00
Pankaj Garg 43ff65ece9
Use single send socket in UdpTpuConnection (#26105) 2022-06-21 14:56:21 -07:00
behzad nouri 75425521b4
moves slot updates notifications after shreds retransmit (#26094)
RetransmitSlotStats can already be utilized to track when the first
shred for a slot was received; therefore
    first_shreds_received: &Mutex<BTreeSet<Slot>>

is redundant. Sending update notifications after shreds retransmit will
also bypass the need for a mutex.
2022-06-21 17:19:40 -04:00
Lijun Wang 61946a49c3
Weight concurrent streams by stake (#25993)
Weight concurrent streams by stake for staked nodes
Ported changes from #25056 after address merge conflicts and some refactoring
2022-06-21 12:06:44 -07:00
behzad nouri d2afa6b418
moves packet-hasher out of the mutex (#26091)
Packet-hasher is not mutated across threads and does not need to be
wrapped in a mutex.
2022-06-21 16:29:27 +00:00
Pankaj Garg e344c8476f
Do not use UdpTpuConnection to forward votes (#26082)
* Do not use UdpTpuConnection to forward votes

* fix tests
2022-06-21 05:56:11 -07:00
Boqin Qin(秦 伯钦) 611d2ec73c
core: fix double-readlock in validator (#26053) 2022-06-20 15:07:00 +00:00
behzad nouri 47e62add5b
removes feature gate code adding shred-type to shred seed (#25963)
The feature is already activated on all clusters, and does not impact
processing of ledger/snapshots.
2022-06-20 14:39:24 +00:00
Trent Nelson a5f290a66f core: disable quic servers on mainnet-beta 2022-06-17 20:04:05 -06:00
behzad nouri b3d1f8d1ac
tracks number of shreds sent and received at different distances from the root (#25989) 2022-06-17 21:33:23 +00:00
behzad nouri eacb9183d4
patches bug where the 1st coding shred is not inserted into blockstore (#25916)
StandardBroadcastRun::insert skips 1st shred with index zero because
the 1st *data* shred is inserted synchronously:
https://github.com/solana-labs/solana/blob/53695ecd2/core/src/broadcast_stage/standard_broadcast_run.rs#L239-L246
https://github.com/solana-labs/solana/blob/53695ecd2/core/src/broadcast_stage/standard_broadcast_run.rs#L334-L339

https://github.com/solana-labs/solana/pull/7481
which added this code was not inserting coding shreds into blockstore.
Starting with
https://github.com/solana-labs/solana/pull/8899
coding shreds are inserted into blockstore as well as data shreds, but
the insert logic erroneously skips first coding shred because it does
not check if shred is code or data.
2022-06-16 13:59:15 +00:00
behzad nouri fe3c1d3d49
removes erroneous uses of &Arc<...> from broadcast-stage (#25962) 2022-06-15 13:44:24 +00:00
Brian Anderson db9004bd0f
Fix doc warnings (#25953) 2022-06-14 21:55:08 -06:00
Tao Zhu c96d9d127a
Include forwarding counters in leader slot metrics (#25874)
* To include forwarding counters in leader slot metrics

* Capture slot_end_detected time when checking leader slots, to be used in reporting later

* Simplify banking stage loop to report leader slot metrics

Co-authored-by: carllin <carl@solana.com>
2022-06-13 17:03:34 -05:00
Michael Vines ace24a7c82 A default tower is no longer considered to contain a stray last vote 2022-06-10 14:17:26 -07:00
Lijun Wang 29b597cea5
Connection pool support in connection cache and QUIC connection reliability improvement (#25793)
* Connection pool in connection cache and handle connection errors

1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
2022-06-10 09:25:24 -07:00
Yueh-Hsuan Chiang ee4469c882
Skip compaction in backup_and_clear_blockstore (#25810)
#### Problem
blockstore clean and compact is quite slow with wait-for-supermajority purge and can take 20-30 minutes
as described in #25710.

#### Summary of Changes
This PR removes the compaction logic in backup_and_clear_blockstore as the
actual the restoration from a bad fork is handled by `blockstore.purge_slots`
(which is done by issuing rocksdb range-delete that makes the bad fork
unavailable.)

Compaction is irreverent to the shred version, as its main job in this context
is to reclaim disk storage from the deleted slots, which we can let the rocksdb
automatic background compaction to handle it.

Fixes #25710
2022-06-09 17:11:50 +08:00
carllin bf8faa8a30
Report banking stage tracer metrics (#25620) 2022-06-09 00:25:37 -05:00
Jon Cinque 79a8ecd0ac
client: Remove static connection cache, plumb it instead (#25667)
* client: Remove static connection cache, plumb it instead

* Add TpuClient::new_with_connection_cache to not break downstream

* Refactor get_connection and RwLock into ConnectionCache

* Fix merge conflicts from new async TpuClient

* Remove `ConnectionCache::set_use_quic`

* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
2022-06-08 13:57:12 +02:00
behzad nouri 6c9f2eac78
removes fec_set_offset from UnfinishedSlotInfo (#25815)
If the blockstore has shreds for a slot, it should not recreate the
slot:
https://github.com/solana-labs/solana/blob/ff68bf6c2/ledger/src/leader_schedule_cache.rs#L142-L146
https://github.com/solana-labs/solana/pull/15849/files#r596657314

Therefore in broadcast stage if UnfinishedSlotInfo is None, then
fec_set_offset will be zero:
https://github.com/solana-labs/solana/blob/ff68bf6c2/core/src/broadcast_stage/standard_broadcast_run.rs#L111-L120

As a result fec_set_offset will always be zero, and is so redundant and
can be removed.
2022-06-07 22:17:37 +00:00
Brennan Watt ba04063956
Add CPUmetrics (#25802)
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
2022-06-07 11:34:25 -07:00
apfitzge e6c21a3036
Convert Measure::this to measure! and remove Measure::this (#25776)
* Remove the args param from Measure::this since we don't ever use it

* banking_stage.rs: convert to measure!

* poh_recorder.rs: convert to measure!

* cost_update_service.rs: convert to measure!

* poh_service.rs: convert to measure!

* bank.rs: convert to measure!

* measure.rs: Remove Measure::this now that all have been converted to measure!
2022-06-06 20:21:05 -05:00
sakridge 447a3239e7
Add new replay metrics for replay blockstore_into_bank and complete (#25717) 2022-06-03 19:45:27 +02:00
behzad nouri 5dbf7d8f91
removes raw indexing into packet data (#25554)
Packets are at the boundary of the system where, vast majority of the
time, they are received from an untrusted source. Raw indexing into the
data buffer can open attack vectors if the offsets are invalid.
Validating offsets beforehand is verbose and error prone.

The commit updates Packet::data() api to take a SliceIndex and always to
return an Option. The call-sites are so forced to explicitly handle the
case where the offsets are invalid.
2022-06-03 01:05:06 +00:00
behzad nouri 81231a89b9 adds support for different variants of ShredCode and ShredData
The commit implements two new types:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
    }

Following commits will extend these types by adding merkle variants:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
        Merkle(merkle::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
        Merkle(merkle::ShredData),
    }
2022-06-02 18:55:50 +00:00
Pankaj Garg 1c2ae470c5
Fix forwarding of transactions over QUIC (#25674)
* Spawn QUIC server to receive forwarded txs

* Update validator port range

* forward votes using UDP

* no forwarding from unstaked nodes

* forwarding stats in banking stage

* fix test builds

* fix lifetime of forward sender
2022-06-02 11:14:58 -07:00
HaoranYi d3ac4e941b
Bench: preshrink + sigverify (#25480)
* double shrinking

* add bench

* rename

* aggregate timing

* remove pre/post shrink time

* update api after merge
2022-06-02 09:19:01 -05:00
Tao Zhu 51ac599915
Add user requested CU (eg. compute_budget.compute_unit_limit) to immutable_deserialized_packet, to be used in cost model and prioritized forwarding (#25695) 2022-06-01 22:43:48 +00:00
Ryo Onodera aedcb05dc8
Record solana-validator ver to metrics at startup (#25635)
* Record solana-validator ver to metrics at startup

* Update Cargo.lock
2022-06-01 13:37:50 +09:00
Christian Kamm 02b26ddd82
SigVerify: Fix num_valid_packets metric (#25643)
It used to report the number of packets with successful signature
validations but was accidentally changed to count packets passed into
the verifier by e4409a87fe.

This restores the previous meaning.
2022-05-31 18:51:20 +10:00
carllin 90a3315b69
Detect tracer key in sigverify (#25579)
* Mark the tracer transaction

* simplify tracer check
2022-05-30 18:41:54 -05:00
Justin Starry e4409a87fe
Add pre shrink pass before sigverify batch (#25136) 2022-05-28 01:51:55 +10:00
Yueh-Hsuan Chiang 5b67960c76
(Refactor) Move blocktore options related stuff to blockstore_options.rs (#25509)
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.

#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.

By doing this, we address the mutual dependency and also make the code cleaner.
2022-05-26 16:59:26 -07:00
ryleung-solana 1ca5c3a7bd
Switch to using enum-dispatch to switch between UDP and Quic (#24713) 2022-05-26 11:21:16 -04:00
behzad nouri de612c25b3
removes shred wire layout specs from sigverify (#25520)
sigverify_shreds relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
https://github.com/solana-labs/solana/pull/25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
2022-05-26 13:06:27 +00:00
Christian Kamm 0efb7478cd
FindPacketSenderStake: Remove parallelism to improve performance (#25562)
* FindPacketSenderStake: Remove parallelism to improve performance

The work unit sizes were so small that using the thread pool
slowed down this stage significantly.

* fix checks

Co-authored-by: Justin Starry <justin@solana.com>
2022-05-26 21:17:52 +10:00
behzad nouri cafa85bfbb
includes shred-type when computing turbine broadcast seed (#25556)
Indices for code and data shreds of the same slot overlap; and so they
will have the same random number generator seed when shuffling cluster
nodes for turbine broadcast.

This results in the same propagation path for code and data shreds of
the same index and effectively smaller sample size for re-transmitter
nodes. For example a 32:32 batch (32 code + 32 data shreds), is
retransmitted through _at most_ 32 unique nodes, whereas ideally we want
~64 unique re-transmitters.

This commit adds shred-type to seed function so that code and data
sherds of the same (slot, index) will (most likely) have different
propagation paths.
2022-05-25 20:31:53 +00:00
behzad nouri 880684565c
limits read access into Packet data to Packet.meta.size (#25484)
Bytes past Packet.meta.size are not valid to read from.

The commit makes the buffer field private and instead provides two
methods:
* Packet::data() which returns an immutable reference to the underlying
  buffer up to Packet.meta.size. The rest of the buffer is not valid to
  read from.
* Packet::buffer_mut() which returns a mutable reference to the entirety
  of the underlying buffer to write into. The caller is responsible to
  update Packet.meta.size after writing to the buffer.
2022-05-25 16:52:54 +00:00
carllin 9651cdad99
Refactor Sigverify trait (#25359) 2022-05-24 16:01:41 -05:00
Jeff Biseda 61c5a471e8
preserve optimistic_slot in blockstore (#25311) 2022-05-24 12:03:28 -07:00
Justin Starry e66ea7cb6a Clean up Bank::commit_transactions parameters 2022-05-24 20:24:42 +08:00
Justin Starry cad1c41ce2 Add Packet::deserialize_slice convenience method 2022-05-24 17:31:14 +08:00
steviez ec7ca411dd
Make PacketBatch packets vector non-public (#25413)
Upcoming changes to PacketBatch to support variable sized packets will
modify the internals of PacketBatch. So, this change removes usage of
the internal packet struct and instead uses accessors (which are
currently just wrappers of Vector functions but will change down the
road).
2022-05-23 15:30:15 -05:00
Christian Kamm 6429aff13b
findpacketsenderstake: add discard after receive (#25458)
This mimics a similar change in sigverify, see #25388
2022-05-23 21:27:20 +02:00
behzad nouri c248fb3f51
renames Packet Meta::{,set_}addr methods to {,set_}socket_addr (#25478)
In order to distinguish between Meta.addr field which is an IpAddr and
the methods which refer to a SocketAddr.
2022-05-23 15:48:59 +00:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
sakridge e22be02d3a
sigverify: add discard before dedup (#25388) 2022-05-23 03:40:33 +02:00
Pankaj Garg 7fb0ef1fa5
Use async send for forwarding transactions (#25435) 2022-05-20 21:20:47 -07:00
Jeff Biseda e263be2000
handle start_http failure in rpc_service (#25400) 2022-05-20 17:59:23 -07:00
Brennan Watt e025376719
Fix packet accounting after dedup (#25357)
* Fix packet accounting after dedup
* Rename function to better represent intent
2022-05-20 17:00:13 -07:00
Brennan Watt 2fdc850176
Use Shared IP to Stake Map (#25377)
* Find packet sender stake stage use shared IP to stake map
2022-05-20 12:51:07 -07:00
Michael Vines c54e06355f
voteSubscribe pubsub notification now includes the vote transaction signature (#25291) 2022-05-19 18:28:46 -07:00
Michael Vines 97efbdc303
Defer tower saving until push_vote(), there's no need to do it sooner (#25374) 2022-05-19 18:27:58 -07:00
buffalu 971748b335
fix banking stage starvation (#25245) 2022-05-18 22:37:47 +02:00
Justin Starry 5548baf4dd
Don't drop transactions which use a request heap size ix (#25315) 2022-05-18 17:47:24 +08:00
steviez b27125815a
Simplify logic around MAX_ORPHAN_REPAIR_RESPONSES constant (#25032) 2022-05-17 19:45:45 -06:00
Tao Zhu b1b3702e6d
Prioritize transactions in banking stage by their compute unit price (#25178)
* - get prioritization fee from compute_budget instruction;
- update compute_budget::process_instruction function to take instruction iter to support sanitized versioned message;
- updated runtime.md

* update transaction fee calculation for prioritization fee rate as lamports per 10K CUs

* review changes

* fix test

* fix a bpf test

* fix bpf test

* patch feedback

* fix clippy

* fix bpf test

* feedback

* rename prioritization fee rate to compute unit price

* feedback

Co-authored-by: Justin Starry <justin@solana.com>
2022-05-16 12:06:33 +08:00
sakridge 52db2e19bc
Lower default batch size to 64 and add 2 banking threads (#25226) 2022-05-15 16:52:47 +02:00
sakridge 3d96a1ab76
Block packets in vote-only mode (#24906) 2022-05-14 17:53:37 +02:00
Pankaj Garg 71dd95e842
Tune banking_stage receive loop timing (#25172) 2022-05-13 03:42:08 +00:00
Jeff Washington (jwash) 896729f25e
keep track of oldest slot used by last hash calculation (#25152) 2022-05-12 11:18:08 -05:00
Jeff Washington (jwash) 3a4f0d3397
println -> info (#25163) 2022-05-12 11:07:13 -05:00
Pankaj Garg bcf4d54235
Update test_banking_stage_entryfication to be more deterministic (#25146)
* Update test_banking_stage_entryfication to be more deterministic

* revert to original test with updated checks
2022-05-12 15:36:19 +00:00
HaoranYi 41d34d45e0
pass exit by ref (#25120) 2022-05-11 09:17:21 -05:00
DimAn 2fa9bc3e70
Add options to store full and/or incremental snapshots in separate locations (#24247) 2022-05-10 16:37:41 -04:00
Justin Starry e3bdc38f0a
Add sanitized types for use in banking stage (#25067) 2022-05-11 00:30:48 +08:00
HaoranYi de96663cc4
fix typo (#25083) 2022-05-09 12:42:58 -05:00
Pankaj Garg 362b0526cd
Greedy receive in banking stage (#25060)
* Greedy receive in banking stage

* add upperbound to batch size and batching time

* update test_banking_stage_entryfication test
2022-05-08 10:47:55 -07:00
HaoranYi 8e37e364b1
fix typo in measure name (#25058) 2022-05-07 10:04:42 -05:00
Justin Starry c920d411f7
Clean up logging and make variables consistent (#25049) 2022-05-07 03:52:45 +08:00
Justin Starry 082502d4f3
Fail tx sanitization when ix program id uses lookup table (#25035)
* Fail tx sanitization when ix program id uses lookup table

* feedback
2022-05-07 03:19:50 +08:00
Christian Kamm cb6cd5d60f
FindPacketSenderStake: Improve metrics (#24971)
- separate names for vote and non-vote thread
- time unit postfixes (one is in ns!)
2022-05-06 21:16:13 +02:00
behzad nouri 492f89a170
checks account owner when initializing a vote-account (#25018)
A VoteAccount may only wrap an account if the account owner is
solana_vote_program:id or equivalently this check returns true:
solana_vote_program::check_id(account.owner())
2022-05-06 16:22:49 +00:00
behzad nouri a01291069a
initializes thread-pools with lazy_static instead of thread_local (#24853)
In addition to thread_local -> lazy_static change, a number of thread-pools are
initialized with get_max_thread_count to achieve parity with the older code in
terms of number of validator threads.
2022-05-05 20:00:50 +00:00
Justin Starry 7100f1c94b
Collect stats in streamer receiver and report fetch stage metrics (#25010) 2022-05-06 02:56:18 +08:00
carllin 870ac80b79
Prioritize BankingStage packets individually in min-max heap (#24187) 2022-05-04 21:50:56 -05:00
Christian Kamm 503d0baf6d
SigVerify: Add total time metrics for dedup/discard/verify (#24768)
* SigVerify: Add total time metrics for dedup/discard/verify

Previously it was impossible to determine the total time the stage spent
on these activities within a measurement window.

* SigVerify: Add _us postfix to time metrics
2022-05-03 14:59:25 +02:00
behzad nouri eff59193db
enforces that LAST_SHRED_IN_SLOT is also DATA_COMPLETE_SHRED (#24892)
A data shred cannot be LAST_SHRED_IN_SLOT if not also DATA_COMPLETE_SHRED.
So LAST_SHRED_IN_SLOT should also imply DATA_COMPLETE_SHRED:
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shredder.rs#L116-L117
https://github.com/solana-labs/solana/blob/74b586ae7/core/src/broadcast_stage/standard_broadcast_run.rs#L80-L81

However current shred constructs allow specifying a shred which is
LAST_SHRED_IN_SLOT but not DATA_COMPLETE_SHRED:
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shred.rs#L117-L118
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shred.rs#L272-L273

The commit updates ShredFlags so that if a shred is not
DATA_COMPLETE_SHRED it cannot be LAST_SHRED_IN_SLOT either.
2022-05-02 23:33:53 +00:00
apfitzge 112a0b475a
Revert "Refactor to use EpochSchedule from within RentCollector struct" (#24893)
* Revert "Ran cargo fmt"

This reverts commit 9052e41b32.

* Revert "Fix build error introduced by my editor setup, part 2"

This reverts commit 4dfeab3b38.

* Revert "Fix build error introduced by my editor setup"

This reverts commit 87fb78dc56.

* Revert "Remove redundant epoch_schedule from AccountsPackage"

This reverts commit c2f7f2fff8.

* Revert "Fix a test"

This reverts commit 36c0bdaa78.

* Revert "Fixes to initial code"

This reverts commit ed7813e698.

* Revert "Removing redundant EpochSchedule param from fns"

This reverts commit 5472d2e605.
2022-05-02 13:46:17 -05:00
behzad nouri e812430e28
defines shred flags using bitflags crate (#24874)
Shred flags uses raw bit-masking ops which lacks type-safety:
https://github.com/solana-labs/solana/blob/a829ddc92/ledger/src/shred.rs#L112-L114

This commit instead uses bitflags crate to define shred flags.
2022-05-01 19:25:15 +00:00
Pankaj Garg 88c16c0176
Check if quic is enabled before warming up quic connections (#24821)
* Check if quic is enabled before warming up quic connections

* fix after rebase

* don't start warmup service if quic not enabled

* fix test
2022-05-01 03:52:38 +00:00
Justin Starry a61652104b
Avoid holding lock guards in match expressions (#24805)
* Avoid holding bank forks read lock for RPC requests

* Avoid using lock guards in temporaries

* revert fetch stage change
2022-04-29 16:32:46 +08:00
Justin Starry 4e58b3870c
Update all BankForks methods to return owned values (#24801) 2022-04-28 18:51:00 +00:00
sakridge 5a430c15e2
Separate sigverify metrics for each verifier (#24744) 2022-04-28 01:16:17 -07:00
behzad nouri 0f60665100
replaces Shred::new_empty_coding with Shred::new_from_parity_shard (#24749)
Removing implementation details of shreds and payload offsets from
shredder, so that shredder does not need to mutate payload:
https://github.com/solana-labs/solana/blob/71ad12128/ledger/src/shred.rs#L968-L977

Also, Shred::new_from_data can simply obtain a slice as opposed to
Option<&[u8]>:
https://github.com/solana-labs/solana/blob/71ad12128/ledger/src/shred.rs#L268-L278
2022-04-27 18:04:10 +00:00
behzad nouri 081c844d6e
removes Shred::new_empty_data_shred (#24714)
Shred::new_empty_data_shred returns an invalid shred (i.e.
shred.sanitize() returns error). The method is only used in tests and
can be easily replaced with Shred::new_from_data. To keep the shred api
surface small, this commit removes this method.
2022-04-26 23:13:12 +00:00
behzad nouri 12ae8d3be5
returns Error when Shred::sanitize fails (#24653)
Including the error in the output allows to debug when Shred::sanitize
fails.
2022-04-25 23:19:37 +00:00
behzad nouri 895f76a93c
hides implementation details of shred from its public interface (#24563)
Working towards embedding versioning into shreds binary, so that a new
variant of shred struct can include merkle tree hashes of the erasure
set.
2022-04-25 12:43:22 +00:00
carllin 8a062273de
Move error counters to be reported by leader only at end of slot (#24581)
* Add error counters to leader metrics only

* Add dependencies
2022-04-23 18:10:47 -05:00
Michael Vines d0a8a16a57 ReplayStage no longer relies on Validator to reset the poh recorder at start 2022-04-22 21:17:49 -07:00
Michael Vines 84e3342612 Process blockstore after starting the TVU 2022-04-22 21:17:49 -07:00
Michael Vines 83e041299a Run real snapshot packager while processing blockstore at validator startup 2022-04-22 21:17:49 -07:00
HaoranYi 2d4defa477
fix typo (#24576) 2022-04-22 08:43:57 -05:00
Jon Cinque 0d51596224
sim: Override slot hashes account on simulation bank (#24543)
* sim: Override slot hashes during simulation

* Add simulation test program

* Address feedback

* Add AccountOverrides explicit type

* Cargo fmt
2022-04-22 12:32:31 +02:00
Justin Starry c544742091
Local cluster test cleanup and refactoring (#24559)
* remove FixedSchedule.start_epoch

* use duration for timing

* Rename to partition bool to turbine_disabled

* simplify partition config
2022-04-22 12:14:07 +08:00
Justin Starry 02bfb85c16
Refactor transaction processing in banking stage (#24336)
* Refactor transaction processing in banking stage

* feedback

* more feedback
2022-04-21 21:06:26 +08:00
Justin Starry d5127abf46
Only add hashes for completed blocks to recent blockhashes (#24389)
* Only add hashes for completed blocks to recent blockhashes

* feedback
2022-04-21 21:05:29 +08:00
Tao Zhu a21fc3f303
Apply transaction actual execution units to cost_tracker (#24311)
* Pass the sum of consumed compute units to cost_tracker

* cost model tracks builtins and bpf programs separately, enabling adjust block cost by actual bpf programs execution costs

* Copied nightly-only experimental `checked_add_(un)signed` implementation to sdk

* Add function to update cost tracker with execution cost adjustment

* Review suggestion - using enum instead of struct for CommitTransactionDetails
Co-authored-by: Justin Starry <justin.m.starry@gmail.com>

* review - rename variable to distinguish accumulated_consumed_units from individual compute_units_consumed

* not to use signed integer operations

* Review - using saturating_add_assign!(), and checked_*().unwrap_or()

* Review - using Ordering enum to cmp

* replace checked_ with saturating_

* review - remove unnecessary Option<>

* Review - add function to report number of non-zero units account to metrics
2022-04-21 07:38:07 +00:00
Justin Starry 79923c3b58
Refactor: Rename BlockhashQueue fields and methods for clarity (#24426) 2022-04-21 11:57:17 +08:00
HaoranYi d0761d0ca4
demote receive_window_num_slot_shreds to debug logging (#24505) 2022-04-20 08:51:46 -05:00
Michael Vines 05f32f287c solana-validator monitor now reports slot-level progress while loading blockstore 2022-04-19 22:09:48 -07:00
Michael Vines 9e4999ef6a Remove halt_at_slot from RuntimeConfig, it's not a runtime concern 2022-04-19 19:23:58 -07:00
Michael Vines 988210908c Move verify_udp_stats_access out of the way 2022-04-19 19:23:58 -07:00
Michael Vines c6f3da4879 blockstore_processor now accepts an Arc<Rwlock<BankForks>> 2022-04-19 19:23:58 -07:00
Michael Vines 0e2e0c8b7d Extract most storage-related services from the Tvu abstraction 2022-04-19 19:23:58 -07:00
Michael Vines 268a2109de Relocate hard forks info log 2022-04-19 19:23:58 -07:00
Michael Vines dd766042df Remove LedgerMetricReportService from TVU 2022-04-19 19:23:58 -07:00
behzad nouri 705ea53353
moves sign_shred and new_coding_shred_header out of Shredder (#24487) 2022-04-19 20:00:05 +00:00
Tao Zhu 94b0186a96
Cost model tracks builtins and bpf programs separately (#24468)
* Cost model tracks builtins and bpf programs separatele (enables adjusting block cost by actual bpf programs execution costs)

* Address reviews: expand test; add metrics stat
2022-04-19 13:25:47 -05:00
behzad nouri 3bbfaae7b6
moves shred stats to a separate file (#24484) 2022-04-19 18:25:09 +00:00
Jeff Washington (jwash) d9d0dad258
report swap mem as bytes like other metrics (#24455) 2022-04-19 10:03:25 -05:00
behzad nouri 039488b562
drops redundant turbine propagation path (#24351)
Most nodes in the cluster receive the same shred from two different
nodes: parent, and the first node of their neighborhood:
https://github.com/solana-labs/solana/blob/a8c695ba5/core/src/cluster_nodes.rs#L178-L197

Because of the erasure codings, half of the shreds are already
redundant. So this redundant propagation path will only add extra
overhead.

Additionally the very first node of the broadcast tree has 2x fanout
(i.e. 400 nodes) which adds too much load at one node.

This commit simplifies the broadcast tree by dropping the redundant
propagation path and removing the 2x fanout at root node.
2022-04-19 00:11:29 +00:00
behzad nouri 1d50832389
replaces counters with datapoints in gossip metrics (#24451) 2022-04-18 23:14:59 +00:00
Jason Davis c2f7f2fff8 Remove redundant epoch_schedule from AccountsPackage 2022-04-18 11:57:40 -05:00
Jason Davis 5472d2e605 Removing redundant EpochSchedule param from fns 2022-04-18 11:57:40 -05:00
Christian Kamm d2c6c04d3e banking-bench: Add and rearrange options
- Add write-lock-contention option, replacing same_payer
- write-lock-contention also has a same-batch-only value, where
  contention happens only inside batches, not between them
- Rename num-threads to batches-per-iteration, which is closer to what
  it is actually doing.
- Add num-banking-threads as a new option
- Rename packets-per-chunk to packets-per-batch, because this is closer
  to what's happening; and it was previously confusing that num-chunks
  had little to do with packets-per-chunk.

Example output for a iterations=100 and a permutation of inputs:

contention,threads,batchsize,batchcount,tps
none,           3,192, 4,65290.30
none,           4,192, 4,77358.06
none,           5,192, 4,86436.65
none,           3, 12,64,43944.57
none,           4, 12,64,65852.15
none,           5, 12,64,70674.37
same-batch-only,3,192, 4,3928.21
same-batch-only,4,192, 4,6460.15
same-batch-only,5,192, 4,7242.85
same-batch-only,3, 12,64,11377.58
same-batch-only,4, 12,64,19582.79
same-batch-only,5, 12,64,24648.45
full,           3,192, 4,3914.26
full,           4,192, 4,2102.99
full,           5,192, 4,3041.87
full,           3, 12,64,11316.17
full,           4, 12,64,2224.99
full,           5, 12,64,5240.32
2022-04-18 09:43:46 -05:00
steviez 38f0d60b00
Move repeated logic into common function (#24373) 2022-04-18 00:16:06 -05:00
Tao Zhu 578d59c802 Remove the code that handles cost update for separate pr 2022-04-17 19:26:24 -05:00
Tao Zhu e97ffb55cb nit - renaming variables to concise names 2022-04-17 19:26:24 -05:00
Tao Zhu 6bc6384f8e refactor to consolidate info into single return field 2022-04-17 19:26:24 -05:00
Tao Zhu 9dadfb2e2c Add checked_add_signed() to apply cost adjustment to cost_tracker 2022-04-17 19:26:24 -05:00
Tao Zhu 810b1dff40 undo cost of executed-but-not-recorded transactions from cost_tracker 2022-04-17 19:26:24 -05:00
Tao Zhu 23d365d02f Address review comment: extract transaction was_executed status to avoid cloning execution_results 2022-04-17 19:26:24 -05:00
Tao Zhu 094da35b91 Address review comments:
1. use was_executed to correctly identify transactions requires cost adjustment;
2. add function to specifically handle executino cost adjustment without have to copy accounts
2022-04-17 19:26:24 -05:00
Tao Zhu 29ca21ed78 undo transaction cost from cost_tracker if it was not executed successfully 2022-04-17 19:26:24 -05:00
sakridge d71986cecf
Separate staked and un-staked on quic tpu port (#24339) 2022-04-16 10:54:22 +02:00
sakridge 1b7d1f78de
Implement QUIC connection warmup service for future leaders (#24054)
* Increase connection timeouts

* Bump quic connection cache to 1024

* Use constant for quic connection timeout and add warm cache service

* Fixes to QUIC warmup service

* fix check failure

* fixes after rebase

* fix timeout test

Co-authored-by: Pankaj Garg <pankaj@solana.com>
2022-04-15 12:09:24 -07:00
Christian Kamm 97f2eb8e65 Banking stage: Deserialize packets only once
Benchmarks show roughly a 6% improvement. The impact could be more
significant when transactions need to be retried a lot.

after patch:
{'name': 'banking_bench_total', 'median': '72767.43'}
{'name': 'banking_bench_tx_total', 'median': '80240.38'}
{'name': 'banking_bench_success_tx_total', 'median': '72767.43'}
test bench_banking_stage_multi_accounts
... bench:   6,137,264 ns/iter (+/- 1,364,111)
test bench_banking_stage_multi_programs
... bench:  10,086,435 ns/iter (+/- 2,921,440)

before patch:
{'name': 'banking_bench_total', 'median': '68572.26'}
{'name': 'banking_bench_tx_total', 'median': '75704.75'}
{'name': 'banking_bench_success_tx_total', 'median': '68572.26'}
test bench_banking_stage_multi_accounts
... bench:   6,521,007 ns/iter (+/- 1,926,741)
test bench_banking_stage_multi_programs
... bench:  10,526,433 ns/iter (+/- 2,736,530)
2022-04-15 00:57:11 -06:00
sakridge 7a4a6597c0
Don't enforce ulimit for validator test config (#24272) 2022-04-12 22:06:37 +02:00
Jon Cinque 9b8850f99e
test-validator: Add `--max-compute-units` flag (#24130)
* test-validator: Add `--max-compute-units` flag

* Add `RuntimeConfig` for tweaking runtime behavior

* Actually add the file

* Move RuntimeConfig to runtime
2022-04-12 02:28:10 +02:00
Michael Vines c1687b0604 Switch to await-aware tokio::sync::Mutex 2022-04-11 18:15:03 -04:00
Giorgio Gambino 60b2155bd3
Add accounts-filler-size command line option (#23896) 2022-04-11 13:10:09 -05:00
carllin ff3b6d2b8b
Remove duplicate increment (#24219) 2022-04-09 15:21:39 -05:00
Christian Kamm a058f348a2 Address review comments 2022-04-08 14:37:55 -05:00
Christian Kamm 2ed29771f2 Unittest for cost tracker after process_and_record_transactions 2022-04-08 14:37:55 -05:00
Christian Kamm 924b8ea1eb Adjustments to cost_tracker updates
- don't store pending tx signatures and costs in CostTracker
- apply tx costs to global state immediately again
- go from commit_or_cancel to update_or_remove, where the cost tracker
  is either updated with the true costs for successful tx, or the costs
  of a retryable tx is removed
- move the function into qos_service and hold the cost tracker lock for
  the whole loop
2022-04-08 14:37:55 -05:00
Tao Zhu 9e07272af8 - Only commit successfully executed transactions' cost to cost_tracker;
- In-fly transactions are pended in cost_tracker until being committed
  or cancelled;
2022-04-08 14:37:55 -05:00
Jeff Washington (jwash) 210f6a6fab
move hash calculation out of acct bg svc (#23689)
* move hash calculation out of acct bg svc

* pr feedback
2022-04-08 10:42:03 -05:00
steviez 1dd63631c0
Add high level overview comments on ledger_cleanup_service (#24184) 2022-04-08 00:49:21 -05:00
HaoranYi e105547c14
tvu and tpu timeout on joining its microservices (#24111)
* panic when test timeout

* nonblocking send when when droping banks

* debug log

* timeout for tvu

* unused varaible

* timeout for tpu

* Revert "debug log"

This reverts commit da780a3301a51d7c496141a85fcd35014fe6dff5.

* add timeout const

* fix typo

* Revert "nonblocking send when when droping banks".
I will create another pull request for this.

This reverts commit 088c98ec0facf825b5eca058fb860deba6d28888.

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-07 20:20:13 -05:00
Jeff Washington (jwash) c27150b1a3
reserialize_bank_fields_with_hash (#23916)
* reserialize_bank_with_new_accounts_hash

* Update runtime/src/serde_snapshot.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/serde_snapshot/tests.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/serde_snapshot/tests.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* pr feedback

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-04-07 14:05:57 -05:00
Jeff Washington (jwash) 550ca7bf92
compare contents of serialized banks instead of exact file format (#24141)
* compare contents of serialized banks instead of exact file format

* Update runtime/src/snapshot_utils.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/snapshot_utils.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* pr feedback

* get rid of clone

* pr feedback

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-04-06 21:55:44 -05:00
Jeff Washington (jwash) fddd162645
reserialize bank in ahv by first writing to temp file in abs (#23947) 2022-04-06 21:39:26 -05:00
Tyera Eulberg fb67ff14de
Remove replica-node crates (#24152) 2022-04-06 16:52:19 -06:00
Brooks Prumo c322842257
Replace channel with Mutex<Option> for AccountsPackage (#24013) 2022-04-06 05:47:19 -05:00
HaoranYi 302142bb25
fix typo (#24123) 2022-04-05 15:55:47 -05:00
behzad nouri db23295e1c
removes legacy weighted_shuffle and weighted_best methods (#24125)
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
https://github.com/solana-labs/solana/pull/18343
and can be replaced with WeightedShuffle.

Also, as described in:
https://github.com/solana-labs/solana/pull/13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
2022-04-05 19:19:22 +00:00
carllin 4ea59d8cb4
Set drop callback on first root bank (#23999) 2022-04-05 13:02:33 -05:00
behzad nouri 2282571493
removes outdated and flaky test_skip_repair from retransmit-stage (#24121)
test_skip_repair in retransmit-stage is no longer relevant because
following: https://github.com/solana-labs/solana/pull/19233
repair packets are filtered out earlier in window-service and so
retransmit stage does not know if a shred is repaired or not.
Also, following turbine peer shuffle changes:
https://github.com/solana-labs/solana/pull/24080
the test has become flaky since it does not take into account how peers
are shuffled for each shred.
2022-04-05 16:02:53 +00:00
behzad nouri 2b718d00b0 removes legacy compatibility turbine peers shuffle code 2022-04-05 12:04:12 +00:00
behzad nouri d0b850cdd9 removes turbine peers shuffle patch feature 2022-04-05 12:04:12 +00:00
behzad nouri 855801cc95 removes deterministic-shred-seed feature 2022-04-05 12:04:12 +00:00
Jeff Biseda ee6bb0d5d3
track fec set turbine stats (#23989) 2022-04-04 14:44:21 -07:00
HaoranYi 6ba4e870c4
Blockstore should drop signals before validator exit (#24025)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* wip

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

* clean up import

* add mutex for signal senders in blockstore

* remove mut

* refactor: extract add signal functions

* make blockstore signal private

* let compiler infer mutex type

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-04 11:38:05 -05:00
behzad nouri 7cb3b6cbe2
demotes WeightedShuffle failures to error metrics (#24079)
Since call-sites are calling unwrap anyways, panicking seems too punitive
for our use cases.
2022-04-03 16:20:06 +00:00
HaoranYi ffa4cafe1c
Revert sequential execution of validator_exit and validator_parallel_exit tests (#24048)
* handle channel disconnect

* revert sequential execution of validator_exit and parallel_validator_exit tests
2022-04-02 10:22:47 -05:00
Yueh-Hsuan Chiang 0b5ed87220
(LedgerStore) Enable performance sampling in column family get() (#23834)
#### Summary of Changes
This PR enables RocksDB read side performance metrics to report to blockstore_rocksdb_read_perf.
The sampling rate is controlled by an env arg `SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K`,
specifies the number of perf samples for every 1000 operations.  The default value is set to 10, meaning
we will report 10 out of 1000 (or 1/100) reads.

The metrics are based on the RocksDB [PerfContext](https://github.com/facebook/rocksdb/blob/main/include/rocksdb/perf_context.h).
It includes many useful metrics including block read time, cache hit rate, and time spent on decompressing the block.
2022-04-01 13:13:32 -07:00
Pankaj Garg df4d92f9cf
Revert voting service to use UDP instead of QUIC (#24032) 2022-04-01 09:34:18 -07:00
HaoranYi 51b37f0184
Modify rpc_completed_slot_service to be non-blocking (#24007)
* timeout for validator exits

* clippy

* print backtrace when panic

* add backtrace package

* increase time out to 30s

* debug logging

* make rpc complete service non blocking

* reduce log level

* remove logging

* recv_timeout

* remove backtrace

* remove sleep

* remove unused variable

* add comments

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* whitespace

* more whitespace

* fix build

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-03-31 16:44:23 -05:00
Jeff Washington (jwash) 9c8dad33c7
add epoch_schedule and rent_collector to hash calc (#24012) 2022-03-31 10:51:18 -05:00
Jeff Washington (jwash) da001d54e5
calculate_accounts_hash_helper uses config (#24003) 2022-03-31 09:29:45 -05:00
Jeff Washington (jwash) 125f9634fd
add hash calc config.use_write_cache (#24005) 2022-03-30 17:19:34 -05:00
HaoranYi ba770832d0
Poh timing service (#23736)
* initial work for poh timing report service

* add poh_timing_report_service to validator

* fix comments

* clippy

* imrove test coverage

* delete record when complete

* rename shred full to slot full.

* debug logging

* fix slot full

* remove debug comments

* adding fmt trait

* derive default

* default for poh timing reporter

* better comments

* remove commented code

* fix test

* more test fixes

* delete timestamps for slot that are older than root_slot

* debug log

* record poh start end in bank reset

* report full to start time instead

* fix poh slot offset

* report poh start for normal ticks

* fix typo

* refactor out poh point report fn

* rename

* optimize delete - delete only when last_root changed

* change log level to trace

* convert if to match

* remove redudant check

* fix SlotPohTiming comments

* review feedback on poh timing reporter

* review feedback on poh_recorder

* add test case for out-of-order arrival of timing points and incomplete timing points

* refactor poh_timing_points into its own mod

* remove option for poh_timing_report service

* move poh_timing_point_sender to constructor

* clippy

* better comments

* more clippy

* more clippy

* add slot poh timing point macro

* clippy

* assert in test

* comments and display fmt

* fix check

* assert format

* revise comments

* refactor

* extrac send fn

* revert reporting_poh_timing_point

* align loggin

* small refactor

* move type declaration to the top of the module

* replace macro with constructor

* clippy: remove redundant closure

* review comments

* simplify poh timing point creation

Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>
2022-03-30 09:04:49 -05:00
Jeff Washington (jwash) c24de17278
remove index hash calculation as an option (#23928) 2022-03-25 15:32:53 -05:00
HaoranYi 01af40d6b6
Fix intermittent validator_exit test failure (#23594)
* run validator_exit_test sequentially

* limit validator exit run to its own serial run subset
add 10ms delay in the validator exit tests

* fix intermittent validator exit failure

* no sleep

* undo the code move
2022-03-25 14:38:19 -05:00
ryleung-solana 6b85c2104c
Implement forwarding via TpuConnection (#23817) 2022-03-25 11:31:40 -04:00
Steven Luscher f44c8f296f
fix: thread `enforce_ulimit_nofile` config down when opening blockstore (#23925) 2022-03-25 03:13:33 -05:00
Jeff Washington (jwash) 51f5524e2f
make verify_accounts_package_hash like other hash calc (#23906) 2022-03-24 17:49:48 -05:00
Jeff Washington (jwash) 55d61023f7
document 'accounts' hash (#23907) 2022-03-24 15:58:52 -05:00
HaoranYi fedf4e984f
typo (#23910) 2022-03-24 15:21:59 -05:00
Jeff Washington (jwash) 37c36ce3fa
pass stats separately from CalcAccountsHashConfig (#23892) 2022-03-24 12:48:47 -05:00
steviez c31db81ac4
Use VoteAccountsHashMap type alias in all applicable spots (#23904) 2022-03-24 12:09:48 -05:00
ryleung-solana 82945ba973
Optimize TpuConnection and its implementations and refactor connection-cache to not use dyn in order to enable those changes (#23877) 2022-03-24 11:40:26 -04:00
Jeff Washington (jwash) 5b916961b5
HashCalc uses self.accounts_cache (#23890) 2022-03-24 10:34:28 -05:00
Jeff Washington (jwash) b22165ad69
hash calc uses self.filler_account_suffix (#23887) 2022-03-24 09:58:06 -05:00
Jeff Washington (jwash) 9022931689
calc hash uses self.num_hash_scan_passes (#23883) 2022-03-24 09:44:42 -05:00
Jeff Washington (jwash) db5d68f01f
HashCalc uses self.accounts_hash_cache_path (#23882) 2022-03-24 09:31:55 -05:00
Jeff Washington (jwash) 3e22d4b286
calc hash uses self.thread_pool_clean (#23881) 2022-03-23 20:52:38 -05:00
Jeff Washington (jwash) 9e61fe7583
add AccountsHashConfig to manage parameters (#23850) 2022-03-23 13:44:23 -05:00
HaoranYi db49b826f0
seperate blockstore metrics from window service metrics (#23871) 2022-03-23 13:38:17 -05:00
HaoranYi 7ff8ed869c
typos (#23870) 2022-03-23 13:36:55 -05:00
Jeff Washington (jwash) b1280b670a
calculate_accounts_hash_without_index takes &self (#23846)
* calculate_accounts_hash_without_index takes &self

* Update runtime/src/snapshot_package.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-03-23 11:57:32 -05:00
Justin Starry 92462ae031
Manually serialize and use `send_wire_transaction` for votes (#23826)
* Revert "core: partial versioned transaction support for voting service"

This reverts commit eb3df4c20e.

* Manually serialize vote tx before sending to TPU
2022-03-23 09:47:55 +08:00
Jon Cinque 7af48465fa
transaction-status: Add return data to meta (#23688)
* transaction-status: Add return data to meta

* Add return data to simulation results

* Use pretty-hex for printing return data

* Update arg name, make TransactionRecord struct

* Rename TransactionRecord -> ExecutionRecord
2022-03-22 23:17:05 +01:00
Trent Nelson eb3df4c20e core: partial versioned transaction support for voting service 2022-03-21 22:59:05 -06:00
HaoranYi 45a7c6edfb
Fix typos and a small refactor (#23805)
* fix typo

* remove packet_has_more_unprocessed_transactions function
2022-03-21 18:35:31 -05:00
Pankaj Garg 5d03b188c8
Use QUIC client in voting service (#23713)
* Use QUIC client in voting service

* guard quic-client usage with a flag

* add measure to time the quic client

* move time measure outside if block

* remove quic vs UDP flag from voting service
2022-03-21 09:10:16 -07:00
Tao Zhu 71ea05c176 replace nested for_each with flat_map 2022-03-18 16:37:41 -05:00
Tao Zhu 1c369fb55f Scan entire UnprocessedPacketBatches buffer to produce stake and locator of each packet 2022-03-18 16:37:41 -05:00
Yueh-Hsuan Chiang f999eef452
(LedgerStore) Rename BlockstoreAdvancedOptions to LedgerColumnOptions (#23764)
This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will
pass-down this struct to LedgerColumn to allow it to perform metric reporting.
2022-03-18 11:13:35 -07:00
Tao Zhu 56428be629 Not exposing inner cost_table to encapsulating implementation details,
making future change easier.
2022-03-18 12:58:43 -05:00
Tao Zhu 0ed23899e7 directly use compute_budget MAX_UNITS and DEFAULT_UNITS 2022-03-18 08:53:11 -05:00
Tao Zhu a4cacf3389 add deterministic default cost 2022-03-18 08:53:11 -05:00
Tao Zhu c478fe2047 add timing metrics, some renaming 2022-03-17 19:31:28 -05:00
Tao Zhu fd515097d8 leader qos part 2: add stage to find sender stake, set to packet meta 2022-03-17 19:31:28 -05:00
Stephen Akridge 976b138e76 Add tx weighting stage 2022-03-17 19:31:28 -05:00
Michael Vines 3773b753d1 Configure shrink paths during blockstore load 2022-03-15 23:08:07 -07:00
Michael Vines ab373bb1a9 Refactor new_banks_from_ledger() into load and process steps 2022-03-15 23:08:07 -07:00
Michael Vines 2da4e3eb6c Add --no-os-memory-stats-reporting 2022-03-15 17:07:40 -07:00
Michael Vines dbc62f2e28 Use consistent variable naming for DropBankService 2022-03-15 17:07:13 -07:00
Michael Vines d44f3d7216 Remove unhelpful log message 2022-03-15 17:07:13 -07:00
Tao Zhu 2d3501dff9 make upsert infallible op 2022-03-15 17:05:41 -05:00
Tao Zhu 61cead9b9b Remove injection of exit signal into cost_update_service 2022-03-15 09:58:56 -05:00
Tao Zhu eb73dacd58 harden banking tests 2022-03-15 09:58:08 -05:00
Justin Starry 8c8f9694e0
Refactor: Sanitized transaction creation (#23558)
* Refactor: SanitizedTransaction::try_create optionally computes hash

* Refactor: Add SimpleAddressLoader
2022-03-15 12:02:22 +08:00
Tyera Eulberg 102dd68a03
Rename AccountsDb plugins to Geyser plugins (#23604) 2022-03-14 19:18:46 -06:00
Michael Vines 17cc095d28 Slot warping doesn't need to be in new_banks_from_ledger 2022-03-14 15:29:58 -07:00
Michael Vines 2e7ee0f177 Tower loading doesn't need to be in new_banks_from_ledger 2022-03-14 15:29:58 -07:00
Michael Vines 390dc24608 Create leader schedule before processing blockstore 2022-03-14 15:29:58 -07:00
Michael Vines 543d5d4a5d Reduce new_banks_from_ledger arguments 2022-03-14 15:29:58 -07:00
Michael Vines 115f376465 Factor out bank_forks_utils::load_bank_forks() 2022-03-14 15:29:58 -07:00
Michael Vines c2ce152be8 Inline do_process_blockstore_from_root 2022-03-14 15:29:58 -07:00
Tao Zhu 5ea6a1e500 code review 2022-03-14 13:14:27 -05:00
Tao Zhu 8590911b0a Replace type alias with newtype for UnprocesedPacketBatches 2022-03-14 13:14:27 -05:00
Brooks Prumo 7758c32035
Banking Stage drops transactions that'll exceed the total account data size limit (#23537) 2022-03-13 15:58:57 +00:00
Yueh-Hsuan Chiang 1e20bd8f9a
(LedgerStore) Include storage type as a tag in RocksDB metric reporting (#23523)
#### Summary of Changes
This PR further enables group by operation on storage type in blockstore_rocksdb_cfs metrics.
Such group-by allows us to further compare the performance metrics between rocks-level and
rocks-fifo.

To make things extensible, this PR introduces BlockstoreAdvancedOptions and move shred_storage_type. 
All fields in BlockstoreAdvancedOptions will support group-by operation in blockstore_rocksdb_cfs.

Dependency: #23580
2022-03-11 15:17:34 -08:00
Tao Zhu 35d1235ed0
- move `unprocessed_packet_batches` from `BankingStage` to its own (#23508)
module
- deserialize packets during receving and buffering
2022-03-10 18:47:46 +00:00
carllin 588414a776
Report even if slot begins and ends in process_buffered_packets() (#23549) 2022-03-09 23:42:35 -05:00
Tao Zhu f68c5a274d remove persist_cost_table code 2022-03-09 21:05:47 -07:00
Tao Zhu 9f71958d7d Patch validator from loading persisted program costs 2022-03-09 21:05:47 -07:00
sakridge 7a9884c831
Quic limit connections (#23283)
* quic server limit connections

* bump per_ip

* Review comments

* Make the connections per port
2022-03-09 10:52:31 +01:00
Carl Lin 5a0cd05866 Revert "- estimate a program cost as 2 standard deviation above mean"
This reverts commit a25ac1c988.
2022-03-08 17:18:44 -08:00
Carl Lin 9acbfa5eb1 Revert "use EMA in place of Welford"
This reverts commit 6587dbfa47.
2022-03-08 17:18:44 -08:00
Carl Lin c878c9e2cb Revert "1. Persist to blockstore less frequently;"
This reverts commit 7aa1fb4e24.
2022-03-08 17:18:44 -08:00
Carl Lin 0a17edcc1f Revert "fix tests after merge"
This reverts commit ba2d83f580.
2022-03-08 17:18:44 -08:00
Michael Vines b719d6a2ad `solana-validator set-identity` no longer writes a tower file unnecessarily 2022-03-08 15:34:23 -08:00
Justin Starry 3114c199bd
Add RPC support for versioned transactions (#22530)
* Add RPC support for versioned transactions

* fix doc tests

* Add rpc test for versioned txs

* Switch to preflight bank
2022-03-08 15:20:34 +08:00
HaoranYi 181fffb916
rename status filename to be consistent (#23501) 2022-03-07 17:34:35 +00:00
Yueh-Hsuan Chiang b8b7163b66
(Ledger Store) Report RocksDB Column Family Metrics (#22503)
This PR enables blockstore to periodically report RocksDB column family properties.
The reported properties are under blockstore_rocksdb_cfs, and the properties also
support group by operation on cf_name.
2022-03-05 16:13:03 -08:00
Yueh-Hsuan Chiang 62d2a4cd88
Make ShredStorageType::RocksLevel public (#23272)
#### Summary of Changes
This PR adds two hidden arguments to the validator that allow users to use RocksDB's FIFO compaction for storing shreds.

        --shred-storage <SHRED_STORAGE>
            EXPERIMENTAL: Controls how RocksDB compacts shreds.  *WARNING*: You will lose your ledger data
            when you switch between options. Possible values are: 'level': stores shreds using RocksDB's default (level)
            compaction. 'fifo': stores shreds under RocksDB's FIFO compaction. This option is more efficient on
            disk-write-bytes of the ledger store. [default: level]  [possible values: level, fifo]

        --shred-storage-size <SHRED_STORAGE_SIZE_BYTES>
            The shred storage size in bytes. The suggested value is 50% of your ledger storage size in bytes. [default:
            268435456000]
2022-03-03 12:43:58 -08:00
Jeff Washington (jwash) 26aa18b3f3
fmt (#23448) 2022-03-02 11:54:58 -06:00
HaoranYi 41f78b9925
small optimization. use shift for pow of 2. (#22975) 2022-03-02 09:11:12 -06:00
HaoranYi 8de88d0a55
Refactor packet_threshold adjustment code into its own struct (#23216)
* refactor packet_threshold adjustment code into own struct and add unittest for it

* fix a typo in error message

* code review feedbacks

* another code review feedback

* Update core/src/ancestor_hashes_service.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* share packet threshold with repair service (credit to carl)

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-03-02 09:09:06 -06:00
HaoranYi 86e2f728c3
Fix a batch limits bug in banking (#23327)
* add thread index in thread name for debugging

* fix batch_limit

* use NUM_VOTE_THREAD instead of hardcoded number (credit to carllin)
2022-03-02 09:08:08 -06:00
Jeff Biseda c69e3b73ff
bench get_retransmit_peers (#23292) 2022-03-01 19:10:29 -08:00
Brooks Prumo 533eca3b4c
Simplify replay_blockstore_into_bank() (#23282) 2022-02-25 06:57:04 -06:00
Trent Nelson d4292774c5 checks 2022-02-25 08:05:28 +00:00
Justin Starry d0e85c293f
Fix rustfmt check (#23296) 2022-02-23 16:38:53 +08:00
Gavin Chan 20d031e2b8
Refactor ExecuteTimings w/ enum-indexed array (#23085) 2022-02-22 14:46:56 -08:00
Tyera Eulberg 7e08ae1d0c
Revert "Add simulation detection countermeasure (#22880)" (#23261)
This reverts commit c42b80f099.
2022-02-21 21:15:37 +00:00
buffalu 70ebab2c82
Add rustfmt.toml and `cargo fmt` (#23238)
* fmt

* formatted

Co-authored-by: Lucas B <buffalu@jito.network>
2022-02-19 13:32:29 +08:00
carllin 619335df1a
Add execute timings (#23097) 2022-02-17 01:14:32 -05:00
anatoly yakovenko 83d31c9e65
shrink batches when over 80% of the space is wasted (#23066)
* shrink batches when over 80% of the space is wasted
2022-02-16 08:18:17 -08:00
Jeff Biseda 115d71536b
forward_buffered_packets return packet count in error path (#23167) 2022-02-16 07:46:32 -08:00
Michael Vines a6d736572c `solana-validator set-identity` now supports the `--require-tower` flag 2022-02-15 19:45:00 -08:00
Tao Zhu 03bf66a51b flag end-of-slot when poh bank is gone 2022-02-15 15:01:27 -06:00
Ashwin Sekar ab92578b02
Fix the flaky test test_restart_tower_rollback (#23129)
* Add flag to disable voting until a slot to avoid duplicate voting

* Fix the tower rollback test and remove it from flaky.
2022-02-15 13:19:34 -07:00
Michael Vines c42b80f099
Add simulation detection countermeasure (#22880)
* Add simulation detection countermeasures

* Add program and test using TestValidator

* Remove incinerator deposit

* Remove incinerator

* Update Cargo.lock

* Add more features to simulation bank

* Update Cargo.lock per rebase

Co-authored-by: Jon Cinque <jon.cinque@gmail.com>
2022-02-15 13:09:59 +01:00
Lijun Wang c04438be4b
Retaining transaction logs when transaction plugin is loaded. (#22874)
Transaction logs are not being saved to the database through the plugin interface.

Summary of Changes

Retain the transaction logs when transaction notification plugin is loaded.

Fixes #
lijunwangs/solana-accountsdb-plugin-postgres#6
2022-02-11 20:29:07 -08:00
carllin 2f9e30a1f7
Introduce slot-specific packet metrics (#22906) 2022-02-11 03:07:45 -05:00
Justin Starry d5dec989b9
Enforce tx metadata upload with static types (#23028) 2022-02-10 13:28:18 +08:00
Yueh-Hsuan Chiang 1b287f1b59
(Ledger Cleanup) Add code comments for ledger_cleanup. (#22807) 2022-02-08 22:48:56 -08:00
Tao Zhu ba2d83f580 fix tests after merge 2022-02-08 16:18:23 -06:00
Tao Zhu 7aa1fb4e24 1. Persist to blockstore less frequently;
2. reduce alpha for EMA to 1 percent to have roughly 200 data points for estimatio
2022-02-08 16:18:23 -06:00
Tao Zhu 6587dbfa47 use EMA in place of Welford 2022-02-08 16:18:23 -06:00
Tao Zhu a25ac1c988 - estimate a program cost as 2 standard deviation above mean
- replaced get_average / get_mode with get_default to assign max units to unknown program
2022-02-08 16:18:23 -06:00
Tao Zhu e52e48076e
bench should update leader schedule cache (#22991) 2022-02-08 02:28:28 +00:00
Ashwin Sekar 5acf0f6331
Add feature gate for new vote instruction and plumb through replay (#21683)
* Add feature gate for new vote instruction and plumb through replay

Add tower versions

* Add check for slot hashes history

* Update is_recent check to exclude voting on hard fork root slot

* Move tower rollback test to flaky and ignore it until #22551 lands
2022-02-07 14:06:19 -08:00
behzad nouri 27aaf9df85
removes VoteTracker::new in favor of VoteTracker::default (#22941)
VoteTracker::new does not need a bank and is so redundant:
https://github.com/solana-labs/solana/blob/5a230f418/core/src/cluster_info_vote_listener.rs#L103-L107
2022-02-04 19:01:59 +00:00
sakridge 5a230f418d
Add quic port for accepting transactions (#22753)
using quinn library

streamer: Sign TLS cert with validator identity key

Handle multiple incoming chunks
2022-02-04 15:27:09 +01:00
Tao Zhu 4bec182b32
Allow buffered packets be consumed if bank is active, regardless leader schedule (#22913) 2022-02-03 21:29:41 +00:00
Justin Starry 60af1a4cce
Refactor: Add trait for loading addresses (#22903) 2022-02-03 11:00:12 +00:00
carllin bd1850df25
Return actual committed transactions from process_transactions() (#22802) 2022-02-03 03:56:36 -05:00
Trent Nelson c62f9839a2 test-validator-bin: reinstate full rpc method set 2022-02-03 02:43:03 +00:00
Ikko Ashimine 58a70d76a3
fix typo in broadcast_duplicates_run.rs (#22888)
Creat -> Create
2022-02-02 12:29:14 -07:00
behzad nouri dccbddad80
adds reverse lookup index to cluster-nodes (#22892)
retransmit has to exclude slot leader from set of nodes for each shred; 
which currently requires a linear scan:
https://github.com/solana-labs/solana/blob/e3b137066/core/src/cluster_nodes.rs#L238-L242

This commit adds a reverse lookup index to avoid linear scan.
2022-02-02 19:27:50 +00:00
behzad nouri e3b137066d
caches WeightedShuffle struct in ClusterNodes (#22877)
Instead of reconstructing WeightedShuffle struct for each shred
broadcast or retransmit, we can use the same struct with minimal
mutations.
2022-02-02 15:12:26 +00:00
Trent Nelson eac4a6df68 rpc: use minimal mode by default 2022-02-01 19:00:06 -07:00
behzad nouri 45e09664b8
removes Rng field from WeightedShuffle struct (#22850) 2022-02-01 15:27:23 +00:00
behzad nouri 604ca9316c
includes zero weighted entries in WeightedShuffle (#22829)
Current WeightedShuffle implementation excludes zero weighted entries
from the shuffle:
https://github.com/solana-labs/solana/blob/13e631dcf/gossip/src/weighted_shuffle.rs#L29-L30

Though mathematically this might make more sense, for our use-cases
(turbine specifically), this results in less efficient code:
https://github.com/solana-labs/solana/blob/13e631dcf/core/src/cluster_nodes.rs#L409-L430

This commit changes the implementation so that zero weighted indices are
also included in the shuffle but appear only at the end after non-zero
weighted indices.
2022-01-31 16:23:50 +00:00
Justin Starry 220aa6ada0
Fix poh recorder initialization on startup (#22755) 2022-01-28 14:21:15 +08:00
Michael Vines 331b953551 Add vote account address to vote subscription 2022-01-27 08:22:29 -08:00
Justin Starry d9c259a231
Set the correct root in block commitment cache initialization (#22750)
* Set the correct root in block commitment cache initialization

* clean up test

* bump
2022-01-27 00:48:00 +08:00
sakridge 2e56c59bcb
Handle already discarded packets in gpu sigverify path (#22680) 2022-01-24 14:35:47 +01:00
Justin Starry 1240217a73
Refactor: Rename variables and helper method to `PohRecorder` (#22676)
* Refactor: Rename leader_first_tick_height field

* Refactor: add `PohRecorder::slot_for_tick_height` helper

* Refactor: Add type for poh leader status
2022-01-23 10:28:50 +08:00
anatoly yakovenko d6011ba14d
Dedup bloom filter is too slow (#22607)
* Faster dedup

* use ahash

* fixup

* single threaded

* use duration type

* remove the count

* fixup
2022-01-21 20:23:48 -07:00
Michael Vines 6d5bbca630 Pacify clippy 2022-01-21 19:12:57 -08:00
Michael Vines ce4f7601af Avoid unstable_name_collisions warning 2022-01-21 19:12:57 -08:00
sakridge 38b02bbcc0
Handle already discarded packets in discard_excess_packets (#22594) 2022-01-21 17:22:50 +01:00
Jeff Biseda e7777281d6
regularly report network limits (#22563) 2022-01-20 12:38:42 -08:00
Trent Nelson cca3dbc76d system-monitor-service: support percentages from bigger numbers 2022-01-20 09:51:23 +00:00
Justin Starry 7f20c6149e
Refactor: move simple vote parsing to runtime (#22537) 2022-01-20 10:39:21 +08:00
anatoly yakovenko d343713f61
Optimize packet dedup (#22571)
* Use bloom filter to dedup packets

* dedup first

* Update bloom/src/bloom.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/sigverify_stage.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/sigverify_stage.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/sigverify_stage.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* fixup

* fixup

* fixup

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-01-19 13:58:20 -08:00
behzad nouri dcf44d2523
improves sigverify discard_excess_packets performance (#22577)
As shown by the added benchmark, current code does worse if there is a
spam address plus a lot of unique addresses.

on current master:
test bench_packet_discard_many_senders  ... bench:   1,997,960 ns/iter (+/- 103,715)
test bench_packet_discard_mixed_senders ... bench:  14,256,116 ns/iter (+/- 534,865)
test bench_packet_discard_single_sender ... bench:   1,306,809 ns/iter (+/- 61,992)

with this commit:
test bench_packet_discard_many_senders  ... bench:   1,644,025 ns/iter (+/- 83,715)
test bench_packet_discard_mixed_senders ... bench:   1,089,789 ns/iter (+/- 86,324)
test bench_packet_discard_single_sender ... bench:     955,234 ns/iter (+/- 55,953)
2022-01-19 18:10:02 +00:00
buffalu 650882217c
Add PacketBatch packet_indexes stat (#22564)
* collect stats on packet batch indicies

* cleanup

* cleanup

* cleanup

* change name
2022-01-19 08:13:07 +00:00
anatoly yakovenko e616a7ebfc
Track discard time of excess packets in sigverify (#22554)
* discard time histogram

* closer to the if

* update
2022-01-18 15:09:39 -07:00
Michael Vines 8dc6f9f589 Remove unused mut 2022-01-18 12:10:31 -08:00
sakridge 49443406fd
Use VecDeque instead of Vec in sigverify stage (#22538)
avoid bad performance of remove(0) for a single sender
2022-01-17 18:37:05 +01:00
anatoly yakovenko 2d94e6e5d3
metrics for generate new bank forks (#22492)
* metrics for generate new bank forks

* fixed

* Apply suggestions from code review

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* --fixup

* fixup!

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-01-17 09:59:47 -07:00
Tao Zhu a724fa2347
Add hidden cli option to allow validator reports replayed transaction cost metrics (#22369)
* add hidden cli option to allow validator reports replayed transaction cost detail metrics

* Update validator/src/main.rs

Co-authored-by: Michael Vines <mvines@gmail.com>

* - rebase master, using unbounded instead of channel; dowgrade to datapoint_trace

* removed cli arg, prefer log at trace

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-01-15 00:31:21 +00:00
Tao Zhu 1309a9cea0
Add estimated and actual block cost units metrics (#22326)
* - report cost details for transactions selected to be packed into block;
- report estimated execution units packed into block, and actual units and time after execution

* revert reporting per-transaction details

* rollup transaction cost details (eg signature cost, wirte lock, data cost and execution costs) into block stats

* change naming from units to cu, use struct to replace tuple
2022-01-14 23:44:18 +00:00
Tao Zhu 9c9f2dd5bd port counting vote CUs to block cost (#22477) 2022-01-14 10:50:29 -06:00
Justin Starry f804ccdece
Store address table lookups in blockstore and bigtable (#22402) 2022-01-14 15:24:41 +08:00
carllin 4ab7d6c23e
Filter out outdated slots (#22450)
* Filter out outdated slots

* Fixup error
2022-01-13 19:51:00 -05:00
Tao Zhu 6614727be8 downgrade individual per-program-timing to trace to reduce writes to influx 2022-01-12 18:52:13 -06:00
Tyera Eulberg 637e366b18
Prevent rent-paying account creation (#22292)
* Fixup typo

* Add new feature

* Add new TransactionError

* Add framework for checking account state before and after transaction processing

* Fail transactions that leave new rent-paying accounts

* Only check rent-state of writable tx accounts

* Review comments: combine process_result success behavior; log and metrics before feature activation

* Fix tests that assume rent-exempt accounts are okay

* Remove test no longer relevant

* Remove native/sysvar special case

* Move metrics submission to report legacy->legacy rent paying transitions as well
2022-01-11 11:32:25 -07:00
Jeff Biseda 8b66625c95
convert std::sync::mpsc to crossbeam_channel (#22264) 2022-01-11 02:44:46 -08:00
steviez 5f1f4dcbdd
Use struct to pass all Tpu sockets as one argument to Tpu::new() (#21965)
Tpu::new() now matches Tvu::new() in having struct to reduce argument
list. Additionally, Rust supports partial moves, so there is no need to
clone the Tvu sockets out of Node object.
2022-01-10 11:29:48 -06:00
Ashwin Sekar eeec1ce2ad
Add local cluster test to repro slot hash expiry bug (#21873) 2022-01-10 00:58:21 -05:00
Justin Starry 52d12cc802
Add runtime support for address table lookups (#22223)
* Add support for address table lookups in runtime

* feedback

* feedback
2022-01-07 11:59:09 +08:00
Trent Nelson 390ef0fbcd Consolidate process instruction execution timings to own struct 2022-01-06 03:56:46 -07:00
Trent Nelson 848b6dfbdd Add metrics for executor creation 2022-01-06 03:56:46 -07:00
Carl Lin b25e4a200b Add execute metrics 2022-01-06 03:56:46 -07:00
Trent Nelson 7d32909e17 move `ExecuteTimings` from `runtime::bank` to `program_runtime::timings` 2022-01-06 03:56:46 -07:00
Justin Starry 45458e7139
Refactor: Improve type safety and readability of transaction execution (#22215)
* Refactor Bank::load_and_execute_transactions

* Refactor: improve type safety of TransactionExecutionResult

* Add enum for extra type safety in execution results

* feedback
2022-01-05 10:15:15 +08:00
behzad nouri 4b24499916 removes total-size from return value of recv_mmsg 2022-01-04 21:06:59 +00:00
behzad nouri 01a096adc8 adds bitflags to Packet.Meta
Instead of a separate bool type for each flag, all the flags can be
encoded in a type-safe bitflags encoded in a single u8:
https://github.com/solana-labs/solana/blob/d6ec103be/sdk/src/packet.rs#L19-L31
2022-01-04 13:53:40 +00:00
behzad nouri 73a7741c49 uses std::net::IpAddr type for Packet.Meta.addr 2022-01-04 13:53:40 +00:00
sakridge 2486e21ffe
Lower vote-only-mode to 400 (#22210) 2022-01-04 12:49:14 +01:00
Jeff Biseda ca8fef5855
retransmit consecutive leader blocks (#22157) 2022-01-04 00:24:16 -08:00
Yueh-Hsuan Chiang e8b7f96a89
Add struct BlockstoreOptions (#22121) 2022-01-03 18:30:45 -10:00
carllin 005592998d
Fix bug, add error specific timings (#22225) 2022-01-03 16:33:54 -05:00
behzad nouri 69d71f8f86
removes epoch_authorized_voters from VoteTracker (#22192)
https://github.com/solana-labs/solana/pull/22169
verifies authorized-voter early on in vote-listener pipeline; and so
VoteTracker no longer needs to maintain and check for epoch authorized
voters.
2022-01-03 21:07:47 +00:00
Jeff Biseda 0e4ede46d1
work around rust 39364 for stats_reporter_sender (#22227) 2022-01-03 11:46:02 -08:00
carllin d06e6c7425
Count compute units even when transaction errors (#22182) 2021-12-30 21:21:42 -05:00
Jeff Biseda 95dfcc546a
bypass retransmission for slots without propagated stats (#22176) 2021-12-30 16:07:34 -08:00
behzad nouri c0c6038654
checks for authorized voter early on in the vote-listener pipeline (#22169)
Before votes are verified that they are signed by the authorized voter,
they might be dropped in verified-vote-packets code. If there are
enough many spam votes from unauthorized voters, this may potentially
drop valid votes but keep the false ones.
https://github.com/solana-labs/solana/blob/57986f982/core/src/verified_vote_packets.rs#L165-L168
2021-12-30 15:03:14 +00:00
carllin 33d0b5e011
Revert "Count compute units even when transaction errors (#22059)" (#22174)
This reverts commit eaa8c67bde.
2021-12-30 02:42:32 -05:00
Lijun Wang f14928a970
Stream additional block metadata via plugin (#22023)
* Stream additional block metadata through plugin
blockhash, block_height, block_time, rewards are streamed
2021-12-29 15:12:01 -08:00
Justin Starry b1d9a2e60e
Don't forward packets received from TPU forwards port (#22078)
* Don't forward packets received from TPU forwards port

* Add banking stage test
2021-12-29 19:34:31 +01:00
carllin eaa8c67bde
Count compute units even when transaction errors (#22059) 2021-12-28 17:05:11 -05:00
carllin f061059e45
Prevent log spam (#22148) 2021-12-28 17:04:48 -05:00
Tao Zhu 3d6ab96587 push live packets straight to buffer, leader only process packets from buffer 2021-12-28 15:21:24 -06:00
Yueh-Hsuan Chiang b89cd8cd1a
Avoid cloning Vec<Entry> when calling entries_to_test_shreds() (#22093) 2021-12-24 12:32:43 -08:00
Justin Starry 93c776ce19
Refactor packet deduplication and harden bench test (#22080) 2021-12-22 23:05:10 -06:00
Tao Zhu dd80a525ef
Leader QoS service metrics (#21708)
* - qos_service metrics tagged with leader thread ids to separate gossip/tpu votes and transactions;
- qos_service metrics is reported with bank slot;
- replaced timer-based reporting with signal via channel; removed async report test as qos_service now lives within a thread

* - add tpu live packets (eg, not buffered packets) states to qos metrics reporting
2021-12-22 21:39:59 +00:00
behzad nouri 4d62f03297
uses enum instead of trait for VoteTransaction (#22019)
Box<dyn Trait> involves runtime dispatch, has significant overhead and
is slow. It also requires hacky boilerplate code for implementing Clone
or other basic traits:
https://github.com/solana-labs/solana/blob/e92a81b74/programs/vote/src/vote_state/mod.rs#L70-L102

Only limited known types can be VoteTransaction and they are all defined
in the same crate. So using a trait here only adds overhead.
https://github.com/solana-labs/solana/blob/e92a81b74/programs/vote/src/vote_state/mod.rs#L125-L165
https://github.com/solana-labs/solana/blob/e92a81b74/programs/vote/src/vote_state/mod.rs#L221-L264
2021-12-22 14:25:46 +00:00
Tao Zhu 9c5d82557a skip reporting all-zero stats 2021-12-21 16:20:36 -06:00
behzad nouri 65d59f4ef0
tracks erasure coding shreds' indices explicitly (#21822)
The indices for erasure coding shreds are tied to data shreds:
https://github.com/solana-labs/solana/blob/90f41fd9b/ledger/src/shred.rs#L921

However with the upcoming changes to erasure schema, there will be more
erasure coding shreds than data shreds and we can no longer infer coding
shreds indices from data shreds.

The commit adds constructs to track coding shreds indices explicitly.
2021-12-19 22:37:55 +00:00
behzad nouri 7476dfeec0
removes Select in favor of recv_timeout/try_iter (#21981)
crossbeam_channel::Select::ready_timeout might return with success spuriously.
2021-12-18 17:39:07 +00:00
Jeff Biseda 3fe942ab30
new net-stats require a new table (#21996) 2021-12-18 00:13:16 -08:00
carllin 7f6fb6937a
Ensure AncestorHashesSerice selects an open port (#21919) 2021-12-18 00:44:01 -05:00
Jeff Biseda 97a1fa10a6
streamer send destination metrics for repair, gossip (#21564) 2021-12-17 15:21:05 -08:00
segfaultdoctor 76098dd42a
RPC Block Subscription (#21787)
* add stuff

* compiling

* add notify block

* wip

* feat: add blockSubscribe pubsub method

* address PR comments

Co-authored-by: Lucas B <buffalu@jito.network>
Co-authored-by: Zano <segfaultdoctor@protonmail.com>
2021-12-17 16:03:09 -07:00
behzad nouri 89d66c3210
removes next_shred_index from return value of entries to shreds api (#21961)
next-shred-index is already readily available from returned data shreds.
The commit simplifies the api for upcoming changes to erasure coding
schema which will require explicit tracking of indices for coding shreds
as well as data shreds.
2021-12-17 15:01:55 +00:00
Jeff Biseda 7ec39f5a1e
time based retransmit in replay_stage (#21498) 2021-12-17 05:44:40 -08:00
carllin 385efae4b3
Remove need to send bank in retransmit request from ReplayStage (#21943)
* Remove need to send bank in retransmitter
2021-12-16 21:11:01 -05:00
Justin Starry 6ff0be6a82
Clean up demote program write lock feature (#21949)
* Clean up demote program write lock feature

* fix test
2021-12-16 17:27:22 -05:00
carllin cb395abff7
Fix subtraction overflow (#21871) 2021-12-14 14:24:22 -05:00
behzad nouri 8d980f07ba
uses Option<Slot> for SlotMeta.parent_slot (#21808)
SlotMeta.parent_slot for the head of a detached chain of slots is
unknown and that is indicated by u64::MAX which lacks type-safety:
https://github.com/solana-labs/solana/blob/6c108c8fc/ledger/src/blockstore_meta.rs#L203-L205

The commit changes the type to Option<Slot>. Backward compatibility is
maintained by customizing serde serialize/deserialize implementations.
2021-12-14 18:57:11 +00:00
behzad nouri 4ceb2689f5
adds ShredId uniquely identifying each shred (#21820) 2021-12-14 17:34:02 +00:00
Jeff Washington (jwash) 90f41fd9b7
use cost model to limit new account creation (#21369)
* use cost model to limit new account creation

* handle every system instruction

* remove &

* simplify match

* simplify match

* add datapoint for account data size

* add postgres error handling

* handle accounts:unlock_accounts
2021-12-12 14:57:18 -06:00
behzad nouri e08139f949
uses Option<u64> for SlotMeta.last_index (#21775)
SlotMeta.last_index may be unknown and current code is using u64::MAX to
indicate that:
https://github.com/solana-labs/solana/blob/6c108c8fc/ledger/src/blockstore_meta.rs#L169-L174

This lacks type-safety and can introduce bugs if not always checked for
Several instances of slot_meta.last_index + 1 are also subject to
overflow.

This commit updates the type to Option<u64>. Backward compatibility is
maintained by customizing serde serialize/deserialize implementations.
2021-12-11 14:47:20 +00:00
Justin Starry 254ef3e7b6
Rename Packets to PacketBatch (#21794) 2021-12-11 09:44:15 -05:00
behzad nouri 8063273d09
adds more sanity checks to shreds (#21675) 2021-12-09 16:43:57 +00:00
Ashwin Sekar f0acf7681e
Add vote instructions that directly update on chain vote state (#21531)
* Add vote state instructions

UpdateVoteState and UpdateVoteStateSwitch

* cargo tree

* extract vote state version conversion to common fn
2021-12-07 16:47:26 -08:00
behzad nouri cd17f63d81
adds back position field to coding-shred-header (#21600)
https://github.com/solana-labs/solana/pull/17004
removed position field from coding-shred-header because as it stands the
field is redundant and unused.
However, with the upcoming changes to erasure coding schema this field
will no longer be redundant and needs to be populated.
2021-12-05 14:42:09 +00:00
Jeff Biseda 9c6b95e1e1
fix distance calculation in get_closest_completion (#21601) 2021-12-03 22:36:46 -08:00
Justin Starry 1430b58a6d
Remove deprecated slow epoch boundary methods (#21568) 2021-12-03 17:59:10 +00:00
Michael Vines b8837c04ec Reformat imports to a consistent style for imports
rustfmt.toml configuration:
  imports_granularity = "One"
  group_imports = "One"
2021-12-03 09:19:13 -08:00
Alexander Meißner b78f5b6032
Refactor: Cleanup InstructionProcessor (#21404)
* Moves create_message(), native_invoke() and process_cross_program_instruction()
from the InstructionProcessor to the InvokeContext so that they can have a useful "self" parameter.

* Moves InstructionProcessor into InvokeContext and Bank.

* Moves ExecuteDetailsTimings into its own file.

* Moves Executor into invoke_context.rs

* Moves PreAccount into its own file.

* impl AbiExample for BuiltinPrograms
2021-12-01 08:54:42 +01:00
Michael Vines ba9dfa0d22 Remove frozen account support 2021-11-29 08:38:11 -08:00
Tao Zhu 9edfc5936d
Refactor accounts.rs with Justin's comments to improve lock accounts (#21406)
with results code path.
- fix a bug that could unlock accounts that weren't locked
- add test to the refactored function
- skip enumerating transaction accounts if qos results is an error
- add #[must_use] annotation
- avoid clone error in results
- add qos error code to unlock_accounts match statement
- remove unnecessary AbiExample
2021-11-23 21:17:55 +00:00
Lijun Wang c29838fce1
Accountsdb plugin transaction part 3: Transaction Notifier (#21374)
The TransactionNotifierInterface interface for notifying transactions.
Changes to transaction_status_service to notify the notifier of the transaction data.
Interface to query the plugin's interest in transaction data
2021-11-23 09:55:53 -08:00
Tao Zhu 2602e7c3bc
Fix flaky test (#21402)
* the async test is flaky on ci

* fix unstable test by increasing stats repoting time
2021-11-23 09:47:17 -06:00
behzad nouri dd338b6c9f
changes Shred::parent return type to Option<Slot> (#21370)
Shred::parent can return garbage if the struct fields are invalid:
https://github.com/solana-labs/solana/blob/8a50b6302/ledger/src/shred.rs#L446-L453

The commit adds more sanity checks and changes the return type to Option<Slot>.
2021-11-23 14:45:26 +00:00
Tao Zhu cd5a39ee43
the async test is flaky on ci (#21365) 2021-11-22 18:16:20 -06:00
Jeff Washington (jwash) 87831e7f8d
start system monitor earlier in validator so we get memory stats at startup (#21372) 2021-11-22 14:37:17 -06:00
sakridge f31ca8ba8c
Report cluster slots size (#21380) 2021-11-22 17:47:58 +01:00
Jeff Biseda 2ed7e3af89
prioritize slot repairs for unknown last index and close to completion (#21070) 2021-11-19 19:17:30 -08:00
sakridge 0bda0c3e0c
Add bank drop service (#21322) 2021-11-19 17:20:18 +01:00
behzad nouri 48dfdfb4d5 changes Blockstore::is_shred_duplicate arg type to ShredType 2021-11-19 14:16:39 +00:00
behzad nouri 57057f8d39 uses enum for shred type
Current code is using u8 which does not have any type-safety and can
contain invalid values:
https://github.com/solana-labs/solana/blob/66fa062f1/ledger/src/shred.rs#L167

Checks for invalid shred-types are scattered through the code:
https://github.com/solana-labs/solana/blob/66fa062f1/ledger/src/blockstore.rs#L849-L851
https://github.com/solana-labs/solana/blob/66fa062f1/ledger/src/shred.rs#L346-L348

The commit uses enum for shred type with #[repr(u8)]. Backward
compatibility is maintained by implementing Serialize and Deserialize
compatible with u8, and adding a test to assert that.
2021-11-19 14:16:39 +00:00
carllin b30c94ce55
ClusterInfoVoteListener send only missing votes to BankingStage (#20873) 2021-11-18 15:20:41 -08:00
Tao Zhu 0ca255220e
- Encapsulate QoS Service metrics reporting within QosServioce, so client (#21191)
code (eg banking_stage) doesn't need to worry about it.
- Remove dead cost_* stats from banking_stage, clean up call path.
2021-11-18 15:35:30 -06:00
Lijun Wang 89c45a57f8
Refactor slot status notification to decouple from accounts notifications (#21308)
Problem

Slot status can be used of in other scenarios in addition to account information such as transactions, blocks. The current implementation is too tightly coupled.

Summary of Changes

Decouple the slot status notification from accounts notification. Created a new slot status notification module.
2021-11-17 17:11:38 -08:00
Jeff Biseda d5de0c8e12
add --no-os-network-stats-reporting option (#21296) 2021-11-16 10:26:03 -08:00
sakridge 398af132a5
More set_root metrics (#21286) 2021-11-15 16:28:18 -07:00
Jeff Washington (jwash) f2bd9947cc
mem stats: rescale from kb to bytes (#21282) 2021-11-15 14:42:41 -06:00
Jeff Washington (jwash) f8dcb2f38b
report mem stats (#21258) 2021-11-13 00:59:41 +00:00
Michael Keleti b0ca335463
Rename "trusted" to "known" in `validators/` (#21197)
* Replaced trusted with known validator

* Format Convention
2021-11-12 11:57:55 -07:00
Tao Zhu 11153e1f87
refactor cost calculation (#21062)
* - cache calculated transaction cost to allow sharing;
- atomic cost tracking op;
- only lock accounts for transactions eligible for current block;
- moved qos service and stats reporting to its own model;
- add cost_weight default to neutral (as 1), vote has zero weight;

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Update core/src/qos_service.rs

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Update core/src/qos_service.rs

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-11-12 01:04:53 -06:00
Ivan Mironov c78f474373 Add validator option to change niceness of snapshot packager thread 2021-11-04 17:16:46 -06:00
Alexander Meißner 7200c5106e
Replaces MockInvokeContext by ThisInvokeContext in tests (#20881)
* Replaces MockInvokeContext by ThisInvokeContext in BpfLoader, SystemInstructionProcessor, CLIs, ConfigProcessor, StakeProcessor and VoteProcessor.

* Finally, removes MockInvokeContext, MockComputeMeter and MockLogger.

* Adjusts assert_instruction_count test.

* Moves ThisInvokeContext to the program-runtime crate.
2021-11-04 21:47:32 +01:00
Justin Starry 140a5f633d
Simplify replay vote tracking by using packet metadata (#21112) 2021-11-03 09:02:48 +00:00
steviez e6280fc1fa
Add additional checks for should_retransmit_and_persist() (#20672)
Add additional checks to should_retransmit_and_persist()

- Check invalid shred index
- Update cases that check if node was leader
- Some comments and variable rename for clarity
2021-11-03 02:01:07 -05:00
sakridge a8d78e89d3
Move test-validator to own module to reduce core dependencies (#20658)
* Move test-validator to own module to reduce core dependencies

* Fix a few TestValidator paths

* Use solana_test_validator crate for solana_test_validator bin

* Move client int tests to separate crate

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-10-29 01:27:07 +00:00
Justin Starry 036d7fcc81
Clean up sanitized tx creation for tests (#21006) 2021-10-27 18:09:16 +01:00
sakridge 261dd96ae3
Swap banking stage vote channels (#20987) 2021-10-26 21:20:31 +02:00
behzad nouri 1297a13586
adds metrics tracking crds writes and votes (#20953) 2021-10-26 13:02:30 +00:00
Jeff Washington (jwash) 43ea579f63
add cli for --accounts-hash-num-passes (#20827) 2021-10-25 09:45:46 -05:00
Tao Zhu c2bfce90b3
- cost_tracker is data member of a bank, it can report metrics when bank is frozen (#20802)
- removed cost_tracker_stats and histogram
- move stats reporting outside of bank freeze
2021-10-24 22:19:23 -05:00
behzad nouri 5e1cf39c74
adds metrics for number of outgoing shreds in retransmit stage (#20882) 2021-10-24 13:12:27 +00:00
Michael Vines 350bb561eb Clippy 2021-10-23 08:21:20 +00:00
Jack May bfbbc53dac
Divorce the runtime from FeeCalculator (#20737) 2021-10-22 14:32:40 -07:00
Justin Starry 735016661b
Report timing info for stakes cache updates from txs (#20856) 2021-10-22 12:49:02 -04:00
Tao Zhu 71d0bd4605
Add counter for dropped duplicated packets, fix dropped_packets_count (#20834) 2021-10-21 02:56:48 +00:00
Trent Nelson fe098b5ddc rpc-send-tx-svc: add with_config constructor 2021-10-20 13:43:27 -06:00
Jeff Washington (jwash) 95e91a4863
disable gossip publish of snapshots when using filler accts (#20824) 2021-10-20 18:07:29 +00:00
Tao Zhu 7496b5784b
- make cost_tracker a member of bank, remove shared instance from TPU; (#20627)
- decouple cost_model from cost_tracker; allowing one cost_model
  instance being shared within a validator;
- update cost_model api to calculate_cost(&self...)->transaction_cost
2021-10-19 14:37:33 -05:00
Jeff Biseda 4cac66244d
report udp stats from validator (#20587) 2021-10-15 15:11:11 -07:00
carllin 44ff30b65b
Retry `SampleNotDuplicateConfirmed` decisions in AncestorHashesService (#20240) 2021-10-15 11:40:03 -07:00
behzad nouri 0f03971c3c
adds counters for errors in window-service run_insert (#20670) 2021-10-15 14:13:26 +00:00
behzad nouri 0c0384ec32
revises turbine peers shuffling order (#20480)
Turbine randomly shuffles cluster nodes on a broadcast tree for each
shred. This requires knowing the stakes and nodes' contact-infos (from
gossip).

However gossip is subject to partitioning and propogation delays.
Additionally unstaked nodes may join and leave the cluster at any
moment, changing the cluster view from one node to another.

This commit:
* Always arranges the unstaked nodes at the bottom of turbine broadcast
  tree.
* Staked nodes are always included regardless of if their contact-info
  is available in gossip or not.
* Uses the unbiased WeightedShuffle construct for shuffling nodes.
2021-10-14 15:09:36 +00:00
sakridge 588168b99d
Add check for shred data header size (#20668) 2021-10-14 05:56:14 +02:00
Jack May da45be366a
Remove blockhash from fee calculation (#20641) 2021-10-13 13:10:58 -07:00
Tao Zhu 005d6863fd
- move cost tracker into bank, so each bank has its own cost tracker; (#20527)
- move related modules to runtime
2021-10-12 08:51:33 -05:00
Jeff Washington (jwash) a8e000a2a6
add filler accounts to bloat validator and predict failure (#20491)
* add filler accounts to bloat validator and predict failure

* assert no accounts match filler

* cleanup magic numbers

* panic if can't load from snapshot with filler accounts specified

* some renames

* renames

* into_par_iter

* clean filler accts, too
2021-10-11 12:46:27 -05:00
Michael Vines c16510152e Rework AVX/AVX2 detection again 2021-10-10 12:22:10 -07:00
carllin 838ff3b871
Separate out interrupted slots broadcast metrics (#20537) 2021-10-09 01:46:06 -07:00
Lijun Wang d621994fee
Accountsdb stream plugin improvement (#20419)
Support using connection pooling and use multiple threads to do Postgres db operations. The performance is improved from 1500 RPS to 40,000 RPS measured during validator start.

Support multiple plugins at the same time.
2021-10-08 20:06:58 -07:00
Brooks Prumo 5440c1d2e1
SnapshotPackagerService pushes incremental snapshot hashes to CRDS (#20442)
Now that CRDS supports incremental snapshot hashes,
SnapshotPackagerService needs to push 'em!

This commit does two main things:

1. SnapshotPackagerService now knows about incremental snapshot hashes,
   and will push SnapshotPackage::IncrementalSnapshot hashes to CRDS.
2. At startup, when loading from a full + incremental snapshot, the
   hashes need to be passed all the way to SnapshotPackagerService so it
   can push these starting hashes to CRDS.  Those values have been piped
   through.

Fixes #20441 and #20423
2021-10-08 15:14:56 -05:00
Tao Zhu 675fa6993b
- update const cost values with data collected by #19627 (#20314)
- update cost calculation to closely proposed fee schedule #16984
2021-10-08 14:48:50 -05:00
Tao Zhu 0ebd8c53ee
cost model to ignore vote transactions (#20510) 2021-10-07 12:49:07 -05:00
Tao Zhu 177a375479
Tpu vote 1.7 (#20187) (#20494)
* Add separate vote processing tpu port

* Add feature to send to tpu vote port

* Add vote rejecting sigverify mode

* use packet.meta.is_simple_vote_tx in place of deserialization

* consolidate code that identifies vote tx atcommon path for cpu and gpu

* new key for feature set

* banking forward tpu vote

* add tpu vote port to dockerfile and other review changes

* Simplify thread id compare

* fix a test; updated cluster_info ABI change

Co-authored-by: Tao Zhu <tao@solana.com>

Co-authored-by: sakridge <sakridge@gmail.com>
2021-10-07 09:38:23 +00:00
Michael Vines 7027d56064 Resolve nightly-2021-10-05 clippy complaints 2021-10-06 10:37:58 -07:00
Tao Zhu 03913f6661
add tx count and thread id to stats, each stat reports and resets when slot changes (#20451) 2021-10-06 00:09:19 -05:00
Justin Starry 129716f3f0
Optimize stakes cache and rewards at epoch boundaries (#20432)
* Optimize stakes cache and rewards at epoch boundaries

* Fetch from accounts db

* Add cli flag for disabling epoch boundary optimization
2021-10-06 00:53:26 -04:00
Tao Zhu 6ff508c643
add transaction cost histogram metrics (#20350) 2021-10-05 08:57:39 -05:00
Brooks Prumo 4cd50f5d45
Don't gossip more snapshot hashes than what we retain (#20379) 2021-10-01 15:59:45 -05:00
Lijun Wang fe97cb2ddf
AccountsDb plugin framework (#20047)
Summary of Changes

Create a plugin mechanism in the accounts update path so that accounts data can be streamed out to external data stores (be it Kafka or Postgres). The plugin mechanism allows

Data stores of connection strings/credentials to be configured,
Accounts with patterns to be streamed
PostgreSQL implementation of the streaming for different destination stores to be plugged in.

The code comprises 4 major parts:

accountsdb-plugin-intf: defines the plugin interface which concrete plugin should implement.
accountsdb-plugin-manager: manages the load/unload of plugins and provide interfaces which the validator can notify of accounts update to plugins.
accountsdb-plugin-postgres: the concrete plugin implementation for PostgreSQL
The validator integrations: updated streamed right after snapshot restore and after account update from transaction processing or other real updates.
The plugin is optionally loaded on demand by new validator CLI argument -- there is no impact if the plugin is not loaded.
2021-09-30 14:26:17 -07:00
Jeff Biseda 3854cfaa00
Use batch_send in forward_buffered_packets (#20330) 2021-09-29 20:49:43 -07:00
sakridge 94668c95c2
Prune sigverify queue (#20331) 2021-09-30 05:41:05 +02:00
Brooks Prumo 3ea6a01254
Only gossip snapshot hashes for full snapshots (#20271) 2021-09-27 19:29:08 -05:00
Jeff Biseda 640e93187c
periodically report sigverify_stage stats (#19674) 2021-09-21 10:37:58 -07:00
sakridge 013e1d9d49
Limit transaction forwarding from banking_stage (#19940) 2021-09-21 08:49:41 -07:00
carllin e6b4dd3866
Add bank to banking stage regardless of if there is a working bank (#19855) 2021-09-17 16:55:53 -07:00
Pavel Strakhov 65227f44dc
Optimize RPC pubsub for multiple clients with the same subscription (#18943)
* reimplement rpc pubsub with a broadcast queue

* update tests for new pubsub implementation

* fix: fix review suggestions

* chore(rpc): add additional pubsub metrics

* integrate max subscriptions check into SubscriptionTracker to reduce locking

* separate subscription control from tracker

* limit memory usage of items in pubsub broadcast queue, improve error handling

* add more pubsub metrics

* add final count metrics to pubsub

* add metric for total number of subscriptions

* fix small review suggestions

* remove by_params from SubscriptionTracker and add node_progress_watchers map instead

* add subscription tracker tests

* add metrics for number of pubsub notifications as a counter

* ignore clippy lint in TokenCounter

* fix underflow in token counter

* reduce queue capacity in pubsub tests

* fix(rpc): fix test timeouts

* fix race in account subscription test

* Add RpcSubscriptions::new_for_tests

Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-09-17 13:40:14 -06:00
sakridge dc69cc1ae4
Only allow votes when root distance gets too high (#19917) 2021-09-16 15:12:26 +02:00
Justin Starry ca3f147670
Add banking metrics for buffered and dropped packets (#19902) 2021-09-15 15:53:55 -05:00
Tao Zhu 67fa9945e1
Add few more metrics data points (#19624)
* Add slot, count and accumulated-units to per-program-timings for determining transaction cost elements

* correct the stats naming; fixes the dirty bit resetting
2021-09-15 09:49:49 -05:00
Justin Starry 34c1a9ac85
Report consumed_buffered_packets_count stat to metrics (#19900) 2021-09-15 14:19:39 +00:00
Tyera Eulberg c91519961c
Use f64 for stake math in get_stake_percent_in_gossip (#19895) 2021-09-14 23:36:30 -06:00
Michael 4ff50519ff
Add an info log to indicate the node has reached supermajority and print the active stake percentage (#19893) 2021-09-14 21:48:15 -06:00
Jeff Washington (jwash) b57e86abf2
cache account hash info (#19426)
* cache account hash info

* ledger_path -> accounts_hash_cache_path
2021-09-13 20:39:26 -05:00
carllin 87a7f00926
Track reset bank in PohRecorder (#19810) 2021-09-13 16:55:35 -07:00
Brooks Prumo 62c8bcf565
Add default() to SnapshotConfig (#19776) 2021-09-12 13:44:27 -05:00
Brooks Prumo 7aa5f6b833
Add CLI args for incremental snapshots (#19694)
Add `--incremental-snapshots` flag to enable incremental snapshots.
This will allow setting `--full-snapshot-interval-slots` and
`--incremental-snapshot-interval-slots`.

Also added `--maximum-incremental-snapshots-to-retain`.

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-09-10 15:59:26 -05:00
Michael Vines 4386e09710 Reduce wait for supermajority threshold back to 80% 2021-09-09 21:17:35 -07:00
sakridge 3a8c678f62
Remove some copying (#19691) 2021-09-08 18:32:38 +02:00
Jeff Washington (jwash) 456bf15012
AccountsIndexConfig -> AccountsDbConfig (#19687) 2021-09-08 04:30:38 +00:00
Jeff Washington (jwash) d3f938f0cf
Remove Copy from AccountsIndexConfig. Not all types will support it (#19686) 2021-09-07 20:09:40 -05:00
Brooks Prumo a0552e5b46
Make startup aware of Incremental Snapshots (#19600) 2021-09-07 20:43:43 +00:00
behzad nouri 01a7ec8198
uses rayon thread-pool for retransmit-stage parallelization (#19486) 2021-09-07 15:15:01 +00:00
Brooks Prumo fe8ba81ce6
Rename to is_valid instead of is_invalid (#19670) 2021-09-07 09:31:54 -05:00
Brooks Prumo 9d9482b9d8
Plumb `maximum_incremental_snapshot_archives_to_retain` (#19640) 2021-09-06 18:01:56 -05:00
Sean Young d461a9ac10 verify_precompiles needs FeatureSet
Rather than pass in individual features, pass in the entire feature set
so that we can add the ed25519 program feature in a later commit.
2021-09-05 18:59:37 +01:00
Tyera Eulberg decec3cd8b
Demote write locks on transaction program ids (#19593)
* Add feature

* Demote write lock on program ids

* Fixup bpf tests

* Update MappedMessage::is_writable

* Comma nit

* Review comments
2021-09-04 03:05:30 +00:00
Brooks Prumo 1828579580
Pass SnapshotConfig to SnapshotPackagerService (#19616) 2021-09-03 21:42:32 +00:00
Brooks Prumo 5e25ee5ebe
Add maximum_incremental_snapshot_archives_to_retain to SnapshotConfig (#19612) 2021-09-03 20:21:32 +00:00
Brooks Prumo 7ab0aec61f
Rename maximum_full_snapshot_archives_to_retain (#19610)
To prepare for adding maximum_incremental_snapshot_archives_to_retain,
rename the current field in SnapshotConfig.
2021-09-03 11:28:10 -05:00
Brooks Prumo e9374d32a3
Revert "Make startup aware of Incremental Snapshots (#19550)" (#19599)
This reverts commit d45ced0a5d.
2021-09-02 19:14:41 -05:00
Brooks Prumo d45ced0a5d
Make startup aware of Incremental Snapshots (#19550) 2021-09-02 19:05:15 -05:00
Jeff Biseda 7a8eba10b2
add synchronization comment to handle_new_root (#19571) 2021-09-02 13:52:14 -07:00
Lijun Wang 8378e8790f
Accountsdb replication installment 2 (#19325)
This is the 2nd installment for the AccountsDb replication.

Summary of Changes

The basic google protocol buffer protocol for replicating updated slots and accounts. tonic/tokio is used for transporting the messages.

The basic framework of the client and server for replicating slots and accounts -- the persisting of accounts in the replica-side will be done at the next PR -- right now -- the accounts are streamed to the replica-node and dumped. Replication for information about Bank is also not done in this PR -- to be addressed in the next PR to limit the change size.

Functionality used by both the client and server side are encapsulated in the replica-lib crate.

There is no impact to the existing validator by default.

Tests:

Observe the confirmed slots replicated to the replica-node.
Observe the accounts for the confirmed slot are received at the replica-node side.
2021-09-01 14:10:16 -07:00
behzad nouri 6d9818b8e4
skips retransmit for shreds with unknown slot leader (#19472)
Shreds' signatures should be verified before they reach retransmit
stage, and if the leader is unknown they should fail signature check.
Therefore retransmit-stage can as well expect to know who the slot
leader is and otherwise just skip the shred.

Blockstore checking signature of recovered shreds before sending them to
retransmit stage:
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/blockstore.rs#L884-L930

Shred signature verifier:
https://github.com/solana-labs/solana/blob/4305d4b7b/core/src/sigverify_shreds.rs#L41-L57
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/sigverify_shreds.rs#L105
2021-09-01 15:44:26 +00:00
Brooks Prumo 1d5a8ebc6a
Revert "Add LastFullSnapshotSlot to SnapshotConfig (#19341)" (#19529)
This reverts commit 4d361af976.
2021-08-31 22:03:19 -05:00
Brooks Prumo fe9ee9134a
Make background services aware of incremental snapshots (#19401)
AccountsBackgroundService now knows about incremental snapshots.  It is
now also in charge of deciding if an AccountsPackage is destined to be a
SnapshotPackage or not (or just used by AccountsHashVerifier).

!!! New behavior changes !!!

Taking snapshots (both bank and archive) **MUST** succeed.

This is required because of how the last full snapshot slot is
calculated, which is used by AccountsBackgroundService when calling
`clean_accounts()`.

File system calls are now unwrapped and will result in a crash. As Trent told me:

>Well I think if a snapshot fails due to some IO error, it's very likely that the operator is going to have to intervene before it works.  We should exit error in this case, otherwise the validator might happily spin for several more hours, never successfully writing a complete snapshot, before something else brings it down.  This would leave the validator's last local snapshot many more slots behind than it would be had we exited outright and potentially force the operator to abandon ledger continuity in favor of a quick catchup

Other errors will set the `exit` flag to `true`, and the node will gracefully shutdown.

Fixes #19167 
Fixes #19168
2021-08-31 18:33:27 -05:00
Tyera Eulberg a3bef2e537
Fix shreds-to-hours/days estimations (#19477) 2021-08-30 13:16:06 -06:00
behzad nouri 8ad52fa095
implements copy-on-write for vote-accounts (#19362)
Bank::vote_accounts redundantly clones vote-accounts HashMap even though
an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/95c998a19/runtime/src/bank.rs#L5174-L5186

This commit implements copy-on-write semantics for vote-accounts by
wrapping the underlying HashMap in Arc<...>.
2021-08-30 15:54:01 +00:00
carllin 84db04ce6c
Fix duplicate broadcast test (#19365) 2021-08-27 17:53:24 -07:00
Justin Starry 2d7f036afd
Add solana-program-runtime crate (#19438) 2021-08-27 00:30:36 +00:00
Brooks Prumo 6d939811e9
Name snapshots consistently (#19346)
#### Problem

Snapshot names are overloaded, and there are multiple terms that mean the same thing. This is confusing. Here's a list of ones in the codebase that I've found:

```
- snapshot_dir
- snapshots_dir
- snapshot_path
- snapshot_output_dir
- snapshot_package_output_path
- snapshot_archives_dir
```

#### Summary of Changes

For all the ones that are about the directory where snapshot archives are stored, ensure they are `snapshot_archives_dir`. For the ones about the (bank) snapshots directory, set to `bank_snapshots_dir`.


Co-authored-by: Michael Vines <mvines@gmail.com>
2021-08-21 15:41:03 -05:00
Brooks Prumo 4d361af976
Add LastFullSnapshotSlot to SnapshotConfig (#19341) 2021-08-20 17:06:53 +00:00
behzad nouri 1deb4add81
removes Slot from TransmitShreds (#19327)
An earlier version of the code was funneling through stakes along with
shreds to broadcast:
https://github.com/solana-labs/solana/blob/b67ffab37/core/src/broadcast_stage.rs#L127

This was changed to only slots as stakes computation was pushed further
down the pipeline in:
https://github.com/solana-labs/solana/pull/18971

However shreds themselves embody which slot they belong to. So pairing
them with slot is redundant and adds rooms for bugs should they become
inconsistent.
2021-08-20 13:48:33 +00:00
Trent Nelson e0bc5fa690 validator: Trusted validators are now called known validators 2021-08-19 22:43:49 -06:00
Jack May 3ec33e7d02
Fail secp256k1 if the instruction data looks incorrect (#19300) 2021-08-19 13:13:54 -07:00
Tao Zhu 4982dc20f9
replace function with const var for better readability (#19285) 2021-08-19 14:59:53 -05:00
Justin Starry c50b01cb60
Store versioned transactions in the ledger, disabled by default (#19139)
* Add support for versioned transactions, but disable by default

* merge conflicts

* trent's feedback

* bump Cargo.lock

* Fix transaction error encoding

* Rename legacy_transaction method

* cargo clippy

* Clean up casts, int arithmetic, and unused methods

* Check for duplicates in sanitized message conversion

* fix clippy

* fix new test

* Fix bpf conditional compilation for message module
2021-08-17 15:17:56 -07:00
Jeff Washington (jwash) 7c70f2158b
accounts_index_bins to AccountsIndexConfig (#19257)
* accounts_index_bins to AccountsIndexConfig

* rename param bins -> config

* rename BINS_FOR* to ACCOUNTS_INDEX_CONFIG_FOR*
2021-08-17 14:50:01 -05:00
Brooks Prumo f9986c66b8
Make SnapshotPackagerService aware of Incremental Snapshots (#19254)
Add a field to SnapshotPackage that is an enum for SnapshotType, so archive_snapshot_package() will do the right thing.

Fixes #19166
2021-08-17 13:01:59 -05:00
behzad nouri 7a8807b8bb retransmits shreds recovered from erasure codes
Shreds recovered from erasure codes have not been received from turbine
and have not been retransmitted to other nodes downstream. This results
in more repairs across the cluster which is slower.

This commit channels through recovered shreds to retransmit stage in
order to further broadcast the shreds to downstream nodes in the tree.
2021-08-17 13:44:10 +00:00
behzad nouri 3efccbffab sends shreds (instead of packets) to retransmit stage
Working towards channelling through shreds recovered from erasure codes
to retransmit stage.
2021-08-17 13:44:10 +00:00
behzad nouri 6e413331b5 removes erroneous uses of Arc<...> from retransmit stage 2021-08-17 13:44:10 +00:00
behzad nouri 8198a7eae1 adds packet/shred count stats to window-service
Adding back these metrics from the earlier commit which removed them
from retransmit stage.
2021-08-17 13:44:10 +00:00
behzad nouri bf437b0336 removes packet-count metrics from retransmit stage
Working towards sending shreds (instead of packets) to retransmit stage
so that shreds recovered from erasure codes are as well retransmitted.

Following commit will add these metrics back to window-service, earlier
in the pipeline.
2021-08-17 13:44:10 +00:00
behzad nouri 563aec0b4d
discards epoch-slots epochs ahead of the current root (#19256)
Cross cluster gossip contamination is causing cluster-slots hash map to
contain a lot of bogus values and consume too much memory:
https://github.com/solana-labs/solana/issues/17789

If a node is using the same identity key across clusters, then these
erroneous values might not be filtered out by shred-versions check,
because one of the variants of the contact-info will have matching
shred-version:
https://github.com/solana-labs/solana/issues/17789#issuecomment-896304969

The cluster-slots hash-map is bounded and trimmed at the lower end by
the current root. This commit also discards slots epochs ahead of the
root.
2021-08-17 13:13:28 +00:00
behzad nouri f33b7abffb
adds back cluster partitions to broadcast-duplicates (#19253)
An earlier version of this code was aiming to create a partition by
manipulating stakes, and setting some of them to zero:
https://github.com/solana-labs/solana/blob/cde146155/core/src/broadcast_stage/broadcast_duplicates_run.rs#L65-L116

https://github.com/solana-labs/solana/pull/18971
moved stakes computation further down the stream, and so that logic
could no longer live there. This commit adds back cluster partitions
by intercepting packets before send.
2021-08-16 22:24:30 +00:00
Michael Vines 3e5ba594e0 Revert `TestValidatorGenesis::start()` to v1.7.8 signature; add `TestValidatorGenesis::start_with_socket_addr_space()` 2021-08-16 06:37:23 +00:00
Michael Vines b15fa9fbd2 Add EtcdTowerStorage 2021-08-14 09:46:36 -07:00
carllin 22674000bd
Add EpochSlots frozen state transition (#19112) 2021-08-13 14:21:52 -07:00
Brooks Prumo 176036aa58
Rename AccountsPacakge to SnapshotPackage and AccountsPackagePre to AccountsPackage (#19231)
Renaming these types to better communicate their usages, which will
further diverge as incremental snapshot support is added.

With the new names, AccountsPacakge now refers to the type between
AccountsBackgroundProcess and AccountsHashVerifier, and SnapshotPackage
refers to the type between AccountsHashVerifier and
SnapshotPackagerService.
2021-08-13 16:08:09 -05:00
behzad nouri b64eeb7729 removes erroneous uses of &Arc<...> from window-service 2021-08-13 17:26:31 +00:00
behzad nouri d57398a959 removes repeated bank-forks locking in window-service
Window service is repeatedly locking bank-forks to look-up working-bank
for every single shred:
https://github.com/solana-labs/solana/blob/5fde4ee3a/core/src/window_service.rs#L597-L606

This commit updates shred_filter signature in recv_window so that where
we already obtain the lock on bank-forks, we can also look-up
working-bank once for all packets:
https://github.com/solana-labs/solana/blob/5fde4ee3a/core/src/window_service.rs#L256-L277
2021-08-13 17:26:31 +00:00
Jack May 0b50bb2b20
Deprecate FeeCalculator returning APIs (#19120) 2021-08-13 09:08:20 -07:00
behzad nouri 7a789e0763
filters for recent contact-infos when checking for live stake (#19204)
Contact-infos are saved to disk:
https://github.com/solana-labs/solana/blob/9dfeee299/gossip/src/cluster_info.rs#L1678-L1683

and restored on validator start-up:
https://github.com/solana-labs/solana/blob/9dfeee299/core/src/validator.rs#L450

Staked nodes entries will not expire until an epoch after. So when the
validator checks for online stake it is erroneously picking up
contact-infos restored from disk, which breaks the entire
wait-for-supermajority logic:
https://github.com/solana-labs/solana/blob/9dfeee299/core/src/validator.rs#L1515-L1561

This commit adds an extra check for the age of contact-info entries and
filters out old ones.
2021-08-13 12:12:40 +00:00
Tao Zhu 414d904959
Reject blocks for costs above the max block cost (#18994)
* added realtime cost checking logic to reject block that would exceed max limit:
- defines max limits at block_cost_limits.rs
- right after each bath's execution, accumulate its cost and check again
  limit, return error if limit is exceeded

* update abi that changed due to adding additional TransactionError

* To avoid counting stats mltiple times, only accumulate execute-timing when a bank is completed

* gate it by a feature

* move cost const def into block_cost_limits.rs

* redefine the cost for signature and account access, removed signer part as it is not well defined for now

* check if per_program_timings of execute_timings before sending
2021-08-12 10:48:47 -05:00
Jeff Washington (jwash) e91988c977
cli for num account index bins (#19085) 2021-08-11 11:45:25 -05:00
Michael Vines 7ddda30126 `solana-test-validator` now uses FileTowerStorage 2021-08-11 00:20:46 -07:00
Michael Vines e9722474eb Move tower storage into its own module 2021-08-11 00:20:46 -07:00
Michael Vines d7ab510229 Move tower save into the VotingService 2021-08-11 00:20:46 -07:00
behzad nouri 00e5e12906 renames solana_runtime::vote_account::VoteAccount
Rename:
  VoteAccount    -> VoteAccountInner  # the private type
  ArcVoteAccount -> VoteAccount       # the public type
2021-08-10 22:54:17 +00:00
Brooks Prumo ccfa82461b
Pass SnapshotConfig to AccountsHashVerifier (#19154)
AccountsHashVerifier will need access to both the full and incremental
snapshot archive interval slots config values, which is in the
SnapshotConfig.

Also, cleanup some `Option<>` params and their references.
2021-08-10 14:02:34 -05:00
Brooks Prumo fd937548a0
Move SnapshotArchiveInfo and friends into its own module (#19114) 2021-08-08 07:57:06 -05:00
Brooks Prumo 00890957ee
Add snapshot_utils::bank_from_latest_snapshot_archives() (#18983)
While reviewing PR #18565, as issue was brought up to refactor some code
around verifying the bank after rebuilding from snapshots.  A new
top-level function has been added to get the latest snapshot archives
and load the bank then verify.  Additionally, new tests have been
written and existing tests have been updated to use this new function.

Fixes #18973

While resolving the issue, it became clear there was some additional
low-hanging fruit this change enabled.  Specifically, the functions
`bank_to_xxx_snapshot_archive()` now return their respective
`SnapshotArchiveInfo`.  And on the flip side,
`bank_from_snapshot_archives()` now takes `SnapshotArchiveInfo`s instead
of separate paths and archive formats.  This bundling simplifies bank
rebuilding.
2021-08-06 20:16:06 -05:00
Michael Vines 397801a2d8 Extract tower storage details from Tower struct 2021-08-06 10:04:37 -07:00
behzad nouri e4be00fece falls back on working-bank if root-bank::epoch-staked-nodes is none
bank.get_leader_schedule_epoch(shred_slot)
is one epoch after epoch_schedule.get_epoch(shred_slot).

At epoch boundaries, shred is already one epoch after the root-slot. So
we need epoch-stakes 2 epochs ahead of the root. But the root bank only
has epoch-stakes for one epoch ahead, and as a result looking up epoch
staked-nodes from the root-bank fails.

To be backward compatible with the current master code, this commit
implements a fallback on working-bank if epoch staked-nodes obtained
from the root-bank is none.
2021-08-05 21:47:33 +00:00
behzad nouri eaf927cf49 allows only one thread to update cluster-nodes cache entry for an epoch
If two threads simultaneously call into ClusterNodesCache::get for the
same epoch, and the cache entry is outdated, then both threads recompute
cluster-nodes for the epoch and redundantly overwrite each other.

This commit wraps ClusterNodesCache entries in Arc<Mutex<...>>, so that
when needed only one thread does the computations to update the entry.
2021-08-05 21:47:33 +00:00