Commit Graph

3081 Commits

Author SHA1 Message Date
Jeff Biseda c6cd96635f
get_best_weighted_repairs parameter cleanup (#30010) 2023-01-31 03:12:25 -08:00
Jeff Biseda 6163a6c279
restructure repair decode error handling (#29977) 2023-01-31 02:44:58 -08:00
Xiang Zhu 856598969c
Account path add run parent with old path cleanup (#29942)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation

* Clean up files in the old account_path before trasitioning to the new run path

* try_exist and accounts_dir removing extra

* sync rmdir, is_dir check

* handle the account_path not deletable case
2023-01-30 10:26:43 -08:00
Jeff Biseda 7cacbdcca2
track repair handle_requests time (#29940) 2023-01-27 15:50:18 -08:00
behzad nouri 7f173ce7c7
feature gates merkle shreds on all clusters (#29957) 2023-01-27 21:02:51 +00:00
behzad nouri efb8a53b28
removes staked-nodes updater service excessive locks on gossip (#29936) 2023-01-26 23:31:35 +00:00
Andrew Fitzgerald fbb90603a9
BankingStage Refactor: Separate transaction commiting module (#29808)
Separate transaction commiting module
2023-01-25 19:02:21 -08:00
Kirill Fomichev b4d1769688
geyser: add parent slot/blockhash to block (#29855) 2023-01-25 14:20:24 -08:00
steviez fa39bfef6b
Move Deduper into a separate file (#29891) 2023-01-25 15:34:53 -06:00
Andrew Fitzgerald 704472ae13
BankingStage Refactor: Separate Forwarder Module (#29402)
Separate Forwarder module
2023-01-25 12:31:59 -08:00
Xiang Zhu 4ebcacb4a3
Revert "Add run parent directory for accounts files (#29794)" (#29899)
This PR is causing OOM on master.  Reverting it for now.

This reverts commit 74f89d1494.
2023-01-25 10:03:01 -08:00
Ryo Onodera 40bbf99c74
Add fully-reproducible online tracer for banking (#29196)
* Add fully-reproducible online tracer for banking

* Don't use eprintln!()...

* Update programs/sbf/Cargo.lock...

* Remove meaningless assert_eq

* Group test-only code under aptly named mod

* Remove needless overflow handling in receive_until

* Delay stat aggregation as it's possible now

* Use Cow to avoid needless heap allocs

* Properly consume metrics action as soon as hold

* Trace UnprocessedTransactionStorage::len() instead

* Loosen joining api over type safety for replaystage

* Introce hash event to override these when simulating

* Use serde_with/serde_as instead of hacky workaround

* Update another Cargo.lock...

* Add detailed comment for Packet::buffer serialize

* Rename sender_overhead_minimized_receiver_loop()

* Use type interference for TraceError

* Another minor rename

* Retire now useless ForEach to simplify code

* Use type alias as much as possible

* Properly translate and propagate tracing errors

* Clarify --enable-banking-trace with better naming

* Consider unclean (signal-based) node restarts..

* Tweak logging and cli

* Remove Bank events as it's not needed anymore

* Make tpu own banking tracer thread

* Reduce diff a bit..

* Use latest serde_with

* Finally use the published rolling-file crate

* Make test code change more consistent

* Revive dead and non-terminating test code path...

* Dispose batches early now that possible

* Split off thread handle very early at ::new()

* Tweak message for TooSmallDirByteLimitl

* Remove too much of indirection

* Remove needless pub from ::channel()

* Clarify test comments

* Avoid needless event creation if tracer is disabled

* Write tests around file rotation and spill-over

* Remove unneeded PathBuf::clone()s...

* Introduce inner struct instead of tuple...

* Remove unused enum BankStatus...

* Avoid .unwrap() for the case of disabled tracer...
2023-01-25 21:54:38 +09:00
steviez ac65343f01
Remove duplicate bank frozen log from ReplayStage (#29821)
We emit a similar log with more information shortly after from Bank, so
this logline is extra that occurs for every slot.
2023-01-24 20:29:14 -06:00
Xiang Zhu 74f89d1494
Add run parent directory for accounts files (#29794)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation
2023-01-24 16:44:35 -08:00
Brennan Watt 0be194145b
Include own node in stake table (#29838) 2023-01-24 09:34:44 -08:00
behzad nouri 1c7662a37f
asserts that cluster-info keypair is consistent with contact-info id (#29818) 2023-01-24 16:57:55 +00:00
steviez be7ec87b9b
Reduce cpuid reporting frequency to once an hour (#29849) 2023-01-24 09:27:43 -06:00
Kevin Ji dd92f225bb
Use Ipv4Addr::{LOCALHOST, UNSPECIFIED} constants (#29813) 2023-01-23 16:49:51 -06:00
steviez f1b2e49b03
Cleanup FindPacketSenderStakeReceiver function args (#29834)
find_packet_sender_stake_stage::FindPacketSenderStakeReceiver is quite
verbose to include in function arguments, and type name is descriptive
enough that it doesn't need to be qualified with the crate name in every
instance.
2023-01-23 16:40:18 -06:00
Ashwin Sekar 3e8874e3a2
Clear parent in repair weighting when dumping from replay (#29770) 2023-01-23 12:55:09 -08:00
behzad nouri bd9b311c63
adds frozen_abi annotations to repair service enums/structs (#29820)
... in order to keep types backward compatible.
2023-01-23 16:49:06 +00:00
steviez 206a1c7296
Reduce the amount of IO that LedgerCleanupService performs (#29239)
Currently, the cleanup service counts the number of shreds in the
database by iterating the entire SlotMeta column and reading the number
of received shreds for each slot. This gives us a fairly accurate count
at the expense of performing a good amount of IO.

Instead of counting the individual slots, use the live_files()
rust-rocksdb entrypoint that we expose in Blockstore. This API allows us
to get the number of entries (shreds) in the data shred column family by
reading file metadata. This is much more efficient from IO perspective.
2023-01-23 04:39:47 -06:00
behzad nouri d75303f541
patches bug in sigverify-shreds when identity is hot-swapped (#29802)
Sigverify-shreds discards shreds from node's own leader slots:
https://github.com/solana-labs/solana/blob/6baab92ab/core/src/sigverify_shreds.rs#L153-L154

But if the identity is hot-swapped the pubkey would be wrong since it
is instantiated only once at startup:
https://github.com/solana-labs/solana/blob/6baab92ab/core/src/tvu.rs#L168
2023-01-21 20:07:41 +00:00
apfitzge 8c793da7d0
BankingStage Refactor: Move decision making functions to new module (#29788)
Move decision making functions to new module
2023-01-20 10:10:47 -08:00
apfitzge 5fc83a3d19
BankingStage Refactor: Separate Next Leader Functions (#29401)
Separate next_leader functions
2023-01-20 10:02:29 -08:00
behzad nouri 64c13b74d8
errors out when retransmit loopbacks to the slot leader (#29789)
When broadcasting shreds, turbine excludes the slot leader from the
random shuffle. Doing so, shreds should never loopback to the leader.
If shreds reaching retransmit stage are from the node's own leader slots
they should not be retransmited to any nodes.
2023-01-20 17:20:51 +00:00
Wen b36791956e
Ingest duplicate proofs sent through Gossip (#29227)
* First draft of ingesting duplicate proofs in Gossip into blockstore.

* Add more unittests.

* Add more unittests for bad cases.

* Fix lint errors for tests.

* More linter fixes for tests.

* Lint fixes

* Rename get_entries, move location of comment.

* Some renaming changes and comment fixes.

* Fix compile warning, this enum is not used.

* Fix lint errors.

* Slow down cleanup because this could potentially be expensive.

* Forgot to reset cleanup count.

* Add protection against attackers when constructing chunk map when
we ingest Gossip proofs.

* Use duplicate shred index instead of get_entries.

* Rename ClusterInfoDuplicateShredListener and fix a few small problems.

* Use into_shreds to piece together the proof.

* Remove redundant code.

* Address a few small errors.

* Discard slots too advanced in the future.

* - Use oldest proof for each pubkey
- Limit number of pubkeys in each slot to 100

* Disable duplicate shred handling for now.

* Revert "Disable duplicate shred handling for now."

This reverts commit c3fcf403876cfbf90afe4d2265a826f21a5e24ab.
2023-01-19 13:00:56 -08:00
apfitzge 2c347ac0a5
BankingStage Refactor: Move packet receiving and buffering functions to separate module (#29761)
Move packet receiving and buffering functions to separate module
2023-01-19 08:52:32 -08:00
Trent Nelson c4e43f1de4
vote: encapsulate `Lockout` (#29753) 2023-01-18 19:28:28 -07:00
Ryo Onodera 4973fe18f1
Rename banking stage packet receivers consistently (#29752)
Rename banking stage batch receivers consistently
2023-01-19 10:04:55 +09:00
Ryo Onodera 55d743c49a
Rename remaining ones to replay_vote_{sender,receiver} (#29716)
* Rename remaining ones to replay_vote_{sender,receiver}

* Fix typo...
2023-01-18 14:14:04 +09:00
Jeff Biseda f9062718c4
prioritize repair requests by stake (#29730) 2023-01-17 18:38:10 -08:00
Brennan Watt aa40c2b712
Increase turbine propagation const (#29742)
* Increase turbine propagation const

Value is used as a delay threshold for issuing shred repairs and analysis is showing we are overly aggressive in requesting repairs. Shreds show up via turbine before the repair completes the vast majority of the time

* Use Duration type for MAX_TURBINE_PROPAGATION
2023-01-17 15:01:00 -08:00
Jeff Biseda f6fcb14a3e
adjust normalized stake calculation in compute_weight (#29694) 2023-01-17 11:27:57 -08:00
Ryo Onodera 156454c980
Remove PacketDeserializer's extra overflow guard (#29715) 2023-01-17 14:21:17 +09:00
Brooks 0db14ad39c
Removes full_snapshot from CalcAccountsHashConfig (#29722) 2023-01-16 16:22:46 -05:00
behzad nouri 80a39bd6a5
adds feature to (temporarily) drop merkle shreds from testnet (#29711) 2023-01-15 15:41:58 +00:00
behzad nouri 5b5a3ebce8
adds metrics for num merkle shreds on the receiving end (#29710) 2023-01-14 23:07:42 +00:00
Illia Bobyr 59fde130d6
ledger/blockstore: PerfSampleV2: num_non_vote_transactions (#29404)
Store non-vote transaction counts that are now recorded by the banks
into the `blockstore`.

`SamplePerformanceService` now populates `PerfSampleV2` with counts from
the banks.
2023-01-12 19:14:04 -08:00
Jeff Washington (jwash) 544b9745c2
snapshot storage path uses 1 append vec per slot (#29627) 2023-01-11 12:05:15 -08:00
behzad nouri 8c212f59ad
renames ContactInfo to LegacyContactInfo (#29566)
Working towards adding a new ContactInfo where new sockets can be
added in a backward compatible way.
2023-01-08 16:00:55 +00:00
Brian Anderson 43a0745b37
Fix doc warnings (#29537) 2023-01-07 09:24:50 +00:00
behzad nouri 283a2b1540
removes #[allow(clippy::same_item_push)] (#29543) 2023-01-06 17:32:26 +00:00
behzad nouri 12da2da389
fixes errors from clippy::redundant_clone (#29536)
https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone
2023-01-05 18:42:19 +00:00
behzad nouri 5c9beef498
fixes errors from clippy::useless_conversion (#29534)
https://rust-lang.github.io/rust-clippy/master/index.html#useless_conversion
2023-01-05 18:05:32 +00:00
Lijun Wang 1e8a8e07b6
Stream the executed transaction count in the block notification (#29272)
Problem

The plugins need to know when all transactions for a block have been all notified to serve getBlock request correctly. As block and transaction notifications are sent asynchronously to each other it will be difficult.

Summary of Changes

Include the executed transaction count in block notification which can be used to check if all transactions have been notified.
2023-01-05 09:36:19 -08:00
Jeff Biseda 832302485e
require repair request signature, ping/pong for Testnet, Development clusters (#29351) 2023-01-04 14:54:19 -08:00
Illia Bobyr d7bd1bf970
bank: Record non-vote transaction count (#29383)
A subsequent change to `SamplePerformanceService` introduces non-vote transaction counts, which `bank`s need to store.

Part of work on https://github.com/solana-labs/solana/issues/29159
2023-01-03 14:46:20 -08:00
Xiang Zhu 3363c08ac0
Move async remove to snapshot_utils.rs (#29406) 2023-01-03 06:15:32 -08:00
behzad nouri 754ecf467b
generalizes the return type of Shred::get_signed_data (#29446)
The commit adds an associated SignedData type to Shred trait so that
merkle and legacy shreds can return different types for signed_data
method.
This would allow legacy shreds to point to a section of the shred
payload, whereas merkle shreds would compute and return the merkle root.
Ultimately this would allow to remove the merkle root from the shreds
binary.
2022-12-31 17:08:25 +00:00
Ashwin Sekar 17b64005d3
Add more logging and documentation to flaky optimistic confirmation tests (#29418)
* Revert "add retry for flakey local cluster test (#29228)"

This reverts commit 7a97121747.

* Add logging for repair
2022-12-27 10:47:45 -07:00
behzad nouri 456d06785d
experiments different turbine fanouts for propagating shreds (#29393)
The commit allocates 2% of slots to running experiments with different
turbine fanouts based on the slot number.
The experiment is feature gated with an additional feature to disable
the experiment.
2022-12-26 14:18:56 +00:00
Ashwin Sekar f2ba16ee87
Plumb dumps from replay_stage to repair (#29058)
* Plumb dumps from replay_stage to repair

When dumping a slot from replay_stage as a result of duplicate or
ancestor hashes, properly update repair subtrees to keep weighting and
forks view accurate.

* add test

* pr comments
2022-12-25 09:58:30 -07:00
behzad nouri 558292466b
rolls back merkle shreds on testnet (#29340)
https://github.com/solana-labs/solana/pull/29339
adds hash domain to merkle shreds. In order to merge that change, need
to temporarily disable merkle shreds on testnet.
2022-12-20 18:33:48 +00:00
Brooks 053775ad77
Elides unnecessary lifetimes (#29299) 2022-12-20 12:44:17 -05:00
Tao Zhu c657f42d77
remove a wrapper function (#29305) 2022-12-19 16:10:16 +00:00
Brennan Watt 86b2e545e1
Prune redundant const SLOT_MS (#29278)
* Alias redundant const SLOT_MS to DEFAULT_MS_PER_SLOT

* Slate SLOT_MS for deprecation

* Add doc comments

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-12-16 08:05:09 -08:00
Jeff Biseda a44ea779bd
add support for a repair protocol whitelist (#29161) 2022-12-15 19:24:23 -08:00
Jeff Washington (jwash) b95835143e
remove AccountsBackgroundService::new(caching_enabled) (#29234) 2022-12-13 07:18:02 -08:00
Jeff Washington (jwash) fec8f61566
remove ProcessOptions::accounts_db_caching_enabled (#29217) 2022-12-12 20:25:00 -08:00
Jeff Biseda 88a8f40bd2
apply [limit repairs to top staked... #28673] to non-MainnetBeta clusters (#29163) 2022-12-11 15:52:41 -08:00
behzad nouri 4ee318b2b2
fixes rust code formatting in core/src/consensus.rs (#29204) 2022-12-11 23:20:52 +00:00
Jeff Washington (jwash) 631a98a3b6
warp_from_parents works with write_cache enabled (#29185) 2022-12-09 14:28:18 -08:00
apfitzge cd9f1f1862
Typo/filter_and_forward_with_account_limits (#29183) 2022-12-09 16:22:25 -06:00
Jeff Washington (jwash) 560143a267
remove ValidatorConfig.caching_enabled (#29172) 2022-12-09 11:31:55 -08:00
Lijun Wang ecea802fe6
Bidirectional quic communication support (#29155)
* Support bi-directional quic communication, use the same endpoint for the quic server and client
This is needed for supporting using quic for repair

* Added comments on the bi-directional communication tests

* Removed some debug logs

* clippy issue
2022-12-09 10:59:43 -08:00
HaoranYi ca5d8c4b4d
refactor sysmonitor config (#29132) 2022-12-09 07:43:03 -06:00
Jason Davis 049fb3d725 Remove an unnecessary clone of a PohConfig inside Validator::new 2022-12-07 13:03:14 -06:00
Jason Davis 8f24ceffbd Removed Arcs from PohConfig parameters and pass the struct by reference only 2022-12-07 10:52:07 -06:00
HaoranYi 582397ad48
fix solRptLdgrStat thread hang (#29118) 2022-12-06 17:09:56 -06:00
HaoranYi 33b15240ac
Revert #28945 (#29127)
revert #28945
2022-12-06 17:08:56 -06:00
steviez aeb6b53502
Remove unused Option<> around ValidatorConfig's SnapshotConfig (#29090)
Remove Option<> around ValidatorConfig's SnapshotConfig

The SnapshotConfig is required and is currently hard-coded to be a
Some().
2022-12-06 22:47:55 +00:00
behzad nouri df7fd8ae5f
patches rust code formatting in core/src/replay_stage.rs (#29123) 2022-12-06 22:09:57 +00:00
behzad nouri 9524c9dbff patches errors from clippy::uninlined_format_args
https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
2022-12-06 19:32:15 +00:00
behzad nouri 9433c06745 patches errors from clippy::unchecked_duration_subtraction
https://rust-lang.github.io/rust-clippy/master/index.html#unchecked_duration_subtraction
2022-12-06 19:32:15 +00:00
Haoran Yi bbd49acb2f fix merge error 2022-12-06 13:31:50 -06:00
haoran 7a512d7f27 report number of open files 2022-12-06 13:31:50 -06:00
haoran f716cad4af don't use procfs as it is not supported on mac and windows.
make open_fd stats only on linux platform
2022-12-06 13:31:50 -06:00
haoran fc97d818b6 share code 2022-12-06 13:31:50 -06:00
haoran 36dc3a457f get mmap with wc-l 2022-12-06 13:31:50 -06:00
haoran dd81af9fff increase fd report interval to 3 minutes 2022-12-06 13:31:50 -06:00
haoran 0ce507c20d refactor SystemMonitorReportConfig 2022-12-06 13:31:50 -06:00
Haoran Yi 914f7bd85d fix 2022-12-06 13:31:50 -06:00
Haoran Yi 1635b99486 add mmmap file count 2022-12-06 13:31:50 -06:00
Haoran Yi e1ba5a2a63 add monitoring for open file descriptors stat 2022-12-06 13:31:50 -06:00
Tao Zhu 7ed22f7b18
Remove gate from accepting packets for forwarding (#29049) 2022-12-06 12:13:01 -06:00
Jon Cinque b1340d77a2
sdk: Make Packet::meta private, use accessor functions (#29092)
sdk: Make packet meta private
2022-12-06 12:54:49 +01:00
apfitzge fd3b5d08d7
Refactor/banking_stage_make_decision_consume_bank (#28946) 2022-12-02 10:07:01 -06:00
Tao Zhu 5850af5316
Refactor to remove requested_cu from cost_trarcker (#29015)
* refactor cost tracker by removing requested_cu from it, call sites to use cost_model forr consistency

* review fix
2022-12-02 00:25:09 +00:00
steviez 3c42c87098
Remove obsoleted return value from Blockstore insert shred method (#28992) 2022-12-01 11:17:46 -06:00
steviez b6dce6cf3b
Move BlockstoreInsertionMetrics field update to blockstore.rs (#28991)
The num_repair field is only blockstore insertion metric being updated
outside of Blockstore::insert() call chain; move the update to insert()
with the rest of the fields in BlockstoreInsertionMetrics struct.
2022-11-30 11:46:35 -06:00
Ashwin Sekar edacd3c411
Add dump_node to update stake for heaviest subtrees (#28827)
* Add dump_node to update stake for heaviest subtrees

Additionally refactor subtrees to store children as a hashset

* Add a more complicated forks test

* chose -> choose

* remove is_dumped flag and reuse latest_invalid_ancestor instead
2022-11-30 09:26:13 -08:00
apfitzge 4d338ed882
Bugfix/mi_remove_never_entries (#28978) 2022-11-29 16:00:21 -06:00
Ashwin Sekar 0d0a491f27
More documentation + small refactor for RepairService (#28933) 2022-11-28 19:46:06 -08:00
Tao Zhu 9f370475d4
remove obsoleted comment (#28960) 2022-11-28 13:39:40 -06:00
behzad nouri 7d99cddb9f
dedups turbine retransmit peers by tvu socket addresses (#28944)
No need to send duplicate shreds if several nodes have the same tvu
socket address because they are behind a relayer or whatever.
2022-11-28 19:23:02 +00:00
HaoranYi 7e87998091
reduce memory usage report freq to 1 per 5s (#28327) 2022-11-28 19:08:06 +00:00
apfitzge bdd162492c
Feature/multi-iterator-scanner-read-locks (#28862) 2022-11-28 11:23:04 -06:00
Brooks Prumo 9327658007
Promotes accounts hash to a strong type (#28930) 2022-11-28 10:09:47 -05:00
apfitzge 38f7122605
separate make_decision in BankingStage (#28884) 2022-11-22 19:01:09 -06:00
Maximilian Schneider c8b0c3ede9
Update cost model to use requested_cu instead of estimated cu #27608 (#28281)
* Update cost model to use requested_cu instead of estimated cu #27608

* remove CostUpdate and CostModel from replay/tvu

* revive cost update service to send cost tracker stats

* CostModel is now static

* remove unused package

Co-authored-by: Tao Zhu <tao@solana.com>
2022-11-22 11:55:56 -06:00
apfitzge 637e8a937b
clean up: remove my_pubkey arg from consume_buffered_packets (#28888) 2022-11-22 11:40:04 -06:00
Jeff Washington (jwash) 20d8b5e98b
default some tests to write cache = true (#28917) 2022-11-21 15:53:39 -08:00
apfitzge dd723210ca
remove unnecessary clippy attributes (#28891) 2022-11-21 12:54:54 -06:00
behzad nouri d43b001189
rolls out merkle shreds to ~20% of testnet (#28905) 2022-11-21 16:20:02 +00:00
Michael Vines c6927151ef
Sort offline/wrong-shred nodes by stake weight while waiting for supermajority (#28872) 2022-11-18 15:26:21 -08:00
apfitzge a636038fff
Clean up: banking_stage_prepare_sanitized_batch (#28841)
Use measure! for bank.prepare_sanitized_batch_with_results
2022-11-18 14:04:44 -06:00
Tyera c32377b5af
Split out quic- and udp-client definitions (#28762)
* Move ConnectionCache back to solana-client, and duplicate ThinClient, TpuClient there

* Dedupe thin_client modules

* Dedupe tpu_client modules

* Move TpuClient to TpuConnectionCache

* Move ThinClient to TpuConnectionCache

* Move TpuConnection and quic/udp trait implementations back to solana-client

* Remove enum_dispatch from solana-tpu-client

* Move udp-client to its own crate

* Move quic-client to its own crate
2022-11-18 12:21:45 -07:00
apfitzge 88e6ea37d9
refactor: move more BankingStage cost_model stuff into qos_service (#28840) 2022-11-17 14:03:17 -06:00
Andrew Fitzgerald ee2f760d3d
MultiIteratorScanner - improve banking stage performance with high contention 2022-11-17 10:54:12 -06:00
Jeff Biseda 17ee3349f8
limit repairs to top staked requests in batch (#28673) 2022-11-16 16:30:41 -08:00
Ashwin Sekar ddf4ff2d26
Repair service documentation (#28592)
* repair doc update

* tree_root rename

* remove extra todo
2022-11-16 02:38:07 +00:00
Jeff Biseda e10d958352
signed repair request test fixes/cleanup (#28691) 2022-11-15 16:46:17 -08:00
Tao Zhu e5ae0b3371
check is_forwarded packet earlier (#28159)
* check and filter is_forwarded packet earlier

* review fix: renaming; and rebase
2022-11-11 23:32:03 +00:00
Brooks Prumo 4d6653598b
Upgrades to Rust 1.65.0 (#28741) 2022-11-09 17:15:03 -05:00
Brooks Prumo d1ba42180d
clippy for rust 1.65.0 (#28765) 2022-11-09 19:39:38 +00:00
Brooks Prumo 9e1cdc7e60
Enables not taking a bank snapshot (#28756) 2022-11-09 12:43:33 -05:00
Brooks Prumo 0b9426e734
Simplifies AHV's `test_max_hashes()` (#28754) 2022-11-07 02:32:33 +00:00
Brooks Prumo 064cfc70d2
Removes cluster_type from AccountsPackage (#28725) 2022-11-02 18:21:13 -04:00
Brooks Prumo d0f639745a
Uses AccountsPackage::default_for_tests() in AHV tests (#28723) 2022-11-02 14:13:35 -04:00
Lijun Wang f156bc12ca
Enforce stream receive timeout (#28513)
In the quic server handle_connection, when we timed out in receiving the chunks, we loop forever to wait for the chunk. If the client never provide another chunk, the server can hopelessly wait for that chunk and wasting server resources. Instead WAIT_FOR_CHUNK_TIMEOUT_MS is introduced to bound this to 10 seconds at maximum. The stream will be dropped if it times out.
2022-11-02 10:09:32 -07:00
Brooks Prumo 59bf1809fe
Uses SnapshotHash type in snapshot archive fields (#28681) 2022-10-31 14:28:35 -04:00
Dmitri Makarov 34865d032c chore: update Solana docs and code comments that specify "BPF" to "SBF" 2022-10-31 14:14:25 -04:00
Brooks Prumo 37507a2de6
Removes EAH parameter from serde_snapshot::reserialize_bank() (#28669) 2022-10-31 09:43:17 -04:00
sakridge 340ad68223
Banking stage refactor commit transactions (#28660)
* Refactor commit transactions step

* Cleanup token pre-balances

* Collect prebalances together

* Collect pre/post balances in separate function

* Fix clippy
2022-10-29 21:36:57 +02:00
steviez 6b93d05c37
Add LedgerCleanupService::find_slots_to_clean() test (#28656)
Add a test to better exercise find_slots_to_clean(), as well as a minor
bug fix to this method that was found as a result of writing test.
2022-10-29 00:55:21 +02:00
apfitzge 22ce49ae7f
Maintain original queue capacity for unprocessed packet buffer (#28661) 2022-10-28 16:37:21 -05:00
apfitzge 0a148b2bf7
remove unused handle_retryable_packets_elapsed (#28355) 2022-10-28 16:36:41 -05:00
steviez 2272fd807e
Remove Blockstore manual compaction code (#28409)
The manual Blockstore compaction that was being initiated from
LedgerCleanupService has been disabled for quite some time in favor of
several optimizations.

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
2022-10-28 10:39:00 +02:00
Ashwin Sekar ae557a9eb5
Exit when stuck in an unrecoverable repair/purge loop (#28596)
* Exit when stuck in an unrecoverable repair/purge loop

* add tests
2022-10-27 20:06:06 -07:00
apfitzge 340d3b5468
rename and change capacity on unprocessed transaction storage - max_receive_size (#28586) 2022-10-26 10:03:47 -05:00
Brooks Prumo f158bab0ef
Tracks how long background requests wait before processing (#28581) 2022-10-25 12:10:53 -04:00
Brooks Prumo bc02789c43
Renames fn to calculate_accounts_hash_from_storages() (#28566) 2022-10-24 21:07:00 -04:00
Brooks Prumo 2354a0a343
Renames fn to calculate_accounts_hash_from_index() (#28568) 2022-10-24 19:20:08 -04:00
Ashwin Sekar 9eafad467c
Add convenience methods to VoteInstruction to distinguish vote types (#28526)
* Add convenience methods to VoteInstruction to distinguish vote types

* use matches! macro instead
2022-10-21 14:17:40 -06:00
Ashwin Sekar f207af765e
Split out voting and banking threads in banking stage (#27931)
* Split out voting and banking threads in banking stage

Additionally this allows us to aggressively prune the buffer for voting threads
as with the new vote state only the latest vote from each validator is
necessary.

* Update local cluster test to use new Vote ix

* Encapsulate transaction storage filtering better

* Address pr comments

* Commit cargo lock change

* clippy

* Remove unsafe impls

* pr comments

* compute_sanitized_transaction -> build_sanitized_transaction

* &Arc -> Arc

* Move test

* Refactor metrics enums

* clippy
2022-10-20 21:10:48 +00:00
Jeff Biseda 0df4be06a0
enable repair ping/pong cache (#28408) 2022-10-19 14:55:55 -07:00
carllin 274d9ea607
Check for valid address in broadcast (#28432)
Check for valid address
2022-10-19 14:49:22 -05:00
Brooks Prumo 1cc9cf927c
Supports warping with Epoch Accounts Hash (#28459) 2022-10-19 10:37:14 -04:00
behzad nouri e283461d99
enforces hash domain for ping-pong protocol (#28433)
https://github.com/solana-labs/solana/pull/27193
added hash domain to ping-pong protocol.
For backward compatibility responses both with and without domain were
generated and accepted.
Now that all clusters are upgraded, this commit enforces the hash domain
by removing the response without the domain.
2022-10-18 18:17:12 +00:00
Jeff Washington (jwash) 28a89a1d99
remove expected rent collection and rehashing completely (#28422) 2022-10-17 07:24:42 -07:00
steviez 39fa297bf6
Report total_transactions in replay-slot-stats (#28382)
We have transactions counted in replay-slot-end-to-end-stats, but that
metric is broken down to report things per thread.

So, report total_transactions for the entire slot (all threads) in
replay-slot-stats.
2022-10-15 14:07:03 +01:00
Brooks Prumo dd7fee8f32
Re-enqueues unhandled ABS requests (#28362) 2022-10-13 16:25:39 -04:00
Brooks Prumo 9cbd00fdbc
Converts PendingAccountsPackage to a channel (#28352) 2022-10-13 12:47:36 -04:00
Jason Davis e2fc9d51de Increase cpu metric reporting interval from 1s to 10s 2022-10-11 10:44:59 -05:00
Jeff Biseda 15050b14b9
use signed repair request variants (#28283) 2022-10-10 14:09:45 -07:00
Tao Zhu 50985f79a1
Correctly mark packets as forwarded (#28161)
Only mark packets accepted for forwarding as `forwarded`
2022-10-07 11:50:57 -05:00
Tao Zhu 0324573667
report additional transaction errors to metrics (#28285) 2022-10-07 10:36:22 -05:00
Brooks Prumo a8c6a9e5fc
Bank::freeze() waits for EAH calculation to complete (#28170) 2022-10-05 17:44:35 -04:00
Jason Davis c899ededfc Minor refactoring and cleaning of cpuid code 2022-10-05 11:43:27 -05:00
Jason Davis 3b2ab313de Use num-enum crate to make everything typesafe 2022-10-05 11:43:27 -05:00
Jason Davis 1e1455688d Convert magic numbers to named constants 2022-10-05 11:43:27 -05:00
Jason Davis fac772ff90 Update naming, style after PR review comments 2022-10-05 11:43:27 -05:00
Jason Davis 13b095b4ab Fix a fmt problem, I think. Shows up in the git check, but not when I run here 2022-10-05 11:43:27 -05:00
Jason Davis c8584b0cdd Cargo fmt applied 2022-10-05 11:43:27 -05:00
Jason Davis d841286c21 Add cpuid calls and metric reporting; change cpu info sampling interval from 1s to 10s 2022-10-05 11:43:27 -05:00
Jeff Biseda e3e888c0e0
stats for staked/unstaked repair requests (#28215) 2022-10-04 17:37:24 -07:00
behzad nouri 9e7a0e7420
rolls out merkle shreds to ~5% of testnet (#28199) 2022-10-04 19:36:16 +00:00
carllin 14a415ccf3
Consensus Logging (#28176) 2022-10-03 20:45:55 -05:00
haoran c4aab3f178 typo 2022-10-03 09:41:15 -05:00
Justin Starry c2bb2b8e60
Allow validators to reset to the slot which matches their last voted slot (#28172)
* Add failing test

* Allow resetting to duplicate was confirmed

* feedback

* feedback

* bump

* simplify change

* Revert "simplify change"

This reverts commit 72e5de3e5bdac595f71dc7fc01650ca3bc7da98e.

* update comment

* Update core/src/replay_stage.rs
2022-10-03 16:49:47 +08:00
Jeff Washington (jwash) cfc124c825
acct idx can no longer use write cache (#28150) 2022-09-30 10:55:27 -07:00
apfitzge 82558226f7
ImmutableDeserializedPacket rc to arc (#28145) 2022-09-30 12:07:48 -05:00
Tao Zhu 82e65593ee
Batch filtering invalid transactions before forwarding (#26798)
- Batch filtering invalid transactions (fail to sanitize, too old or already processed) before forwarding
- Combine packet filtering and forwarding to share sanitized transactions
- `iter_desc` is no longer needed, remove it;
- Add a method to share the logic of removing packets from buffer after they were removed from MinMaxHeap
- Add test coverage for forward_packet_batches_by_accounts
- rebase, resolve conflicts
2022-09-29 16:33:40 -05:00
Jeff Biseda 8b0f9b4917
make ping cache rate limit delay configurable (#27955) 2022-09-26 14:16:56 -07:00
behzad nouri f49beb0cbc
caches reed-solomon encoder/decoder instance (#27510)
ReedSolomon::new(...) initializes a matrix and a data-decode-matrix cache:
https://github.com/rust-rse/reed-solomon-erasure/blob/273ebbced/src/core.rs#L460-L466

In order to cache this computation, this commit caches the reed-solomon
encoder/decoder instance for each (data_shards, parity_shards) pair.
2022-09-25 18:09:47 +00:00
Jeff Biseda 9816c94d7e
metrics to distinguish why repair packets are dropped (#27960) 2022-09-24 23:20:05 -07:00
Jeff Biseda 8b43215ddd
count unsigned repair requests (#27953) 2022-09-24 12:56:02 -07:00
Tao Zhu e51cf46d6b
Remove priority from vote transactions (#28030)
vote transactions have same priority fee
2022-09-24 00:31:50 +00:00
behzad nouri 9ee53e594d
patches clippy errors from new rust nightly release (#28028) 2022-09-23 20:57:27 +00:00
Brooks Prumo d9b31fd6b0
ahv: Add debug logging for EAH (#27998) 2022-09-23 14:04:48 -04:00
Jeff Biseda 206cc9407b
allow unsigned repair requests (#27910) 2022-09-23 10:11:08 -07:00
behzad nouri 97c9af4c6b plumbs through flag to generate merkle variant of shreds 2022-09-23 16:45:18 +00:00
steviez e4affb9fea
Add Blockstore::highest_slot() method (#27981) 2022-09-23 04:53:43 -05:00
behzad nouri 9a57c64f21
patches clippy errors from new rust nightly release (#27996) 2022-09-22 22:23:03 +00:00
behzad nouri abfb996135
tracks number of staked/stale/dead nodes in turbine cluster-nodes (#27915) 2022-09-19 18:16:04 +00:00
Ashwin Sekar 9119dc13ec
Add structure to house unprocessed transactions in banking_stage (#27777)
Separate storage for voting and transaction threads:
- Voting threads utilize a shared reference in order to dedup extraneous
  votes
- Transactions have thread local storage like before
2022-09-14 10:40:44 -07:00
Ashwin Sekar c74df830b1
Add structure to collect and coalesce vote packets (#27558)
* Add structure to collect and coalesce vote packets

Will be used in banking stage to throw out extraneous vote packets
before processing

* pr comments

* Update inner lock to arc to improve performance
2022-09-14 00:44:26 -07:00
apfitzge 079bf561b0
Clean_up/upb_push_comment (#27707) 2022-09-12 18:59:41 -05:00
Jeff Washington (jwash) 765c628546
use exit signal for acct idx bg threads (#27483) 2022-09-12 11:51:12 -07:00
behzad nouri 4f22ee8f9b uses varint encoding for vote-state lockout offsets
The commit removes CompactVoteStateUpdate and instead reduces serialized
size of VoteStateUpdate using varint encoding for vote-state lockout
offsets.
2022-09-12 16:31:20 +00:00
Christian Kamm 90b8a3a44d
Remove KeypairInsecureClone trait and add insecure_clone() instead (#27396)
See discussion in #26248
2022-09-12 14:59:41 +00:00
Michael Vines 83d4d128c2 Add --process-ledger-before-service flag to solana-validator 2022-09-11 07:58:42 -07:00
Jeff Washington (jwash) 1f00b468e5
add enable_rehashing to AccountsPackage (#27644) 2022-09-08 09:25:25 -07:00
apfitzge a9c5adbf88
UnprocessedPacketBatches pop_max fn are only used in tests (#27645) 2022-09-08 11:01:14 -05:00
Maximilian Schneider cc58968b76
add new leader slot metric to track account contention throttling (#27654) 2022-09-08 09:22:58 -05:00
Xiang Zhu 4308c300b4
In ledger-tool delete the account files in the async way (#27622)
* In ledger-tool delete the account files in the async way

* format changes by ./cargo nightly fmt --all
2022-09-07 14:35:06 -07:00
Brooks Prumo 6a322de845
Make Accounts Background Services aware of Epoch Accounts Hash (#27626) 2022-09-07 20:41:40 +00:00
Lijun Wang 7f223dc582
Added option to turn on UDP for TPU transaction and make UDP based TPU off by default (#27462)
--tpu-enable-udp is introduced. And when this is on, the transaction receive and transaction forward is enabled using udp.

Except for a few tests which was hard-coded sending transactions using udp, most tests are being done with udp based tpu disabled.
2022-09-07 13:19:14 -07:00
apfitzge c04747dd66
cluster_slot_state_verifier: clippy nightly fixes (#27521)
clippy nightly fixes
2022-09-07 15:04:56 -05:00
apfitzge 1465ec947d
replay_stage: clippy nightly fixes (#27520)
clippy nightly fixes
2022-09-07 15:04:46 -05:00
apfitzge d6a1e7498f
Add tests for deserialize_and_collect_packets (#27623) 2022-09-07 12:52:18 -05:00
Jeff Washington (jwash) 22007a3c96
allow accounts hash calc to specify enable_rehashing (#27615) 2022-09-07 10:16:52 -07:00
Jeff Washington (jwash) a31d4a597d
serialize epoch_accounts_hash (#27516) 2022-09-07 10:07:00 -07:00
Brooks Prumo 93a4f80a2c
Handling snapshot requests is now required (#27537) 2022-09-07 10:08:42 -04:00
Jeff Biseda 269eb519dd
track time to coalesce entries in recv_slot_entries (#27525) 2022-09-06 16:07:17 -07:00
apfitzge a67d56f462
refactor: add function for deserializing and collecting packets - separate from channel receive (#27548) 2022-09-06 15:54:31 -05:00
Brooks Prumo 6684c62280
Add SnapshotUsage to SnapshotConfig (#27508) 2022-09-02 08:56:23 -04:00
Brennan Watt 242c9cb442
RPC Notifier Signal when Setup Complete (#27481)
* RPC notifier signal when ready
2022-09-01 16:39:55 -07:00
Tyera Eulberg 9b8bed86f9
Add getRecentPrioritizationFees RPC endpoint (#27278)
* Plumb priority_fee_cache into rpc

* Add PrioritizationFeeCache api

* Add getRecentPrioritizationFees rpc endpoint

* Use MAX_TX_ACCOUNT_LOCKS to limit input keys

* Remove unused cache apis

* Map fee data by slot, and make rpc account inputs optional

* Add priority_fee_cache to rpc test framework, and add test

* Add endpoint to jsonrpc docs

* Update docs/src/developing/clients/jsonrpc-api.md

* Update docs/src/developing/clients/jsonrpc-api.md
2022-09-01 23:12:12 +00:00