Commit Graph

3839 Commits

Author SHA1 Message Date
Jeff Biseda 180273b97d
defer HighestShred repairs during shred propagation threshold (#30142) 2023-02-09 14:57:55 -08:00
Ashwin Sekar 67f644473b
Fix repair behavior concerning our own leader slots (#30200)
panic when trying to dump & repair a block that we produced
2023-02-09 14:30:12 -07:00
Andrew Fitzgerald 4b17acf64e
BankingStage Refactor: Add state to Committer (#30107) 2023-02-09 13:22:42 -08:00
Andrew Fitzgerald 058738424d
BankingStage Refactor: transaction recorder record transactions (#30106) 2023-02-09 08:34:02 -08:00
steviez d3dab24bbe
chore: Use `i` over `ix` variable name when naming worker threads (#30206) 2023-02-09 01:24:57 +00:00
behzad nouri 1ad69cfc38
removes dynamic cast and dynamic dispatch from connection-cache (#30128)
Dynamic dispatch forces heap allocation and adds extra overhead.
Dynamic casting as in the ones below, lacks compile-time type safety:
https://github.com/solana-labs/solana/blob/eeb622c4e/quic-client/src/lib.rs#L172-L175
https://github.com/solana-labs/solana/blob/eeb622c4e/udp-client/src/lib.rs#L52-L55

The commit removes all instances of Any, Box<dyn ...>, and Arc<dyn ...>,
and instead uses generic and associated types.

There are only two protocols QUIC and UDP; and the code which has to
work with both protocols can use a trivial thin enum wrapper.

With respect to connection-cache specifically:
* connection-cache/ConnectionCache is a single protocol cache which
  allows to use either QUIC or UDP without any build dependency on the
  other protocol.
* client/ConnectionCache is an enum wrapper around both protocols and
  can be used in the code which has to work with both QUIC and UDP.

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2023-02-09 00:50:44 +00:00
Illia Bobyr cf77f5dbb8
doc: ledger: Document `completed_data_sets_service` module (#30001) 2023-02-07 21:20:09 -08:00
Andrew Fitzgerald 2b99756b3e
BankingStage Refactor: Move counters out of record_transactions (#30093)
Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
2023-02-07 07:45:50 -08:00
Andrew Fitzgerald d9444a6576
remove unnecessary clippy warning ignore (#30100) 2023-02-06 08:27:18 -08:00
Andrew Fitzgerald 7fb2fc6f27
Add comment on the closure (#30091) 2023-02-06 08:24:36 -08:00
Wen 151585e596
Filter pubkey in gossip duplicateproof ingestion (#29879) 2023-02-03 11:41:32 -08:00
Andrew Fitzgerald 8914d1af27
BankingStage Refactor: Add state to PacketReceiver (#30090) 2023-02-03 11:35:43 -08:00
Pankaj Garg be8e463a51
Use TPU IP instead of gossip for QUIC client certificate info (#30105) 2023-02-03 04:16:57 +00:00
Andrew Fitzgerald 8fa396a321
BankingStage Refactor: Add state to Forwarder (#29403) 2023-02-02 11:09:08 -08:00
Andrew Fitzgerald fd3f26380e
BankingStage Refactor: Simplify PacketReceiver (#29784) 2023-02-02 07:58:55 -08:00
Lijun Wang ada6136a6c
Refactor connection cache to support generic msgs (#29774)
tpu-client/tpu_connection_cache is refactored out the module and moved to connection-cache/connection_cache and the logic in client/connection_cache is consolidated to connection-cache/connection_cache as well. client/connection_cache only has a thin wrapper which forward calls to connection-cache/connection_cache and deal with constructions of quic/udp connection cache for clients using them both.2.

The TpuConnection is refactored to ClientConnection to make it generic and functions renamed to be proper for other workflows. eg. tpu_addr -> server_addr, send_transaction --> send_data and etc...

The enum dispatch is removed so that we can make the bulk of code of quic and udp agnostic of each other. The client is possible to load quic or udp only into its runtime.

The generic type parameter in the tpu-client/tpu_connection_cache is removed in order to create both quic and udp connection cache and use the object to send transactions with multiple branching when sending data. The generic type parameters and associated types are dropped in other types in order to make the trait "object safe" for this purpose.

I have annotated the code explaining the reasoning and the refactoring source -> destination.

There is no functional changes

bench-tps has been performed for rpc-client, thin-client and tpu-client. And it is found the performance number largely match the ones before the refactoring.
2023-02-01 18:10:06 -08:00
Xiang Zhu f107b8b607
Add slot deltas into the bank snapshot directory (#29409) 2023-02-01 16:51:32 -08:00
Andrew Fitzgerald c549129974
BankingStage Refactor: Committer Simplify (#29958) 2023-02-01 15:44:53 -08:00
dependabot[bot] 232e252014
Bump serde from 1.0.144 to 1.0.152 (#29696)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
Co-authored-by: Tyera <tyera@solana.com>
2023-02-01 16:27:17 -07:00
Brooks d048a1903f
Splits up AccountsDb::bank_hashes (#30024) 2023-02-01 14:32:35 -05:00
Andrew Fitzgerald c06053f505
BankingStage Refactor: Add state to DecisionMaker (#29806) 2023-02-01 09:18:40 -08:00
behzad nouri ffc9c90cb4
expands api parity between the new and the legacy contact-info (#30038)
Working towards replacing the legacy contact-info with the new one, the
commit expands api compatibility between the two.
2023-02-01 13:07:42 +00:00
Will Hickey 04a6a631bc
Bump version to v1.16 (#30028) 2023-01-31 17:48:33 -06:00
carllin b345d97f67
Add local cluster test for optimistic confirmation with malformed votes (#29822) 2023-01-31 14:19:45 -06:00
joeaba a12bf8c003
Update maintainers references (#29997)
* update maintainers references

* chore: update maintainers reference
2023-01-31 08:07:13 -05:00
Jeff Biseda c6cd96635f
get_best_weighted_repairs parameter cleanup (#30010) 2023-01-31 03:12:25 -08:00
Jeff Biseda 6163a6c279
restructure repair decode error handling (#29977) 2023-01-31 02:44:58 -08:00
Xiang Zhu 856598969c
Account path add run parent with old path cleanup (#29942)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation

* Clean up files in the old account_path before trasitioning to the new run path

* try_exist and accounts_dir removing extra

* sync rmdir, is_dir check

* handle the account_path not deletable case
2023-01-30 10:26:43 -08:00
Jeff Biseda 7cacbdcca2
track repair handle_requests time (#29940) 2023-01-27 15:50:18 -08:00
behzad nouri 7f173ce7c7
feature gates merkle shreds on all clusters (#29957) 2023-01-27 21:02:51 +00:00
behzad nouri efb8a53b28
removes staked-nodes updater service excessive locks on gossip (#29936) 2023-01-26 23:31:35 +00:00
Andrew Fitzgerald fbb90603a9
BankingStage Refactor: Separate transaction commiting module (#29808)
Separate transaction commiting module
2023-01-25 19:02:21 -08:00
Kirill Fomichev b4d1769688
geyser: add parent slot/blockhash to block (#29855) 2023-01-25 14:20:24 -08:00
steviez fa39bfef6b
Move Deduper into a separate file (#29891) 2023-01-25 15:34:53 -06:00
Andrew Fitzgerald 704472ae13
BankingStage Refactor: Separate Forwarder Module (#29402)
Separate Forwarder module
2023-01-25 12:31:59 -08:00
Xiang Zhu 4ebcacb4a3
Revert "Add run parent directory for accounts files (#29794)" (#29899)
This PR is causing OOM on master.  Reverting it for now.

This reverts commit 74f89d1494.
2023-01-25 10:03:01 -08:00
Ryo Onodera 40bbf99c74
Add fully-reproducible online tracer for banking (#29196)
* Add fully-reproducible online tracer for banking

* Don't use eprintln!()...

* Update programs/sbf/Cargo.lock...

* Remove meaningless assert_eq

* Group test-only code under aptly named mod

* Remove needless overflow handling in receive_until

* Delay stat aggregation as it's possible now

* Use Cow to avoid needless heap allocs

* Properly consume metrics action as soon as hold

* Trace UnprocessedTransactionStorage::len() instead

* Loosen joining api over type safety for replaystage

* Introce hash event to override these when simulating

* Use serde_with/serde_as instead of hacky workaround

* Update another Cargo.lock...

* Add detailed comment for Packet::buffer serialize

* Rename sender_overhead_minimized_receiver_loop()

* Use type interference for TraceError

* Another minor rename

* Retire now useless ForEach to simplify code

* Use type alias as much as possible

* Properly translate and propagate tracing errors

* Clarify --enable-banking-trace with better naming

* Consider unclean (signal-based) node restarts..

* Tweak logging and cli

* Remove Bank events as it's not needed anymore

* Make tpu own banking tracer thread

* Reduce diff a bit..

* Use latest serde_with

* Finally use the published rolling-file crate

* Make test code change more consistent

* Revive dead and non-terminating test code path...

* Dispose batches early now that possible

* Split off thread handle very early at ::new()

* Tweak message for TooSmallDirByteLimitl

* Remove too much of indirection

* Remove needless pub from ::channel()

* Clarify test comments

* Avoid needless event creation if tracer is disabled

* Write tests around file rotation and spill-over

* Remove unneeded PathBuf::clone()s...

* Introduce inner struct instead of tuple...

* Remove unused enum BankStatus...

* Avoid .unwrap() for the case of disabled tracer...
2023-01-25 21:54:38 +09:00
Yihau Chen 9193b4221d
Revert "chore: workspace inheritance (#29509)" (#29892)
This reverts commit a67d239dde.
2023-01-25 15:50:41 +08:00
Yihau Chen a67d239dde
chore: workspace inheritance (#29509)
* introduce workspace.package

* introduce workspace.dependencies

* read version from root cargo.toml

* pass check when version = { workspace = true }

* don't bump version when version = { workspace = true }

* including workspace Cargo.toml when bump version

* programs/sbf use workspace inheritance

* fix increasing cargo version ignore program/sbf/Cargo.toml
2023-01-25 13:59:59 +08:00
steviez ac65343f01
Remove duplicate bank frozen log from ReplayStage (#29821)
We emit a similar log with more information shortly after from Bank, so
this logline is extra that occurs for every slot.
2023-01-24 20:29:14 -06:00
Xiang Zhu 74f89d1494
Add run parent directory for accounts files (#29794)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation
2023-01-24 16:44:35 -08:00
Brennan Watt 0be194145b
Include own node in stake table (#29838) 2023-01-24 09:34:44 -08:00
behzad nouri 1c7662a37f
asserts that cluster-info keypair is consistent with contact-info id (#29818) 2023-01-24 16:57:55 +00:00
steviez be7ec87b9b
Reduce cpuid reporting frequency to once an hour (#29849) 2023-01-24 09:27:43 -06:00
Kevin Ji dd92f225bb
Use Ipv4Addr::{LOCALHOST, UNSPECIFIED} constants (#29813) 2023-01-23 16:49:51 -06:00
steviez f1b2e49b03
Cleanup FindPacketSenderStakeReceiver function args (#29834)
find_packet_sender_stake_stage::FindPacketSenderStakeReceiver is quite
verbose to include in function arguments, and type name is descriptive
enough that it doesn't need to be qualified with the crate name in every
instance.
2023-01-23 16:40:18 -06:00
Ashwin Sekar 3e8874e3a2
Clear parent in repair weighting when dumping from replay (#29770) 2023-01-23 12:55:09 -08:00
behzad nouri bd9b311c63
adds frozen_abi annotations to repair service enums/structs (#29820)
... in order to keep types backward compatible.
2023-01-23 16:49:06 +00:00
steviez 206a1c7296
Reduce the amount of IO that LedgerCleanupService performs (#29239)
Currently, the cleanup service counts the number of shreds in the
database by iterating the entire SlotMeta column and reading the number
of received shreds for each slot. This gives us a fairly accurate count
at the expense of performing a good amount of IO.

Instead of counting the individual slots, use the live_files()
rust-rocksdb entrypoint that we expose in Blockstore. This API allows us
to get the number of entries (shreds) in the data shred column family by
reading file metadata. This is much more efficient from IO perspective.
2023-01-23 04:39:47 -06:00
behzad nouri d75303f541
patches bug in sigverify-shreds when identity is hot-swapped (#29802)
Sigverify-shreds discards shreds from node's own leader slots:
https://github.com/solana-labs/solana/blob/6baab92ab/core/src/sigverify_shreds.rs#L153-L154

But if the identity is hot-swapped the pubkey would be wrong since it
is instantiated only once at startup:
https://github.com/solana-labs/solana/blob/6baab92ab/core/src/tvu.rs#L168
2023-01-21 20:07:41 +00:00
apfitzge 8c793da7d0
BankingStage Refactor: Move decision making functions to new module (#29788)
Move decision making functions to new module
2023-01-20 10:10:47 -08:00
apfitzge 5fc83a3d19
BankingStage Refactor: Separate Next Leader Functions (#29401)
Separate next_leader functions
2023-01-20 10:02:29 -08:00
behzad nouri 64c13b74d8
errors out when retransmit loopbacks to the slot leader (#29789)
When broadcasting shreds, turbine excludes the slot leader from the
random shuffle. Doing so, shreds should never loopback to the leader.
If shreds reaching retransmit stage are from the node's own leader slots
they should not be retransmited to any nodes.
2023-01-20 17:20:51 +00:00
Wen b36791956e
Ingest duplicate proofs sent through Gossip (#29227)
* First draft of ingesting duplicate proofs in Gossip into blockstore.

* Add more unittests.

* Add more unittests for bad cases.

* Fix lint errors for tests.

* More linter fixes for tests.

* Lint fixes

* Rename get_entries, move location of comment.

* Some renaming changes and comment fixes.

* Fix compile warning, this enum is not used.

* Fix lint errors.

* Slow down cleanup because this could potentially be expensive.

* Forgot to reset cleanup count.

* Add protection against attackers when constructing chunk map when
we ingest Gossip proofs.

* Use duplicate shred index instead of get_entries.

* Rename ClusterInfoDuplicateShredListener and fix a few small problems.

* Use into_shreds to piece together the proof.

* Remove redundant code.

* Address a few small errors.

* Discard slots too advanced in the future.

* - Use oldest proof for each pubkey
- Limit number of pubkeys in each slot to 100

* Disable duplicate shred handling for now.

* Revert "Disable duplicate shred handling for now."

This reverts commit c3fcf403876cfbf90afe4d2265a826f21a5e24ab.
2023-01-19 13:00:56 -08:00
apfitzge 2c347ac0a5
BankingStage Refactor: Move packet receiving and buffering functions to separate module (#29761)
Move packet receiving and buffering functions to separate module
2023-01-19 08:52:32 -08:00
Trent Nelson c4e43f1de4
vote: encapsulate `Lockout` (#29753) 2023-01-18 19:28:28 -07:00
Ryo Onodera 4973fe18f1
Rename banking stage packet receivers consistently (#29752)
Rename banking stage batch receivers consistently
2023-01-19 10:04:55 +09:00
Ryo Onodera 55d743c49a
Rename remaining ones to replay_vote_{sender,receiver} (#29716)
* Rename remaining ones to replay_vote_{sender,receiver}

* Fix typo...
2023-01-18 14:14:04 +09:00
Jeff Biseda f9062718c4
prioritize repair requests by stake (#29730) 2023-01-17 18:38:10 -08:00
Brennan Watt aa40c2b712
Increase turbine propagation const (#29742)
* Increase turbine propagation const

Value is used as a delay threshold for issuing shred repairs and analysis is showing we are overly aggressive in requesting repairs. Shreds show up via turbine before the repair completes the vast majority of the time

* Use Duration type for MAX_TURBINE_PROPAGATION
2023-01-17 15:01:00 -08:00
Jeff Biseda f6fcb14a3e
adjust normalized stake calculation in compute_weight (#29694) 2023-01-17 11:27:57 -08:00
Ryo Onodera 156454c980
Remove PacketDeserializer's extra overflow guard (#29715) 2023-01-17 14:21:17 +09:00
Brooks 0db14ad39c
Removes full_snapshot from CalcAccountsHashConfig (#29722) 2023-01-16 16:22:46 -05:00
behzad nouri 80a39bd6a5
adds feature to (temporarily) drop merkle shreds from testnet (#29711) 2023-01-15 15:41:58 +00:00
behzad nouri 5b5a3ebce8
adds metrics for num merkle shreds on the receiving end (#29710) 2023-01-14 23:07:42 +00:00
Illia Bobyr 59fde130d6
ledger/blockstore: PerfSampleV2: num_non_vote_transactions (#29404)
Store non-vote transaction counts that are now recorded by the banks
into the `blockstore`.

`SamplePerformanceService` now populates `PerfSampleV2` with counts from
the banks.
2023-01-12 19:14:04 -08:00
Jeff Washington (jwash) 544b9745c2
snapshot storage path uses 1 append vec per slot (#29627) 2023-01-11 12:05:15 -08:00
behzad nouri 8c212f59ad
renames ContactInfo to LegacyContactInfo (#29566)
Working towards adding a new ContactInfo where new sockets can be
added in a backward compatible way.
2023-01-08 16:00:55 +00:00
Brian Anderson 43a0745b37
Fix doc warnings (#29537) 2023-01-07 09:24:50 +00:00
behzad nouri 283a2b1540
removes #[allow(clippy::same_item_push)] (#29543) 2023-01-06 17:32:26 +00:00
behzad nouri 12da2da389
fixes errors from clippy::redundant_clone (#29536)
https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone
2023-01-05 18:42:19 +00:00
behzad nouri 5c9beef498
fixes errors from clippy::useless_conversion (#29534)
https://rust-lang.github.io/rust-clippy/master/index.html#useless_conversion
2023-01-05 18:05:32 +00:00
Lijun Wang 1e8a8e07b6
Stream the executed transaction count in the block notification (#29272)
Problem

The plugins need to know when all transactions for a block have been all notified to serve getBlock request correctly. As block and transaction notifications are sent asynchronously to each other it will be difficult.

Summary of Changes

Include the executed transaction count in block notification which can be used to check if all transactions have been notified.
2023-01-05 09:36:19 -08:00
Jeff Biseda 832302485e
require repair request signature, ping/pong for Testnet, Development clusters (#29351) 2023-01-04 14:54:19 -08:00
Illia Bobyr d7bd1bf970
bank: Record non-vote transaction count (#29383)
A subsequent change to `SamplePerformanceService` introduces non-vote transaction counts, which `bank`s need to store.

Part of work on https://github.com/solana-labs/solana/issues/29159
2023-01-03 14:46:20 -08:00
Xiang Zhu 3363c08ac0
Move async remove to snapshot_utils.rs (#29406) 2023-01-03 06:15:32 -08:00
behzad nouri 754ecf467b
generalizes the return type of Shred::get_signed_data (#29446)
The commit adds an associated SignedData type to Shred trait so that
merkle and legacy shreds can return different types for signed_data
method.
This would allow legacy shreds to point to a section of the shred
payload, whereas merkle shreds would compute and return the merkle root.
Ultimately this would allow to remove the merkle root from the shreds
binary.
2022-12-31 17:08:25 +00:00
Ashwin Sekar 17b64005d3
Add more logging and documentation to flaky optimistic confirmation tests (#29418)
* Revert "add retry for flakey local cluster test (#29228)"

This reverts commit 7a97121747.

* Add logging for repair
2022-12-27 10:47:45 -07:00
behzad nouri 456d06785d
experiments different turbine fanouts for propagating shreds (#29393)
The commit allocates 2% of slots to running experiments with different
turbine fanouts based on the slot number.
The experiment is feature gated with an additional feature to disable
the experiment.
2022-12-26 14:18:56 +00:00
Ashwin Sekar f2ba16ee87
Plumb dumps from replay_stage to repair (#29058)
* Plumb dumps from replay_stage to repair

When dumping a slot from replay_stage as a result of duplicate or
ancestor hashes, properly update repair subtrees to keep weighting and
forks view accurate.

* add test

* pr comments
2022-12-25 09:58:30 -07:00
behzad nouri 558292466b
rolls back merkle shreds on testnet (#29340)
https://github.com/solana-labs/solana/pull/29339
adds hash domain to merkle shreds. In order to merge that change, need
to temporarily disable merkle shreds on testnet.
2022-12-20 18:33:48 +00:00
Brooks 053775ad77
Elides unnecessary lifetimes (#29299) 2022-12-20 12:44:17 -05:00
Tao Zhu c657f42d77
remove a wrapper function (#29305) 2022-12-19 16:10:16 +00:00
Brennan Watt 86b2e545e1
Prune redundant const SLOT_MS (#29278)
* Alias redundant const SLOT_MS to DEFAULT_MS_PER_SLOT

* Slate SLOT_MS for deprecation

* Add doc comments

Co-authored-by: Brooks Prumo <brooks@prumo.org>
2022-12-16 08:05:09 -08:00
Jeff Biseda a44ea779bd
add support for a repair protocol whitelist (#29161) 2022-12-15 19:24:23 -08:00
dependabot[bot] dca5d7f9b4
chore: bump test-case from 2.1.0 to 2.2.2 (#28184)
Bumps [test-case](https://github.com/frondeus/test-case) from 2.1.0 to 2.2.2.
- [Release notes](https://github.com/frondeus/test-case/releases)
- [Changelog](https://github.com/frondeus/test-case/blob/master/CHANGELOG.md)
- [Commits](https://github.com/frondeus/test-case/compare/v2.1.0...v2.2.2)

---
updated-dependencies:
- dependency-name: test-case
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-13 16:07:50 +00:00
Jeff Washington (jwash) b95835143e
remove AccountsBackgroundService::new(caching_enabled) (#29234) 2022-12-13 07:18:02 -08:00
Jeff Washington (jwash) bb0bfc4214
remove bank_from_latest_snapshot_archives(caching_enabled) (#29238) 2022-12-13 07:16:24 -08:00
Jeff Washington (jwash) fec8f61566
remove ProcessOptions::accounts_db_caching_enabled (#29217) 2022-12-12 20:25:00 -08:00
Jeff Washington (jwash) 2c2324f4ea
remove caching_enabled from Bank::new_with_paths_for_tests (#29214) 2022-12-12 15:30:46 -08:00
Brooks Prumo 1b0aaf1607
Makes a new PathBuf instead of moving the test's TempDir (#29220) 2022-12-12 18:29:36 -05:00
apfitzge 249607dbfe
Use a different tempdir for unpacking snapshots (#29219) 2022-12-12 17:26:52 -06:00
Brooks Prumo 391f68da61
Uses Storages to calculate accounts hash in EAH warp tests (#29192) 2022-12-12 13:30:23 -05:00
Jeff Biseda 88a8f40bd2
apply [limit repairs to top staked... #28673] to non-MainnetBeta clusters (#29163) 2022-12-11 15:52:41 -08:00
behzad nouri 4ee318b2b2
fixes rust code formatting in core/src/consensus.rs (#29204) 2022-12-11 23:20:52 +00:00
Jeff Washington (jwash) 631a98a3b6
warp_from_parents works with write_cache enabled (#29185) 2022-12-09 14:28:18 -08:00
apfitzge cd9f1f1862
Typo/filter_and_forward_with_account_limits (#29183) 2022-12-09 16:22:25 -06:00
Jeff Washington (jwash) 560143a267
remove ValidatorConfig.caching_enabled (#29172) 2022-12-09 11:31:55 -08:00
Lijun Wang ecea802fe6
Bidirectional quic communication support (#29155)
* Support bi-directional quic communication, use the same endpoint for the quic server and client
This is needed for supporting using quic for repair

* Added comments on the bi-directional communication tests

* Removed some debug logs

* clippy issue
2022-12-09 10:59:43 -08:00
Jeff Washington (jwash) 6a90abd056
remove handle_snapshot_requests.caching_enabled (#29174) 2022-12-09 10:51:44 -08:00
Jeff Washington (jwash) 45ba5ef6fd
remove bank_from_snapshot_archives caching_enabled (#29171) 2022-12-09 10:45:21 -08:00
Jeff Washington (jwash) ec5098a723
remove bank_test_config_caching_enabled (#29170) 2022-12-09 08:28:02 -08:00
HaoranYi ca5d8c4b4d
refactor sysmonitor config (#29132) 2022-12-09 07:43:03 -06:00
Jason Davis 049fb3d725 Remove an unnecessary clone of a PohConfig inside Validator::new 2022-12-07 13:03:14 -06:00
Jason Davis 8f24ceffbd Removed Arcs from PohConfig parameters and pass the struct by reference only 2022-12-07 10:52:07 -06:00
HaoranYi 582397ad48
fix solRptLdgrStat thread hang (#29118) 2022-12-06 17:09:56 -06:00
HaoranYi 33b15240ac
Revert #28945 (#29127)
revert #28945
2022-12-06 17:08:56 -06:00
steviez aeb6b53502
Remove unused Option<> around ValidatorConfig's SnapshotConfig (#29090)
Remove Option<> around ValidatorConfig's SnapshotConfig

The SnapshotConfig is required and is currently hard-coded to be a
Some().
2022-12-06 22:47:55 +00:00
behzad nouri df7fd8ae5f
patches rust code formatting in core/src/replay_stage.rs (#29123) 2022-12-06 22:09:57 +00:00
behzad nouri 9524c9dbff patches errors from clippy::uninlined_format_args
https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
2022-12-06 19:32:15 +00:00
behzad nouri 9433c06745 patches errors from clippy::unchecked_duration_subtraction
https://rust-lang.github.io/rust-clippy/master/index.html#unchecked_duration_subtraction
2022-12-06 19:32:15 +00:00
Haoran Yi bbd49acb2f fix merge error 2022-12-06 13:31:50 -06:00
haoran 7a512d7f27 report number of open files 2022-12-06 13:31:50 -06:00
haoran 412cf3df27 sort deps 2022-12-06 13:31:50 -06:00
haoran f716cad4af don't use procfs as it is not supported on mac and windows.
make open_fd stats only on linux platform
2022-12-06 13:31:50 -06:00
haoran fc97d818b6 share code 2022-12-06 13:31:50 -06:00
haoran 36dc3a457f get mmap with wc-l 2022-12-06 13:31:50 -06:00
haoran dd81af9fff increase fd report interval to 3 minutes 2022-12-06 13:31:50 -06:00
haoran 0ce507c20d refactor SystemMonitorReportConfig 2022-12-06 13:31:50 -06:00
Haoran Yi 914f7bd85d fix 2022-12-06 13:31:50 -06:00
Haoran Yi 1635b99486 add mmmap file count 2022-12-06 13:31:50 -06:00
Haoran Yi e1ba5a2a63 add monitoring for open file descriptors stat 2022-12-06 13:31:50 -06:00
Tao Zhu 7ed22f7b18
Remove gate from accepting packets for forwarding (#29049) 2022-12-06 12:13:01 -06:00
Jon Cinque b1340d77a2
sdk: Make Packet::meta private, use accessor functions (#29092)
sdk: Make packet meta private
2022-12-06 12:54:49 +01:00
apfitzge fd3b5d08d7
Refactor/banking_stage_make_decision_consume_bank (#28946) 2022-12-02 10:07:01 -06:00
Tao Zhu 5850af5316
Refactor to remove requested_cu from cost_trarcker (#29015)
* refactor cost tracker by removing requested_cu from it, call sites to use cost_model forr consistency

* review fix
2022-12-02 00:25:09 +00:00
steviez 3c42c87098
Remove obsoleted return value from Blockstore insert shred method (#28992) 2022-12-01 11:17:46 -06:00
steviez b6dce6cf3b
Move BlockstoreInsertionMetrics field update to blockstore.rs (#28991)
The num_repair field is only blockstore insertion metric being updated
outside of Blockstore::insert() call chain; move the update to insert()
with the rest of the fields in BlockstoreInsertionMetrics struct.
2022-11-30 11:46:35 -06:00
Ashwin Sekar edacd3c411
Add dump_node to update stake for heaviest subtrees (#28827)
* Add dump_node to update stake for heaviest subtrees

Additionally refactor subtrees to store children as a hashset

* Add a more complicated forks test

* chose -> choose

* remove is_dumped flag and reuse latest_invalid_ancestor instead
2022-11-30 09:26:13 -08:00
apfitzge 4d338ed882
Bugfix/mi_remove_never_entries (#28978) 2022-11-29 16:00:21 -06:00
Ashwin Sekar 0d0a491f27
More documentation + small refactor for RepairService (#28933) 2022-11-28 19:46:06 -08:00
Tao Zhu 9f370475d4
remove obsoleted comment (#28960) 2022-11-28 13:39:40 -06:00
behzad nouri 7d99cddb9f
dedups turbine retransmit peers by tvu socket addresses (#28944)
No need to send duplicate shreds if several nodes have the same tvu
socket address because they are behind a relayer or whatever.
2022-11-28 19:23:02 +00:00
HaoranYi 7e87998091
reduce memory usage report freq to 1 per 5s (#28327) 2022-11-28 19:08:06 +00:00
apfitzge bdd162492c
Feature/multi-iterator-scanner-read-locks (#28862) 2022-11-28 11:23:04 -06:00
Brooks Prumo 9327658007
Promotes accounts hash to a strong type (#28930) 2022-11-28 10:09:47 -05:00
Brooks Prumo 638b26ea65
Renames EAH test fn (#28939) 2022-11-23 05:18:50 +00:00
apfitzge 38f7122605
separate make_decision in BankingStage (#28884) 2022-11-22 19:01:09 -06:00
Maximilian Schneider c8b0c3ede9
Update cost model to use requested_cu instead of estimated cu #27608 (#28281)
* Update cost model to use requested_cu instead of estimated cu #27608

* remove CostUpdate and CostModel from replay/tvu

* revive cost update service to send cost tracker stats

* CostModel is now static

* remove unused package

Co-authored-by: Tao Zhu <tao@solana.com>
2022-11-22 11:55:56 -06:00
apfitzge 637e8a937b
clean up: remove my_pubkey arg from consume_buffered_packets (#28888) 2022-11-22 11:40:04 -06:00
Jeff Washington (jwash) 20d8b5e98b
default some tests to write cache = true (#28917) 2022-11-21 15:53:39 -08:00
apfitzge dd723210ca
remove unnecessary clippy attributes (#28891) 2022-11-21 12:54:54 -06:00
behzad nouri d43b001189
rolls out merkle shreds to ~20% of testnet (#28905) 2022-11-21 16:20:02 +00:00
Michael Vines c6927151ef
Sort offline/wrong-shred nodes by stake weight while waiting for supermajority (#28872) 2022-11-18 15:26:21 -08:00
Jeff Washington (jwash) f22104d46b
use write cache by default in some tests (#28876) 2022-11-18 14:35:52 -08:00
apfitzge a636038fff
Clean up: banking_stage_prepare_sanitized_batch (#28841)
Use measure! for bank.prepare_sanitized_batch_with_results
2022-11-18 14:04:44 -06:00
Tyera c32377b5af
Split out quic- and udp-client definitions (#28762)
* Move ConnectionCache back to solana-client, and duplicate ThinClient, TpuClient there

* Dedupe thin_client modules

* Dedupe tpu_client modules

* Move TpuClient to TpuConnectionCache

* Move ThinClient to TpuConnectionCache

* Move TpuConnection and quic/udp trait implementations back to solana-client

* Remove enum_dispatch from solana-tpu-client

* Move udp-client to its own crate

* Move quic-client to its own crate
2022-11-18 12:21:45 -07:00
apfitzge 88e6ea37d9
refactor: move more BankingStage cost_model stuff into qos_service (#28840) 2022-11-17 14:03:17 -06:00
Andrew Fitzgerald ee2f760d3d
MultiIteratorScanner - improve banking stage performance with high contention 2022-11-17 10:54:12 -06:00
Brooks Prumo 2bafb0cb12
Requires EAH state cannot be Invalid (#28817) 2022-11-17 11:01:01 -05:00
Brooks Prumo ae0bb44401
Fixes test_snapshots_with_background_services (#28848) 2022-11-17 00:49:28 -05:00
Jeff Biseda 17ee3349f8
limit repairs to top staked requests in batch (#28673) 2022-11-16 16:30:41 -08:00
Ashwin Sekar ddf4ff2d26
Repair service documentation (#28592)
* repair doc update

* tree_root rename

* remove extra todo
2022-11-16 02:38:07 +00:00
Jeff Biseda e10d958352
signed repair request test fixes/cleanup (#28691) 2022-11-15 16:46:17 -08:00
Brooks Prumo d798e751a0
Disables EAH with short epochs (#28803) 2022-11-15 13:26:19 -05:00
Brooks Prumo d4cf18421d
Use 400 slots-per-epoch in EAH tests (#28801) 2022-11-14 17:49:20 -05:00
Brooks Prumo 0bfea02056
Snapshots wait for EAH calculations to complete (#28777) 2022-11-14 11:34:44 -06:00
Tao Zhu e5ae0b3371
check is_forwarded packet earlier (#28159)
* check and filter is_forwarded packet earlier

* review fix: renaming; and rebase
2022-11-11 23:32:03 +00:00
Brooks Prumo 4d6653598b
Upgrades to Rust 1.65.0 (#28741) 2022-11-09 17:15:03 -05:00
Brooks Prumo d1ba42180d
clippy for rust 1.65.0 (#28765) 2022-11-09 19:39:38 +00:00
Brooks Prumo 9e1cdc7e60
Enables not taking a bank snapshot (#28756) 2022-11-09 12:43:33 -05:00
Brooks Prumo d4c2900590
Removes `snapshot_bank()` wrapper fn (#28753) 2022-11-07 15:09:31 +00:00
Brooks Prumo 0b9426e734
Simplifies AHV's `test_max_hashes()` (#28754) 2022-11-07 02:32:33 +00:00
Brooks Prumo 064cfc70d2
Removes cluster_type from AccountsPackage (#28725) 2022-11-02 18:21:13 -04:00
Brooks Prumo d0f639745a
Uses AccountsPackage::default_for_tests() in AHV tests (#28723) 2022-11-02 14:13:35 -04:00
Lijun Wang f156bc12ca
Enforce stream receive timeout (#28513)
In the quic server handle_connection, when we timed out in receiving the chunks, we loop forever to wait for the chunk. If the client never provide another chunk, the server can hopelessly wait for that chunk and wasting server resources. Instead WAIT_FOR_CHUNK_TIMEOUT_MS is introduced to bound this to 10 seconds at maximum. The stream will be dropped if it times out.
2022-11-02 10:09:32 -07:00
Brooks Prumo 59bf1809fe
Uses SnapshotHash type in snapshot archive fields (#28681) 2022-10-31 14:28:35 -04:00
Dmitri Makarov 34865d032c chore: update Solana docs and code comments that specify "BPF" to "SBF" 2022-10-31 14:14:25 -04:00
Brooks Prumo 37507a2de6
Removes EAH parameter from serde_snapshot::reserialize_bank() (#28669) 2022-10-31 09:43:17 -04:00
sakridge 340ad68223
Banking stage refactor commit transactions (#28660)
* Refactor commit transactions step

* Cleanup token pre-balances

* Collect prebalances together

* Collect pre/post balances in separate function

* Fix clippy
2022-10-29 21:36:57 +02:00
steviez 6b93d05c37
Add LedgerCleanupService::find_slots_to_clean() test (#28656)
Add a test to better exercise find_slots_to_clean(), as well as a minor
bug fix to this method that was found as a result of writing test.
2022-10-29 00:55:21 +02:00
apfitzge 22ce49ae7f
Maintain original queue capacity for unprocessed packet buffer (#28661) 2022-10-28 16:37:21 -05:00
apfitzge 0a148b2bf7
remove unused handle_retryable_packets_elapsed (#28355) 2022-10-28 16:36:41 -05:00
Brooks Prumo 5a3d252899
Renames fn to Bank::update_accounts_hash_for_tests() (#28620) 2022-10-28 14:33:05 -04:00
steviez 2272fd807e
Remove Blockstore manual compaction code (#28409)
The manual Blockstore compaction that was being initiated from
LedgerCleanupService has been disabled for quite some time in favor of
several optimizations.

Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
2022-10-28 10:39:00 +02:00
Ashwin Sekar ae557a9eb5
Exit when stuck in an unrecoverable repair/purge loop (#28596)
* Exit when stuck in an unrecoverable repair/purge loop

* add tests
2022-10-27 20:06:06 -07:00
apfitzge 340d3b5468
rename and change capacity on unprocessed transaction storage - max_receive_size (#28586) 2022-10-26 10:03:47 -05:00
Brooks Prumo f158bab0ef
Tracks how long background requests wait before processing (#28581) 2022-10-25 12:10:53 -04:00
Brooks Prumo bc02789c43
Renames fn to calculate_accounts_hash_from_storages() (#28566) 2022-10-24 21:07:00 -04:00
Brooks Prumo 2354a0a343
Renames fn to calculate_accounts_hash_from_index() (#28568) 2022-10-24 19:20:08 -04:00
Ashwin Sekar 9eafad467c
Add convenience methods to VoteInstruction to distinguish vote types (#28526)
* Add convenience methods to VoteInstruction to distinguish vote types

* use matches! macro instead
2022-10-21 14:17:40 -06:00
Ashwin Sekar f207af765e
Split out voting and banking threads in banking stage (#27931)
* Split out voting and banking threads in banking stage

Additionally this allows us to aggressively prune the buffer for voting threads
as with the new vote state only the latest vote from each validator is
necessary.

* Update local cluster test to use new Vote ix

* Encapsulate transaction storage filtering better

* Address pr comments

* Commit cargo lock change

* clippy

* Remove unsafe impls

* pr comments

* compute_sanitized_transaction -> build_sanitized_transaction

* &Arc -> Arc

* Move test

* Refactor metrics enums

* clippy
2022-10-20 21:10:48 +00:00
Jeff Biseda 0df4be06a0
enable repair ping/pong cache (#28408) 2022-10-19 14:55:55 -07:00
Brooks Prumo 12f3e8c9cc
Ignores errors when joining background threads in snapshot tests (#28480) 2022-10-19 16:54:59 -04:00
carllin 274d9ea607
Check for valid address in broadcast (#28432)
Check for valid address
2022-10-19 14:49:22 -05:00
HaoranYi d81d2bba59
comments out print in test (#28475) 2022-10-19 10:25:11 -05:00
Brooks Prumo 1cc9cf927c
Supports warping with Epoch Accounts Hash (#28459) 2022-10-19 10:37:14 -04:00
behzad nouri e283461d99
enforces hash domain for ping-pong protocol (#28433)
https://github.com/solana-labs/solana/pull/27193
added hash domain to ping-pong protocol.
For backward compatibility responses both with and without domain were
generated and accepted.
Now that all clusters are upgraded, this commit enforces the hash domain
by removing the response without the domain.
2022-10-18 18:17:12 +00:00
Jeff Washington (jwash) 28a89a1d99
remove expected rent collection and rehashing completely (#28422) 2022-10-17 07:24:42 -07:00
steviez 39fa297bf6
Report total_transactions in replay-slot-stats (#28382)
We have transactions counted in replay-slot-end-to-end-stats, but that
metric is broken down to report things per thread.

So, report total_transactions for the entire slot (all threads) in
replay-slot-stats.
2022-10-15 14:07:03 +01:00
Brooks Prumo 31c2b29941
Sends both an EAH and a snapshot request from `set_root()` (#28363) 2022-10-14 11:00:04 -04:00
Brooks Prumo dd7fee8f32
Re-enqueues unhandled ABS requests (#28362) 2022-10-13 16:25:39 -04:00
Brooks Prumo 9cbd00fdbc
Converts PendingAccountsPackage to a channel (#28352) 2022-10-13 12:47:36 -04:00
Jason Davis e2fc9d51de Increase cpu metric reporting interval from 1s to 10s 2022-10-11 10:44:59 -05:00
Jeff Biseda 15050b14b9
use signed repair request variants (#28283) 2022-10-10 14:09:45 -07:00
Brooks Prumo 5a08eed82d
Cleans up debugging code in EAH tests (#28324) 2022-10-10 16:07:55 +00:00
Brooks Prumo 27cd2c324e
Adds tests for EAH and snapshot interactions (#28304) 2022-10-10 10:16:13 -04:00
Tao Zhu 50985f79a1
Correctly mark packets as forwarded (#28161)
Only mark packets accepted for forwarding as `forwarded`
2022-10-07 11:50:57 -05:00
Tao Zhu 0324573667
report additional transaction errors to metrics (#28285) 2022-10-07 10:36:22 -05:00
Brooks Prumo 981c9d07a4
Rearranges eah TestEnvironment fields to ensure drop order (#28270) 2022-10-06 16:17:32 -04:00
Brooks Prumo 2d936784dd
Ignore errors when joining background threads for EAH tests (#28263) 2022-10-06 18:43:56 +00:00
Brooks Prumo a8c6a9e5fc
Bank::freeze() waits for EAH calculation to complete (#28170) 2022-10-05 17:44:35 -04:00
Jason Davis c899ededfc Minor refactoring and cleaning of cpuid code 2022-10-05 11:43:27 -05:00
Jason Davis 3b2ab313de Use num-enum crate to make everything typesafe 2022-10-05 11:43:27 -05:00
Jason Davis 1e1455688d Convert magic numbers to named constants 2022-10-05 11:43:27 -05:00
Jason Davis fac772ff90 Update naming, style after PR review comments 2022-10-05 11:43:27 -05:00
Jason Davis 13b095b4ab Fix a fmt problem, I think. Shows up in the git check, but not when I run here 2022-10-05 11:43:27 -05:00
Jason Davis c8584b0cdd Cargo fmt applied 2022-10-05 11:43:27 -05:00
Jason Davis d841286c21 Add cpuid calls and metric reporting; change cpu info sampling interval from 1s to 10s 2022-10-05 11:43:27 -05:00
Jeff Biseda e3e888c0e0
stats for staked/unstaked repair requests (#28215) 2022-10-04 17:37:24 -07:00
behzad nouri 9e7a0e7420
rolls out merkle shreds to ~5% of testnet (#28199) 2022-10-04 19:36:16 +00:00
carllin 14a415ccf3
Consensus Logging (#28176) 2022-10-03 20:45:55 -05:00
haoran c4aab3f178 typo 2022-10-03 09:41:15 -05:00
Justin Starry c2bb2b8e60
Allow validators to reset to the slot which matches their last voted slot (#28172)
* Add failing test

* Allow resetting to duplicate was confirmed

* feedback

* feedback

* bump

* simplify change

* Revert "simplify change"

This reverts commit 72e5de3e5bdac595f71dc7fc01650ca3bc7da98e.

* update comment

* Update core/src/replay_stage.rs
2022-10-03 16:49:47 +08:00
Yueh-Hsuan Chiang 6b17bee5a8
Remove the const default for RocksFifo (#27965)
#### Summary of Changes
Removes the constant default for ShredStorageType::RocksFifo
as the shred storage size is either user-specified or derived
from --limit-ledger-size in #27459.
2022-10-01 15:10:54 -07:00
Brooks Prumo 8877ac2aa9
Fix call to calculate_accounts_hash() (#28169) 2022-09-30 15:29:18 -04:00
Brooks Prumo 2f8f6c6a31
Send Epoch Accounts Hash requests from set_root() (#27764) 2022-09-30 14:59:41 -04:00
Jeff Washington (jwash) cfc124c825
acct idx can no longer use write cache (#28150) 2022-09-30 10:55:27 -07:00
apfitzge 82558226f7
ImmutableDeserializedPacket rc to arc (#28145) 2022-09-30 12:07:48 -05:00
Tao Zhu 82e65593ee
Batch filtering invalid transactions before forwarding (#26798)
- Batch filtering invalid transactions (fail to sanitize, too old or already processed) before forwarding
- Combine packet filtering and forwarding to share sanitized transactions
- `iter_desc` is no longer needed, remove it;
- Add a method to share the logic of removing packets from buffer after they were removed from MinMaxHeap
- Add test coverage for forward_packet_batches_by_accounts
- rebase, resolve conflicts
2022-09-29 16:33:40 -05:00
Ashwin Sekar 84acef007c
Add bench test for voting threads (#28031) 2022-09-27 12:12:22 -07:00
Jeff Biseda 8b0f9b4917
make ping cache rate limit delay configurable (#27955) 2022-09-26 14:16:56 -07:00
behzad nouri f49beb0cbc
caches reed-solomon encoder/decoder instance (#27510)
ReedSolomon::new(...) initializes a matrix and a data-decode-matrix cache:
https://github.com/rust-rse/reed-solomon-erasure/blob/273ebbced/src/core.rs#L460-L466

In order to cache this computation, this commit caches the reed-solomon
encoder/decoder instance for each (data_shards, parity_shards) pair.
2022-09-25 18:09:47 +00:00
Jeff Biseda 9816c94d7e
metrics to distinguish why repair packets are dropped (#27960) 2022-09-24 23:20:05 -07:00
Jeff Biseda 8b43215ddd
count unsigned repair requests (#27953) 2022-09-24 12:56:02 -07:00
Tao Zhu e51cf46d6b
Remove priority from vote transactions (#28030)
vote transactions have same priority fee
2022-09-24 00:31:50 +00:00
behzad nouri 9ee53e594d
patches clippy errors from new rust nightly release (#28028) 2022-09-23 20:57:27 +00:00
Brooks Prumo d9b31fd6b0
ahv: Add debug logging for EAH (#27998) 2022-09-23 14:04:48 -04:00
Jeff Biseda 206cc9407b
allow unsigned repair requests (#27910) 2022-09-23 10:11:08 -07:00
behzad nouri 97c9af4c6b plumbs through flag to generate merkle variant of shreds 2022-09-23 16:45:18 +00:00
steviez e4affb9fea
Add Blockstore::highest_slot() method (#27981) 2022-09-23 04:53:43 -05:00
behzad nouri 9a57c64f21
patches clippy errors from new rust nightly release (#27996) 2022-09-22 22:23:03 +00:00
Brooks Prumo ff71df4695
Remove unnecessary call to `set_startup_verification_complete()` (#27986) 2022-09-22 16:54:17 -04:00
Brooks Prumo 1ee595ca9c
remove AccountsDb::initial_blockstore_processing_complete (#27974) 2022-09-22 13:52:04 -04:00
dependabot[bot] c4fa849844
chore: bump itertools from 0.10.3 to 0.10.5 (#27962)
* chore: bump itertools from 0.10.3 to 0.10.5

Bumps [itertools](https://github.com/rust-itertools/itertools) from 0.10.3 to 0.10.5.
- [Release notes](https://github.com/rust-itertools/itertools/releases)
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/commits)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-09-21 15:28:49 -06:00
dependabot[bot] 11b7c45bff
chore: bump systemstat from 0.1.11 to 0.2.0 (#27967)
Bumps [systemstat](https://github.com/unrelentingtech/systemstat) from 0.1.11 to 0.2.0.
- [Release notes](https://github.com/unrelentingtech/systemstat/releases)
- [Commits](https://github.com/unrelentingtech/systemstat/compare/v0.1.11...v0.2.0)

---
updated-dependencies:
- dependency-name: systemstat
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-21 15:28:31 -06:00
Jeff Washington (jwash) f2d6a7ecea
bank.initial_blockstore_processing_complete to avoid concurrent hash calculations (#27776)
* bank.initial_blockstore_processing_complete to avoid concurrent hash calculations

* Update runtime/src/bank.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Update runtime/src/bank.rs

Co-authored-by: Brooks Prumo <brooks@prumo.org>

* Rename TestValidator::set_startup_verification_complete()

* Initialize with `AtomicBool::new(false)` instead of `default()`

* snapshot tests: move where `initial_blockstore_processing_completed()` is called

* fixup bank_forks.rs calling `is_initial_blockstore_processing_complete()`

* only call initial_blockstore_processing_completed() in blockstore_processor

Co-authored-by: Brooks Prumo <brooks@prumo.org>
Co-authored-by: Brooks Prumo <brooks@solana.com>
2022-09-19 13:00:21 -07:00
behzad nouri abfb996135
tracks number of staked/stale/dead nodes in turbine cluster-nodes (#27915) 2022-09-19 18:16:04 +00:00
Ashwin Sekar 9119dc13ec
Add structure to house unprocessed transactions in banking_stage (#27777)
Separate storage for voting and transaction threads:
- Voting threads utilize a shared reference in order to dedup extraneous
  votes
- Transactions have thread local storage like before
2022-09-14 10:40:44 -07:00
Ashwin Sekar c74df830b1
Add structure to collect and coalesce vote packets (#27558)
* Add structure to collect and coalesce vote packets

Will be used in banking stage to throw out extraneous vote packets
before processing

* pr comments

* Update inner lock to arc to improve performance
2022-09-14 00:44:26 -07:00
Will Hickey c0e4379f43
Whickey/version v1.15 (#27739)
* Bump version to v1.13.0
* Bump version to v1.14.0
* Bump version to v1.15.0
2022-09-13 09:06:15 -05:00
apfitzge 079bf561b0
Clean_up/upb_push_comment (#27707) 2022-09-12 18:59:41 -05:00
Jeff Washington (jwash) 765c628546
use exit signal for acct idx bg threads (#27483) 2022-09-12 11:51:12 -07:00
behzad nouri 4f22ee8f9b uses varint encoding for vote-state lockout offsets
The commit removes CompactVoteStateUpdate and instead reduces serialized
size of VoteStateUpdate using varint encoding for vote-state lockout
offsets.
2022-09-12 16:31:20 +00:00
Christian Kamm 90b8a3a44d
Remove KeypairInsecureClone trait and add insecure_clone() instead (#27396)
See discussion in #26248
2022-09-12 14:59:41 +00:00
Michael Vines 83d4d128c2 Add --process-ledger-before-service flag to solana-validator 2022-09-11 07:58:42 -07:00
Jeff Washington (jwash) abd01553d5
tests: Keypair::new().pubkey() -> pubkey::new_rand (#27705) 2022-09-10 13:56:45 -07:00
Jeff Washington (jwash) 1f00b468e5
add enable_rehashing to AccountsPackage (#27644) 2022-09-08 09:25:25 -07:00
apfitzge a9c5adbf88
UnprocessedPacketBatches pop_max fn are only used in tests (#27645) 2022-09-08 11:01:14 -05:00
Maximilian Schneider cc58968b76
add new leader slot metric to track account contention throttling (#27654) 2022-09-08 09:22:58 -05:00
dependabot[bot] f338aa62ba
chore: bump serde from 1.0.143 to 1.0.144 (#27511)
* chore: bump serde from 1.0.143 to 1.0.144

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.143 to 1.0.144.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.143...v1.0.144)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-09-07 16:54:44 -06:00
Xiang Zhu 4308c300b4
In ledger-tool delete the account files in the async way (#27622)
* In ledger-tool delete the account files in the async way

* format changes by ./cargo nightly fmt --all
2022-09-07 14:35:06 -07:00
Brooks Prumo 6a322de845
Make Accounts Background Services aware of Epoch Accounts Hash (#27626) 2022-09-07 20:41:40 +00:00
Lijun Wang 7f223dc582
Added option to turn on UDP for TPU transaction and make UDP based TPU off by default (#27462)
--tpu-enable-udp is introduced. And when this is on, the transaction receive and transaction forward is enabled using udp.

Except for a few tests which was hard-coded sending transactions using udp, most tests are being done with udp based tpu disabled.
2022-09-07 13:19:14 -07:00
apfitzge c04747dd66
cluster_slot_state_verifier: clippy nightly fixes (#27521)
clippy nightly fixes
2022-09-07 15:04:56 -05:00
apfitzge 1465ec947d
replay_stage: clippy nightly fixes (#27520)
clippy nightly fixes
2022-09-07 15:04:46 -05:00
apfitzge 452866dbcf
shredder: clippy nightly fixes (#27522)
clippy nightly fixes
2022-09-07 15:04:32 -05:00
apfitzge d6a1e7498f
Add tests for deserialize_and_collect_packets (#27623) 2022-09-07 12:52:18 -05:00
Jeff Washington (jwash) 22007a3c96
allow accounts hash calc to specify enable_rehashing (#27615) 2022-09-07 10:16:52 -07:00
Jeff Washington (jwash) a31d4a597d
serialize epoch_accounts_hash (#27516) 2022-09-07 10:07:00 -07:00
Brooks Prumo 93a4f80a2c
Handling snapshot requests is now required (#27537) 2022-09-07 10:08:42 -04:00
Jeff Biseda 269eb519dd
track time to coalesce entries in recv_slot_entries (#27525) 2022-09-06 16:07:17 -07:00
apfitzge a67d56f462
refactor: add function for deserializing and collecting packets - separate from channel receive (#27548) 2022-09-06 15:54:31 -05:00
Brooks Prumo 6684c62280
Add SnapshotUsage to SnapshotConfig (#27508) 2022-09-02 08:56:23 -04:00
Brennan Watt 242c9cb442
RPC Notifier Signal when Setup Complete (#27481)
* RPC notifier signal when ready
2022-09-01 16:39:55 -07:00
Tyera Eulberg 9b8bed86f9
Add getRecentPrioritizationFees RPC endpoint (#27278)
* Plumb priority_fee_cache into rpc

* Add PrioritizationFeeCache api

* Add getRecentPrioritizationFees rpc endpoint

* Use MAX_TX_ACCOUNT_LOCKS to limit input keys

* Remove unused cache apis

* Map fee data by slot, and make rpc account inputs optional

* Add priority_fee_cache to rpc test framework, and add test

* Add endpoint to jsonrpc docs

* Update docs/src/developing/clients/jsonrpc-api.md

* Update docs/src/developing/clients/jsonrpc-api.md
2022-09-01 23:12:12 +00:00
apfitzge 3bdc5b3f2b
separate packet_deserializer inside banking_stage (#27120)
* separate packet_deserializer inside banking_stage

* Make ReceivePacketResults into a struct with named fields
2022-09-01 10:00:48 -05:00
dependabot[bot] 66717ff87d
chore: bump chrono from 0.4.21 to 0.4.22 (#27509)
* chore: bump chrono from 0.4.21 to 0.4.22

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.21 to 0.4.22.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/v0.4.22/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.21...v0.4.22)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-08-31 22:39:12 +00:00
Tao Zhu 8bb039d08d
collect min prioritization fees when replaying sanitized transactions (#26709)
* Collect blocks' minimum prioritization fees when replaying sanitized transactions

* Limits block min-fee metrics reporting to top 10 writable accounts

* Add service thread to asynchronously update and finalize prioritization fee cache

* Add bench test for prioritization_fee_cache

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2022-08-31 08:00:55 -05:00
Tyera Eulberg 7bd08ad3ae
Featurize spinner in rpc-client and tpu-client (#27381)
* Add spinner features to rpc-client and tpu-client, and disable where unneeded

* Add doc comment
2022-08-30 09:56:56 -06:00
Haoran Yi 5b64107626 make pruned_bank channel unbonded.
In kin-sim, we found that bounded channel causes halt for account
background services. As the number of accounts grows, the time for
pruning and cleaning increases, which would leads to longer intervals
between the pruning of deaded bank slots. With 1.7B accounts, we will
exceed the 10K bounded channel threshold that causes halt of account
back ground services. Without pruning, the node will eventually run out
of memory.
2022-08-29 19:06:30 -05:00
Brooks Prumo 3c7cd62030
Move pruned_banks_receiver into PrunedBanksRequestHandler (#27445) 2022-08-29 13:30:06 -04:00
Brennan Watt 46a48760db
Switch concurrent replay from feature to param (#27401)
* Switch concurrent replay from feature to param
2022-08-26 12:36:02 -07:00
Will Hickey 5eefc256d6
Fix startup panic if removing accounts directory fails (#27386)
* Remove contents of accounts directory if deleting the directory fails.
2022-08-25 20:35:12 -05:00
Trent Nelson b1cff5d740 make fatal log message sound fatal 2022-08-25 21:49:12 +00:00
Jeff Biseda d1522fc790
coalesce entries in recv_slot_entries to target byte count (#27321) 2022-08-25 13:51:55 -07:00
Jeff Washington (jwash) 2da93bd45a
add text to assert (#27377) 2022-08-24 14:11:53 -05:00
Tyera Eulberg b8b3d723da
Use new client crates (#27360)
* Update ancillary cli crates

* Update cli

* Update command-line tools

* Update rpc, etc

* Update client-test

* Update core, validator

* Update local-cluster
2022-08-24 10:47:02 -06:00
Ashwin Sekar efa6201eda
Check overflow on vote tx compaction boundary (#27185)
* Check overflow on vote tx compaction boundary

Check for overflow during the conversion between VoteStateUpdate and
CompactVoteStateUpdate.

* Try removing clippy supress
2022-08-23 22:29:03 -07:00
Xiang Zhu 827d8e4bc0
Fallback to synchronous rm_dir call if path moving fails (#27306)
Remove some log lines, as suggested in PR #26910
2022-08-22 22:47:39 -07:00
Brennan Watt e4a7d01e10
Rust v1.63 (#27303)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

* Update quinn-udp crate

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-22 18:01:03 -07:00
Michael Vines 7bdeea10ad Assign custom names to the Rayon global thread pool 2022-08-22 17:56:55 +00:00
Michael Vines 7c01c1ecc6
Update delete_path thread name 2022-08-22 09:01:26 -07:00
Jeff Washington (jwash) fc1a4dd11a
run hash calc with index on failure (#27279) 2022-08-22 10:58:04 -05:00
Michael Vines 3f4731b37f Standardize thread names
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
2022-08-20 07:49:39 -07:00
Xiang Zhu c54824e4f5
Account files remove (#26910)
* Create a new function cleanup_accounts_paths, a trivial change

* Remove account files asynchronously

* Update and simplify the implementation after the validator test runs.

* Fixes after testing on the dev device

* Discard tokio.  Use thread instead

* Fix comments format

* Fix config type to pass the github test

* Fix failed tests.  Handle the case of non-existing path

* Final cleanup, addressing the review comments
Avoided OsString.
Made the function more generic with "impl AsRef<Path>"

Co-authored-by: Jeff Washington <jeff.washington@solana.com>
2022-08-19 23:56:52 -07:00
apfitzge eb06bb61e8
banking stage: actually aggregate tracer packet stats (#27118)
* aggregated_tracer_packet_stats_option was alwasys None

* Actually accumulate tracer packet stats
2022-08-19 15:16:56 -05:00
Will Hickey dba2fd5a16
Enable QUIC client by default. Add arg to disable QUIC client. (Forward port #26927) (#27194)
Enable QUIC client by default. Add arg to disable QUIC client.

* Enable QUIC client by default. Add arg to disable QUIC client.
* Deprecate --disable-quic-servers arg
* Add #[ignore] annotation to failing tests
2022-08-19 09:15:15 -05:00
Brennan Watt 7573000d87
Revert "Rust v1.63.0 (#27148)" (#27245)
This reverts commit a2e7bdf50a.
2022-08-19 09:19:44 +01:00
behzad nouri 6928b2a5af
adds hash domain to ping-pong protocol (#27193)
In order to maintain backward compatibility, for now the responding node
will hash the token both with and without domain so that the other node
will accept the response regardless of its upgrade status.
Once the cluster has upgraded to the new code, we will remove the legacy
domain = false case.
2022-08-18 22:39:31 +00:00
Brennan Watt a2e7bdf50a
Rust v1.63.0 (#27148)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-17 15:48:33 -07:00
behzad nouri fea66c8b63
derives Error trait for ClusterInfoError and core::result::Error (#27208) 2022-08-17 22:01:51 +00:00
Jeff Washington (jwash) 225cddcffb
serialize incremental_snapshot_hash (#26839)
* serialize incremental_snapshot_hash

* pr feedback
2022-08-17 15:14:31 -05:00
behzad nouri 3b87aa9227
reverts wide fanout in broadcast when the root node is down (#26359)
A change included in
https://github.com/solana-labs/solana/pull/20480
was that when the root node in turbine broadcast tree is down, the
leader will broadcast the shred to all nodes in the first layer.
The intention was to mitigate the impact of dead nodes on shreds
propagation, because if the root node is down, then the entire cluster
will miss out the shred.
On the other hand, if x% of stake is down, this will cause 200*x% + 1
packets/shreds ratio at the broadcast stage which might contribute to
line-rate saturation and packet drop.
To avoid this bandwidth saturation issue, this commit reverts that logic
and always broadcasts shreds from the leader only to the root node.
As before we rely on erasure codes to recover shreds lost due to staked
nodes being offline.
2022-08-16 19:40:06 +00:00
dependabot[bot] a0d1f4ef88
chore: bump serial_test from 0.8.0 to 0.9.0 (#27097)
Bumps [serial_test](https://github.com/palfrey/serial_test) from 0.8.0 to 0.9.0.
- [Release notes](https://github.com/palfrey/serial_test/releases)
- [Commits](https://github.com/palfrey/serial_test/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: serial_test
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-15 15:53:12 -06:00
Justin Starry bdce208fe5
clean feature: `request_units_deprecated` (#27102)
clean feature: request_units_deprecated
2022-08-13 13:12:35 +01:00
Jeff Biseda e50013acdf
Handle JsonRpcService startup failure (#27075) 2022-08-11 23:25:20 -07:00
janlegner fc6cee9c06
allow staked nodes weight override (#26870)
* Allowed staked nodes weight override (#26407)

* Allowed staked nodes weight override, passing only HashMap over to core module

Co-authored-by: Ondra Chaloupka <chalda@chainkeepers.io>
2022-08-11 14:34:04 -07:00
dependabot[bot] f641d3bad6
chore: bump chrono from 0.4.19 to 0.4.21 (#27076)
* chore: bump chrono from 0.4.19 to 0.4.21

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.19 to 0.4.21.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.19...v0.4.21)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <you@example.com>
2022-08-11 09:28:50 -06:00
behzad nouri ac91cdab74
removes buffering when generating coding shreds in broadcast (#25807)
Given the 32:32 erasure recovery schema, current implementation requires
exactly 32 data shreds to generate coding shreds for the batch (except
for the final erasure batch in each slot).
As a result, when serializing ledger entries to data shreds, if the
number of data shreds is not a multiple of 32, the coding shreds for the
last batch cannot be generated until there are more data shreds to
complete the batch to 32 data shreds. This adds latency in generating
and broadcasting coding shreds.

In addition, with Merkle variants for shreds, data shreds cannot be
signed and broadcasted until coding shreds are also generated. As a
result *both* code and data shreds will be delayed before broadcast if
we still require exactly 32 data shreds for each batch.

This commit instead always generates and broadcast coding shreds as soon
as there any number of data shreds available. When serializing entries
to shreds:
* if the number of resulting data shreds is less than 32, then more
  coding shreds will be generated so that the resulting erasure batch
  has the same recovery probabilities as a 32:32 batch.
* if the number of data shreds is more than 32, then the data shreds are
  split uniformly into erasure batches with _at least_ 32 data shreds in
  each batch. Each erasure batch will have the same number of code and
  data shreds.

For example:
* If there are 19 data shreds, 27 coding shreds are generated. The
  resulting 19(data):27(code) erasure batch has the same recovery
  probabilities as a 32:32 batch.
* If there are 107 data shreds, they are split into 3 batches of 36:36,
  36:36 and 35:35 data:code shreds each.

A consequence of this change is that code and data shreds indices will
no longer align as there will be more coding shreds than data shreds
(not only in the last batch in each slot but also in the intermediate
ones);
2022-08-11 12:44:27 +00:00
Michael Vines 4e79d78629 `solana-validator monitor` how displays slot and gossip stake % while waiting for supermajority 2022-08-10 11:13:25 -07:00
dependabot[bot] e3a8d2ecdd
chore: bump serde_json from 1.0.81 to 1.0.83 (#27036)
* chore: bump serde_json from 1.0.81 to 1.0.83

Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.81 to 1.0.83.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.81...v1.0.83)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-08-10 09:45:42 -06:00
apfitzge c03f3b1436
Separate file for ImmutableDeserializedPacket type (#26951) 2022-08-09 22:39:01 -07:00
dependabot[bot] ae5b680c6f
chore: bump serde from 1.0.138 to 1.0.143 (#27015)
* chore: bump serde from 1.0.138 to 1.0.143

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.138 to 1.0.143.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.138...v1.0.143)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-08-10 05:08:43 +00:00
Jeff Biseda 370de8129e
ancestor hashes socket ping/pong support (#26866) 2022-08-09 21:39:55 -07:00
Michael Vines ccfbc54195 Move vote program state and instructions to solana-program 2022-08-09 20:52:47 -07:00
apfitzge c2455e7aa4
Fix typo in test function (#27031) 2022-08-09 12:39:22 -07:00
Lijun Wang a69470fd45
Set receive_window per quic connection (#26936)
This change sets the receive_window for non-staked node to 1 * PACKET_DATA_SIZE, and maps the staked nodes's connection's receive_window between 1.2 * PACKET_DATA_SIZE to 10 * PACKET_DATA_SIZE based on the stakes.

The changes is based on Quinn library change to support per connection receive_window tweak at the server side. quinn-rs/quinn#1393
2022-08-09 10:02:47 -07:00
behzad nouri e2a2d271f2
adds number of coding shreds to broadcast metrics (#27006) 2022-08-09 13:59:40 +00:00
apfitzge b6d38aad69
tracer-packet-stats reporting should not reset id (#27012) 2022-08-09 06:38:08 -07:00
Yueh-Hsuan Chiang 99ef2184cc
Delete files older than the lowest_cleanup_slot in LedgerCleanupService::cleanup_ledger (#26651)
#### Problem
LedgerCleanupService requires compactions to propagate & digest range-delete tombstones
to eventually reclaim disk space.

#### Summary of Changes
This PR makes LedgerCleanupService::cleanup_ledger delete any file whose slot-range is
older than the lowest_cleanup_slot.  This allows us to reclaim disk space more often with
fewer IOps.  Experimental results on mainnet validators show that the PR can effectively
reduce 33% to 40% ledger disk size.
2022-08-09 00:48:06 +08:00
Will Hickey ed8c224374
Bump version to v1.12 (#26967) 2022-08-06 13:20:30 -05:00
Christian Kamm cf58640937
Keypair: implement clone() (#26248)
* Keypair: implement clone()

This was not implemented upstream in ed25519-dalek to force everyone to
think twice before creating another copy of a potentially sensitive
private key in memory.

See https://github.com/dalek-cryptography/ed25519-dalek/issues/76

However, there are now 9 instances of
  Keypair::from_bytes(&keypair.to_bytes())
in the solana codebase and it would be preferable to have a function.

In particular since this also comes up when writing programs and can
cause users to either start messing with lifetimes or discover the
from_bytes() workaround themselves.

This patch opts to not implement the Clone trait. This avoids automatic
use in order to preserve some of the original "let developers think
twice about this" intention.

* Use Keypair::clone
2022-08-06 11:54:38 -06:00
Richard Patel 270315a7f6
transaction-status, storage-proto: add compute_units_consumed (#26528)
* transaction-status, storage-proto: add compute_units_consumed

* fix bpf test

Co-authored-by: Justin Starry <justin@solana.com>
2022-08-06 17:14:31 +00:00
Justin Starry 69598ed4c0
Refactor: Add `RuntimeConfig` field to Bank (#26946)
* Refactor: Simplify arguments for bank constructor methods

* Refactor: Add RuntimeConfig to Bank fields

* Arc wrap runtime_config

* Arc wrap all runtime config usages

* Remove Copy trait derivation from RuntimeConfig

* Remove some arc wrapping
2022-08-05 20:49:00 +01:00
Brennan Watt 5bc81a6c35
Io stats v2 (#26898)
* Use sysfs instead of procfs for disk stats

* Filter map to filter dmcrypt and mdraid volumes

* Unit test cover different kernel formats
2022-08-05 10:38:49 -07:00
Tyera Eulberg 2dca239480
Remove runtime dependency from solana-transaction-status (#26930)
* Move RewardType out of runtime

* Move collect_token_balances to solana-ledger

* Remove solana-runtime dependency
2022-08-05 00:20:27 -06:00
steviez 300666dce7
Make `solana-ledger-tool` run AccountsBackgroundService (#26914)
Prior to this change, long running commands like `solana-ledger-tool
verify` would OOM due to AccountsDb cleanup not happening.

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-04 15:44:31 -05:00
Boqin Qin(秦 伯钦) 83a0f5da0f
core: fix double-readlock in replay_stage (#26052) 2022-08-04 10:45:31 -04:00
Brennan Watt 457f9ef739
Reduce severity level of log in replay (#26893)
* Reduce active banks log severity from warn to trace
2022-08-03 13:51:16 -07:00
github-actions[bot] fbf1bf6d86
Bump Version to 1.11.6 (#26906)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-08-03 12:48:43 -05:00
Brennan Watt f3b760dd91
Add IO metrics (#26804)
* Add Disk IO metrics
2022-08-02 14:29:53 -07:00
behzad nouri ec36f0c5df removes redundant Option<&Arc<...>> wrapper for Blockstore in serve-repair 2022-08-02 15:30:53 +00:00
behzad nouri 6423da0218 removes redundant Arc<RwLock<...>> wrapper off ServeRepair 2022-08-02 15:30:53 +00:00
Jeff Biseda ded9a35cd6
mark repair ping packets for discard only after successful signature verification (#26878) 2022-08-01 16:17:19 -07:00
Jeff Biseda 857be1e237
sign repair requests (#26833) 2022-07-31 15:48:51 -07:00
apfitzge fbfcc3febf
Bugfix: VoteProcessingTiming reset both counters (#26843) 2022-07-29 12:56:04 -05:00
Pankaj Garg fb922f613c
Compute maximum parallel QUIC streams using client stake (#26802)
* Compute maximum parallel QUIC streams using client stake

* clippy fixes

* Add unit test
2022-07-29 08:44:24 -07:00
Brennan Watt 467cb5def5
Concurrent slot replay (#26465)
* Concurrent replay slots

* Split out concurrent and single bank replay paths

* Sub function processing of replay results for readability

* Add feature switch for concurrent replay
2022-07-28 11:33:19 -07:00
Jeff Washington (jwash) 817f65bb50
add full_snapshot to hash config (#26811) 2022-07-28 09:46:34 -05:00
Ashwin Sekar 8d69e8d447
Compact vote state updates to reduce block size (#26616)
* Compact vote state updates to reduce block size

* Add rpc transaction tests
2022-07-27 13:23:44 -06:00
Ashwin Sekar ed539d65b4
Only take the latest vote for each validator in gossip (#25934)
* Only take the latest vote for each validator in gossip

Since the new vote updates are no longer incremental, there
is no value in storing intermediate votes.

* Address pr feedback

* Handle potential downgrade path, FullTowerVote -> Incremental

* Rename sent to bank -> gossip slot

* Handle downgrade case properly

* Only downgrade for newer votes and feature flag, ignore incremental votes otherwise

* Update test
2022-07-26 16:38:30 -06:00
github-actions[bot] 5d038b9d2a
Bump Version to 1.11.5 (#26758)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-07-25 13:05:14 -06:00
carllin f6d5b253fb
Enforce a 12MB limit on outbound repair (#26493) 2022-07-24 20:44:22 -05:00
Jeff Washington (jwash) d9c7bc7e78
Revert "cleanup feature: default units per instruction (#26684)" (#26750)
This reverts commit 39a34db52a.
2022-07-23 11:03:46 -05:00
Trent Nelson a603c8b0bc Enable QUIC servers by default 2022-07-22 15:45:10 -06:00
Trent Nelson 2ee19f536a Revert "Revert "core: disable quic servers on mainnet-beta" (#26216)"
This reverts commit 4a7fb2a808.
2022-07-22 15:45:10 -06:00
Tao Zhu a6215c1b92
Remove unnecessary poh_recorder read lock acquire (#26743)
Remove unnecessary acquiring of poh_recorder read lock
2022-07-22 15:23:05 -05:00
Tao Zhu 333a48e4e2
Add comment for forwarder buffer iteration behavior (#26721)
Add comment of forwarder buffer iteration behavior
2022-07-22 01:15:17 +00:00
Brennan Watt 932abe98a7
Switch UDP stats from usize to u64 (#26700) 2022-07-20 15:20:30 -07:00
Jack May 39a34db52a
cleanup feature: default units per instruction (#26684) 2022-07-20 19:13:34 +00:00
Brennan Watt 502f249904
Add proc net dev metrics to net stats (#26603)
* Add proc net dev metrics to net stats
2022-07-20 11:44:36 -07:00
sakridge 4a7fb2a808
Revert "core: disable quic servers on mainnet-beta" (#26216)
Enable QUIC server
2022-07-20 20:37:24 +02:00
Jeff Washington (jwash) 263911e7fd
save off what we find when calculating hash (#26663) 2022-07-19 09:55:52 -05:00
behzad nouri 2dd8573287
removes erroneous allow(dead_code) annotations from core (#26660) 2022-07-18 17:15:47 +00:00
Tao Zhu 22d465cd57
Share function to get priority details from various transaction types (#26643) 2022-07-15 18:17:22 -05:00
Jeff Washington (jwash) 47716a5e01
async hash verify on load (#26208)
* verify accounts hash in bg on startup

* fix some tests and loading from genesis

* add extra state for when background thread has completed
2022-07-15 14:29:56 -05:00
Tao Zhu f13b5c832d
Remove obsoleted metrics reporting to reduce lock contention on cost_model (#26608)
remove obsoleted metrics reporting to reduce lock contention on cost_model
2022-07-14 23:02:49 -05:00
Pankaj Garg 49a112ae74
Use pubkey of peer for active QUIC connection table (#26597)
* Use pubkey of peer for active QUIC connection table

* clippy

* update code
2022-07-13 09:59:01 -07:00
HaoranYi bf14440895
clean up and optimize account hash verify (#26560)
* remove unused code

* extract test related fault hash inject fn

* use rotate to optimize hashes removal

* use rotate to optimize snapshot hashes removal

* address code reveiw feedbacks

* revise comments

* inline
2022-07-12 19:27:28 +00:00
github-actions[bot] fd5df1cf25
Bump Version to 1.11.4 (#26578)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-07-11 23:30:38 -05:00
Pankaj Garg ea7448c568
Use client certs in QUIC to get peer's stake (#26477)
* Use client certs in QUIC to get peer's stake

* fixes to cert processing

* integrate the code

* clippy

* more cleanup

* sort cargo deps

* test fixes

* info -> debug
2022-07-11 18:06:40 +00:00
Tao Zhu a3b094300b
Remove sender stakes from banking_stage buffer prioritization (#26512)
* remove sender stakes from banking_stage buffer prioritization
2022-07-11 12:46:15 -05:00
Nicholas Clarke ee0a40937e
Add validator argument log_messages_bytes_limit to change log truncation limit.
Add new cli argument log_messages_bytes_limit to solana-validator to control how long program logs can be before truncation
2022-07-11 10:53:18 -05:00
behzad nouri ba785cf8ab
removes erroneous uses of std::mem::swap (#26536)
All instances should be replace by std::mem::{replace,take},
or just plain assignment.
2022-07-11 11:33:15 +00:00
Jeff Washington (jwash) 275e47f931
do this right: add 2nd pass at hash calc when failure seen (#26392) (#26538) 2022-07-10 23:10:22 -05:00
Jeff Washington (jwash) 602da5e51f
add accounts db config to bank tests (#26517) 2022-07-10 19:42:06 -05:00
Ashwin Sekar 734fedea4c
Create a more compact vote state update transaction (#26092)
* Create a more compact vote state update transaction

* pr comments

* change root to not be an option and update abi
2022-07-07 22:29:02 -07:00
carllin 5bffee248c
Cleanup repair logging (#26461) 2022-07-07 15:02:43 -05:00
github-actions[bot] 9d937fb8a0
Bump Version to 1.11.3 (#26481)
Co-authored-by: willhickey <willhickey@users.noreply.github.com>
2022-07-07 14:39:46 -05:00
dependabot[bot] 5e8e1beeb5
chore: bump serial_test from 0.6.0 to 0.8.0 (#26463)
Bumps [serial_test](https://github.com/palfrey/serial_test) from 0.6.0 to 0.8.0.
- [Release notes](https://github.com/palfrey/serial_test/releases)
- [Commits](https://github.com/palfrey/serial_test/compare/v0.6.0...v0.8.0)

---
updated-dependencies:
- dependency-name: serial_test
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-07 10:52:20 -06:00
behzad nouri 6f4838719b
decouples shreds sig-verify from tpu vote and transaction packets (#26300)
Shreds have different workload and traffic pattern from TPU vote and
transaction packets. Some of recent changes to SigVerifyStage are not
suitable or at least optimal for shreds sig-verify; e.g. random discard,
dedup with false positives, discard excess by IP-address, ...

SigVerifier trait is meant to abstract out the distinctions between the
two pipelines, but in practice it has led to more verbose and convoluted
code.

This commit discards SigVerifier implementation for shreds sig-verify
and instead provides a standalone stage for verifying shreds signatures.
2022-07-07 11:13:13 +00:00
behzad nouri d33c548660
bypasses window-service stage before retransmitting shreds (#26291)
With recent patches, window-service recv-window does not do much other
than redirecting packets/shreds to downstream channels.
The commit removes window-service recv-window and instead sends
packets/shreds directly from sigverify to retransmit-stage and
window-service insert thread.
2022-07-06 11:49:58 +00:00
dependabot[bot] 37f4621c06
chore: bump serde from 1.0.137 to 1.0.138 (#26421)
* chore: bump serde from 1.0.137 to 1.0.138

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.137 to 1.0.138.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.137...v1.0.138)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-07-05 23:18:08 -06:00
Tao Zhu c1d89ad749
forward packets by prioritization in desc order (#25406)
- Forward packets by prioritization in desc order
- Add support of cost-tracking by transaction requested compute units
- Hook up account buckets to forwarder
- Add metrics for forwardable batches count
- Remove redundant invalid packets filtering at end of slot since forwarder will do the same when batch forwardable packets
- Add bench test for forwarding
2022-07-05 23:24:58 -05:00
Jeff Washington (jwash) 8eba4d1698
add 2nd pass at hash calc when failure seen (#26392) 2022-07-05 18:01:02 -05:00
behzad nouri d3a14f5b30
simplifies packet/shred sanity checks (#26356) 2022-07-05 21:41:19 +00:00
carllin ce39c14025
Add end-to-end replay slot metrics (#25752) 2022-07-05 13:58:51 -05:00
Nick Rempel 7e4a5de99c
Refactor ConnectionCache::use_quic (#26235)
* Remove UseQuic type

Move to storing the UdpSocket on ConnectionCache and accepting a bool

* Remove use_quic from ConnectionCache constructor

Replace with separate with_udp constructor to force callers to choose
2022-07-05 10:49:42 -07:00
behzad nouri 61f0a7d9c3
replaces Mutex<PohRecorder> with RwLock<PohRecorder> (#26370)
Mutex causes superfluous lock contention when a read-only reference suffices.
2022-07-05 14:29:44 +00:00
Brooks Prumo 9ec38a3191
Cleanup snapshot integration tests (#26390) 2022-07-05 09:23:23 -05:00
Pankaj Garg 94685e1222
Implement randomized pruning of QUIC connection from staked peers (#26299) 2022-06-30 17:56:15 -07:00
behzad nouri 88599fd760
skips shreds deserialization before retransmit (#26230)
Fully deserializing shreds in window-service before sending them to
retransmit stage adds latency to shreds propagation.
This commit instead channels through the payload and relies on only
partial deserialization of a few required fields: slot, shred-index,
shred-type.
2022-06-30 12:13:00 +00:00
Jack May 4563bf40f6
cleanup feature: tx-wide-compute-cap (#26326) 2022-06-29 23:54:45 -07:00
Jeff Washington (jwash) 557bf6e656
allow initial hash calc to occur in bg (#26271)
* allow initial hash calc to occur in bg

* validator_initialized -> startup_verification_complete

* add infos for leader and vote

* rework snapshot for startup verification

* change to assert
2022-06-29 16:48:33 -05:00
behzad nouri f875733a9e
patches bug in retransmit stats where slot stats are erroneously dropped (#26317)
slot_stats are submitted at a different cadence from the rest of
RetransmitStats. Current code erroneously clears slot_stats before
submitting any metrics.
2022-06-29 21:35:58 +00:00
behzad nouri b3406b5b2a
removes IndexedParallelIterator::with_min_len from retransmit (#26305)
Testing on mainnet-beta, with_min_len does not seem to have much impact
in the current retransmit code.
2022-06-29 13:27:17 +00:00
behzad nouri 348fe9ebe2
verifies shred slot and parent in fetch stage (#26225)
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
2022-06-28 12:45:50 +00:00
Steven Luscher 9765034e04
Enable wire compression in Solana CLI and Rust client (#26236) 2022-06-27 15:38:07 -07:00
behzad nouri 39ca788b95
discards shreds in sigverify if the slot leader is the node itself (#26229)
Shreds are dropped in window-service if the slot leader is the node
itself:
https://github.com/solana-labs/solana/blob/cd2878acf/core/src/window_service.rs#L181-L185

However this is done after wasting resources verifying signature on
these shreds, and requires a redundant 2nd lookup of the slot leader.

This commit instead discards such shreds in sigverify stage where we
already know the leader for the slot.
2022-06-27 20:12:23 +00:00
behzad nouri 67936aaa74
moves Shred::seed to ShredId and adds test coverage (#26251)
Following commits will skip shreds deserializaton before retransmit, and
so we will only have a ShredId and not a fully deserialized shred to
obtain the shuffling seed from.
2022-06-27 17:58:43 +00:00
Brooks Prumo 662818ef0d
Use `VoteAccount::node_pubkey()` (#26207) 2022-06-27 09:09:06 -05:00
HaoranYi d5efbdb19b
Add timing measurement for gossip vote txn processing (#26163)
* add timing for gossip vote txn processing

* fix build

* fix too many arg error in clippy

* atomic interval
2022-06-27 08:53:34 -05:00
Ryo Onodera cd2878acf9
Avoid to miss to root for local slots before the hard fork (#19912)
* Make sure to root local slots even with hard fork

* Address review comments

* Cleanup a bit

* Further clean up

* Further clean up a bit

* Add comment

* Tweak hard fork reconciliation code placement
2022-06-26 15:14:17 +09:00
behzad nouri 30d2b112e4
bypasses rayon thread-pool for small retransmit shred batches (#26222)
In order to preserve current behavior, the threshold is set to the
current value of the argument to IndexedParallelIterator::with_min_len.
Follow up commits will recalibrate this threshold to optimize
performance on mainnet-beta.
2022-06-25 21:15:42 +00:00
Justin Starry 7cd7173b71
Refactor: Add get_delegated_stake method to VoteAccounts (#26221) 2022-06-25 16:41:35 +00:00
Justin Starry 44d1e62007
Refactor: No need to return stake in Bank::get_vote_account (#26220) 2022-06-25 16:27:43 +00:00
behzad nouri f1b82ec44d
factors out common retransmit work for shreds of the same slot (#26218)
Shreds arriving at a node for retransmit tend to belong to the same slot
(or a just a couple of different slots). Slot leader and cluster nodes
are common for the shreds of the same slot, and so the common work to
look up these values can be factored out.
This commit first group-bys shreds by slot to factor out that common
lookup work.
2022-06-25 15:49:05 +00:00
Jeff Washington (jwash) a3395a786a
vote_account uses AccountSharedData to avoid copies (#23687)
* vote_account uses AccountSharedData to avoid copies

* simpler deserialize
2022-06-24 15:08:01 -05:00
Brooks Prumo 877fedadac
Remove StatusCacheRc type and use StatusCache directly (#26184) 2022-06-24 08:38:56 -05:00
Brooks Prumo 23c50a2389
Add StatusCache::root_slot_deltas() and use it (#26170) 2022-06-23 15:19:06 -05:00
Tyera Eulberg a6ba5a9a05
Add transaction index in slot to geyser plugin TransactionInfo (#25688)
* Define shuffle to prep using same shuffle for multiple slices

* Determine transaction indexes and plumb to execute_batch

* Pair transaction_index with transaction in TransactionStatusService

* Add new ReplicaTransactionInfoVersion

* Plumb transaction_indexes through BankingStage

* Prepare BankingStage to receive transaction indexes from PohRecorder

* Determine transaction indexes in PohRecorder; add field to WorkingBank

* Add PohRecorder::record unit test

* Only pass starting_transaction_index around PohRecorder

* Add helper structs to simplify test DashMap

* Pass entry and starting-index into process_entries_with_callback together

* Add tx-index checks to test_rebatch_transactions

* Revert shuffle definition and use zip/unzip

* Only zip/unzip if randomize

* Add confirm_slot_entries test

* Review nits

* Add type alias to make sender docs more clear
2022-06-23 13:37:38 -06:00
behzad nouri f534b8981b
maps number of data shreds to erasure batch size (#25917)
In prepration of
https://github.com/solana-labs/solana/pull/25807
which reworks erasure batch sizes, this commit:
* adds a helper function mapping the number of data shreds to the
  erasure batch size.
* adds ProcessShredsStats to Shredder::entries_to_shreds in order to
  replace and remove entries_to_data_shreds from the public interface.
2022-06-23 13:27:54 +00:00
github-actions[bot] 5c2f819f99
Bump Version to 1.11.2 (#26159) 2022-06-22 21:16:18 -05:00
dependabot[bot] 55a7e53b9e
chore: bump lru from 0.7.6 to 0.7.7 (#26140)
* chore: bump lru from 0.7.6 to 0.7.7

Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.7.6 to 0.7.7.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.7.6...0.7.7)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-06-22 14:39:07 -06:00
Jeff Biseda bafdb7dd62
Revert handle start_http failure in rpc_service (#25400) (#26130)
* revert e263be2000
2022-06-22 10:52:27 -07:00
Michael Vines f3639b76ce Remove some clippy lints 2022-06-22 09:23:22 -07:00
HaoranYi b5d0c7b468
Revert "tvu and tpu timeout on joining its microservices (#24111)" (#26132)
This reverts commit e105547c14.
2022-06-22 10:57:46 -05:00
behzad nouri faa6c32162 removes packet modifier from shred_fetch_stage
... in favor of just passing packet flags.
2022-06-22 12:17:37 +00:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00