Commit Graph

3459 Commits

Author SHA1 Message Date
behzad nouri efb8a53b28
removes staked-nodes updater service excessive locks on gossip (#29936) 2023-01-26 23:31:35 +00:00
Andrew Fitzgerald fbb90603a9
BankingStage Refactor: Separate transaction commiting module (#29808)
Separate transaction commiting module
2023-01-25 19:02:21 -08:00
Kirill Fomichev b4d1769688
geyser: add parent slot/blockhash to block (#29855) 2023-01-25 14:20:24 -08:00
steviez fa39bfef6b
Move Deduper into a separate file (#29891) 2023-01-25 15:34:53 -06:00
Andrew Fitzgerald 704472ae13
BankingStage Refactor: Separate Forwarder Module (#29402)
Separate Forwarder module
2023-01-25 12:31:59 -08:00
Xiang Zhu 4ebcacb4a3
Revert "Add run parent directory for accounts files (#29794)" (#29899)
This PR is causing OOM on master.  Reverting it for now.

This reverts commit 74f89d1494.
2023-01-25 10:03:01 -08:00
Ryo Onodera 40bbf99c74
Add fully-reproducible online tracer for banking (#29196)
* Add fully-reproducible online tracer for banking

* Don't use eprintln!()...

* Update programs/sbf/Cargo.lock...

* Remove meaningless assert_eq

* Group test-only code under aptly named mod

* Remove needless overflow handling in receive_until

* Delay stat aggregation as it's possible now

* Use Cow to avoid needless heap allocs

* Properly consume metrics action as soon as hold

* Trace UnprocessedTransactionStorage::len() instead

* Loosen joining api over type safety for replaystage

* Introce hash event to override these when simulating

* Use serde_with/serde_as instead of hacky workaround

* Update another Cargo.lock...

* Add detailed comment for Packet::buffer serialize

* Rename sender_overhead_minimized_receiver_loop()

* Use type interference for TraceError

* Another minor rename

* Retire now useless ForEach to simplify code

* Use type alias as much as possible

* Properly translate and propagate tracing errors

* Clarify --enable-banking-trace with better naming

* Consider unclean (signal-based) node restarts..

* Tweak logging and cli

* Remove Bank events as it's not needed anymore

* Make tpu own banking tracer thread

* Reduce diff a bit..

* Use latest serde_with

* Finally use the published rolling-file crate

* Make test code change more consistent

* Revive dead and non-terminating test code path...

* Dispose batches early now that possible

* Split off thread handle very early at ::new()

* Tweak message for TooSmallDirByteLimitl

* Remove too much of indirection

* Remove needless pub from ::channel()

* Clarify test comments

* Avoid needless event creation if tracer is disabled

* Write tests around file rotation and spill-over

* Remove unneeded PathBuf::clone()s...

* Introduce inner struct instead of tuple...

* Remove unused enum BankStatus...

* Avoid .unwrap() for the case of disabled tracer...
2023-01-25 21:54:38 +09:00
Yihau Chen 9193b4221d
Revert "chore: workspace inheritance (#29509)" (#29892)
This reverts commit a67d239dde.
2023-01-25 15:50:41 +08:00
Yihau Chen a67d239dde
chore: workspace inheritance (#29509)
* introduce workspace.package

* introduce workspace.dependencies

* read version from root cargo.toml

* pass check when version = { workspace = true }

* don't bump version when version = { workspace = true }

* including workspace Cargo.toml when bump version

* programs/sbf use workspace inheritance

* fix increasing cargo version ignore program/sbf/Cargo.toml
2023-01-25 13:59:59 +08:00
steviez ac65343f01
Remove duplicate bank frozen log from ReplayStage (#29821)
We emit a similar log with more information shortly after from Bank, so
this logline is extra that occurs for every slot.
2023-01-24 20:29:14 -06:00
Xiang Zhu 74f89d1494
Add run parent directory for accounts files (#29794)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation
2023-01-24 16:44:35 -08:00
Brennan Watt 0be194145b
Include own node in stake table (#29838) 2023-01-24 09:34:44 -08:00
behzad nouri 1c7662a37f
asserts that cluster-info keypair is consistent with contact-info id (#29818) 2023-01-24 16:57:55 +00:00
steviez be7ec87b9b
Reduce cpuid reporting frequency to once an hour (#29849) 2023-01-24 09:27:43 -06:00
Kevin Ji dd92f225bb
Use Ipv4Addr::{LOCALHOST, UNSPECIFIED} constants (#29813) 2023-01-23 16:49:51 -06:00
steviez f1b2e49b03
Cleanup FindPacketSenderStakeReceiver function args (#29834)
find_packet_sender_stake_stage::FindPacketSenderStakeReceiver is quite
verbose to include in function arguments, and type name is descriptive
enough that it doesn't need to be qualified with the crate name in every
instance.
2023-01-23 16:40:18 -06:00
Ashwin Sekar 3e8874e3a2
Clear parent in repair weighting when dumping from replay (#29770) 2023-01-23 12:55:09 -08:00
behzad nouri bd9b311c63
adds frozen_abi annotations to repair service enums/structs (#29820)
... in order to keep types backward compatible.
2023-01-23 16:49:06 +00:00
steviez 206a1c7296
Reduce the amount of IO that LedgerCleanupService performs (#29239)
Currently, the cleanup service counts the number of shreds in the
database by iterating the entire SlotMeta column and reading the number
of received shreds for each slot. This gives us a fairly accurate count
at the expense of performing a good amount of IO.

Instead of counting the individual slots, use the live_files()
rust-rocksdb entrypoint that we expose in Blockstore. This API allows us
to get the number of entries (shreds) in the data shred column family by
reading file metadata. This is much more efficient from IO perspective.
2023-01-23 04:39:47 -06:00
behzad nouri d75303f541
patches bug in sigverify-shreds when identity is hot-swapped (#29802)
Sigverify-shreds discards shreds from node's own leader slots:
https://github.com/solana-labs/solana/blob/6baab92ab/core/src/sigverify_shreds.rs#L153-L154

But if the identity is hot-swapped the pubkey would be wrong since it
is instantiated only once at startup:
https://github.com/solana-labs/solana/blob/6baab92ab/core/src/tvu.rs#L168
2023-01-21 20:07:41 +00:00
apfitzge 8c793da7d0
BankingStage Refactor: Move decision making functions to new module (#29788)
Move decision making functions to new module
2023-01-20 10:10:47 -08:00
apfitzge 5fc83a3d19
BankingStage Refactor: Separate Next Leader Functions (#29401)
Separate next_leader functions
2023-01-20 10:02:29 -08:00
behzad nouri 64c13b74d8
errors out when retransmit loopbacks to the slot leader (#29789)
When broadcasting shreds, turbine excludes the slot leader from the
random shuffle. Doing so, shreds should never loopback to the leader.
If shreds reaching retransmit stage are from the node's own leader slots
they should not be retransmited to any nodes.
2023-01-20 17:20:51 +00:00
Wen b36791956e
Ingest duplicate proofs sent through Gossip (#29227)
* First draft of ingesting duplicate proofs in Gossip into blockstore.

* Add more unittests.

* Add more unittests for bad cases.

* Fix lint errors for tests.

* More linter fixes for tests.

* Lint fixes

* Rename get_entries, move location of comment.

* Some renaming changes and comment fixes.

* Fix compile warning, this enum is not used.

* Fix lint errors.

* Slow down cleanup because this could potentially be expensive.

* Forgot to reset cleanup count.

* Add protection against attackers when constructing chunk map when
we ingest Gossip proofs.

* Use duplicate shred index instead of get_entries.

* Rename ClusterInfoDuplicateShredListener and fix a few small problems.

* Use into_shreds to piece together the proof.

* Remove redundant code.

* Address a few small errors.

* Discard slots too advanced in the future.

* - Use oldest proof for each pubkey
- Limit number of pubkeys in each slot to 100

* Disable duplicate shred handling for now.

* Revert "Disable duplicate shred handling for now."

This reverts commit c3fcf403876cfbf90afe4d2265a826f21a5e24ab.
2023-01-19 13:00:56 -08:00
apfitzge 2c347ac0a5
BankingStage Refactor: Move packet receiving and buffering functions to separate module (#29761)
Move packet receiving and buffering functions to separate module
2023-01-19 08:52:32 -08:00
Trent Nelson c4e43f1de4
vote: encapsulate `Lockout` (#29753) 2023-01-18 19:28:28 -07:00
Ryo Onodera 4973fe18f1
Rename banking stage packet receivers consistently (#29752)
Rename banking stage batch receivers consistently
2023-01-19 10:04:55 +09:00
Ryo Onodera 55d743c49a
Rename remaining ones to replay_vote_{sender,receiver} (#29716)
* Rename remaining ones to replay_vote_{sender,receiver}

* Fix typo...
2023-01-18 14:14:04 +09:00
Jeff Biseda f9062718c4
prioritize repair requests by stake (#29730) 2023-01-17 18:38:10 -08:00
Brennan Watt aa40c2b712
Increase turbine propagation const (#29742)
* Increase turbine propagation const

Value is used as a delay threshold for issuing shred repairs and analysis is showing we are overly aggressive in requesting repairs. Shreds show up via turbine before the repair completes the vast majority of the time

* Use Duration type for MAX_TURBINE_PROPAGATION
2023-01-17 15:01:00 -08:00
Jeff Biseda f6fcb14a3e
adjust normalized stake calculation in compute_weight (#29694) 2023-01-17 11:27:57 -08:00
Ryo Onodera 156454c980
Remove PacketDeserializer's extra overflow guard (#29715) 2023-01-17 14:21:17 +09:00
Brooks 0db14ad39c
Removes full_snapshot from CalcAccountsHashConfig (#29722) 2023-01-16 16:22:46 -05:00
behzad nouri 80a39bd6a5
adds feature to (temporarily) drop merkle shreds from testnet (#29711) 2023-01-15 15:41:58 +00:00
behzad nouri 5b5a3ebce8
adds metrics for num merkle shreds on the receiving end (#29710) 2023-01-14 23:07:42 +00:00
Illia Bobyr 59fde130d6
ledger/blockstore: PerfSampleV2: num_non_vote_transactions (#29404)
Store non-vote transaction counts that are now recorded by the banks
into the `blockstore`.

`SamplePerformanceService` now populates `PerfSampleV2` with counts from
the banks.
2023-01-12 19:14:04 -08:00
Jeff Washington (jwash) 544b9745c2
snapshot storage path uses 1 append vec per slot (#29627) 2023-01-11 12:05:15 -08:00
behzad nouri 8c212f59ad
renames ContactInfo to LegacyContactInfo (#29566)
Working towards adding a new ContactInfo where new sockets can be
added in a backward compatible way.
2023-01-08 16:00:55 +00:00
Brian Anderson 43a0745b37
Fix doc warnings (#29537) 2023-01-07 09:24:50 +00:00
behzad nouri 283a2b1540
removes #[allow(clippy::same_item_push)] (#29543) 2023-01-06 17:32:26 +00:00
behzad nouri 12da2da389
fixes errors from clippy::redundant_clone (#29536)
https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone
2023-01-05 18:42:19 +00:00
behzad nouri 5c9beef498
fixes errors from clippy::useless_conversion (#29534)
https://rust-lang.github.io/rust-clippy/master/index.html#useless_conversion
2023-01-05 18:05:32 +00:00
Lijun Wang 1e8a8e07b6
Stream the executed transaction count in the block notification (#29272)
Problem

The plugins need to know when all transactions for a block have been all notified to serve getBlock request correctly. As block and transaction notifications are sent asynchronously to each other it will be difficult.

Summary of Changes

Include the executed transaction count in block notification which can be used to check if all transactions have been notified.
2023-01-05 09:36:19 -08:00
Jeff Biseda 832302485e
require repair request signature, ping/pong for Testnet, Development clusters (#29351) 2023-01-04 14:54:19 -08:00
Illia Bobyr d7bd1bf970
bank: Record non-vote transaction count (#29383)
A subsequent change to `SamplePerformanceService` introduces non-vote transaction counts, which `bank`s need to store.

Part of work on https://github.com/solana-labs/solana/issues/29159
2023-01-03 14:46:20 -08:00
Xiang Zhu 3363c08ac0
Move async remove to snapshot_utils.rs (#29406) 2023-01-03 06:15:32 -08:00
behzad nouri 754ecf467b
generalizes the return type of Shred::get_signed_data (#29446)
The commit adds an associated SignedData type to Shred trait so that
merkle and legacy shreds can return different types for signed_data
method.
This would allow legacy shreds to point to a section of the shred
payload, whereas merkle shreds would compute and return the merkle root.
Ultimately this would allow to remove the merkle root from the shreds
binary.
2022-12-31 17:08:25 +00:00
Ashwin Sekar 17b64005d3
Add more logging and documentation to flaky optimistic confirmation tests (#29418)
* Revert "add retry for flakey local cluster test (#29228)"

This reverts commit 7a97121747.

* Add logging for repair
2022-12-27 10:47:45 -07:00
behzad nouri 456d06785d
experiments different turbine fanouts for propagating shreds (#29393)
The commit allocates 2% of slots to running experiments with different
turbine fanouts based on the slot number.
The experiment is feature gated with an additional feature to disable
the experiment.
2022-12-26 14:18:56 +00:00
Ashwin Sekar f2ba16ee87
Plumb dumps from replay_stage to repair (#29058)
* Plumb dumps from replay_stage to repair

When dumping a slot from replay_stage as a result of duplicate or
ancestor hashes, properly update repair subtrees to keep weighting and
forks view accurate.

* add test

* pr comments
2022-12-25 09:58:30 -07:00