Commit Graph

3595 Commits

Author SHA1 Message Date
Brooks 6e5615e32d
Revert "AccountsHashVerifier remembers last full snapshot info (#30582)" (#30660) 2023-03-13 14:48:16 -04:00
Brooks 505e3ff5c7
AccountsHashVerifier updates AccountsDb after calculating accounts hash (#30658) 2023-03-13 16:41:24 +00:00
Trent Nelson a15139ef15
tests: share `GenesisConfig` in `validator_parallel_exit` (#30692) 2023-03-13 10:12:35 -06:00
Brooks a43f803604
AccountsHashVerifier purges old accounts hashes (#30644) 2023-03-13 11:12:11 -04:00
behzad nouri c4b2639a86
patches flaky test_retransmit_latest_unpropagated_leader_slot (#30686) 2023-03-12 22:46:05 +00:00
behzad nouri f9805b6fbb
stops nodes from broadcasting slots twice (#30681)
https://github.com/solana-labs/solana/blob/94ef881de/core/src/progress_map.rs#L178
always returns true the first time around because retry_time is None.
So every slot is broadcasted twice.
2023-03-11 02:46:08 +00:00
Andrew Fitzgerald 5f6755f58b
remove test fn (#30616)
kill banking_stage::Consumer test_fn
2023-03-10 09:30:25 -08:00
Andrew Fitzgerald b0112a5f43
BankingStage Consumer: test_buffered_packets* reworking (#30615)
* refactor test_consume_buffered_packets_interrupted without test_fn

* Fix comment

* Also check retries

* Add retryable test case
2023-03-09 09:13:04 -08:00
Ashwin Sekar 11e554b922
Fix repair_weight test bad merge failure (#30649) 2023-03-08 21:23:37 -07:00
Ashwin Sekar 31712d38de
Track pruned subtrees in repair weight (#29922) 2023-03-08 18:38:32 -07:00
Jeff Biseda 4c0ce84488
increase retransmit shreds received cache size (#30556) 2023-03-07 13:03:52 -08:00
HaoranYi 8a1c7614f1
typos (#30604) 2023-03-07 11:08:46 -06:00
Brooks 70c6c7e1f7
Uses strong types for snapshot hashes in SnapshotPackagerService (#30603) 2023-03-06 16:50:45 -05:00
Andrew Fitzgerald bba0ed702f
BankingStage Refactor: Consumer State (#30288)
* BankingStage Refactor: Consumer add state

* remove trailing comma
2023-03-06 09:13:28 -08:00
Brooks 120b0c92d1
AccountsHashVerifier remembers last full snapshot info (#30582)
AHV remembers last full snapshot info
2023-03-06 16:40:46 +00:00
steviez a8bff33387
Make backup_and_clear_blockstore() honor ValidatorConfig options (#30538)
* Add helper function to create BlockstoreOptions from ValidatorConfig

* Make backup_and_clear_blockstore() honor ValidatorConfig options

backup_and_clear_blockstore() opens a Blockstore session; however, it
is currently using Blockstore::open(). This Blockstore method uses
BlockstoreOption::default() under the hood. As a result, any validator
args that adjust Blockstore settings are not considered in
backup_and_clear_blockstore().

This is especially problematic if the non-default value of
--rocksdb-shred-compaction is being used. In this case,
backup_and_clear_blockstore() was opening the wrong directory and
incorrectly finding an empty ledger.

This change plumbs any blockstore configuration to
backup_and_clear_blockstore().
2023-03-04 21:09:41 -08:00
Brooks 6972f92c29
AHV loop uses let-else (#30583) 2023-03-04 01:59:29 +00:00
sakridge 7a8563f2c8
Panic when shred index exceeds the max per slot (#30555)
Assert when shred index exceeds the max per slot
2023-03-04 02:49:23 +01:00
Brooks 1cf0ce1215
AHV logs when stopped (#30585) 2023-03-03 23:44:30 +00:00
Brooks cd652a7e20
AHV uses metrics names like SPS's (#30584) 2023-03-03 23:38:35 +00:00
Tyera 7b1d446001
Admin RPC Service: move post-init activation to before wait-for-supermajority (#30544)
* Move AdminRpcRequestMetadataPostInit to solana-core

* Move AdminRpcRequestMetadataPostInit write to just before wait_for_supermajority

* Pass AdminRpcRequestMetadataPostInit in TestValidatorGenesis

* Fixup local-cluster
2023-03-01 19:38:11 -07:00
HaoranYi 16db984cb5
improve supermajority waiting logging (#30479)
make logging for supermajority waiting and stake percent from gossip in sync
2023-03-01 08:57:42 -06:00
Jeff Biseda 781a7cbd28
cleanup get_closeset_completion (#30516) 2023-02-27 16:56:40 -08:00
Brooks 89c07d259a
AccountsHashVerifier uses AccountsHashEnum (#30514) 2023-02-24 17:17:54 -05:00
Brennan 7847661511
Process tower after warping bank forks (#30467)
This helps ensure tower and bank forks are in sync in terms of root slot
2023-02-23 16:23:18 -08:00
Jeff Washington (jwash) 2441a06e78
drop default from PhantomData::default() (#30476) 2023-02-23 14:59:08 -08:00
Yihau Chen df3ef111f7
chore: workspace inheritance (#29893)
* introduce workspace.package

* introduce workspace.dependencies

* read version from root cargo.toml

* pass check when version = { workspace = true }

* don't bump version when version = { workspace = true }

* including workspace Cargo.toml when bump version

* programs/sbf use workspace inheritance

* fix increasing cargo version ignore program/sbf/Cargo.toml
2023-02-23 22:01:54 +08:00
Michael Vines 5136ed3448
Update homepage value for all crates (#30444) 2023-02-23 02:20:18 +00:00
Jeff Biseda 55f601b25c
prevent revisiting slots in get_closest_completion (#30458) 2023-02-22 18:16:17 -08:00
Brooks 69a9520f79
Flushes accounts cache before warping (#30437) 2023-02-22 21:13:31 -05:00
Jeff Biseda 5221049595
stop get_unrepaired_path at root slot (#30450) 2023-02-22 15:04:09 -08:00
Brennan d2c6bd1410
Metrics for repair trees & closest completion slots (#30448) 2023-02-22 14:33:02 -08:00
Brennan e7a69dcec5
get_best_repairs minor cleanup (#30439) 2023-02-22 12:15:42 -08:00
Brennan 96dd621426
Remove ignored slots from repair (#30438) 2023-02-22 12:15:17 -08:00
Brooks 1689586213
Uses a channel for AHV -> SPS (#30406) 2023-02-22 03:36:29 +00:00
Brooks 35328ca63d
Makes AccountsHash an enum (#30416) 2023-02-21 15:20:51 -05:00
Brooks bcc4bc80c9
Removes unnecessary derives from Accounts{Delta}Hash (#30392) 2023-02-20 16:00:53 -05:00
Andrew Fitzgerald 50f553e245
Clean up: consumer saturating add assign (#30347)
Use saturating_add_assign where appropriate in Consumer
2023-02-16 15:19:43 -08:00
Brooks 4ba80ad785
Inline format args (#30364)
clippy fixes
2023-02-16 17:00:43 +00:00
Brooks febaf36e6d
Apply clippy fixes for future rust upgrade (#30363) 2023-02-16 16:12:51 +00:00
Andrew Fitzgerald 4194661bcf
Rewrite accumulate_execute_units_and_time without allocation (#30338) 2023-02-15 17:22:24 -08:00
Andrew Fitzgerald 1cefb90271
BankingStage Refactor: Simplify Consumer (#30253)
* measure! to measure_us!

* Consistent naming of transaction_recorder

* Remove outdated comment - Instant cannot be None

* use local

* Remove measure! import
2023-02-15 17:20:55 -08:00
Jeff Biseda 20614fa746
restore timestamp() in find_missing_indexes (#30345) 2023-02-15 16:12:36 -08:00
Andrew Fitzgerald b86bfbb5c5
measure_us! use Instant and duration_to_us internally (#30339) 2023-02-15 12:43:47 -08:00
Xiang Zhu 4909267c88
Add accounts hard-link files into the bank snapshot directory (#29496)
* Add accounts hard-link files into the bank snapshot directory

* Small adjustments and fixes.

* Address some of the review issues

* Fix compilation issues

* Change the latest slot snapshot storage from VecDeque to Option

* IoWithSourceAndFile and expanded comments on accounts

* last_slot_snapshot_storages in return value

* Update comments following the review input

* rename dir_accounts_hard_links to hard_link_path

* Add dir_full_state flag for add_bank_snapshot

* Let appendvec files hardlinking work with multiple accounts paths across multiple partitions

* Fixes for rebasing

* fix tests which generates account_path without adding run/

* rebasing fixes

* fix account path test failures

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* update with more review comments

* handle error from create_accounts_run_and_snapshot_dirs

* fix rebasing duplicate

* minor accounts_dir path cleanup

* minor cleanup, remove commented code

* misc review comments

* build error fix

* Fix test_incremental_snapshot_download_with_crossing_full_snapshot_interval_at_startup

* fix build error on MAX_BANK_SNAPSHOTS_TO_RETAIN

* rebase fix, update hardlink filename

* minor comment spelling fix

* rebasing fixes

* fix rebase issues; with_extension

* comments changes for review

* misc minor review issues

* bank.fill_bank_with_ticks_for_tests

* error handling on appendvec path

* fix use_jit

* minor comments refining

* Remove type AccountStorages

* get_account_path_from_appendvec_path return changed to Option

* removed appendvec_path.to_path_buf in create_accounts_run_and_snapshot_dirs

* add test_get_snapshot_accounts_hardlink_dir

* update last_snapshot_storages comment

* update last_snapshot_storages comment

* symlink map_err

* simplify test_get_snapshot_accounts_hardlink_dir with fake paths

* log last_snapshot_storages at the end of the loop
2023-02-15 09:52:07 -08:00
Tyera a020f3eb60
Add clarifying comments to SamplePerformanceService (#30296)
* Add clarifying comment

* Make jsonrpc docs more explicit
2023-02-15 10:02:53 -07:00
Andrew Fitzgerald beb3cd5ed9
BankingStage Refactor: Separate Consumer Module (#30238) 2023-02-15 08:52:13 -08:00
Illia Bobyr d2b21c09ff
SamplePerformanceService: Refactor stats snapshot logic (#30297)
Snapshot construction and interaction code was a bit more manual than necessary, even causing a bug to slip past a review.  Separated snapshot construction from the diffing of two snapshots.
This should make the logic clearer.
2023-02-14 19:01:23 -08:00
Tyera 7c35191322
Scope SamplePerformanceService Bank only for initial sample snapshot (#30316)
* Scope Bank only for initial sample snapshot

* Remove nesting
2023-02-14 23:32:26 +00:00
steviez dd9d6e308c
Fix transactions counts stored by SamplePerformanceService (#30280)
A recent change to this service to store the number of non-vote
transactions introduced a bug in the computation of the number of
transactions during the time interval. This resulted in bogus values
being stored in Blockstore and eventually getting served through RPC for
the TPS chart on explorer.
2023-02-13 19:51:34 +00:00
Proph3t 2271a3920b
chore: fix broken docs link (#30274)
docs: fix broken link
2023-02-13 13:31:16 -06:00
Jeff Biseda f4fe550004
remove sleeps from repair tests (#30252) 2023-02-13 10:28:30 -08:00
Tao Zhu 60bfc2524b
implement From trait for CostTrackerError to TransactionError (#30267)
implement From trait for CostTrackerError to TransactionError
2023-02-13 11:06:39 -06:00
Trent Nelson 8770b15bb2
remove recommendations to skip validator startup tests on failure (#30250) 2023-02-10 18:14:47 -07:00
behzad nouri ded457cd73
embeds the new gossip ContactInfo in ClusterInfo (#30022)
Working towards replacing the legacy gossip contact-info with the new
one, the commit updates the respective field in gossip cluster-info.
2023-02-10 20:07:45 +00:00
Andrew Fitzgerald 60cf8ce65b
remove unnecessary lifetime (#30108)
Remove unnecessary lifetime on function
2023-02-09 21:17:41 -08:00
Jeff Biseda 180273b97d
defer HighestShred repairs during shred propagation threshold (#30142) 2023-02-09 14:57:55 -08:00
Ashwin Sekar 67f644473b
Fix repair behavior concerning our own leader slots (#30200)
panic when trying to dump & repair a block that we produced
2023-02-09 14:30:12 -07:00
Andrew Fitzgerald 4b17acf64e
BankingStage Refactor: Add state to Committer (#30107) 2023-02-09 13:22:42 -08:00
Andrew Fitzgerald 058738424d
BankingStage Refactor: transaction recorder record transactions (#30106) 2023-02-09 08:34:02 -08:00
steviez d3dab24bbe
chore: Use `i` over `ix` variable name when naming worker threads (#30206) 2023-02-09 01:24:57 +00:00
behzad nouri 1ad69cfc38
removes dynamic cast and dynamic dispatch from connection-cache (#30128)
Dynamic dispatch forces heap allocation and adds extra overhead.
Dynamic casting as in the ones below, lacks compile-time type safety:
https://github.com/solana-labs/solana/blob/eeb622c4e/quic-client/src/lib.rs#L172-L175
https://github.com/solana-labs/solana/blob/eeb622c4e/udp-client/src/lib.rs#L52-L55

The commit removes all instances of Any, Box<dyn ...>, and Arc<dyn ...>,
and instead uses generic and associated types.

There are only two protocols QUIC and UDP; and the code which has to
work with both protocols can use a trivial thin enum wrapper.

With respect to connection-cache specifically:
* connection-cache/ConnectionCache is a single protocol cache which
  allows to use either QUIC or UDP without any build dependency on the
  other protocol.
* client/ConnectionCache is an enum wrapper around both protocols and
  can be used in the code which has to work with both QUIC and UDP.

Co-authored-by: Tyera Eulberg <tyera@solana.com>
2023-02-09 00:50:44 +00:00
Illia Bobyr cf77f5dbb8
doc: ledger: Document `completed_data_sets_service` module (#30001) 2023-02-07 21:20:09 -08:00
Andrew Fitzgerald 2b99756b3e
BankingStage Refactor: Move counters out of record_transactions (#30093)
Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
2023-02-07 07:45:50 -08:00
Andrew Fitzgerald d9444a6576
remove unnecessary clippy warning ignore (#30100) 2023-02-06 08:27:18 -08:00
Andrew Fitzgerald 7fb2fc6f27
Add comment on the closure (#30091) 2023-02-06 08:24:36 -08:00
Wen 151585e596
Filter pubkey in gossip duplicateproof ingestion (#29879) 2023-02-03 11:41:32 -08:00
Andrew Fitzgerald 8914d1af27
BankingStage Refactor: Add state to PacketReceiver (#30090) 2023-02-03 11:35:43 -08:00
Pankaj Garg be8e463a51
Use TPU IP instead of gossip for QUIC client certificate info (#30105) 2023-02-03 04:16:57 +00:00
Andrew Fitzgerald 8fa396a321
BankingStage Refactor: Add state to Forwarder (#29403) 2023-02-02 11:09:08 -08:00
Andrew Fitzgerald fd3f26380e
BankingStage Refactor: Simplify PacketReceiver (#29784) 2023-02-02 07:58:55 -08:00
Lijun Wang ada6136a6c
Refactor connection cache to support generic msgs (#29774)
tpu-client/tpu_connection_cache is refactored out the module and moved to connection-cache/connection_cache and the logic in client/connection_cache is consolidated to connection-cache/connection_cache as well. client/connection_cache only has a thin wrapper which forward calls to connection-cache/connection_cache and deal with constructions of quic/udp connection cache for clients using them both.2.

The TpuConnection is refactored to ClientConnection to make it generic and functions renamed to be proper for other workflows. eg. tpu_addr -> server_addr, send_transaction --> send_data and etc...

The enum dispatch is removed so that we can make the bulk of code of quic and udp agnostic of each other. The client is possible to load quic or udp only into its runtime.

The generic type parameter in the tpu-client/tpu_connection_cache is removed in order to create both quic and udp connection cache and use the object to send transactions with multiple branching when sending data. The generic type parameters and associated types are dropped in other types in order to make the trait "object safe" for this purpose.

I have annotated the code explaining the reasoning and the refactoring source -> destination.

There is no functional changes

bench-tps has been performed for rpc-client, thin-client and tpu-client. And it is found the performance number largely match the ones before the refactoring.
2023-02-01 18:10:06 -08:00
Xiang Zhu f107b8b607
Add slot deltas into the bank snapshot directory (#29409) 2023-02-01 16:51:32 -08:00
Andrew Fitzgerald c549129974
BankingStage Refactor: Committer Simplify (#29958) 2023-02-01 15:44:53 -08:00
dependabot[bot] 232e252014
Bump serde from 1.0.144 to 1.0.152 (#29696)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
Co-authored-by: Tyera <tyera@solana.com>
2023-02-01 16:27:17 -07:00
Brooks d048a1903f
Splits up AccountsDb::bank_hashes (#30024) 2023-02-01 14:32:35 -05:00
Andrew Fitzgerald c06053f505
BankingStage Refactor: Add state to DecisionMaker (#29806) 2023-02-01 09:18:40 -08:00
behzad nouri ffc9c90cb4
expands api parity between the new and the legacy contact-info (#30038)
Working towards replacing the legacy contact-info with the new one, the
commit expands api compatibility between the two.
2023-02-01 13:07:42 +00:00
Will Hickey 04a6a631bc
Bump version to v1.16 (#30028) 2023-01-31 17:48:33 -06:00
carllin b345d97f67
Add local cluster test for optimistic confirmation with malformed votes (#29822) 2023-01-31 14:19:45 -06:00
joeaba a12bf8c003
Update maintainers references (#29997)
* update maintainers references

* chore: update maintainers reference
2023-01-31 08:07:13 -05:00
Jeff Biseda c6cd96635f
get_best_weighted_repairs parameter cleanup (#30010) 2023-01-31 03:12:25 -08:00
Jeff Biseda 6163a6c279
restructure repair decode error handling (#29977) 2023-01-31 02:44:58 -08:00
Xiang Zhu 856598969c
Account path add run parent with old path cleanup (#29942)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation

* Clean up files in the old account_path before trasitioning to the new run path

* try_exist and accounts_dir removing extra

* sync rmdir, is_dir check

* handle the account_path not deletable case
2023-01-30 10:26:43 -08:00
Jeff Biseda 7cacbdcca2
track repair handle_requests time (#29940) 2023-01-27 15:50:18 -08:00
behzad nouri 7f173ce7c7
feature gates merkle shreds on all clusters (#29957) 2023-01-27 21:02:51 +00:00
behzad nouri efb8a53b28
removes staked-nodes updater service excessive locks on gossip (#29936) 2023-01-26 23:31:35 +00:00
Andrew Fitzgerald fbb90603a9
BankingStage Refactor: Separate transaction commiting module (#29808)
Separate transaction commiting module
2023-01-25 19:02:21 -08:00
Kirill Fomichev b4d1769688
geyser: add parent slot/blockhash to block (#29855) 2023-01-25 14:20:24 -08:00
steviez fa39bfef6b
Move Deduper into a separate file (#29891) 2023-01-25 15:34:53 -06:00
Andrew Fitzgerald 704472ae13
BankingStage Refactor: Separate Forwarder Module (#29402)
Separate Forwarder module
2023-01-25 12:31:59 -08:00
Xiang Zhu 4ebcacb4a3
Revert "Add run parent directory for accounts files (#29794)" (#29899)
This PR is causing OOM on master.  Reverting it for now.

This reverts commit 74f89d1494.
2023-01-25 10:03:01 -08:00
Ryo Onodera 40bbf99c74
Add fully-reproducible online tracer for banking (#29196)
* Add fully-reproducible online tracer for banking

* Don't use eprintln!()...

* Update programs/sbf/Cargo.lock...

* Remove meaningless assert_eq

* Group test-only code under aptly named mod

* Remove needless overflow handling in receive_until

* Delay stat aggregation as it's possible now

* Use Cow to avoid needless heap allocs

* Properly consume metrics action as soon as hold

* Trace UnprocessedTransactionStorage::len() instead

* Loosen joining api over type safety for replaystage

* Introce hash event to override these when simulating

* Use serde_with/serde_as instead of hacky workaround

* Update another Cargo.lock...

* Add detailed comment for Packet::buffer serialize

* Rename sender_overhead_minimized_receiver_loop()

* Use type interference for TraceError

* Another minor rename

* Retire now useless ForEach to simplify code

* Use type alias as much as possible

* Properly translate and propagate tracing errors

* Clarify --enable-banking-trace with better naming

* Consider unclean (signal-based) node restarts..

* Tweak logging and cli

* Remove Bank events as it's not needed anymore

* Make tpu own banking tracer thread

* Reduce diff a bit..

* Use latest serde_with

* Finally use the published rolling-file crate

* Make test code change more consistent

* Revive dead and non-terminating test code path...

* Dispose batches early now that possible

* Split off thread handle very early at ::new()

* Tweak message for TooSmallDirByteLimitl

* Remove too much of indirection

* Remove needless pub from ::channel()

* Clarify test comments

* Avoid needless event creation if tracer is disabled

* Write tests around file rotation and spill-over

* Remove unneeded PathBuf::clone()s...

* Introduce inner struct instead of tuple...

* Remove unused enum BankStatus...

* Avoid .unwrap() for the case of disabled tracer...
2023-01-25 21:54:38 +09:00
Yihau Chen 9193b4221d
Revert "chore: workspace inheritance (#29509)" (#29892)
This reverts commit a67d239dde.
2023-01-25 15:50:41 +08:00
Yihau Chen a67d239dde
chore: workspace inheritance (#29509)
* introduce workspace.package

* introduce workspace.dependencies

* read version from root cargo.toml

* pass check when version = { workspace = true }

* don't bump version when version = { workspace = true }

* including workspace Cargo.toml when bump version

* programs/sbf use workspace inheritance

* fix increasing cargo version ignore program/sbf/Cargo.toml
2023-01-25 13:59:59 +08:00
steviez ac65343f01
Remove duplicate bank frozen log from ReplayStage (#29821)
We emit a similar log with more information shortly after from Bank, so
this logline is extra that occurs for every slot.
2023-01-24 20:29:14 -06:00
Xiang Zhu 74f89d1494
Add run parent directory for accounts files (#29794)
* Add run parent directory for accounts files

* fix test test_concurrent_snapshot_packaging

* review comments.  renamed the path setup function

* Addressed most of the review comments

* remove explict type def for map result

* handle create_accounts_run_and_snapshot_dirs error with expect

* update with more review comments

* minor fixes from review comments

* simplify account_filename option assignment

* handle error from create_accounts_run_and_snapshot_dirs

* use then instead of then_some for lazy evaluation
2023-01-24 16:44:35 -08:00
Brennan Watt 0be194145b
Include own node in stake table (#29838) 2023-01-24 09:34:44 -08:00
behzad nouri 1c7662a37f
asserts that cluster-info keypair is consistent with contact-info id (#29818) 2023-01-24 16:57:55 +00:00
steviez be7ec87b9b
Reduce cpuid reporting frequency to once an hour (#29849) 2023-01-24 09:27:43 -06:00