Commit Graph

534 Commits

Author SHA1 Message Date
sakridge 398af132a5
More set_root metrics (#21286) 2021-11-15 16:28:18 -07:00
Tao Zhu c2bfce90b3
- cost_tracker is data member of a bank, it can report metrics when bank is frozen (#20802)
- removed cost_tracker_stats and histogram
- move stats reporting outside of bank freeze
2021-10-24 22:19:23 -05:00
Tao Zhu 177a375479
Tpu vote 1.7 (#20187) (#20494)
* Add separate vote processing tpu port

* Add feature to send to tpu vote port

* Add vote rejecting sigverify mode

* use packet.meta.is_simple_vote_tx in place of deserialization

* consolidate code that identifies vote tx atcommon path for cpu and gpu

* new key for feature set

* banking forward tpu vote

* add tpu vote port to dockerfile and other review changes

* Simplify thread id compare

* fix a test; updated cluster_info ABI change

Co-authored-by: Tao Zhu <tao@solana.com>

Co-authored-by: sakridge <sakridge@gmail.com>
2021-10-07 09:38:23 +00:00
Michael Vines 7027d56064 Resolve nightly-2021-10-05 clippy complaints 2021-10-06 10:37:58 -07:00
Justin Starry 129716f3f0
Optimize stakes cache and rewards at epoch boundaries (#20432)
* Optimize stakes cache and rewards at epoch boundaries

* Fetch from accounts db

* Add cli flag for disabling epoch boundary optimization
2021-10-06 00:53:26 -04:00
Pavel Strakhov 65227f44dc
Optimize RPC pubsub for multiple clients with the same subscription (#18943)
* reimplement rpc pubsub with a broadcast queue

* update tests for new pubsub implementation

* fix: fix review suggestions

* chore(rpc): add additional pubsub metrics

* integrate max subscriptions check into SubscriptionTracker to reduce locking

* separate subscription control from tracker

* limit memory usage of items in pubsub broadcast queue, improve error handling

* add more pubsub metrics

* add final count metrics to pubsub

* add metric for total number of subscriptions

* fix small review suggestions

* remove by_params from SubscriptionTracker and add node_progress_watchers map instead

* add subscription tracker tests

* add metrics for number of pubsub notifications as a counter

* ignore clippy lint in TokenCounter

* fix underflow in token counter

* reduce queue capacity in pubsub tests

* fix(rpc): fix test timeouts

* fix race in account subscription test

* Add RpcSubscriptions::new_for_tests

Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-09-17 13:40:14 -06:00
sakridge dc69cc1ae4
Only allow votes when root distance gets too high (#19917) 2021-09-16 15:12:26 +02:00
carllin 87a7f00926
Track reset bank in PohRecorder (#19810) 2021-09-13 16:55:35 -07:00
Jeff Biseda 7a8eba10b2
add synchronization comment to handle_new_root (#19571) 2021-09-02 13:52:14 -07:00
behzad nouri 8ad52fa095
implements copy-on-write for vote-accounts (#19362)
Bank::vote_accounts redundantly clones vote-accounts HashMap even though
an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/95c998a19/runtime/src/bank.rs#L5174-L5186

This commit implements copy-on-write semantics for vote-accounts by
wrapping the underlying HashMap in Arc<...>.
2021-08-30 15:54:01 +00:00
carllin 22674000bd
Add EpochSlots frozen state transition (#19112) 2021-08-13 14:21:52 -07:00
Tao Zhu 414d904959
Reject blocks for costs above the max block cost (#18994)
* added realtime cost checking logic to reject block that would exceed max limit:
- defines max limits at block_cost_limits.rs
- right after each bath's execution, accumulate its cost and check again
  limit, return error if limit is exceeded

* update abi that changed due to adding additional TransactionError

* To avoid counting stats mltiple times, only accumulate execute-timing when a bank is completed

* gate it by a feature

* move cost const def into block_cost_limits.rs

* redefine the cost for signature and account access, removed signer part as it is not well defined for now

* check if per_program_timings of execute_timings before sending
2021-08-12 10:48:47 -05:00
Michael Vines e9722474eb Move tower storage into its own module 2021-08-11 00:20:46 -07:00
Michael Vines d7ab510229 Move tower save into the VotingService 2021-08-11 00:20:46 -07:00
Michael Vines 397801a2d8 Extract tower storage details from Tower struct 2021-08-06 10:04:37 -07:00
Jeff Washington (jwash) 14361906ca
for all tests, bank::new -> bank::new_for_tests (#19064) 2021-08-05 08:42:38 -05:00
carllin 03353d500f
Actively manage dead slots in AncestorHashesService (#18912) 2021-08-02 14:33:28 -07:00
carllin c0704d4ec9
Plumb signal from replay to ancestor hashes service (#18880) 2021-07-26 20:59:00 -07:00
behzad nouri d2d5f36a3c
adds validator flag to allow private ip addresses (#18850) 2021-07-23 15:25:03 +00:00
Michael Vines 61865c0ee0 `solana-validator set-identity` now loads the tower file for the new identity 2021-07-21 22:22:08 -07:00
carllin ce467bea20
Add frozen hashes and marking DuplicateConfirmed in blockstore to state machine (#18648) 2021-07-18 17:04:25 -07:00
sakridge 0f8bcf65af
Add voting service (#18552) 2021-07-15 16:35:51 +02:00
Michael Vines b30b32300d `solana-validator set-identity` now works for voting validators 2021-07-14 09:42:35 -07:00
Michael Vines 62d864559f Tower cleanup: reduce fn visibility, remove unnecessary new_with_key() 2021-07-14 09:42:35 -07:00
sakridge 7f2254225e
Move entry/poh to own crate to speed up poh bench build (#18225) 2021-07-14 14:16:29 +02:00
carllin 4d3e301ee4
Introduce slot dumping to ReplayStage (#18160) 2021-07-08 19:07:32 -07:00
Michael Vines 1e0942e900 Rename ClusterInfo::send_vote to ClusterInfo::send_transaction 2021-07-07 15:51:14 -07:00
Tao Zhu 7cd6224caf
log warning when channel send fails (#18391) 2021-07-02 19:04:09 +00:00
carllin 0eca92de18
Make set roots an iterator (#18357) 2021-07-01 20:02:40 -07:00
Michael Vines b6792a3328 Add ability to change the validator identity at runtime 2021-07-01 17:50:04 -07:00
Tao Zhu 5e424826ba
Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
2021-07-01 11:32:41 -05:00
carllin 68c87469c3
Cleanup ReplayStage tests (#18241) 2021-06-28 20:19:42 -07:00
Tao Zhu 9d6f1ebef4
investigate system performance test degradation (#17919)
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table

* Change mutex on cost_tracker to RwLock

* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.

* acquire and hold locks for block of TXs, instead of acquire and release per transaction;

* remove redundant would_fit check from cost_tracker update execution path

* refactor cost checking with less frequent lock acquiring

* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.

* create hashmap with new_capacity to reduce runtime heap realloc.

* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default

* address potential deadlock by acquiring locks one at time
2021-06-28 21:34:04 -05:00
sakridge 5d08bf9aa3
More detailed voting timings in replay stage (#18229) 2021-06-26 17:32:08 +02:00
Michael Vines 2435ea3ad8 Remove redundant ReplayStageConfig::my_pubkey field 2021-06-21 21:29:52 -07:00
Alexander Meißner 789f33e8db chore: cargo fmt 2021-06-18 10:42:46 -07:00
Alexander Meißner 6514096a67 chore: cargo +nightly clippy --fix -Z unstable-options 2021-06-18 10:42:46 -07:00
Michael Vines fa04531c7a Extricate RpcCompletedSlotsService from RetransmitStage 2021-06-16 16:20:35 -07:00
carllin ccc013e134
Handle removing slots during account scans (#17471) 2021-06-14 21:04:01 -07:00
carllin c8535be0e1
Port unconfirmed duplicate tracking logic from ProgressMap to ForkChoice (#17779) 2021-06-11 03:09:57 -07:00
carllin afafa624a3
Account for duplicate before a bank is frozen or replayed (#17866) 2021-06-10 22:28:23 -07:00
Tao Zhu ae27fcbcda
replay stage feed back program cost (#17731)
* replay stage feeds back realtime per-program execution cost to cost model;

* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;

* changed cost unit to microsecond, using value collected from mainnet;

* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
2021-06-09 17:10:59 -05:00
Michael Vines e5e7390d44 Wrap long lines 2021-06-08 12:05:29 -07:00
Tyera Eulberg 544b3c0d17
Create solana-poh and move remaining rpc modules to solana-rpc (#17698)
* Create solana-poh crate

* Move BigTableUploadService to solana-ledger

* Add solana-rpc to workspace

* Move dependencies to solana-rpc

* Move remaining rpc modules to solana-rpc

* Single use statement solana-poh

* Single use statement solana-rpc
2021-06-04 09:23:06 -06:00
carllin 96ba2edfeb
Switch EpochSlots to be frozen slots, not completed slots (#17168) 2021-06-03 00:20:00 +00:00
Tyera Eulberg ab581dafc2
Add block height to ConfirmedBlock structs (#17523)
* Add BlockHeight CF to blockstore

* Rename CacheBlockTimeService to be more general

* Cache block-height using service

* Fixup previous proto mishandling

* Add block_height to block structs

* Add block-height to solana block

* Fallback to BankForks if block time or block height are not yet written to Blockstore

* Add docs

* Review comments
2021-05-26 22:16:16 -06:00
Tyera Eulberg 9a5330b7eb
Move gossip modules into solana-gossip crate (#17352)
* Move gossip modules to solana-gossip

* Update Protocol abi digest due to move

* Move gossip benches and hook up CI

* Remove unneeded Result entries

* Single use statements
2021-05-26 09:15:46 -06:00
Tyera Eulberg 827355a6b1
Create solana-rpc crate and move subscriptions (#17320)
* Move non_circulating_supply to runtime

* Add solana-rpc crate and move max_slots

* Move subscriptions to solana-rpc

* Single use statements
2021-05-19 00:54:28 -06:00
Tyera Eulberg 6e9deaf1bd
Move block-time caching earlier (#17109)
* Require that blockstore block-time only be recognized slot, instead of root

* Move cache_block_time to after Bank freeze

* Single use statement

* Pass transaction_status_sender by reference

* Remove unnecessary slot-existence check before caching block time altogether

* Move block-time existence check into Blockstore::cache_block_time, Blockstore no longer needed in blockstore_processor helper
2021-05-10 13:14:56 -06:00
behzad nouri fa86a335b0
implements cursor for gossip crds table queries (#16952)
VersionedCrdsValue.insert_timestamp is used for fetching crds values
inserted since last query:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298

So it is crucial that insert_timestamp does not go backward in time when
new values are inserted into the table. However std::time::SystemTime is
not monotonic, or due to workload, lock contention, thread scheduling,
etc, ... new values may be inserted with a stalled timestamp way in the
past. Additionally, reading system time for the above purpose is
inefficient/unnecessary.

This commit adds an ordinal index to crds values indicating their insert
order. Additionally, it implements a new Cursor type for fetching values
inserted since last query.
2021-05-06 14:04:17 +00:00
carllin bc7e741514
Integrate gossip votes into switching threshold (#16973) 2021-05-04 00:51:42 -07:00
carllin 5981399612
Distinguish max replayed and max observed vote (#16936) 2021-04-29 14:43:28 -07:00
carllin b5d30846d6
Retry latest vote if expired (#16735) 2021-04-28 11:46:16 -07:00
steviez bc31378797
Trim extra shred bytes in blockstore (#16602)
Strip the zero-padding off of data shreds before insertion into blockstore

Co-authored-by: Stephen Akridge <sakridge@gmail.com>
Co-authored-by: Nathan Hawkins <utsl@utsl.org>
2021-04-27 17:40:41 -05:00
behzad nouri 0f3ac51cf1
limits to data_header.size when combining shreds' payloads (#16708)
Shredder::deshred is ignoring data_header.size when combining shreds' payloads:
https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/shred.rs#L940-L961

Also adding more sanity checks on the alignment of data shreds indices.
2021-04-27 12:04:44 +00:00
carllin 4c94f8933f
Ingest votes from gossip into fork choice (#16560) 2021-04-21 14:40:35 -07:00
Michael Vines c8b474cd0b Send votes to next leader's TPU instead of our TPU 2021-04-20 00:38:21 -07:00
Michael Vines 6907a2366e Remove unnecessary clone 2021-04-17 10:23:13 -07:00
Michael Vines 2229b70c4e Add authorized-voter add/remove-all commands 2021-04-12 15:55:28 -07:00
carllin dc7030ffaa
Allow fork choice to support multiple versions of a slot (#16266) 2021-04-12 01:00:59 -07:00
carllin 99b3aab703
Track gossip vote updates per hash for replay stage (#16421)
* Track gossip vote updates per hash for replay stage
2021-04-10 17:34:45 -07:00
Christian Drappi 54a04bac3d
Apple M1 compatibility (#16346)
Co-authored-by: Christian Drappi <christiandrappi@Christians-MacBook-Pro.local>
2021-04-09 17:21:01 -07:00
carllin 4e5ef6bce2
Add cluster state verifier logging (#16330)
* Add cluster state verifier logging

* Add duplicate-slots iterator to ledger tool
2021-04-02 21:48:44 -07:00
behzad nouri 3f63ed9a72
removes OrderedIterator and transaction batch iteration order (#16153)
In TransactionBatch,
https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/transaction_batch.rs#L4-L11
lock_results[i] is aligned with transactions[iteration_order[i]]:
https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/bank.rs#L2414-L2424
https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/accounts.rs#L788-L817

However load_and_execute_transactions is iterating over
  lock_results[iteration_order[i]]
https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/bank.rs#L2878-L2889
and then returning i as for the index of the retryable transaction.

If iteratorion_order is [1, 2, 0], and i is 0, then:
  lock_results[iteration_order[i]] = lock_results[1]
which corresponds to
  transactions[iteration_order[1]] = transactions[2]
so neither i = 0, nor iteration_order[i] = 1 gives the correct index for the
corresponding transaction (which is 2).

This commit removes OrderedIterator and transaction batch iteration order
entirely. There is only one place in blockstore processor which the
iteration order is not ordinal:
https://github.com/solana-labs/solana/blob/e50f59844/ledger/src/blockstore_processor.rs#L269-L271
It seems like, instead of using an iteration order, that can shuffle entry
transactions in-place.
2021-03-31 23:59:19 +00:00
sakridge 60b4771fc6
Only print skipped leader slot message when the node is actually leader (#16156)
Also, check vote signature after the vote is signed
2021-03-26 17:45:53 -07:00
Tyera Eulberg 433f1ead1c
Rpc: enable getConfirmedBlock and getConfirmedTransaction to return confirmed (not yet finalized) data (#16142)
* Add Blockstore block and tx apis that allow unrooted responses

* Add TransactionStatusMessage, and send on bank freeze; also refactor TransactionStatusSender

* Track highest slot with tx-status writes complete

* Rename and unpub fn

* Add commitment to GetConfirmed input configs

* Support confirmed blocks in getConfirmedBlock

* Support confirmed txs in getConfirmedTransaction

* Update sigs-for-addr2 comment

* Enable confirmed block in cli

* Enable confirmed transaction in cli

* Review comments

* Rename blockstore method
2021-03-26 16:47:35 -06:00
sakridge b99ae8f334
Skip leader slots until a vote lands (#15607) 2021-03-25 18:54:51 -07:00
sakridge 9b94741290
Fix test_replay_commitment_cache (#16131) 2021-03-25 21:16:39 +00:00
carllin 52703badfa
Setup ReplayStage confirmation scaffolding for duplicate slots (#9698) 2021-03-24 23:41:52 -07:00
Justin Starry 918d04e3f0
Add more slot update notifications (#15734)
* Add more slot update notifications

* fix merge

* Address feedback and add integration test

* switch to datapoint

* remove unused shred method

* fix clippy

* new thread for rpc completed slots

* remove extra constant

* fixes

* rely on channel closing

* fix check
2021-03-12 21:44:06 +08:00
sakridge f1223fb783
Lower blockstore processor error severity (#15578) 2021-03-01 14:57:37 -08:00
Michael Vines 5df36aec7d Pacify clippy 2021-02-19 20:08:41 -08:00
Michael Vines fd3b71a2c6 cargo fmt 2021-02-19 20:08:41 -08:00
Tyera Eulberg 170cb792eb
Return blockstore error if previous_blockhash cannot be determined (#15382)
* Return blockstore error if previous_blockhash cannot be determined

* Add require_previous_blockshash flag
2021-02-18 01:04:52 +00:00
behzad nouri b6f231b60e
removes locked pubkey references (#15152) 2021-02-08 02:07:00 +00:00
sakridge bbae23358c
ledger-tool cleanup and additions (#15179)
* Plumb allow-dead-slots to ledger-tool verify

* ledger-tool cleanup and add some useful missing args

Print root slots and how many unrooted past last root.
2021-02-06 17:26:42 -08:00
behzad nouri 6fd5ec0e4c
caches descendants in bank forks (#15107) 2021-02-05 18:00:45 +00:00
behzad nouri 86467d825a
removes pubkey references (#15050) 2021-02-03 23:02:11 +00:00
Tyera Eulberg cbb8b79a60
Add validator flag to opt in to cpi and logs storage (#14922)
* Add validator flag to opt in to cpi and logs storage

* Default TestValidator to opt-in; allow using in multinode-demo

* No clone

Co-authored-by: Carl Lin <carl@solana.com>
2021-02-01 14:00:51 -07:00
behzad nouri 8e581601d6
patches crds vote-index assignment bug (#14438)
If tower is full, old votes are evicted from the front of the deque:
https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L367-L373
whereas recent votes if expire are evicted from the back:
https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L529-L537

As a result, from a single tower_index scalar, we cannot infer which crds-vote
should be overwritten:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L576

In addition there is an off by one bug in the existing code. tower_index is
bounded by MAX_LOCKOUT_HISTORY - 1:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/consensus.rs#L382
So, it is at most 30, whereas MAX_VOTES is 32:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L29
Which means that this branch is never taken:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L590-L593
so crds table alwasys keeps 29 **oldest** votes by wallclock, and then
only overrides the 30st one each time. (i.e a tally of only two most
recent votes).
2021-01-21 13:08:07 +00:00
Jeff Washington (jwash) 935dfdf0f6
fill in timing gaps in replay_stage (#14550)
* fill in timing gaps in replay_stage

* add replay_stage bank_count metric

* formatting

* handle another gap

* cleanup wait_receive_time to be more straightforward
2021-01-13 10:08:53 -06:00
behzad nouri 49019c6613
obtains staked-nodes from the root-bank (#14257)
... as opposed to the working bank
2020-12-27 13:28:05 +00:00
Trent Nelson 5b903318b2 vote: Add helper for creating current-versioned states 2020-12-22 19:37:26 -07:00
carllin 75e9e321de
Fix race between setting tick height and calculating accounts hash (#14101)
Co-authored-by: Carl Lin <carl@solana.com>
2020-12-15 12:45:40 -08:00
Michael Vines 7143aaa89b Clippy 2020-12-14 08:03:29 -08:00
carllin 55fc963595
Move slot cleanup to AccountsBackgroundService (#13911)
* Move bank drop to AccountsBackgroundService

* Send to ABS on drop instead, protects against other places banks are dropped

* Fix Abi

* test

Co-authored-by: Carl Lin <carl@solana.com>
2020-12-13 01:22:34 +00:00
sakridge c5fe076432
Better dupe detection (#13992) 2020-12-09 23:14:31 -08:00
carllin 239a191612
Remove unneeded BankWeight fork choice (#13978)
Co-authored-by: Carl Lin <carl@solana.com>
2020-12-07 13:47:14 -08:00
carllin 34b68288c8
Fix propagation skip check (#13933)
Co-authored-by: Carl Lin <carl@solana.com>
2020-12-03 12:31:38 -08:00
behzad nouri e1793e5a13
caches vote-state de-serialized from vote accounts (#13795)
Gossip and other places repeatedly de-serialize vote-state stored in
vote accounts. Ideally the first de-serialization should cache the
result.

This commit adds new VoteAccount type which lazily de-serializes
VoteState from Account data and caches the result internally.

Serialize and Deserialize traits are manually implemented to match
existing code. So, despite changes to frozen_abi, this commit should be
backward compatible.
2020-11-30 17:18:33 +00:00
Michael Vines 959880db60 Remove unused pubkey::Pubkey imports 2020-10-21 19:08:13 -07:00
Michael Vines 7bc073defe Run `codemod --extensions rs Pubkey::new_rand solana_sdk::pubkey::new_rand` 2020-10-21 19:08:13 -07:00
Ryo Onodera a44e4d386f
Better tower logs for SwitchForkDecision and etc (#12875)
* Better tower logs for SwitchForkDecision and etc

* nits

* Update comment
2020-10-15 18:30:33 +09:00
Michael Vines c5c8da1ac0 Expose all rewards (fees, rent, voting and staking) in RPC getConfirmedBlock and the cli 2020-10-09 21:54:13 -07:00
Tyera Eulberg 89621adca7
Rpc -> proper optimistic confirmation (#12514)
* Add service to track the most recent optimistically confirmed bank

* Plumb service into ClusterInfoVoteListener and ReplayStage

* Clean up test

* Use OptimisticallyConfirmedBank in RPC

* Remove superfluous notifications from RpcSubscriptions

* Use crossbeam to avoid mpsc recv_timeout panic

* Review comments

* Remove superfluous last_checked_slots, but pass in OptimisticallyConfirmedBank for complete correctness
2020-09-28 20:43:05 -06:00
carllin 06f84c65f1
Fix rooted accounts cleanup, simplify locking (#12194)
Co-authored-by: Carl Lin <carl@solana.com>
2020-09-28 16:04:46 -07:00
Justin Starry 731a943239
Remove transaction encoding from storage layer (#12404) 2020-09-24 13:10:29 +08:00
Ryo Onodera cb8661bd49
Persistent tower (#10718)
* Save/restore Tower

* Avoid unwrap()

* Rebase cleanups

* Forcibly pass test

* Correct reconcilation of votes after validator resume

* d b g

* Add more tests

* fsync and fix test

* Add test

* Fix fmt

* Debug

* Fix tests...

* save

* Clarify error message and code cleaning around it

* Move most of code out of tower save hot codepath

* Proper comment for the lack of fsync on tower

* Clean up

* Clean up

* Simpler type alias

* Manage tower-restored ancestor slots without banks

* Add comment

* Extract long code blocks...

* Add comment

* Simplify returned tuple...

* Tweak too aggresive log

* Fix typo...

* Add test

* Update comment

* Improve test to require non-empty stray restored slots

* Measure tower save and dump all tower contents

* Log adjust and add threshold related assertions

* cleanup adjust

* Properly lower stray restored slots priority...

* Rust fmt

* Fix test....

* Clarify comments a bit and add TowerError::TooNew

* Further clean-up arround TowerError

* Truly create ancestors by excluding last vote slot

* Add comment for stray_restored_slots

* Add comment for stray_restored_slots

* Use BTreeSet

* Consider root_slot into post-replay adjustment

* Tweak logging

* Add test for stray_restored_ancestors

* Reorder some code

* Better names for unit tests

* Add frozen_abi to SavedTower

* Fold long lines

* Tweak stray ancestors and too old slot history

* Re-adjust error conditon of too old slot history

* Test normal ancestors is checked before stray ones

* Fix conflict, update tests, adjust behavior a bit

* Fix test

* Address review comments

* Last touch!

* Immediately after creating cleaning pr

* Revert stray slots

* Revert comment...

* Report error as metrics

* Revert not to panic! and ignore unfixable test...

* Normalize lockouts.root_slot more strictly

* Add comments for panic! and more assertions

* Proper initialize root without vote account

* Clarify code and comments based on review feedback

* Fix rebase

* Further simplify based on assured tower root

* Reorder code for more readability

Co-authored-by: Michael Vines <mvines@gmail.com>
2020-09-19 14:03:54 +09:00
carllin 9c490e06b0
Fix propagation on startup from snapshot (#12177) 2020-09-11 02:03:11 -07:00
Tyera Eulberg 05db41fe9c
Cache block time in Blockstore (#11955)
* Add blockstore column to cache block times

* Add method to cache block time

* Add service to cache block time

* Update rpc getBlockTime to use new method, and refactor blockstore slightly

* Return block_time with confirmed block, if available

* Add measure and warning to cache-block-time
2020-09-09 09:33:14 -06:00
Ryo Onodera 53b8ea4464
Rename to ClusterType and restore devnet compat. (#12068)
* Rename to ClusterType and restore devnet compat.

* De-duplicate parse code and add comments

* Adjust default Devnet genesis & reduce it in tests
2020-09-08 23:55:09 +09:00
carllin 7e25130529
Send votes from banking stage to vote listener (#11434)
*  Send votes from banking stage to vote listener

Co-authored-by: Carl <carl@solana.com>
2020-08-07 11:21:35 -07:00
carllin d7e961dac4
Enable new fork choice on mainnet, 400_000 slots into epoch 61 (#11312)
Co-authored-by: Carl <carl@solana.com>
2020-07-31 20:37:58 +00:00
carllin bf18524368
Add hook for getting vote transactions on replay (#11264)
* Add hook for getting vote transactions on replay

Co-authored-by: Carl <carl@solana.com>
2020-07-29 23:17:40 -07:00
carllin a7ea340f22
Track votes from gossip for optimistic confirmation (#11209)
* Add check in cluster_info_vote_listenere to see if optimstic conf was achieved
Add OptimisticConfirmationVerifier

* More fixes

* Fix merge conflicts

* Remove gossip notificatin

* Add dashboards

* Fix rebase

* Count switch votes as well toward optimistic conf

* rename

Co-authored-by: Carl <carl@solana.com>
2020-07-28 09:33:27 +00:00
carllin c0dc21620b
Test cleanup (#11192)
Co-authored-by: Carl <carl@solana.com>
2020-07-24 09:55:25 +00:00
carllin 6578ad7d08
Speed up local cluster partitioning tests (#11177)
* Fix long local cluster partition tests by skipping slot warmup

Co-authored-by: Carl <carl@solana.com>
2020-07-23 18:50:42 -07:00
carllin 73f3d04798
Add replay votes to gossip vote tracking (#11119)
* Plumb replay vote channel for notifying vote listener of replay votes

* Keep gossip only notification for debugging gossip in the future

Co-authored-by: Carl <carl@solana.com>
2020-07-20 17:29:07 -07:00
Greg Fitzgerald 2fdbb97244
Rename largest_confirmed_root to highest_confirmed_root (#10947) 2020-07-07 23:59:46 +00:00
carllin 4b93a7c1f6
Fix fork detection (#10839)
* Fix fork detection

Co-authored-by: Carl <carl@solana.com>
2020-06-29 18:49:57 -07:00
sakridge 17a2128a8f
More replay stage timing metrics (#10828) 2020-06-28 10:04:15 -07:00
Greg Fitzgerald 50b3fa83a0
Move BankCommitmentCache to solana_runtime (#10816)
* Remove Blockstore member variable from BlockCommitmentCache

* Hoist is_confirmed_rooted() to its only caller

BlockCommitmentCache no longer depends on Blockstore

* Move BlockCommitmentCache to solana_runtime
2020-06-25 22:06:58 -06:00
Ryo Onodera 44f5452013
Remove unused StakeLockout::lockout (#10719)
* Remove unused StakeLockout::lockout

* Revert...

* Really revert to the original behavior...

* Use consistent naming after StakeLockout removal

* Furhter clean up

* Missed type aliases...

* More...

* Even more...
2020-06-23 10:30:09 +09:00
Greg Fitzgerald 0550b893b0
Fix typos (#10675)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-06-17 20:54:52 -07:00
Justin Starry 39984cdcc3
Wait until bank is frozen before sending RPC notifications (#10654) 2020-06-18 00:44:51 +08:00
Greg Fitzgerald 6ee222363e
Move BankForks to solana_runtime (#10637)
* Move BankForks to solana_runtime

* Update imports
2020-06-17 15:27:03 +00:00
carllin f8b88d717e
Enable fork choice and switch votes, devnet => now, testnet => epoch 63 (#10615)
* Enable fork choice, devnet => now, testnet => epoch 63

* Set development to 0

* Enable switch vote slot

Co-authored-by: Carl <carl@solana.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-06-16 09:55:36 +00:00
anatoly yakovenko ba83e4ca50
Fix fannout gossip bench (#10509)
* Gossip benchmark

* Rayon tweaking

* push pulls

* fanout to max nodes

* fixup! fanout to max nodes

* fixup! fixup! fanout to max nodes

* update

* multi vote test

* fixup prune

* fast propagation

* fixups

* compute up to 95%

* test for specific tx

* stats

* stats

* fixed tests

* rename

* track a lagging view of which nodes have the local node in their active set in the local received_cache

* test fixups

* dups are old now

* dont prune your own origin

* send vote to tpu

* tests

* fixed tests

* fixed test

* update

* ignore scale

* lint

* fixup

* fixup

* fixup

* cleanup

Co-authored-by: Stephen Akridge <sakridge@gmail.com>
2020-06-13 22:03:38 -07:00
Greg Fitzgerald 8dd5384d6d
Split commitment module (#10541)
automerge
2020-06-12 17:16:10 -07:00
Greg Fitzgerald 2eb6f498a8
Remove redundant BankForks parameter (#10537) 2020-06-12 11:04:17 -06:00
carllin 2e1d59ff85
Adopt heaviest subtree fork choice rule (#10441)
* Add HeaviestSubtreeForkChoice

* Make replay stage switch between two fork choice rules

Co-authored-by: Carl <carl@solana.com>
2020-06-11 12:16:04 -07:00
Kristofer Peterson e23340d89e
Clippy cleanup for all targets and nighly rust (also support 1.44.0) (#10445)
* address warnings from 'rustup run beta cargo clippy --workspace'

minor refactoring in:
- cli/src/cli.rs
- cli/src/offline/blockhash_query.rs
- logger/src/lib.rs
- runtime/src/accounts_db.rs

expect some performance improvement AccountsDB::clean_accounts()

* address warnings from 'rustup run beta cargo clippy --workspace --tests'

* address warnings from 'rustup run nightly cargo clippy --workspace --all-targets'

* rustfmt

* fix warning stragglers

* properly fix clippy warnings test_vote_subscribe()
replace ref-to-arc with ref parameters where arc not cloned

* Remove lock around JsonRpcRequestProcessor (#10417)

automerge

* make ancestors parameter optional to avoid forcing construction of empty hash maps

Co-authored-by: Greg Fitzgerald <greg@solana.com>
2020-06-09 09:38:14 +09:00
carllin d9366776b2
Add operating mode gating (#10332)
Co-authored-by: Carl <carl@solana.com>
2020-05-30 00:03:19 -07:00
carllin 778078e1dc
Distinguish switch/non-switching votes in ReplayStage (#10218)
* Add SwitchForkDecision, change vote instruction based on decision

* Factor out SelectVoteAndResetForkResult

Co-authored-by: Carl <carl@solana.com>
2020-05-29 14:40:36 -07:00
Kristofer Peterson fb4d8e1f62
cleanup clippy tests (#10172)
automerge
2020-05-29 00:26:06 -07:00
Tyera Eulberg bac4aec16f
Trigger RPC notifications after block commitment cache update (#10077)
* Fixup commitment-aggregation metric

* Trigger notifications after commitment-cache update

* Fixup fn name

* Add single-confirmation commitment level

* Rename to highest_confirmed_slot

* Pass commitment-cache info directly to notifications

* Use match

* Update commitment docs

* Update out of date pubsub docs
2020-05-18 12:49:01 -06:00
Kristofer Peterson 58ef02f02b
9951 clippy errors in the test suite (#10030)
automerge
2020-05-15 09:35:43 -07:00
Jack May eb1acaf927
Remove archiver and storage program (#9992)
automerge
2020-05-14 18:22:47 -07:00
carllin 59de1b3b62
Compute Switch Threshold (#9218)
* Add switching threshold check

Co-authored-by: Carl <carl@solana.com>
2020-05-11 22:20:11 -07:00
carllin 01ab1d1369
Add metrics for logging time taken in replaystage steps (#9933)
automerge
2020-05-08 03:46:29 -07:00
carllin e970c58330
Properly handle ancestor/descendant maps (#9932)
* Account for descendants < root not existing in BankForks, purge ancestors/descendants map for consistency with BankForks and progress map


Co-authored-by: Carl <carl@solana.com>
2020-05-07 23:39:57 -07:00
Tyera Eulberg 754c65c066
Refactor RPC subscriptions account handling (#9888)
* Switch subscriptions to use commitment instead of confirmations

* Add bank method to return account and last-modified slot

* Add last_modified_slot to subscription data and use to filter account subscriptions

* Update tests to non-zero last_notified_slot

* Add accounts subscriptions to test; fails at higher tx load

* Pass BankForks to RpcSubscriptions

* Use BankForks on add_account_subscription to properly initialize last_notified_slot

* Bundle subscriptions

* Check for non-equality

* Use commitment to initialize last_notified_slot; revert context.slot chage
2020-05-07 00:23:06 -06:00
carllin 445e6668c2
Fix (#9896)
Co-authored-by: Carl <carl@solana.com>
2020-05-06 11:44:49 -07:00
Michael Vines 72312ad615
Avoid holding the entire rooted path while loading bank forks (#9885) 2020-05-05 19:45:41 -07:00
carllin 3442f36f8a
Repair alternate versions of dead slots (#9805)
Co-authored-by: Carl <carl@solana.com>
2020-05-05 14:07:21 -07:00
Tyera Eulberg a7f33b5014
Cache banks in BankForks until optional largest_confirmed_root (#9678)
automerge
2020-04-24 15:49:57 -07:00
Greg Fitzgerald 76b1c2baf0
One less alloc per transaction (#9705)
* One less alloc per transaction

* Fix benches

* Fix compiler warnings in bench build

* Fix move build

* Fix bench
2020-04-24 13:03:46 -06:00
Tyera Eulberg d5abff82e0
Add largest_confirmed_root to BlockCommitmentCache (#9640)
* Add largest_confirmed_root to BlockCommitmentCache

* clippy

* Add blockstore to BlockCommitmentCache to check root

* Add rooted_stake helper fn and test

* Nodes that are behind should correctly id confirmed roots

* Simplify rooted_stake collector
2020-04-22 12:22:09 -06:00
carllin bab3502260
Push down cluster_info lock (#9594)
* Push down cluster_info lock

* Rework budget decrement

Co-authored-by: Carl <carl@solana.com>
2020-04-21 12:54:45 -07:00
carllin 1607891b29
log proper slot (#9576)
Co-authored-by: Carl <carl@solana.com>
2020-04-19 14:24:45 -07:00
sakridge 66abe45ea1
Decouple accounts hash calculation from snapshot hash (#9507) 2020-04-16 15:12:20 -07:00
carllin 3037eb8d4f
Remove slot field, add test (#9444)
Co-authored-by: Carl <carl@solana.com>
2020-04-10 23:52:37 -07:00
carllin aa8dfac313
Simplify vote simulation (#9435)
Co-authored-by: Carl <carl@solana.com>
2020-04-10 15:16:12 -07:00
carllin 4522e85ac4
Add Metrics/Dashboards tracking block production (#9342)
* Add metric tracking blocks/dropped blocks

Co-authored-by: Carl <carl@solana.com>
2020-04-08 14:35:24 -07:00
Michael Vines ad0997e15f
RPC: add `err` field to TransactionStatus, alongside the now deprecated `status` field (#9296)
automerge
2020-04-04 16:13:26 -07:00
carllin 0139236464
ReplayStage fixes (#9271) (#9279)
automerge
2020-04-02 21:05:33 -07:00
Jack May 268e04cb4a
Rename CustomError to Custom (#9207) 2020-04-01 09:01:11 -07:00
Michael Vines 0e2722c638
solana-validator now supports multiple --authorized-voter arguments (#9174)
* Use Epoch type

* Vote account's authorized voter is now supported without a validator restart
2020-03-31 08:23:42 -07:00
carllin 66946a4680
Check ClusterSlots for confirmation of block propagation (#9115) 2020-03-30 19:57:11 -07:00
Tyera Eulberg 50fa577af8
Use cluster confirmations in rpc and pubsub (#9138)
* Add runtime methods to simply get status and slot

* Add helper function to get slot confirmation_count from BlockCommitmentCache

* Return cluster confirmations in getSignatureStatus

* Remove use of invalid get_signature_confirmation_status

* Remove unused methods

* Update pubsub to use cluster confirmations

* Fix test_check_signature_subscribe failure

* Refactor confirmations to read commitment cache only once

* Review comments

* Use bank, root from BlockCommitmentCache

* Update docs

* Add metric for block-commitment aggregations

Co-authored-by: Justin Starry <justin@solana.com>
2020-03-30 17:53:25 -06:00
Tyera Eulberg 62040cef56
Store BlockCommitmentCache slot and root metadata (#9154)
automerge
2020-03-30 10:29:30 -07:00
Justin Starry c1a3b6ecc2
Add RPC subscription api for rooted slots (#9118)
automerge
2020-03-27 09:33:40 -07:00
carllin d47262d233
Reduce transmit frequency (#9113)
Co-authored-by: Carl <carl@solana.com>
2020-03-26 23:33:28 -07:00
carllin 5a8658283a
Add check for propagation of leader block before generating further blocks (#8758)
Co-authored-by: Carl <carl@solana.com>
2020-03-26 19:57:27 -07:00
carllin f3d556e3f9
Refactor VoteTracker (#9084)
* Refactor VoteTracker

Co-authored-by: Carl <carl@solana.com>
2020-03-26 17:55:17 -07:00
sakridge b7b4aa5d4d
move rpc types from client to client-types crate (#9039)
* Separate client types into own crate, so ledger does not need it

Removes about 50 crates of dependency from ledger

* Drop Rpc name from transaction-status types
2020-03-26 13:29:30 -07:00
Michael Vines aa24181a53
Remove blockstream unix socket support. RPC or bust (#9004)
automerge
2020-03-21 20:17:11 -07:00
Michael Vines 18c1f0dfe9
Remove stub core/src/genesis_utils.rs (#8999) 2020-03-21 10:54:40 -07:00
carllin dc1db33ec9
Add Capabilities to Signal BroadcastStage to Retransmit (#8899) 2020-03-19 23:35:01 -07:00
sakridge dc347dd3d7
Add Accounts hash consistency halting (#8772)
* Accounts hash consistency halting

* Add option to inject account hash faults for testing.

Enable option in local cluster test to see that node halts.
2020-03-16 08:37:31 -07:00
Carl ead6dc553a If let 2020-03-16 07:57:07 -07:00
Carl 009c124fac Remove generic 2020-03-16 07:57:07 -07:00
Carl 9411fc00b8 Lower error level 2020-03-16 07:57:07 -07:00
carllin 53b8d0d528
Remove holding Poh lock (#8838)
automerge
2020-03-13 15:15:13 -07:00
carllin 9872430bd2
Add VoteTracker for tracking cluster's votes in gossip (#8327)
Track votes by slot in cluster_vote_listener
2020-03-09 22:03:09 -07:00
carllin f23dc11a86
compute_bank_stats needs to return newly computed ForkStats (#8608)
* Fix broken confirmation, add test
2020-03-04 11:49:56 -08:00
carllin 8ef8c9094a
Add ReplayStage changes for checking switch threshold (#8504)
* Refactor for supporting switch threshold check
2020-03-02 12:43:43 -08:00
carllin 5f766cd20b
Remove loop (#8493) 2020-02-26 19:59:28 -08:00
carllin d47a47924a
Update voting simulation (#8460) 2020-02-26 14:09:07 -08:00
carllin 7a2bf7e7eb
Limit leader schedule search space (#8468)
* Limit leader schedule search space

* Fix and add test

* Rename
2020-02-26 13:35:50 -08:00
carllin d821fd29d6
Add versioning (#8348)
automerge
2020-02-25 17:12:01 -08:00
sakridge 004f1d5aed
Combine replay stage memory reporting (#8455)
automerge
2020-02-25 16:04:27 -08:00
sakridge 1caeea8bc2
Refactor new bank paths into common function (#8454) 2020-02-25 15:49:59 -08:00
Pankaj Garg aa80f69171
Promote some datapoints to `info` to fix dashboard (#8381)
automerge
2020-02-21 13:41:49 -08:00
Tyera Eulberg ab361a8073
Rename KeypairUtil to Signer (#8360)
automerge
2020-02-20 13:28:55 -08:00
anatoly yakovenko ccad5d5aaf
change warnings to infos (#8322) 2020-02-19 14:25:49 -08:00
carllin 73a278dc64
Factor out creating genesis with vote accounts into a utility function (#8315)
automerge
2020-02-18 02:39:47 -08:00
anatoly yakovenko 17fb8258e5
Datapoints overwhelm the metrics queue and blow up ram usage. (#8272)
automerge
2020-02-14 11:11:55 -08:00
Michael Vines c4fd81fc1c The getConfirmedBlock RPC API is now disabled by default
The --enable-rpc-get-confirmed-block flag allows validators to opt-in to
the higher disk usage and IOPS.
2020-02-11 22:24:08 -07:00
Michael Vines 72b11081a4 Report validator rewards in getConfirmedBlock JSON RPC 2020-02-11 17:25:45 -07:00
carllin 0c8cee8c4a
Refactor select_fork() to avoid clones and for clarity (#8081)
* Refactor select_fork() to avoid clones and for clarity

* Add test that fork weights are increasing
2020-02-03 16:48:24 -08:00
carllin 4197cce8c9
Tower tests (#7974)
* Add testing framework for voting
2020-01-28 16:02:28 -08:00
carllin 4ffd7693d6
Add lock to make sure slot-based locktree calls are safe (#7993) 2020-01-28 13:45:41 -08:00
Sunny Gleason 5cf090c896
feat: implement RPC notification queue (#7863) 2020-01-20 16:08:29 -05:00
Tyera Eulberg 6d3b8b6d7d
Remove tuples from JSON RPC responses (#7806)
* Remove RpcConfirmedBlock tuple

* Remove getRecentBlockhash tuple

* Remove getProgramAccounts tuple

* Remove tuple from get_signature_confirmation_status

* Collect Rpc response types

* Camel-case epoch schedule for rpc response

* Remove getBlockCommitment tuple

* Remove getStorageTurn tuple

* Update json-rpc docs
2020-01-15 00:25:45 -07:00
Justin Starry ff1ca1e0d3
Consolidate entry tick verification into one function (#7740)
* Consolidate entry tick verification into one function

* Mark bad slots as dead in blocktree processor

* more feedback

* Add bank.is_complete

* feedback
2020-01-15 09:15:26 +08:00
Greg Fitzgerald b5dba77056 Rename blocktree to blockstore (#7757)
automerge
2020-01-13 13:13:52 -08:00
Tyera Eulberg a17d5795fb getConfirmedBlock: add encoding optional parameter (#7756)
automerge
2020-01-12 21:34:30 -08:00
sakridge 73c93cc345
Print bank hash and hash inputs. (#7733) 2020-01-09 16:33:10 -08:00
Michael Vines 4fe0b116ae Measure heap usage while processing the ledger 2020-01-03 13:25:37 -07:00
Michael Vines a0fb9de515 Move thread_mem_usage module into measure/ 2020-01-03 13:25:37 -07:00
Pankaj Garg 87b2525e03
Limit maximum number of shreds in a slot to 32K (#7584)
* Limit maximum number of shreds in a slot to 32K

* mark dead slot replay as fatal error
2019-12-30 07:42:09 -08:00
Parth 727be309b2 fix entryverification state (#7169)
automerge
2019-12-23 23:26:27 -08:00
Sagar Dhawan 6a9005645a
Update "limit-ledger-size" to use DeleteRange for much faster deletes (#7515)
* Update "limit-ledger-size" to use DeleteRange for much faster deletes

* Update core/src/ledger_cleanup_service.rs

Co-Authored-By: Michael Vines <mvines@gmail.com>

* Rewrite more idiomatically

* Move max_ledger_slots to a fn for clippy

* Remove unused import

* Detect when all columns have been purged and fix a bug in deletion

* Check that more than 1 column is actually deleted

* Add helper to test that ledger meets minimum slot bounds

* Remove manual batching of deletes

* Refactor to keep some N slots older than the highest root

* Define MAX_LEDGER_SLOTS that ledger_cleanup_service will try to keep around

* Refactor compact range
2019-12-18 11:50:09 -08:00
sakridge 98b80288ed
Optimize bank_forks critical section (#7477) 2019-12-13 17:20:31 -08:00
sakridge dd54fff978
Use pinned memory for entry verify (#7440) 2019-12-12 10:36:27 -08:00
Parth 6d2861f358
add unit test for minority fork overcommit attack (#7292)
* add unit test for minority fork overcommit attack

* add generic function to simulate fork selection
2019-12-10 22:06:16 +05:30
TristanDebrunner 9ecb844de7 Split up ReplayStageConfig to make it derive Default (#7334)
automerge
2019-12-06 14:39:35 -08:00
Tyera Eulberg 3ab8185777
Add intermittent Timestamping to Votes (#7233)
* Add intermittent timestamp to Vote

* Add timestamp to VoteState, add timestamp processing to program

* Print recent timestamp with solana show-vote-account

* Add offset of 1 to timestamp Vote interval to initialize at node boot (slot 1)

* Review comments

* Cache last_timestamp in Tower and use for interval check

* Move work into Tower method

* Clarify timestamping interval

* Replace tuple with struct
2019-12-06 14:38:49 -07:00
TristanDebrunner fae9c08815
Add ReplayStageConfig (#7195) 2019-12-04 11:17:17 -07:00