Commit Graph

543 Commits

Author SHA1 Message Date
Lucas Steuernagel b97b3dd4ab
Use BankForks on tests - Part 3 (#34248)
* Add BankForks to core tests

* Refactor functions under DCOU
2023-12-01 13:47:22 -03:00
steviez 935e06f8f1
Output BankHashDetails file when leader drops its' own block (#34256)
Currently, the file is generated when a node drops a block that was
produced by another node. However, it would also be beneficial to see
the account state when a node drops its' own block.

Output the file in this additional failure codepath
2023-11-29 17:20:27 -06:00
Ashwin Sekar 504f2ee892
Add deepest slot metric (#34186)
* reset to deepest slot when last vote is for an invalid fork.

* pr feedback: comments, height starts at 1
2023-11-21 04:07:00 -05:00
Ashwin Sekar b5256997f8
refactor: GossipDuplicateConfirmed/cluster_confirmed -> DuplicateConf… (#34012)
refactor: GossipDuplicateConfirmed/cluster_confirmed -> DuplicateConfirmed
2023-11-10 14:47:42 -05:00
Lijun Wang 69cec7e7b7
Remove RwLock on BlockNotifier (#33981) 2023-11-08 10:27:50 -08:00
Tyera d6ac9bea84
Geyser: return real parent blockhash, or default (#33873)
Return real parent blockhash, or default
2023-11-06 11:14:18 -07:00
Jeff Biseda 63abc72e86
remove unused replay-loop-voting-stats values (#33935) 2023-10-31 23:40:45 -07:00
Ryo Onodera 136ab21f34
Define InstalledScheduler::wait_for_termination() (#33922)
* Define InstalledScheduler::wait_for_termination()

* Rename to wait_for_scheduler_termination

* Comment wait_for_termination and WaitReason better
2023-10-31 14:33:36 +09:00
Ryo Onodera 080285cb95
Adjust solana-core for cleaner scheduler-pr diff (#33881) 2023-10-27 12:29:41 +09:00
Pankaj Garg 9d42cd7efe
Initialize fork graph in program cache during bank_forks creation (#33810)
* Initialize fork graph in program cache during bank_forks creation

* rename BankForks::new to BankForks::new_rw_arc

* fix compilation

* no need to set fork_graph on insert()

* fix partition tests
2023-10-23 09:32:41 -07:00
steviez 56ccffdaa5
Replace get_tmp_ledger_path!() with self cleaning version (#33702)
This macro is used a lot for tests to create a ledger path in order to
open a Blockstore. Files will be left on disk unless the test remembers
to call Blockstore::destroy() on the directory. So, instead of requiring
this, use the get_tmp_ledger_path_auto_delete!() macro that creates a
TempDir (which automatically deletes itself when it goes out of scope).
2023-10-21 11:38:31 +02:00
Ryo Onodera 5a963529a8
Add BankWithScheduler for upcoming scheduler code (#33704)
* Add BankWithScheduler for upcoming scheduler code

* Remove too confusing insert_without_scheduler()

* Add doc comment as a bonus

* Simplify BankForks::banks()

* Add derive(Debug) on BankWithScheduler
2023-10-21 15:56:43 +09:00
Pankaj Garg 59cb3b57ee
Set a global fork graph in program cache (#33776)
* Set a global fork graph in program cache

* fix deadlock

* review feedback
2023-10-20 08:47:03 -07:00
Ryo Onodera 1704789247
Define tick related helper test methods (#33537)
* Define tick related helper methods

* dcou VoteSimulator

* blacklist ledger-tool for dcou

* fix dcou ci...

* github
2023-10-10 09:23:18 +09:00
Wen 630feeddf2
Add wen_restart module (#33344)
* Add wen_restart module:
- Implement reading LastVotedForkSlots from blockstore.
- Add proto file to record the intermediate results.
- Also link wen_restart into validator.
- Move recreation of tower outside replay_stage so we can get last_vote.

* Update lock file.

* Fix linter errors.

* Fix depencies order.

* Update wen_restart explanation and small fixes.

* Generate tower outside tvu.

* Update validator/src/cli.rs

Co-authored-by: Tyera <teulberg@gmail.com>

* Update wen-restart/protos/wen_restart.proto

Co-authored-by: Tyera <teulberg@gmail.com>

* Update wen-restart/build.rs

Co-authored-by: Tyera <teulberg@gmail.com>

* Update wen-restart/src/wen_restart.rs

Co-authored-by: Tyera <teulberg@gmail.com>

* Rename proto directory.

* Rename InitRecord to MyLastVotedForkSlots, add imports.

* Update wen-restart/Cargo.toml

Co-authored-by: Tyera <teulberg@gmail.com>

* Update wen-restart/src/wen_restart.rs

Co-authored-by: Tyera <teulberg@gmail.com>

* Move prost-build dependency to project toml.

* No need to continue if the distance between slot and last_vote is
already larger than MAX_SLOTS_ON_VOTED_FORKS.

* Use 16k slots instead of 81k slots, a few more wording changes.

* Use AncestorIterator which does the same thing.

* Update Cargo.lock

* Update Cargo.lock

---------

Co-authored-by: Tyera <teulberg@gmail.com>
2023-10-06 15:04:37 -07:00
carllin ec2e1241a1
Cleanup select_vote_and_reset_forks() (#33421) 2023-09-29 15:11:25 -07:00
Tyera ddd029774a
Add geyser block-metadata notification with entry count (#33359)
* Add new ReplicaBlockInfoVersions variant

* Use new variant to return entry count
2023-09-26 10:13:17 -06:00
Ashwin Sekar 85cc6ace05
Update is_locked_out cache when adopting on chain vote state (#33341)
* Update is_locked_out cache when adopting on chain vote state

* extend to all cached tower checks

* upgrade error to panic
2023-09-25 12:33:38 -07:00
Pankaj Garg f50342a790
Split vote related code from runtime to its own crate (#32882)
* Move vote related code to its own crate

* Update imports in code and tests

* update programs/sbf/Cargo.lock

* fix check errors

* update abi_digest

* rebase fixes

* fixes after rebase
2023-09-19 10:46:37 -07:00
Ashwin Sekar a8e83c8720
replay: send duplicate proofs from blockstore to state machine (#32962)
* replay: send duplicate proofs from blockstore to state machine

* pr feedback: bank.slot() -> slot

* pr feedback
2023-09-05 21:29:53 -07:00
Ashwin Sekar 025651e0d4
ff cleanup: allow_votes_to_directly_update_vote_state and compact_vot… (#32967)
ff cleanup: allow_votes_to_directly_update_vote_state and compact_vote_state_updates
2023-08-30 17:00:19 -07:00
Trent Nelson b8dc5daedb
preliminaries for bumping nightly to 2023-08-25 (#33047)
* remove unnecessary hashes around raw string literals

* remove unncessary literal `unwrap()`s

* remove panicking `unwrap()`

* remove unnecessary `unwrap()`

* use `[]` instead of `vec![]` where applicable

* remove (more) unnecessary explicit `into_iter()` calls

* remove redundant pattern matching

* don't cast to same type and constness

* do not `cfg(any(...` a single item

* remove needless pass by `&mut`

* prefer `or_default()` to `or_insert_with(T::default())`

* `filter_map()` better written as `filter()`

* incorrect `PartialOrd` impl on `Ord` type

* replace "slow zero-filled `Vec` initializations"

* remove redundant local bindings

* add required lifetime to associated constant
2023-08-29 23:05:35 +00:00
Pankaj Garg dbe4017143
Prune programs deployed in duplicate unconfirmed slot (#32999)
* Prune programs deployed in duplicate unconfirmed slot

* unit test
2023-08-25 11:02:20 -07:00
Ashwin Sekar 329c6f131b
tower: when syncing from vote state, update last_vote (#32944)
* tower: when syncing from vote state, update last_vote

* pr: bubble error through unchecked
2023-08-23 13:15:57 -07:00
steviez a4c8cc3ce0
Remove improper uses of &Arc<Bank> (#32802)
In most cases, either a &Bank or an Arc<Bank> is more proper.
- &Bank is used if the function only needs a momentary reference
- Arc<Bank> is used if the function needs its' own copy

This PR leaves several instances of &Arc<Bank> around; these instances
are situations where a clone may only happen conditionally.
2023-08-18 16:46:34 -05:00
steviez 6bbf514e78
Add ability to output components that go into Bank hash (#32632)
When a consensus divergance occurs, the current workflow involves a
handful of manual steps to hone in on the offending slot and
transaction. This process isn't overly difficult to execute; however, it
is tedious and currently involves creating and parsing logs.

This change introduces functionality to output a debug file that
contains the components go into the bank hash. The file can be generated
in two ways:
- Via solana-validator when the node realizes it has diverged
- Via solana-ledger-tool verify by passing a flag

When a divergance occurs now, the steps to debug would be:
- Grab the file from the node that diverged
- Generate a file for the same slot with ledger-tool with a known good
  version
- Diff the files, they are pretty-printed json
2023-08-15 00:12:05 -05:00
Tao Zhu ef6af307a4
improve prioritization fee cache accuracy (#32692)
* improve prioritization cache accuracy
2023-08-07 19:27:28 -05:00
steviez 20fc3a5ded
Remove improper &Arc<Blockstore> instances (#32698)
Update to either &Blockstore if the function just needs a ref, or
Arc<Blockstore> if the function needs to hang onto a copy.
2023-08-03 15:10:25 -06:00
Brooks 7143667dbe
Removes unnecessary mut (#32476) 2023-07-13 13:14:33 -04:00
behzad nouri d54b6204be
removes instances of clippy::manual_let_else (#32417) 2023-07-09 21:41:36 +00:00
Ashwin Sekar c172dfd268
Only send earliest mismatched ancestor to replay (#31842) 2023-07-06 21:31:12 -07:00
Illia Bobyr 282e043177
`cargo fmt` using 1.6.0-nightly (#32390)
Seems like rustfmt 1.6.0 can now format `let/else` statements.
1.5.2 we use in our `solana-nightly` does not mind the reformatting.

```
$ cargo +nightly fmt --version
rustfmt 1.6.0-nightly (f20afcc 2023-07-04)

$ cargo +nightly fmt
$ git add -u
$ git commit

$ ./cargo nightly fmt --version
+ exec cargo +nightly-2023-04-19 fmt --version
rustfmt 1.5.2-nightly (c609da5 2023-04-18)

$ ./cargo nightly fmt
$ git diff
[empty output]
```
2023-07-06 20:45:29 -07:00
Jeff Biseda bad5197cb0
refactor core to create repair module (#32303) 2023-07-05 12:20:46 -07:00
Ashwin Sekar e1576b5352
Don't attempt to refresh votes on non voting validators (#32315) 2023-06-30 17:53:06 -07:00
Jeff Biseda 87c1b67d53
refactor core to create consensus module (#32282) 2023-06-27 17:25:08 -07:00
Wen 6f72258e3e
Vote refresh fix when outside slothash (#29948)
* When there are too many pubkeys in one slot, kick the one with lowest
stake out.

* Cache last_root to reduce read locks we need.

* Use slots_in_epoch to limit number of slots in the map.

* Fix lint errors.

* Only cache stake and slots per epoch once per epoch.

* Revert "Only cache stake and slots per epoch once per epoch."

This reverts commit 8658aad0083456794b4c4403adaf9c74d1a71d09.

* Vote at the tip of current fork if last vote is outside SlotHash
of the tip and last vote expired.

* Add unittest when last vote is outside slothash, we should vote at the tip
of the current fork.

* Revert "Use slots_in_epoch to limit number of slots in the map."

This reverts commit 93574f57a48d2a70fbbc0f62fa8810d3b6bee0af.

* Revert "Cache last_root to reduce read locks we need."

This reverts commit bb114ec2b62cb9c0207328b19c415f6116be0f1c.

* Revert "When there are too many pubkeys in one slot, kick the one with lowest"

This reverts commit 711e29a6a025fd4f11fbc97dcbbe90e4832be04c.

* Move new vote generation when last vote is outside slothash into the
main path, this actually makes more sense since we don't select where
to vote in two different places, and all the vote generation logic
is seamlessly inherited.

* - Move vote refresh to be behind select vote and do not refresh vote if a new
  vote is selected.
- Check whether last vote is inside slothash inside select_vote_and_reset_forks
- rename slot_within_slothash to is_in_slothashes_history
- remove one unittest for now, more tests will be added in a separate CL

* Remove new test, it will be in another file.

* Add is_in_slot_hashes_history test in the new file.

* Add unittest for the case when last vote is outside slot hashes.

* Small improvements and more unittests.

* Fix bad merge.

* Update docs/src/terminology.md

Co-authored-by: mvines <mvines@gmail.com>

* Put SwitchForkDecision::FailedSwitchThreshold logic into separate function.

* Make linter happy.

---------

Co-authored-by: mvines <mvines@gmail.com>
2023-06-26 18:21:24 -07:00
Jeff Biseda 5ca1b40f11
refactor core to create cluster_slots_service module (#32119) 2023-06-26 08:54:49 -07:00
behzad nouri f6e039b0b3
moves turbine to a separate crate out of solana/core (#32226) 2023-06-22 16:22:11 +00:00
Ashwin Sekar 8135cf35bf
Only dump duplicate descendants in dump & repair (#31559) 2023-06-21 11:28:42 -07:00
Ashwin Sekar 01d3546de0
Increment timestamp on refreshed votes (#31908) 2023-06-15 10:38:22 -07:00
Illia Bobyr 4353ac6797
Pass Arc<AtomicBool> by value, not by reference. (#31916)
`Arc` is already a reference internally, so it does not seem to be
beneficial to pass a reference to it.  Just adds an extra layer of
indirection.

Functions that need to be able to increment `Arc` reference count need
to take `Arc<AtomicBool>`, but those that just want to read the
`AtomicBool` value can accept `&AtomicBool`, making them a bit more
generic.

This change focuses specifically on `Arc<AtomicBool>`.  There are other
uses of `&Arc<T>` in the code base that could be converted in a similar
manner.  But it would make the change even larger.
2023-06-01 17:25:48 -07:00
Andrew Fitzgerald 02ac8a46d6
set_bank takes owned Arc<Bank> (#31717) 2023-05-23 09:41:27 -07:00
Lijun Wang 917f3d2586
Use unwrap_or_else for efficiency (#31747)
Use unwrap_or_else for efficiency.
2023-05-22 09:58:24 -07:00
Ashwin Sekar 3e8f5bad81
refactor: highest_cluster_confirmed_root -> highest_super_majority_root (#31619) 2023-05-14 00:42:03 -07:00
Ashwin Sekar ef75f1cb4e
Add ancestor hashes to state machine (#31627)
* Notify replay of pruned duplicate confirmed slots

* Ingest replay signal and run ancestor hashes for pruned

* Forward PDC to ancestor hashes and ingest pruned dumps from ancestor hashes service

* Add local-cluster test
2023-05-13 02:05:44 -07:00
Tyera 3f70ddb2c5
Add entry notification service for geyser (#31290)
* Move entry_notifier_interface

* Add EntryNotifierService

* Use descriptive struct in sender/receiver

* Optionally initialize EntryNotifierService in validator

* Plumb EntryNotfierSender into Tvu, blockstore_processor

* Plumb EntryNotfierSender into Tpu

* Only return one option when constructing EntryNotifierService
2023-05-10 17:20:51 -06:00
steviez 4300d84c68
Remove counters from ReplayStage (#31532)
replay_stage-voted_empty_bank has been converted into a datapoint that
now includes slot number. replay_stage-replay_transactions has been
removed altogether as we can get similar information on a per-slot basis
from replay-slot-stats metric.
2023-05-09 11:44:02 -05:00
behzad nouri 8e638b785a
removes feature gate code sending votes to tpu-vote-port (#31529) 2023-05-08 18:12:35 +00:00
Lijun Wang 7cf50e60fc
Fixed missing Root notifications via geyser plugin framework (#31180)
* Fixed missing Root notifications via geyser plugin framework

* Renamed a variable

* fmt issue

* Do not try the loop if no subscribers.

* Addressing some feedback -- passing parent roots from replay_stage to avoid race conditions

* clippy issue

* Address some reviewing findings

* Addressed some feedback from Carl

* fix a clippy issue

* Added comments on optimistically_confirmed_bank_tracker module to explain the workflow

* Addressed Trent's review
2023-05-03 18:50:00 +08:00
steviez 758bc1ca75
Make ReplayStage panic before dumping repeated-repair-attempt slots (#31333)
When ReplayStage repeatedly fails to compute the correct for a block
after purging and repairing, it panics on the assumption that something
is very wrong and will require human intervention.

If this is the case, there is typically something to be debugged, and
having the slot available locally is valuable. This change does the
retry check that will panic before purging the failure slot.
2023-04-25 11:50:47 -05:00