* gossip: notify state machine of duplicate proofs
* Add feature flag for ingesting duplicate proofs from Gossip.
* Use the Epoch the shred is in instead of the root bank epoch.
* Fix unittest by activating the feature.
* Add a test for feature disabled case.
* EpochSchedule is now not copyable, clone it explicitly.
* pr feedback: read epoch schedule on startup, add guard for ff recache
* pr feedback: bank_forks lock, -cached_slots_in_epoch, init ff
* pr feedback: bank.forks_try_read() -> read()
* pr feedback: fix local-cluster setup
* local-cluster: do not expose gossip internals, use retry mechanism instead
* local-cluster: split out case 4b into separate test and ignore
* pr feedback: avoid taking lock if ff is already found
* pr feedback: do not cache ff epoch
* pr feedback: bank_forks lock, revert to cached_slots_in_epoch
* pr feedback: move local variable into helper function
* pr feedback: use let else, remove epoch 0 hack
---------
Co-authored-by: Wen <crocoxu@gmail.com>
The function open_genesis_config() performs several operations that
could fail. If any of these fail, the process exits immediately.
Instead of exiting immediately, bubble up the error and let the caller
decide the appropriate action. solana-validator and solana-ledger-tool
will functionally be unchanged, but this consolidates startup failures
for both of these processes.
* wire up fork stats fork_weight
* convert weight to percentage for logging
* add bank_stake
* remove old fork choice measure and stats - vote stake * lockout
* update tests
* fix bank_weight and rename it to fork_weight
* fix u64 multiple overflow in fork_weight calculation
* format fork_weight as percentage in logging
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
```
warning: writing `&Vec` instead of `&[_]` involves a new object where a slice will do
--> core/src/banking_stage/consumer.rs:153:29
|
153 | packets_to_process: &Vec<Arc<ImmutableDeserializedPacket>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: change this to: `&[Arc<ImmutableDeserializedPacket>]`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg
= note: `#[warn(clippy::ptr_arg)]` on by default
warning: `solana-core` (lib) generated 1 warning
```
```
warning: unit tests in doctest are not executed
--> core/tests/fork-selection.rs:15:5
|
15 | //! #[test]
| _____^
16 | | //! #[ignore]
17 | | //! fn test_all_partitions() {
| |__________________________^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#test_attr_in_doctest
= note: `#[warn(clippy::test_attr_in_doctest)]` on by default
warning: `solana-core` (test "fork-selection") generated 1 warning
```
On mainnet-beta, respective QUIC endpoint are unnecessary for now until
testnet has fully migrated to QUIC. The commit disables turbine and
repair QUIC endpoints on mainnet-beta.
* Finalize unified scheduler plumbing with min impl
* Fix comment
* Rename leftover type name...
* Make logging text less ambiguous
* Make PhantomData simplyer without already used S
* Make TaskHandler stateless again
* Introduce HandlerContext to simplify TaskHandler
* Add comment for coexistence of Pool::{new,new_dyn}
* Fix grammar
* Remove confusing const for upcoming changes
* Demote InstalledScheduler::context() into dcou
* Delay drop of context up to return_to_pool()-ing
* Revert "Demote InstalledScheduler::context() into dcou"
This reverts commit 049a126c905df0ba8ad975c5cb1007ae90a21050.
* Revert "Delay drop of context up to return_to_pool()-ing"
This reverts commit 60b1bd2511a714690b0b2331e49bc3d0c72e3475.
* Make context handling really type-safe
* Update comment
* Fix grammar...
* Refine type aliases for boxed traits
* Swap the tuple order for readability & semantics
* Simplify PooledScheduler::result_with_timings type
* Restore .in_sequence()
* Use where for aesthetics
* Simplify if...
* Fix typo...
* Polish ::schedule_execution() a bit
* Fix rebase conflicts..
* Make test more readable
* Fix test failures after rebase...
BankForks has two constructors; one that takes a single Bank (the root)
and one that can take an arbitrary number of Banks plus the root slot.
However, the constructor that accepts multiple banks is unnecessary; it
isn't used in production and is only used in several tests.
So, remove the multi-bank constructor and update unit tests.
There are operations in bank_fork_utils that may fail; we explicitly
call std::process::exit() on several of these. Granted we may end up
exiting the process higher up the callstack, bubbling the errors up
allow a caller that could handle the error to do so.
Currently, the file is generated when a node drops a block that was
produced by another node. However, it would also be beneficial to see
the account state when a node drops its' own block.
Output the file in this additional failure codepath
The Blockstore currently maintains a RwLock<Slot> of the maximum root
it has seen inserted. The value is initialized during
Blockstore::open() and updated during calls to Blockstore::set_roots().
The max root is queried fairly often for several use cases, and caching
the value is cheaper than constructing an iterator to look it up every
time.
However, the access patterns of these RwLock match that of an atomic.
That is, there is no critical section of code that is run while the
lock is head. Rather, read/write locks are acquired in order to read/
update, respectively. So, change the RwLock<u64> to an AtomicU64.
These services currently live in core/; however, they operate on the
ledger. Mores so, these two services operate on the blockstore only,
and not necessarily the entire ledger. So, it makes sense to move these
services out of core and into ledger. We've recently been doing similar
changes with breaking things out into individual crates in order to
reduce the scope of core.
So, this change moves the services from core/ to ledger/, and replaces
ledger with blockstore.
* Remove RWLock from EntryNotifier because it causes perf degradation when entry notifications are enabled on geyser
* remove unused RWLock
* Remove RWLock
* Initialize fork graph in program cache during bank_forks creation
* rename BankForks::new to BankForks::new_rw_arc
* fix compilation
* no need to set fork_graph on insert()
* fix partition tests
This macro is used a lot for tests to create a ledger path in order to
open a Blockstore. Files will be left on disk unless the test remembers
to call Blockstore::destroy() on the directory. So, instead of requiring
this, use the get_tmp_ledger_path_auto_delete!() macro that creates a
TempDir (which automatically deletes itself when it goes out of scope).
The commit implements lazy eviction for turbine QUIC connections.
The cache is allowed to grow to 2 x capacity at which point at least
half of the entries with lowest stake are evicted, resulting in an
amortized O(1) performance.
The commit implements lazy eviction for repair QUIC connections.
The cache is allowed to grow to 2 x capacity at which point at least
half of the entries with lowest stake are evicted, resulting in an
amortized O(1) performance.
Currently each outgoing repair request will attempt to establish a
connection if one does not already exist. This is very wasteful and
consumes many tokio tasks if the remote node is down or unresponsive.
The commit decouples routing packets from establishing connections by
adding a buffering channel for each remote address. Outgoing packets are
always sent down this channel to be processed once the connection is
established. If connecting attempt fails, all packets already pushed to
the channel are dropped at once, reducing the number of attempts to make
a connection if the remote node is down or unresponsive.
The current getHealth mechanism checks a local accounts hash slot vs.
those of other nodes as specified by --known-validator. This is a
very coarse comparison given that the default for this value is 100
slots. More so, any nodes using a value larger than the default
(ie --incremental-snapshot-interval 500) will likely see getHealth
return status behind at some point.
Change the underlying mechanism of how health is computed. Instead of
using the accounts hash slots published in gossip, use the latest
optimistically confirmed slot from the cluster. Even when a node is
behind, it is able to observe cluster optimistically confirmed by slots
by viewing votes published in gossip.
Thus, the latest cluster optimistically confirmed slot can be compared
against the latest optimistically confirmed bank from replay to
determine health. This new comparison is much more granular, and not
needing to depend on individual known validators is also a plus.