reworks max number of outgoing push messages (#3016)
max_bytes for outgoing push messages is pretty outdated and does not
allow gossip to function properly with current testnet cluster size.
In particular it does not allow to clear out queue of pending push
messages unless the new_push_messages function is called very frequently
which involves repeatedly locking/unlocking CRDS table.
Additionally leaving gossip entries in the queue for the next round will
add delay to propagating push messages which can compound as messages go
through several hops.
(cherry picked from commit 489f483e1d)
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
excludes node's pubkey from bloom filter of pruned origins (#2990)
Bloom filter of pruned origins can return false positive for a node's
own pubkey but a node should always be able to push its own values to
other nodes in the cluster.
(cherry picked from commit bce28c0282)
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
gossip: demote invalid duplicate proof errors to info (#2754)
* gossip: demote invalid duplicate proof errors to info
* pr feedback: explicitly list every enum
(cherry picked from commit 7b6e6c179f)
Co-authored-by: Ashwin Sekar <ashwin@anza.xyz>
* customizes override logic for gossip ContactInfo (#2579)
If there are two running instances of the same node, we want the
ContactInfo with more recent start time to be propagated through
gossip regardless of wallclocks.
The commit adds custom override logic for ContactInfo to first compare
by outset timestamp.
* updates ContactInfo.outset when hot-swapping identity (#2613)
When hot-swapping identity, ContactInfo.outset should be updated so that
the new ContactInfo overrides older node with the same pubkey.
* patches bug causing false duplicate nodes error (#2666)
The bootstrap code during the validator start pushes a contact-info with
more recent timestamp to gossip. If the node is staked the contact-info
lingers in gossip causing false duplicate node instances when the fully
initialized node joins gossip later on.
The commit refreshes the timestamp on contact-info so that it overrides
the one pushed by bootstrap and avoid false duplicates error.
---------
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
* checks for duplicate instances using the new ContactInfo (#2506)
Working towards deprecating NodeInstance CRDS value, the commit adds
check for duplicate instances using the new ContactInfo.
(cherry picked from commit 1d825df4e1)
* removes unwarp
---------
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
gossip: do not allow duplicate proofs for incorrect shred versions (#1931)
* gossip: do not allow duplicate proofs for incorrect shred versions
* pr feedback: refactor test function to take shred_version
(cherry picked from commit 69ea21e947)
Co-authored-by: Ashwin Sekar <ashwin@anza.xyz>
* put most AbiExample derivations behind a cfg_attr
* feature gate all `extern crate solana_frozen_abi_macro;`
* use cfg_attr wherever we were deriving both AbiExample and AbiEnumVisitor
* fix cases where AbiEnumVisitor was still being derived unconditionally
* fix a case where AbiExample was derived unconditionally
* fix more cases where both AbiEnumVisitor and AbiExample were derived unconditionally
* two more cases where AbiExample and AbiEnumVisitor were unconditionally derived
* fix remaining unconditional derivations of AbiEnumVisitor
* fix cases where AbiExample is the first thing derived
* fix most remaining unconditional derivations of AbiExample
* move all `frozen_abi(digest =` behind cfg_attr
* replace incorrect cfg with cfg_attr
* fix one more unconditionally derived AbiExample
* feature gate AbiExample impls
* add frozen-abi feature to required Cargo.toml files
* make frozen-abi features activate recursively
* fmt
* add missing feature gating
* fix accidentally changed digest
* activate frozen-abi in relevant test scripts
* don't activate solana-program's frozen-abi in sdk dev-dependencies
* update to handle AbiExample derivation on new AppendVecFileBacking enum
* revert toml formatting
* remove unused frozen-abi entries from address-lookup-table Cargo.toml
* remove toml references to solana-address-lookup-table-program/frozen-abi
* update lock file
* remove no-longer-used generic param
For duplicate blocks prevention we want to verify that the last erasure
batch was sufficiently propagated through turbine. This requires
additional bookkeeping because, depending on the erasure coding schema,
the entire batch might be recovered from only a few coding shreds.
In order to simplify above, this commit instead ensures that the last
erasure batch has >= 32 data shreds so that the batch cannot be
recovered unless 32+ shreds are received from turbine or repair.
* add PacketFlags::FROM_STAKED_NODE
* Only forward packets from staked node
* fix local-cluster test forwarding
* review comment
* tpu_votes get marked as from_staked_node
The IP echo server currently spins up a worker thread for every thread
on the machine. Observing some data for nodes,
- MNB validators and RPC nodes look to get several hundred of these
requests per day
- MNB entrypoint nodes look to get 2-3 requests per second on average
In both instances, the current threadpool is severely overprovisioned
which is a waste of resources. This PR plumnbs a flag to control the
number of worker threads for this pool as well as setting a default of
two threads for this server. Two threads allow for one thread to always
listen on the TCP port while the other thread processes requests
* add metric for duplicate push messages
* add in num_total_push
* address comments. don't lock stats each time
* address comments. remove num_total_push
* change dup push message name in code to reflect metric name
This is port of firedancer's implementation of weighted shuffle:
https://github.com/firedancer-io/firedancer/blob/3401bfc26/src/ballet/wsample/fd_wsample.chttps://github.com/anza-xyz/agave/pull/185
implemented weighted shuffle using binary tree. Though asymptotically a
binary tree has better performance, compared to a Fenwick tree, it has
less cache locality resulting in smaller improvements and in particular
slower WeightedShuffle::new.
In order to improve cache locality and reduce the overheads of
traversing the tree, this commit instead uses a generalized N-ary tree
with fanout of 16, showing significant improvements in both
WeightedShuffle::new and WeightedShuffle::shuffle.
With 4000 weights:
N-ary tree (fanout 16):
test bench_weighted_shuffle_new ... bench: 36,244 ns/iter (+/- 243)
test bench_weighted_shuffle_shuffle ... bench: 149,082 ns/iter (+/- 1,474)
Binary tree:
test bench_weighted_shuffle_new ... bench: 58,514 ns/iter (+/- 229)
test bench_weighted_shuffle_shuffle ... bench: 269,961 ns/iter (+/- 16,446)
Fenwick tree:
test bench_weighted_shuffle_new ... bench: 39,413 ns/iter (+/- 179)
test bench_weighted_shuffle_shuffle ... bench: 364,771 ns/iter (+/- 2,078)
The improvements become even more significant as there are more items to
shuffle. With 20_000 weights:
N-ary tree (fanout 16):
test bench_weighted_shuffle_new ... bench: 200,659 ns/iter (+/- 4,395)
test bench_weighted_shuffle_shuffle ... bench: 941,928 ns/iter (+/- 26,492)
Binary tree:
test bench_weighted_shuffle_new ... bench: 881,114 ns/iter (+/- 12,343)
test bench_weighted_shuffle_shuffle ... bench: 1,822,257 ns/iter (+/- 12,772)
Fenwick tree:
test bench_weighted_shuffle_new ... bench: 276,936 ns/iter (+/- 14,692)
test bench_weighted_shuffle_shuffle ... bench: 2,644,713 ns/iter (+/- 49,252)
This is partial port of firedancer's implementation of weighted shuffle:
https://github.com/firedancer-io/firedancer/blob/3401bfc26/src/ballet/wsample/fd_wsample.c
Though Fenwick trees use less space, inverse queries require an
additional O(log n) factor for binary search resulting an overall
O(n log n log n) performance for weighted shuffle.
This commit instead uses a binary tree where each node contains the sum
of all weights in its left sub-tree. The weights themselves are
implicitly stored at the leaves. Inverse queries and updates to the tree
all can be done O(log n) resulting an overall O(n log n) weighted
shuffle implementation.
Based on benchmarks, this results in 24% improvement in
WeightedShuffle::shuffle:
Fenwick tree:
test bench_weighted_shuffle_new ... bench: 36,686 ns/iter (+/- 191)
test bench_weighted_shuffle_shuffle ... bench: 342,625 ns/iter (+/- 4,067)
Binary tree:
test bench_weighted_shuffle_new ... bench: 59,131 ns/iter (+/- 362)
test bench_weighted_shuffle_shuffle ... bench: 260,194 ns/iter (+/- 11,195)
Though WeightedShuffle::new is now slower, it generally can be cached
and reused as in Turbine:
https://github.com/anza-xyz/agave/blob/b3fd87fe8/turbine/src/cluster_nodes.rs#L68
Additionally the new code has better asymptotic performance. For
example with 20_000 weights WeightedShuffle::shuffle is 31% faster:
Fenwick tree:
test bench_weighted_shuffle_new ... bench: 255,071 ns/iter (+/- 9,591)
test bench_weighted_shuffle_shuffle ... bench: 2,466,058 ns/iter (+/- 9,873)
Binary tree:
test bench_weighted_shuffle_new ... bench: 830,727 ns/iter (+/- 10,210)
test bench_weighted_shuffle_shuffle ... bench: 1,696,160 ns/iter (+/- 75,271)
The name was previously hard-coded to solReceiver. The use of the same
name makes it hard to figure out which thread is which when these
threads are handling many services (Gossip, Tvu, etc).
* gossip: notify state machine of duplicate proofs
* Add feature flag for ingesting duplicate proofs from Gossip.
* Use the Epoch the shred is in instead of the root bank epoch.
* Fix unittest by activating the feature.
* Add a test for feature disabled case.
* EpochSchedule is now not copyable, clone it explicitly.
* pr feedback: read epoch schedule on startup, add guard for ff recache
* pr feedback: bank_forks lock, -cached_slots_in_epoch, init ff
* pr feedback: bank.forks_try_read() -> read()
* pr feedback: fix local-cluster setup
* local-cluster: do not expose gossip internals, use retry mechanism instead
* local-cluster: split out case 4b into separate test and ignore
* pr feedback: avoid taking lock if ff is already found
* pr feedback: do not cache ff epoch
* pr feedback: bank_forks lock, revert to cached_slots_in_epoch
* pr feedback: move local variable into helper function
* pr feedback: use let else, remove epoch 0 hack
---------
Co-authored-by: Wen <crocoxu@gmail.com>
* Add RestartHeaviestFork to Gossip.
* Add a test for out of bound value.
* Send observed_stake and total_epoch_stake in ResatartHeaviestFork.
* Remove total_epoch_stake from RestartHeaviestFork.
* Forgot to update ABI digest.
* Remove checking of whether stake is zero.
* Remove unnecessary new function and make new_rand pub(crate).
* handle ContactInfo in places where only LegacyContactInfo was used
* missed a spot
* missed a spot
* import contact info for crds lookup
* cargo fmt
* rm contactinfo from crds_entry. not supported yet
* typo
* remove crds.nodes insert for ContactInfo. not supported yet
* forgot to remove clusterinfo in remove()
* move around contactinfo match arm
* remove contactinfo updating crds.shred_version