Commit Graph

26 Commits

Author SHA1 Message Date
behzad nouri 039488b562
drops redundant turbine propagation path (#24351)
Most nodes in the cluster receive the same shred from two different
nodes: parent, and the first node of their neighborhood:
https://github.com/solana-labs/solana/blob/a8c695ba5/core/src/cluster_nodes.rs#L178-L197

Because of the erasure codings, half of the shreds are already
redundant. So this redundant propagation path will only add extra
overhead.

Additionally the very first node of the broadcast tree has 2x fanout
(i.e. 400 nodes) which adds too much load at one node.

This commit simplifies the broadcast tree by dropping the redundant
propagation path and removing the 2x fanout at root node.
2022-04-19 00:11:29 +00:00
behzad nouri 2b718d00b0 removes legacy compatibility turbine peers shuffle code 2022-04-05 12:04:12 +00:00
behzad nouri d0b850cdd9 removes turbine peers shuffle patch feature 2022-04-05 12:04:12 +00:00
behzad nouri 855801cc95 removes deterministic-shred-seed feature 2022-04-05 12:04:12 +00:00
behzad nouri 7cb3b6cbe2
demotes WeightedShuffle failures to error metrics (#24079)
Since call-sites are calling unwrap anyways, panicking seems too punitive
for our use cases.
2022-04-03 16:20:06 +00:00
Tao Zhu c478fe2047 add timing metrics, some renaming 2022-03-17 19:31:28 -05:00
Tao Zhu fd515097d8 leader qos part 2: add stage to find sender stake, set to packet meta 2022-03-17 19:31:28 -05:00
Jeff Biseda c69e3b73ff
bench get_retransmit_peers (#23292) 2022-03-01 19:10:29 -08:00
behzad nouri dccbddad80
adds reverse lookup index to cluster-nodes (#22892)
retransmit has to exclude slot leader from set of nodes for each shred; 
which currently requires a linear scan:
https://github.com/solana-labs/solana/blob/e3b137066/core/src/cluster_nodes.rs#L238-L242

This commit adds a reverse lookup index to avoid linear scan.
2022-02-02 19:27:50 +00:00
behzad nouri e3b137066d
caches WeightedShuffle struct in ClusterNodes (#22877)
Instead of reconstructing WeightedShuffle struct for each shred
broadcast or retransmit, we can use the same struct with minimal
mutations.
2022-02-02 15:12:26 +00:00
behzad nouri 45e09664b8
removes Rng field from WeightedShuffle struct (#22850) 2022-02-01 15:27:23 +00:00
behzad nouri 604ca9316c
includes zero weighted entries in WeightedShuffle (#22829)
Current WeightedShuffle implementation excludes zero weighted entries
from the shuffle:
https://github.com/solana-labs/solana/blob/13e631dcf/gossip/src/weighted_shuffle.rs#L29-L30

Though mathematically this might make more sense, for our use-cases
(turbine specifically), this results in less efficient code:
https://github.com/solana-labs/solana/blob/13e631dcf/core/src/cluster_nodes.rs#L409-L430

This commit changes the implementation so that zero weighted indices are
also included in the shuffle but appear only at the end after non-zero
weighted indices.
2022-01-31 16:23:50 +00:00
behzad nouri 1297a13586
adds metrics tracking crds writes and votes (#20953) 2021-10-26 13:02:30 +00:00
Michael Vines 350bb561eb Clippy 2021-10-23 08:21:20 +00:00
behzad nouri 0c0384ec32
revises turbine peers shuffling order (#20480)
Turbine randomly shuffles cluster nodes on a broadcast tree for each
shred. This requires knowing the stakes and nodes' contact-infos (from
gossip).

However gossip is subject to partitioning and propogation delays.
Additionally unstaked nodes may join and leave the cluster at any
moment, changing the cluster view from one node to another.

This commit:
* Always arranges the unstaked nodes at the bottom of turbine broadcast
  tree.
* Staked nodes are always included regardless of if their contact-info
  is available in gossip or not.
* Uses the unbiased WeightedShuffle construct for shuffling nodes.
2021-10-14 15:09:36 +00:00
behzad nouri 6d9818b8e4
skips retransmit for shreds with unknown slot leader (#19472)
Shreds' signatures should be verified before they reach retransmit
stage, and if the leader is unknown they should fail signature check.
Therefore retransmit-stage can as well expect to know who the slot
leader is and otherwise just skip the shred.

Blockstore checking signature of recovered shreds before sending them to
retransmit stage:
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/blockstore.rs#L884-L930

Shred signature verifier:
https://github.com/solana-labs/solana/blob/4305d4b7b/core/src/sigverify_shreds.rs#L41-L57
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/sigverify_shreds.rs#L105
2021-09-01 15:44:26 +00:00
behzad nouri 1deb4add81
removes Slot from TransmitShreds (#19327)
An earlier version of the code was funneling through stakes along with
shreds to broadcast:
https://github.com/solana-labs/solana/blob/b67ffab37/core/src/broadcast_stage.rs#L127

This was changed to only slots as stakes computation was pushed further
down the pipeline in:
https://github.com/solana-labs/solana/pull/18971

However shreds themselves embody which slot they belong to. So pairing
them with slot is redundant and adds rooms for bugs should they become
inconsistent.
2021-08-20 13:48:33 +00:00
behzad nouri e4be00fece falls back on working-bank if root-bank::epoch-staked-nodes is none
bank.get_leader_schedule_epoch(shred_slot)
is one epoch after epoch_schedule.get_epoch(shred_slot).

At epoch boundaries, shred is already one epoch after the root-slot. So
we need epoch-stakes 2 epochs ahead of the root. But the root bank only
has epoch-stakes for one epoch ahead, and as a result looking up epoch
staked-nodes from the root-bank fails.

To be backward compatible with the current master code, this commit
implements a fallback on working-bank if epoch staked-nodes obtained
from the root-bank is none.
2021-08-05 21:47:33 +00:00
behzad nouri eaf927cf49 allows only one thread to update cluster-nodes cache entry for an epoch
If two threads simultaneously call into ClusterNodesCache::get for the
same epoch, and the cache entry is outdated, then both threads recompute
cluster-nodes for the epoch and redundantly overwrite each other.

This commit wraps ClusterNodesCache entries in Arc<Mutex<...>>, so that
when needed only one thread does the computations to update the entry.
2021-08-05 21:47:33 +00:00
behzad nouri fb69f45f14 adds fallback & metric for when epoch staked-nodes are none 2021-08-05 21:47:33 +00:00
behzad nouri 50d0e830c9 unifies cluster-nodes computation & caching across turbine stages
Broadcast-stage is using epoch_staked_nodes based on the same slot that
shreds belong to:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228
https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349

But retransmit-stage is using bank-epoch of the working-bank:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/retransmit_stage.rs#L272-L289

So the two are not consistent at epoch boundaries where some nodes may
have a working bank (or similarly a root bank) lagging other nodes. As a
result the node which obtains a packet may construct turbine broadcast
tree inconsistently with its parent node in the tree and so some packets
may fail to reach all nodes in the tree.
2021-08-05 21:47:33 +00:00
behzad nouri ecc1c7957f implements cluster-nodes cache
Cluster nodes are cached keyed by the respective epoch from which stakes
are obtained, and so if epoch changes cluster-nodes will be recomputed.

A time-to-live eviction policy is enforced to refresh entries in case
gossip contact-infos are updated.
2021-08-05 21:47:33 +00:00
behzad nouri d2d5f36a3c
adds validator flag to allow private ip addresses (#18850) 2021-07-23 15:25:03 +00:00
carllin 588c0464b8
Add sampling logic and DuplicateSlotRepairStatus module (#18721) 2021-07-21 11:15:08 -07:00
behzad nouri cf31afdd6a
makes CrdsGossip thread-safe (#18615) 2021-07-14 22:27:17 +00:00
behzad nouri 04787be8b1
encapsulates turbine peers computations of broadcast & retransmit stages (#18238)
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.

Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.

This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
2021-07-07 00:35:25 +00:00