Working towards LegacyContactInfo => ContactInfo migration, the commit
hides some implementation details of LegacyContactInfo and expands API
parity with the new ContactInfo.
Amplifying gossip peer sampling weights by the time since last
pull-request has undesired consequence that a node coming back online
will see a huge number of pull requests all at once.
This "time since last request" is also unnecessary to include in
weights because as long as sampling probabilities are non-zero, a node
will be almost surely periodically selected in the sample.
The commit reworks peer sampling probabilities by just using (dampened)
stakes as weights.
Gossip push samples nodes by stake. This is unnecessarily wasteful and
creates too much congestion at high staked nodes if the CRDS value to be
propagated is from a node with low or zero stake.
This commit instead maintains several active-sets for push, each
corresponding with a stake bucket. Peer sampling weights are accordingly
capped by the respective bucket stake.
As described here:
https://github.com/solana-labs/solana/issues/28642#issuecomment-1337449607
current gossip pruning code fails to maintain spanning trees across
cluster.
This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.
Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
Each time a node generates gossip pull-requests, it sends out all the
requests to a single randomly selected peer:
https://github.com/solana-labs/solana/blob/fd7ad31ee/gossip/src/crds_gossip_pull.rs#L253-L266
This causes a burst of pull-requests at a single node at once. In order
to make gossip in-bound traffic less bursty, this commit fans out gossip
pull-requests to several randomly selected peers.
This should reduce spikes in inbound gossip traffic without changing the
average load which may help reduce number of times outbound data budget is
exhausted when responding to gossip pull-requests at the receiving node, and
reduce number of pull-requests dropped.
This commit adds CrdsEntry trait which allows generic lookups into crds
table. For example to get ContactInfo or LowestSlot associated with a
Pubkey, the lookup code would be respectively:
crds.get::<&ContactInfo>(pubkey)
crds.get::<&LowestSlot>(pubkey)
ClusterInfo is the gateway to CrdsGossip function calls, and it already
has node's pubkey and shred version (full ContactInfo and Keypair in
fact).
Duplicating these data in CrdsGossip adds redundancy and possibility for
bugs should they not be consistent with ClusterInfo.
* Move gossip modules to solana-gossip
* Update Protocol abi digest due to move
* Move gossip benches and hook up CI
* Remove unneeded Result entries
* Single use statements