This commit adds CrdsEntry trait which allows generic lookups into crds
table. For example to get ContactInfo or LowestSlot associated with a
Pubkey, the lookup code would be respectively:
crds.get::<&ContactInfo>(pubkey)
crds.get::<&LowestSlot>(pubkey)
* Fix link target in doc comment
* Fix formatting of log examples in process_instruction
* Fix doc markdown in solana-gossip
* Fix doc markdown in solana-runtime
* Escape square braces in doc comments to avoid warnings
* Surround 'account references' doc items in code spans to avoid warnings
* Fix code block in loader_upgradeable_instruction
* Fix doctest for loader_upgradable_instruction
push_lowest_slot cannot sign the new crds-value unless the id (pubkey)
argument passed-in is the same pubkey as in ClusterInfo::keypair(), in
which case the id argument is redundant:
https://github.com/solana-labs/solana/blob/bb41cf346/gossip/src/cluster_info.rs#L824-L845
Additionally, the lookup is done with self.id(), but insert is done with
the id argument, which is logically a bug.
ClusterInfo is the gateway to CrdsGossip function calls, and it already
has node's pubkey and shred version (full ContactInfo and Keypair in
fact).
Duplicating these data in CrdsGossip adds redundancy and possibility for
bugs should they not be consistent with ClusterInfo.
Current implementation of weighted_shuffle:
https://github.com/solana-labs/solana/blob/b08f8bd1b/gossip/src/weighted_shuffle.rs#L11-L37
uses a heuristic which results in biased samples.
For example, if the weights are [1, 10, 100], then the 3rd index should
come first 100 times more often than the 1st index. However,
weighted_shuffle is picking the 3rd index 200+ times more often than the
1st index, showing a disproportional bias in favor of higher weights.
This commit implements weighted shuffle using binary indexed tree to
maintain cumulative sum of weights while sampling. The resulting samples
are demonstrably unbiased and precisely proportional to the weights.
Additionally the iterator interface allows to skip computations when
not all indices are processed.
Of the use cases of weighted_shuffle, changing turbine code requires
feature-gating to keep the cluster in sync. That is not updated in
this commit, but can be done together with future updates to turbine.
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.
Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.
This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
Calling ClusterInfo::id repeatedly in for loops or iterators is
inefficient, because it acquires a lock on ClusterInfo.my_contact_info,
and clones the entire contact-info.
filter_by_shred_version does not check the shred-version of the owner of
the crds-value. It only checks the shred-version of the node which is
relaying the value:
https://github.com/solana-labs/solana/blob/5cc073420/gossip/src/cluster_info.rs#L2274-L2289
So crds-values with different shred versions can still pass through this
function as long as they are relayed by a node with matching shred
version; and so, a single node can bridge different shred values
through-out the cluster.
When starting a validator, the node initially joins gossip with
shred_verison = 0, until it adopts the entrypoint's shred-version:
https://github.com/solana-labs/solana/blob/9b182f408/validator/src/main.rs#L417
Depending on the load on the entrypoint, this adopting entrypoint
shred-version through gossip sometimes becomes very slow, and causes
several problems in gossip because we have to partially support
shred_version == 0 which is a source of leaking crds values from one
cluster to another. e.g. see
https://github.com/solana-labs/solana/pull/17899
and the other linked issues there.
In order to remove shred_version == 0 from gossip, this commit adds
shred-version to ip-echo-server response. Once the entrypoints are
updated, on validator start-up, if --expected_shred_version is not
specified we will obtain shred-version from the entrypoint using
ip-echo-server.
Crds values of nodes with different shred versions are creeping into
gossip table resulting in runtime issues as the one addressed in:
https://github.com/solana-labs/solana/pull/17899
This commit works towards enforcing more checks and filtering based on
shred version by adding necessary mapping and api to gossip table.
Once populated, pubkey->shred-version mapping persists as long as there
are any values associated with the pubkey.