It was not immediately clear why the second `CrdsValue` insertion in the
test must always succeed. Turns out the test was relying on hash values
having a specific relationship. It is confusing to someone not deeply
familiar with the test.
As overwrite based on the hash value is not part of the behavior that we
consider valuable, we just remove that check.
Unified assertion between two checks into one.
Gossip push samples nodes by stake. This is unnecessarily wasteful and
creates too much congestion at high staked nodes if the CRDS value to be
propagated is from a node with low or zero stake.
This commit instead maintains several active-sets for push, each
corresponding with a stake bucket. Peer sampling weights are accordingly
capped by the respective bucket stake.
Since
https://github.com/solana-labs/solana/pull/20480
turbine includes all epoch staked nodes in tree construction and no
longer relies on obtaining their contact-info from gossip; and so
distinguishing between is_valid_address and is_valid_tvu_address is no
longer necessary and the latter can be removed.
dedups gossip addresses, keeping only the one with the highest weight
In order to avoid traffic congestion or sending duplicate packets, when
sampling gossip nodes if several nodes have the same gossip address
(because they are behind a relayer or whatever), they need to be
deduplicated into one.
As described here:
https://github.com/solana-labs/solana/issues/28642#issuecomment-1337449607
current gossip pruning code fails to maintain spanning trees across
cluster.
This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.
Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
The commit tracks number of times duplicates of a CRDS value is received
from gossip push. Following commits will utilize this metric to score
gossip nodes in terms of timeliness of their push messages, in order to
better pick which nodes to prune.
* Move ConnectionCache back to solana-client, and duplicate ThinClient, TpuClient there
* Dedupe thin_client modules
* Dedupe tpu_client modules
* Move TpuClient to TpuConnectionCache
* Move ThinClient to TpuConnectionCache
* Move TpuConnection and quic/udp trait implementations back to solana-client
* Remove enum_dispatch from solana-tpu-client
* Move udp-client to its own crate
* Move quic-client to its own crate
https://github.com/solana-labs/solana/pull/27193
added hash domain to ping-pong protocol.
For backward compatibility responses both with and without domain were
generated and accepted.
Now that all clusters are upgraded, this commit enforces the hash domain
by removing the response without the domain.
Currently, gossip packets are counted after excess packets are dropped.
This makes it difficult to debug gossip traffic spikes if the majority
of the packets are dropped.
This commit instead counts gossip packets received before excess packets
are dropped
* Fix test_bench_tps_local_cluster_solana
* Remove #[ignore] annotations from dos tests (which are also fixed by this change)
* Remove #[ignore] annotations from local cluster tests (which are also fixed by this change)
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
In order to maintain backward compatibility, for now the responding node
will hash the token both with and without domain so that the other node
will accept the response regardless of its upgrade status.
Once the cluster has upgraded to the new code, we will remove the legacy
domain = false case.
cargo test --package solana-gossip --release test_push_votes_with_tower
occasionally fails because with --release all votes are generated at
the same wallclock (milliseconds resolution) and so the new ones will
not necessarily override existing entries in the table.
The commit ensures that the new vote is pushed with a wallclock later
than existing entries.