The commit implement new ContactInfo where
* Ports and IP addresses are specified separately so that unique IP
addresses can only be specified once.
* Different sockets (tvu, tpu, etc) are specified by opaque u8 tags so
that adding and removing sockets is backward and forward compatible.
* solana_version::Version is also embedded in so that it won't need to
be gossiped separately.
* NodeInstance is also rolled in by adding a field identifying when the
instance was first created so that it won't need to be gossiped
separately.
Update plan:
* Once the cluster is able to ingest the new type (i.e. this patch), a
2nd patch will start gossiping the new ContactInfo along with the
LegacyContactInfo.
* Once all nodes in the cluster gossip the new ContactInfo, a 3rd patch
will start solely using the new ContactInfo while still gossiping the
old LegacyContactInfo.
* Once all nodes in the cluster solely use the new ContactInfo, a 4th
patch will stop gossiping the old LegacyContactInfo.
Retransmit code has moved to core/src/cluster_nodes.rs and has been
significantly revised.
gossip/tests/cluster_info.rs is testing the old code which is no longer
relevant.
* First draft of ingesting duplicate proofs in Gossip into blockstore.
* Add more unittests.
* Add more unittests for bad cases.
* Fix lint errors for tests.
* More linter fixes for tests.
* Lint fixes
* Rename get_entries, move location of comment.
* Some renaming changes and comment fixes.
* Fix compile warning, this enum is not used.
* Fix lint errors.
* Slow down cleanup because this could potentially be expensive.
* Forgot to reset cleanup count.
* Add protection against attackers when constructing chunk map when
we ingest Gossip proofs.
* Use duplicate shred index instead of get_entries.
* Rename ClusterInfoDuplicateShredListener and fix a few small problems.
* Use into_shreds to piece together the proof.
* Remove redundant code.
* Address a few small errors.
* Discard slots too advanced in the future.
* - Use oldest proof for each pubkey
- Limit number of pubkeys in each slot to 100
* Disable duplicate shred handling for now.
* Revert "Disable duplicate shred handling for now."
This reverts commit c3fcf403876cfbf90afe4d2265a826f21a5e24ab.
Amplifying gossip peer sampling weights by the time since last
pull-request has undesired consequence that a node coming back online
will see a huge number of pull requests all at once.
This "time since last request" is also unnecessary to include in
weights because as long as sampling probabilities are non-zero, a node
will be almost surely periodically selected in the sample.
The commit reworks peer sampling probabilities by just using (dampened)
stakes as weights.
Gossip push samples nodes by stake. This is unnecessarily wasteful and
creates too much congestion at high staked nodes if the CRDS value to be
propagated is from a node with low or zero stake.
This commit instead maintains several active-sets for push, each
corresponding with a stake bucket. Peer sampling weights are accordingly
capped by the respective bucket stake.
Since
https://github.com/solana-labs/solana/pull/20480
turbine includes all epoch staked nodes in tree construction and no
longer relies on obtaining their contact-info from gossip; and so
distinguishing between is_valid_address and is_valid_tvu_address is no
longer necessary and the latter can be removed.
As described here:
https://github.com/solana-labs/solana/issues/28642#issuecomment-1337449607
current gossip pruning code fails to maintain spanning trees across
cluster.
This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.
Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
https://github.com/solana-labs/solana/pull/27193
added hash domain to ping-pong protocol.
For backward compatibility responses both with and without domain were
generated and accepted.
Now that all clusters are upgraded, this commit enforces the hash domain
by removing the response without the domain.
Currently, gossip packets are counted after excess packets are dropped.
This makes it difficult to debug gossip traffic spikes if the majority
of the packets are dropped.
This commit instead counts gossip packets received before excess packets
are dropped
* Fix test_bench_tps_local_cluster_solana
* Remove #[ignore] annotations from dos tests (which are also fixed by this change)
* Remove #[ignore] annotations from local cluster tests (which are also fixed by this change)
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
In order to maintain backward compatibility, for now the responding node
will hash the token both with and without domain so that the other node
will accept the response regardless of its upgrade status.
Once the cluster has upgraded to the new code, we will remove the legacy
domain = false case.
cargo test --package solana-gossip --release test_push_votes_with_tower
occasionally fails because with --release all votes are generated at
the same wallclock (milliseconds resolution) and so the new ones will
not necessarily override existing entries in the table.
The commit ensures that the new vote is pushed with a wallclock later
than existing entries.
* add three gossip metrics measuring gossip loop times
* add 5 metrics
* rm space
* rm space
* Update SECURITY.md
- fix nav link
- add bounty split policy for duplicate reports
* Add transaction index in slot to geyser plugin TransactionInfo (#25688)
* Define shuffle to prep using same shuffle for multiple slices
* Determine transaction indexes and plumb to execute_batch
* Pair transaction_index with transaction in TransactionStatusService
* Add new ReplicaTransactionInfoVersion
* Plumb transaction_indexes through BankingStage
* Prepare BankingStage to receive transaction indexes from PohRecorder
* Determine transaction indexes in PohRecorder; add field to WorkingBank
* Add PohRecorder::record unit test
* Only pass starting_transaction_index around PohRecorder
* Add helper structs to simplify test DashMap
* Pass entry and starting-index into process_entries_with_callback together
* Add tx-index checks to test_rebatch_transactions
* Revert shuffle definition and use zip/unzip
* Only zip/unzip if randomize
* Add confirm_slot_entries test
* Review nits
* Add type alias to make sender docs more clear
* Update SECURITY.md
finish filling out the table....
* rpc: fix possible deadlock in rpc (#26051)
* Add StatusCache::root_slot_deltas() and use it (#26170)
* Remove InMemAccountsIndex::map() and use map_internal directly (#26189)
* [quic]Decrement total_streams correctly (#26158)
* remove comment
* alphabetical metrics. no abbreviations
* remove trailing white space
* cargo fmt to update code format/readability
Co-authored-by: Trent Nelson <trent@solana.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
Co-authored-by: Boqin Qin(秦 伯钦) <Bobbqqin@gmail.com>
Co-authored-by: Brooks Prumo <brooks@solana.com>
Co-authored-by: Miles Obare <bdhobare@gmail.com>
* Spawn QUIC server to receive forwarded txs
* Update validator port range
* forward votes using UDP
* no forwarding from unstaked nodes
* forwarding stats in banking stage
* fix test builds
* fix lifetime of forward sender