Commit Graph

149 Commits

Author SHA1 Message Date
behzad nouri 0da01270ef
removes redundant recycler clones (#32401) 2023-07-06 18:25:20 +00:00
behzad nouri 469661d217
removes outdated tvu_forward socket (#32101)
Shreds are no longer sent to tvu_forward socket.
2023-06-20 20:50:16 +00:00
behzad nouri 987e8eeeaf
removes feature gate code dropping redundant turbine path (#32075) 2023-06-16 19:53:05 +00:00
behzad nouri aed4ecb633
adds quic receiver to shred-fetch-stage (#31576)
Working towards migrating turbine to QUIC.
2023-06-12 13:16:27 +00:00
behzad nouri 5178d4d49b
adds quic tvu port to contact-info (#31614)
Working towards migrating turbine to QUIC.
2023-05-15 15:13:21 +00:00
behzad nouri 4e34abbf3d
specifies protocol in contact-info get-socket api (#31602) 2023-05-12 16:16:20 +00:00
Illia Bobyr 2d090d4547
test_gossip_node: Use random port (#31490)
Using a fixed port could cause a false negative, if the port is
currently in use.  We actually see this test failing regularly with an
error that port `1111` is already in use.

Quick search did not show any tests that hardcode port 1111, so it is
unclear why is this happening.  But using hardcoded ports is not a good
practice anyways.
2023-05-05 18:47:24 -07:00
Brooks c5e071c7fe
Upgrades nightly Rust to 2023-03-04 (#31487) 2023-05-05 08:28:23 -04:00
behzad nouri aafcac27d8
removes pubkey from LegacyContactInfo public interface (#31375)
Working towards LegacyContactInfo => ContactInfo migration, the commit
adds more api parity between the two.
2023-04-28 12:05:15 +00:00
DimAn bd782634dc
Add ability to manage public TPU Forwards address (#31201)
A node operator can manage a public TPU address (at node startup and while it's running), but doesn't have the ability to manage the TPU Forwards address as well.

This PR adds that functionality.

Added the start argument --public-tpu-forwards-address
Reworked the set-public-tpu-address subcommand into the set-public-address subcommand
2023-04-24 23:04:36 +00:00
behzad nouri 1b08d01a80
removes shred_version from LegacyContactInfo public interface (#31304)
Working towards LegacyContactInfo => ContactInfo migration, the commit
adds more api parity between the two.
2023-04-24 15:19:33 +00:00
behzad nouri a88024e295
removes wallclock from LegacyContactInfo public interface (#31303) 2023-04-22 20:18:39 +00:00
behzad nouri cb65a785bc
makes sockets in LegacyContactInfo private (#31248)
Working towards LegacyContactInfo => ContactInfo migration, the commit
hides some implementation details of LegacyContactInfo and expands API
parity with the new ContactInfo.
2023-04-21 15:39:16 +00:00
DimAn b81b7ebf03
validator: `--tpu-host-addr` -> `--public-tpu-address` (#31137) 2023-04-12 13:33:09 +00:00
DimAn 9136f80d36
validator: add `set-public-tpu-address` command (#30452) 2023-04-12 13:32:22 +00:00
Brooks 453f272698
Rename IncrementalSnapshotHashes to SnapshotHashes (#31136) 2023-04-11 10:30:29 -04:00
Brooks f3083ad2e0
Rename SnapshotHashes to LegacySnapshotHashes (#31086) 2023-04-10 17:52:20 -04:00
Brooks e8ea722061
Uses AccountsHashes type for AccountsHashes CrdsData variant (#31003) 2023-04-03 16:42:21 -04:00
behzad nouri d4b30adffe
reworks gossip crds timeouts (#30468)
CrdsGossipPull::make_timeouts iterates over the stakes hashmap and
creates a new hashmap which is unnecessary:
https://github.com/solana-labs/solana/blob/c032dc275/gossip/src/crds_gossip_pull.rs#L517-L539

The commit instead keeps a reference to the stakes hashmap.
2023-03-27 21:52:48 +00:00
kirill lykov ee1717b24b
Make clippy to be happy (#30394)
* replace default implementation with default enum

* fix format to make clippy happy
2023-02-17 20:51:18 +01:00
Brennan b678ee3583
Clean up socket binding reuse (#30367) 2023-02-16 11:07:22 -08:00
behzad nouri 4892a6a910
removes redundant CrdsGossipPull.msg_timeout (#30334) 2023-02-16 00:23:27 +00:00
behzad nouri ded457cd73
embeds the new gossip ContactInfo in ClusterInfo (#30022)
Working towards replacing the legacy gossip contact-info with the new
one, the commit updates the respective field in gossip cluster-info.
2023-02-10 20:07:45 +00:00
behzad nouri 544fbded07
removes wallclock from duplicate-shreds handler (#30187) 2023-02-08 17:29:30 +00:00
Kevin Ji cd51499ab9
Use Ipv4Addr constants in socketaddr! (#30095) 2023-02-02 16:48:21 -07:00
behzad nouri 4cc07a176e
reduces number of gossip pull requests/responses (#29974) 2023-01-30 17:59:56 +00:00
behzad nouri d6fbf3fb17
adds new contact-info with forward compatible sockets (#29596)
The commit implement new ContactInfo where
* Ports and IP addresses are specified separately so that unique IP
  addresses can only be specified once.
* Different sockets (tvu, tpu, etc) are specified by opaque u8 tags so
  that adding and removing sockets is backward and forward compatible.
* solana_version::Version is also embedded in so that it won't need to
  be gossiped separately.
* NodeInstance is also rolled in by adding a field identifying when the
  instance was first created so that it won't need to be gossiped
  separately.

Update plan:
* Once the cluster is able to ingest the new type (i.e. this patch), a
  2nd patch will start gossiping the new ContactInfo along with the
  LegacyContactInfo.
* Once all nodes in the cluster gossip the new ContactInfo, a 3rd patch
  will start solely using the new ContactInfo while still gossiping the
  old LegacyContactInfo.
* Once all nodes in the cluster solely use the new ContactInfo, a 4th
  patch will stop gossiping the old LegacyContactInfo.
2023-01-26 17:02:18 +00:00
behzad nouri 1c7662a37f
asserts that cluster-info keypair is consistent with contact-info id (#29818) 2023-01-24 16:57:55 +00:00
Kevin Ji dd92f225bb
Use Ipv4Addr::{LOCALHOST, UNSPECIFIED} constants (#29813) 2023-01-23 16:49:51 -06:00
behzad nouri 590b75140f
removes legacy retransmit tests (#29817)
Retransmit code has moved to core/src/cluster_nodes.rs and has been
significantly revised.
gossip/tests/cluster_info.rs is testing the old code which is no longer
relevant.
2023-01-21 22:28:48 +00:00
behzad nouri 272e667cb2
deprecates Pubkey::new in favor of Pubkey::{,try_}from (#29805)
The commit deprecates Pubkey::new which lacks type-safety and instead
implements TryFrom<&[u8]> and TryFrom<Vec<u8>> for Pubkey.
2023-01-21 18:06:27 +00:00
Wen b36791956e
Ingest duplicate proofs sent through Gossip (#29227)
* First draft of ingesting duplicate proofs in Gossip into blockstore.

* Add more unittests.

* Add more unittests for bad cases.

* Fix lint errors for tests.

* More linter fixes for tests.

* Lint fixes

* Rename get_entries, move location of comment.

* Some renaming changes and comment fixes.

* Fix compile warning, this enum is not used.

* Fix lint errors.

* Slow down cleanup because this could potentially be expensive.

* Forgot to reset cleanup count.

* Add protection against attackers when constructing chunk map when
we ingest Gossip proofs.

* Use duplicate shred index instead of get_entries.

* Rename ClusterInfoDuplicateShredListener and fix a few small problems.

* Use into_shreds to piece together the proof.

* Remove redundant code.

* Address a few small errors.

* Discard slots too advanced in the future.

* - Use oldest proof for each pubkey
- Limit number of pubkeys in each slot to 100

* Disable duplicate shred handling for now.

* Revert "Disable duplicate shred handling for now."

This reverts commit c3fcf403876cfbf90afe4d2265a826f21a5e24ab.
2023-01-19 13:00:56 -08:00
behzad nouri 0941d133a8
adds new solana_version::Version with ClientId (#29649) 2023-01-17 22:21:14 +00:00
behzad nouri d4ce59eee7
reworks weights for gossip pull-requests peer sampling (#28463)
Amplifying gossip peer sampling weights by the time since last
pull-request has undesired consequence that a node coming back online
will see a huge number of pull requests all at once.
This "time since last request" is also unnecessary to include in
weights because as long as sampling probabilities are non-zero, a node
will be almost surely periodically selected in the sample.
The commit reworks peer sampling probabilities by just using (dampened)
stakes as weights.
2023-01-14 15:44:38 +00:00
behzad nouri d89cf0d28b
includes origin's stake in gossip push nodes sampling (#29343)
Gossip push samples nodes by stake. This is unnecessarily wasteful and
creates too much congestion at high staked nodes if the CRDS value to be
propagated is from a node with low or zero stake.
This commit instead maintains several active-sets for push, each
corresponding with a stake bucket. Peer sampling weights are accordingly
capped by the respective bucket stake.
2023-01-11 19:46:32 +00:00
behzad nouri 677b6d6458
removes LegacyContactInfo::is_valid_tvu_address (#29570)
Since
https://github.com/solana-labs/solana/pull/20480
turbine includes all epoch staked nodes in tree construction and no
longer relies on obtaining their contact-info from gossip; and so
distinguishing between is_valid_address and is_valid_tvu_address is no
longer necessary and the latter can be removed.
2023-01-08 22:53:45 +00:00
behzad nouri 8c212f59ad
renames ContactInfo to LegacyContactInfo (#29566)
Working towards adding a new ContactInfo where new sockets can be
added in a backward compatible way.
2023-01-08 16:00:55 +00:00
behzad nouri 283a2b1540
removes #[allow(clippy::same_item_push)] (#29543) 2023-01-06 17:32:26 +00:00
behzad nouri 2d849a2eae
indexes duplicate-shreds in gossip crds table (#29317)
Also adding Crds::get_duplicate_shreds which retrieves all upserted
duplicate-shreds since a given cursor using the index.
2022-12-20 13:48:05 +00:00
behzad nouri 78a04ed432
ignores pubkey in Protocol::PruneMessage (#29280)
Protocol::PruneMessage(Pubkey, _) is the same as PruneData.pubkey and so
is redundant and can be ignored:
https://github.com/solana-labs/solana/blob/95d339300/gossip/src/cluster_info.rs#LL277-L279
https://github.com/solana-labs/solana/blob/95d339300/gossip/src/cluster_info.rs#L361-L367
2022-12-15 17:51:12 +00:00
behzad nouri a5c8c7c536
locks crds table only once to process push messages (#29218)
Processing push messages is locking and unlocking crds table for each
push message:
https://github.com/solana-labs/solana/blob/536b879aa/gossip/src/cluster_info.rs#L2266-L2276
https://github.com/solana-labs/solana/blob/536b879aa/gossip/src/crds_gossip_push.rs#L215C9-L260

This commit instead locks the crds table once for all the received push
messages.
2022-12-15 16:02:46 +00:00
behzad nouri 95d3393008 prunes gossip nodes based on timeliness of delivered messages
As described here:
https://github.com/solana-labs/solana/issues/28642#issuecomment-1337449607
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
2022-12-15 13:28:27 +00:00
behzad nouri 8ea5dd8b28
removes metric for process_push_success (#29211)
This is already tracked in CrdsDataStats:
https://github.com/solana-labs/solana/blob/5e799ad56/gossip/src/crds.rs#L96-L106
https://github.com/solana-labs/solana/blob/5e799ad56/gossip/src/cluster_info_metrics.rs#L652-L656
and is so duplicated.
Removing the metric would simplify this code path for upcoming commits.
2022-12-12 22:10:38 +00:00
behzad nouri 9524c9dbff patches errors from clippy::uninlined_format_args
https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
2022-12-06 19:32:15 +00:00
Jon Cinque b1340d77a2
sdk: Make Packet::meta private, use accessor functions (#29092)
sdk: Make packet meta private
2022-12-06 12:54:49 +01:00
behzad nouri 718f433206
adds metrics for gossip push fanout (#29065) 2022-12-04 15:20:51 +00:00
Brooks Prumo d1ba42180d
clippy for rust 1.65.0 (#28765) 2022-11-09 19:39:38 +00:00
behzad nouri f703275fc4
pings peers before sending push messages (#28537) 2022-10-25 00:01:23 +00:00
behzad nouri e283461d99
enforces hash domain for ping-pong protocol (#28433)
https://github.com/solana-labs/solana/pull/27193
added hash domain to ping-pong protocol.
For backward compatibility responses both with and without domain were
generated and accepted.
Now that all clusters are upgraded, this commit enforces the hash domain
by removing the response without the domain.
2022-10-18 18:17:12 +00:00
Brooks Prumo 12df0f234d
Upgrade to Rust 1.64.0 (#28034) 2022-09-29 09:32:24 -04:00