* remove unnecessary hashes around raw string literals
* remove unncessary literal `unwrap()`s
* remove panicking `unwrap()`
* remove unnecessary `unwrap()`
* use `[]` instead of `vec![]` where applicable
* remove (more) unnecessary explicit `into_iter()` calls
* remove redundant pattern matching
* don't cast to same type and constness
* do not `cfg(any(...` a single item
* remove needless pass by `&mut`
* prefer `or_default()` to `or_insert_with(T::default())`
* `filter_map()` better written as `filter()`
* incorrect `PartialOrd` impl on `Ord` type
* replace "slow zero-filled `Vec` initializations"
* remove redundant local bindings
* add required lifetime to associated constant
* sdk: Add concurrent support for rand 0.7 and 0.8
* Update rand, rand_chacha, and getrandom versions
* Run command to replace `gen_range`
Run `git grep -l gen_range | xargs sed -i'' -e 's/gen_range(\(\S*\), /gen_range(\1../'
* sdk: Fix users of older `gen_range`
* Replace `hash::new_rand` with `hash::new_with_thread_rng`
Run:
```
git grep -l hash::new_rand | xargs sed -i'' -e 's/hash::new_rand([^)]*/hash::new_with_thread_rng(/'
```
* perf: Use `Keypair::new()` instead of `generate`
* Use older rand version in zk-token-sdk
* program-runtime: Inline random key generation
* bloom: Fix clippy warnings in tests
* streamer: Scope rng usage correctly
* perf: Fix clippy warning
* accounts-db: Map to char to generate a random string
* Remove `from_secret_key_bytes`, it's just `keypair_from_seed`
* ledger: Generate keypairs by hand
* ed25519-tests: Use new rand
* runtime: Use new rand in all tests
* gossip: Clean up clippy and inline keypair generators
* core: Inline keypair generation for tests
* Push sbf lockfile change
* sdk: Sort dependencies correctly
* Remove `hash::new_with_thread_rng`, use `Hash::new_unique()`
* Use Keypair::new where chacha isn't used
* sdk: Fix build by marking rand 0.7 optional
* Hardcode secret key length, add static assertion
* Unify `getrandom` crate usage to fix linking errors
* bloom: Fix tests that require a random hash
* Remove some dependencies, try to unify others
* Remove unnecessary uses of rand and rand_core
* Update lockfiles
* Add back some dependencies to reduce rebuilds
* Increase max rebuilds from 14 to 15
* frozen-abi: Remove `getrandom`
* Bump rebuilds to 17
* Remove getrandom from zk-token-proof
The commit revises socket tag for TVU over QUIC to reuse old
TVU-forwards socket tag. This avoids gaps in socket tags which wastes
space in ContactInfo.cache.
The change is backward compatible because TVU-forwards is not used any
more and TVU over QUIC is not released yet.
The commit adds a Vec<Extension> to ContactInfo so that future additions
to ContactInfo can be done by only adding new Extensions instead of
modifying the entire ContactInfo.
* we only want to report received message signatures on PUSH requests, not PULL requests
* woops accidently had it has LocalMessage not PushMessage
* switch from match to if let statement
* convert if let to matches macro
* add in from field in PushMessage for message tracking
* update with cargo fmt
* remove display for gossip route and add lifetime param to pubkey reference in gossiproute enum
* forgot to run fmt
* we only want to report received message signatures on PUSH requests, not PULL requests
* woops accidently had it has LocalMessage not PushMessage
* switch from match to if let statement
* convert if let to matches macro
process_gossip_packets_iterations_since_last_report measures the same loop count as gossip_listen_loop_iterations_since_last_report which already exists
* screwed up old branch and syncing with upstream branch
* add fixed size ring buff instead of variable sized vecdeque for reporting signatures
* modify difficulty to take in n 0 bits and check if signature ending ends in n 0 bits
* update to only push every 18 trailing zero bits. and clean up
* report origin with signature. and set trailing 0s to 8 for local testing
* change back to 18 trailing zeros and rm unused imports
* run cargo rmt
* run ./scripts/cargo-for-all-lock-files.sh tree
* allow integer arithmetic for bit comparison
* rm unused lifetime
* rm duplicate entry?
* re implement ring buf
* put ringbuf in sorted order
* ringbuf in cargo.toml now in sorted order
* rm ring buf, refactor, fix trailing zero bug
* fix bug in trailing zeros. was comparing wrong ones
* fix needless range loop bug
* fix bug
* change trailing zero checking to build in methods and only report first 8 bytes of signature and origin pubkey
* report full origin string and first 8 bytes of signature
* set SIGNATURE_SAMPLE_TRAILING_ZEROS back to 18
* forgot to run cargo tree
* avoid panic and change working
* rm bs58
* pass in Option<String> into datapoint_info
* shorten metric names
`Arc` is already a reference internally, so it does not seem to be
beneficial to pass a reference to it. Just adds an extra layer of
indirection.
Functions that need to be able to increment `Arc` reference count need
to take `Arc<AtomicBool>`, but those that just want to read the
`AtomicBool` value can accept `&AtomicBool`, making them a bit more
generic.
This change focuses specifically on `Arc<AtomicBool>`. There are other
uses of `&Arc<T>` in the code base that could be converted in a similar
manner. But it would make the change even larger.
Using a fixed port could cause a false negative, if the port is
currently in use. We actually see this test failing regularly with an
error that port `1111` is already in use.
Quick search did not show any tests that hardcode port 1111, so it is
unclear why is this happening. But using hardcoded ports is not a good
practice anyways.
A node operator can manage a public TPU address (at node startup and while it's running), but doesn't have the ability to manage the TPU Forwards address as well.
This PR adds that functionality.
Added the start argument --public-tpu-forwards-address
Reworked the set-public-tpu-address subcommand into the set-public-address subcommand
Working towards LegacyContactInfo => ContactInfo migration, the commit
hides some implementation details of LegacyContactInfo and expands API
parity with the new ContactInfo.
* introduce workspace.package
* introduce workspace.dependencies
* read version from root cargo.toml
* pass check when version = { workspace = true }
* don't bump version when version = { workspace = true }
* including workspace Cargo.toml when bump version
* programs/sbf use workspace inheritance
* fix increasing cargo version ignore program/sbf/Cargo.toml
Local-timestamp used in current gossip eviction code:
https://github.com/solana-labs/solana/blob/a36e1b211/gossip/src/crds.rs#L469-L511
does not indicate how old the entry is but how recently it was received.
The commit instead uses origin's wallclock to identify old values. In
order to avoid cases where the wallclock on the entry is bogus, it is
capped by local-timestamp.
Duplicate-shreds handler is using a nested hash-map for the incomplete
chunks buffered. This is resulting in a convoluted logic to limit the
number of entries:
https://github.com/solana-labs/solana/blob/427bd6264/gossip/src/duplicate_shred_handler.rs#L62
This commit instead uses a flat buffer mapping (Slot, Pubkey) pairs to
the respective duplicate shreds chunks. The buffer is allowed to grow to
twice the intended capacity, at which point the extraneous entries are
removed in linear time, resulting an amortized O(1) performance.
The commit implement new ContactInfo where
* Ports and IP addresses are specified separately so that unique IP
addresses can only be specified once.
* Different sockets (tvu, tpu, etc) are specified by opaque u8 tags so
that adding and removing sockets is backward and forward compatible.
* solana_version::Version is also embedded in so that it won't need to
be gossiped separately.
* NodeInstance is also rolled in by adding a field identifying when the
instance was first created so that it won't need to be gossiped
separately.
Update plan:
* Once the cluster is able to ingest the new type (i.e. this patch), a
2nd patch will start gossiping the new ContactInfo along with the
LegacyContactInfo.
* Once all nodes in the cluster gossip the new ContactInfo, a 3rd patch
will start solely using the new ContactInfo while still gossiping the
old LegacyContactInfo.
* Once all nodes in the cluster solely use the new ContactInfo, a 4th
patch will stop gossiping the old LegacyContactInfo.
* introduce workspace.package
* introduce workspace.dependencies
* read version from root cargo.toml
* pass check when version = { workspace = true }
* don't bump version when version = { workspace = true }
* including workspace Cargo.toml when bump version
* programs/sbf use workspace inheritance
* fix increasing cargo version ignore program/sbf/Cargo.toml
Retransmit code has moved to core/src/cluster_nodes.rs and has been
significantly revised.
gossip/tests/cluster_info.rs is testing the old code which is no longer
relevant.
* First draft of ingesting duplicate proofs in Gossip into blockstore.
* Add more unittests.
* Add more unittests for bad cases.
* Fix lint errors for tests.
* More linter fixes for tests.
* Lint fixes
* Rename get_entries, move location of comment.
* Some renaming changes and comment fixes.
* Fix compile warning, this enum is not used.
* Fix lint errors.
* Slow down cleanup because this could potentially be expensive.
* Forgot to reset cleanup count.
* Add protection against attackers when constructing chunk map when
we ingest Gossip proofs.
* Use duplicate shred index instead of get_entries.
* Rename ClusterInfoDuplicateShredListener and fix a few small problems.
* Use into_shreds to piece together the proof.
* Remove redundant code.
* Address a few small errors.
* Discard slots too advanced in the future.
* - Use oldest proof for each pubkey
- Limit number of pubkeys in each slot to 100
* Disable duplicate shred handling for now.
* Revert "Disable duplicate shred handling for now."
This reverts commit c3fcf403876cfbf90afe4d2265a826f21a5e24ab.
Amplifying gossip peer sampling weights by the time since last
pull-request has undesired consequence that a node coming back online
will see a huge number of pull requests all at once.
This "time since last request" is also unnecessary to include in
weights because as long as sampling probabilities are non-zero, a node
will be almost surely periodically selected in the sample.
The commit reworks peer sampling probabilities by just using (dampened)
stakes as weights.
It was not immediately clear why the second `CrdsValue` insertion in the
test must always succeed. Turns out the test was relying on hash values
having a specific relationship. It is confusing to someone not deeply
familiar with the test.
As overwrite based on the hash value is not part of the behavior that we
consider valuable, we just remove that check.
Unified assertion between two checks into one.
Gossip push samples nodes by stake. This is unnecessarily wasteful and
creates too much congestion at high staked nodes if the CRDS value to be
propagated is from a node with low or zero stake.
This commit instead maintains several active-sets for push, each
corresponding with a stake bucket. Peer sampling weights are accordingly
capped by the respective bucket stake.
Since
https://github.com/solana-labs/solana/pull/20480
turbine includes all epoch staked nodes in tree construction and no
longer relies on obtaining their contact-info from gossip; and so
distinguishing between is_valid_address and is_valid_tvu_address is no
longer necessary and the latter can be removed.
dedups gossip addresses, keeping only the one with the highest weight
In order to avoid traffic congestion or sending duplicate packets, when
sampling gossip nodes if several nodes have the same gossip address
(because they are behind a relayer or whatever), they need to be
deduplicated into one.
As described here:
https://github.com/solana-labs/solana/issues/28642#issuecomment-1337449607
current gossip pruning code fails to maintain spanning trees across
cluster.
This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.
Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
The commit tracks number of times duplicates of a CRDS value is received
from gossip push. Following commits will utilize this metric to score
gossip nodes in terms of timeliness of their push messages, in order to
better pick which nodes to prune.
* Move ConnectionCache back to solana-client, and duplicate ThinClient, TpuClient there
* Dedupe thin_client modules
* Dedupe tpu_client modules
* Move TpuClient to TpuConnectionCache
* Move ThinClient to TpuConnectionCache
* Move TpuConnection and quic/udp trait implementations back to solana-client
* Remove enum_dispatch from solana-tpu-client
* Move udp-client to its own crate
* Move quic-client to its own crate