behzad nouri
90f8cf0920
makes CrdsGossipPush thread-safe ( #18581 )
2021-07-13 14:04:25 +00:00
behzad nouri
e7a1f2c9b0
makes CrdsGossipPull thread-safe ( #18578 )
2021-07-11 15:32:10 +00:00
carllin
175083c4c1
Add updated duplicate broadcast test ( #18506 )
2021-07-10 22:22:07 -07:00
behzad nouri
918b5c28b2
removes redundant (mutable) self receivers ( #18574 )
2021-07-10 22:16:33 +00:00
behzad nouri
fd9c10c2e2
adds a generic implementation of Gossip{Read,Write}Lock ( #18559 )
2021-07-10 14:13:52 +00:00
behzad nouri
4e1333fbe6
removes id and shred_version from CrdsGossip ( #18505 )
...
ClusterInfo is the gateway to CrdsGossip function calls, and it already
has node's pubkey and shred version (full ContactInfo and Keypair in
fact).
Duplicating these data in CrdsGossip adds redundancy and possibility for
bugs should they not be consistent with ClusterInfo.
2021-07-09 13:10:08 +00:00
behzad nouri
27cc7577a1
skips process_push_message for local messages ( #18493 )
...
received_cache is not relevant for local messages, and does not need to
be updated:
https://github.com/solana-labs/solana/blob/92c5cdab6/gossip/src/crds_gossip_push.rs#L166-L189
2021-07-09 01:42:13 +00:00
Michael Vines
1e0942e900
Rename ClusterInfo::send_vote to ClusterInfo::send_transaction
2021-07-07 15:51:14 -07:00
Justin Starry
92c5cdab62
Fix cargo check ( #18499 )
2021-07-07 14:21:08 -05:00
behzad nouri
dba42c57b4
implements an unbiased weighted shuffle using binary indexed tree ( #18343 )
...
Current implementation of weighted_shuffle:
https://github.com/solana-labs/solana/blob/b08f8bd1b/gossip/src/weighted_shuffle.rs#L11-L37
uses a heuristic which results in biased samples.
For example, if the weights are [1, 10, 100], then the 3rd index should
come first 100 times more often than the 1st index. However,
weighted_shuffle is picking the 3rd index 200+ times more often than the
1st index, showing a disproportional bias in favor of higher weights.
This commit implements weighted shuffle using binary indexed tree to
maintain cumulative sum of weights while sampling. The resulting samples
are demonstrably unbiased and precisely proportional to the weights.
Additionally the iterator interface allows to skip computations when
not all indices are processed.
Of the use cases of weighted_shuffle, changing turbine code requires
feature-gating to keep the cluster in sync. That is not updated in
this commit, but can be done together with future updates to turbine.
2021-07-07 14:14:43 +00:00
behzad nouri
04787be8b1
encapsulates turbine peers computations of broadcast & retransmit stages ( #18238 )
...
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.
Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.
This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
2021-07-07 00:35:25 +00:00
Michael Vines
c17451ca73
Acquire instance read lock once
2021-07-01 17:50:04 -07:00
Michael Vines
db3a9ae7fb
Fully replace NodeInstance
2021-07-01 17:50:04 -07:00
Michael Vines
71efac46cb
Hoist keypair() out of some loops
2021-07-01 17:50:04 -07:00
Michael Vines
b6792a3328
Add ability to change the validator identity at runtime
2021-07-01 17:50:04 -07:00
Michael Vines
bf157506e8
Remove id ref
2021-07-01 17:50:04 -07:00
Ashwin Sekar
f4fb5de545
Consider all peers as potential candidates during pull-request in case of offline nodes ( #18333 )
...
* Try all peers during pull-request in case of offline nodes
* fix clippy err
2021-07-01 12:00:10 -07:00
dependabot[bot]
78968d132f
chore: bump log from 0.4.11 to 0.4.14 ( #18323 )
...
* chore: bump log from 0.4.11 to 0.4.14
Bumps [log](https://github.com/rust-lang/log ) from 0.4.11 to 0.4.14.
- [Release notes](https://github.com/rust-lang/log/releases )
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md )
- [Commits](https://github.com/rust-lang/log/compare/0.4.11...0.4.14 )
---
updated-dependencies:
- dependency-name: log
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* Make version consistent
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-06-30 18:32:01 +00:00
dependabot[bot]
e9165232ef
chore: bump indexmap from 1.6.2 to 1.7.0 ( #18322 )
...
* chore: bump indexmap from 1.6.2 to 1.7.0
Bumps [indexmap](https://github.com/bluss/indexmap ) from 1.6.2 to 1.7.0.
- [Release notes](https://github.com/bluss/indexmap/releases )
- [Commits](https://github.com/bluss/indexmap/compare/1.6.2...1.7.0 )
---
updated-dependencies:
- dependency-name: indexmap
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* [auto-commit] Update all Cargo lock files
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2021-06-30 09:25:01 -06:00
behzad nouri
9d983a34a0
debug logs when crds table trim failed ( #18307 )
...
reports of this error being possibly spammy:
https://discord.com/channels/428295358100013066/689412830075551748/859441080054710293
The commit changes the log level to debug.
Additionally adding a new metric to understand the frequency of this error.
2021-06-29 19:39:46 +00:00
behzad nouri
d7b8329b45
removes repeated calls to ClusterInfo::id in iterators and contact-info clone ( #18174 )
...
Calling ClusterInfo::id repeatedly in for loops or iterators is
inefficient, because it acquires a lock on ClusterInfo.my_contact_info,
and clones the entire contact-info.
2021-06-23 16:30:14 +00:00
behzad nouri
69a5f0e6cd
filters crds values obtained through gossip by their shred version ( #18072 )
...
filter_by_shred_version does not check the shred-version of the owner of
the crds-value. It only checks the shred-version of the node which is
relaying the value:
https://github.com/solana-labs/solana/blob/5cc073420/gossip/src/cluster_info.rs#L2274-L2289
So crds-values with different shred versions can still pass through this
function as long as they are relayed by a node with matching shred
version; and so, a single node can bridge different shred values
through-out the cluster.
2021-06-23 14:16:05 +00:00
dependabot[bot]
2156712768
chore: bump lru from 0.6.1 to 0.6.5 ( #18138 )
...
Bumps [lru](https://github.com/jeromefroe/lru-rs ) from 0.6.1 to 0.6.5.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases )
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md )
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.6.1...0.6.5 )
---
updated-dependencies:
- dependency-name: lru
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-22 17:07:45 +00:00
Michael Vines
84b9de8c18
Shredder no longer holds a keypair
2021-06-21 21:29:52 -07:00
Michael Vines
553fc210f5
Remove duplicated id field
2021-06-21 21:29:52 -07:00
behzad nouri
598093b5db
adds shred-version to ip-echo-server response
...
When starting a validator, the node initially joins gossip with
shred_verison = 0, until it adopts the entrypoint's shred-version:
https://github.com/solana-labs/solana/blob/9b182f408/validator/src/main.rs#L417
Depending on the load on the entrypoint, this adopting entrypoint
shred-version through gossip sometimes becomes very slow, and causes
several problems in gossip because we have to partially support
shred_version == 0 which is a source of leaking crds values from one
cluster to another. e.g. see
https://github.com/solana-labs/solana/pull/17899
and the other linked issues there.
In order to remove shred_version == 0 from gossip, this commit adds
shred-version to ip-echo-server response. Once the entrypoints are
updated, on validator start-up, if --expected_shred_version is not
specified we will obtain shred-version from the entrypoint using
ip-echo-server.
2021-06-21 19:37:16 +00:00
dependabot[bot]
d458fac2ff
chore: bump bincode from 1.3.1 to 1.3.3 ( #18087 )
...
* chore: bump bincode from 1.3.1 to 1.3.3
Bumps [bincode](https://github.com/servo/bincode ) from 1.3.1 to 1.3.3.
- [Release notes](https://github.com/servo/bincode/releases )
- [Commits](https://github.com/servo/bincode/compare/v1.3.1...v1.3.3 )
---
updated-dependencies:
- dependency-name: bincode
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* [auto-commit] Update all Cargo lock files
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2021-06-21 16:06:31 +00:00
Alexander Meißner
789f33e8db
chore: cargo fmt
2021-06-18 10:42:46 -07:00
Alexander Meißner
6514096a67
chore: cargo +nightly clippy --fix -Z unstable-options
2021-06-18 10:42:46 -07:00
behzad nouri
5a99fa3790
adds mapping from nodes pubkeys to their shred-version ( #17940 )
...
Crds values of nodes with different shred versions are creeping into
gossip table resulting in runtime issues as the one addressed in:
https://github.com/solana-labs/solana/pull/17899
This commit works towards enforcing more checks and filtering based on
shred version by adding necessary mapping and api to gossip table.
Once populated, pubkey->shred-version mapping persists as long as there
are any values associated with the pubkey.
2021-06-18 15:56:04 +00:00
dependabot[bot]
a0872232d3
chore: bump itertools from 0.9.0 to 0.10.1 ( #17929 )
...
* chore: bump itertools from 0.9.0 to 0.10.1
Bumps [itertools](https://github.com/rust-itertools/itertools ) from 0.9.0 to 0.10.1.
- [Release notes](https://github.com/rust-itertools/itertools/releases )
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md )
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.9.0...v0.10.1 )
---
updated-dependencies:
- dependency-name: itertools
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* Fix versions
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-06-14 18:32:20 +00:00
sakridge
eeee75c5be
Don't use pinned memory when unnecessary ( #17832 )
...
Reports of excessive GPU memory usage and errors
from cudaHostRegister. There are some cases where pinning is
not required.
2021-06-14 16:10:04 +02:00
behzad nouri
cca46308bc
short cuts expiration check if origin's contact-info is still valid ( #17918 )
...
Crds::find_old_labels can skip checking values timestamps if the
origin's contact info hasn't expired yet:
https://github.com/solana-labs/solana/blob/985280ec0/gossip/src/crds.rs#L394-L408
2021-06-13 19:47:07 +00:00
behzad nouri
985280ec0b
excludes epoch-slots from nodes with unknown or different shred version ( #17899 )
...
Inspecting TDS gossip table shows that crds values of nodes with
different shred-versions are creeping in. Their epoch-slots are
accumulated in ClusterSlots causing bogus slots very far from current
root which are not purged and so cause ClusterSlots keep consuming more
memory:
https://github.com/solana-labs/solana/issues/17789
https://github.com/solana-labs/solana/issues/14366#issuecomment-769896036
https://github.com/solana-labs/solana/issues/14366#issuecomment-832754654
This commit updates ClusterInfo::get_epoch_slots, and discards entries
from nodes with unknown or different shred-version.
Follow up commits will patch gossip not to waste bandwidth and memory
over crds values of nodes with different shred-version.
2021-06-13 14:08:08 +00:00
dependabot[bot]
2aa7df23b5
chore: bump indexmap from 1.5.1 to 1.6.2 ( #17884 )
...
* chore: bump indexmap from 1.5.1 to 1.6.2
Bumps [indexmap](https://github.com/bluss/indexmap ) from 1.5.1 to 1.6.2.
- [Release notes](https://github.com/bluss/indexmap/releases )
- [Commits](https://github.com/bluss/indexmap/compare/1.5.1...1.6.2 )
---
updated-dependencies:
- dependency-name: indexmap
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* [auto-commit] Update all Cargo lock files
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2021-06-11 09:50:09 -06:00
dependabot[bot]
f08ed1eb2d
chore: bump rayon from 1.5.0 to 1.5.1 ( #17869 )
...
* chore: bump rayon from 1.5.0 to 1.5.1
Bumps [rayon](https://github.com/rayon-rs/rayon ) from 1.5.0 to 1.5.1.
- [Release notes](https://github.com/rayon-rs/rayon/releases )
- [Changelog](https://github.com/rayon-rs/rayon/blob/master/RELEASES.md )
- [Commits](https://github.com/rayon-rs/rayon/compare/rayon-core-v1.5.0...v1.5.1 )
---
updated-dependencies:
- dependency-name: rayon
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* [auto-commit] Update all Cargo lock files
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <you@example.com>
2021-06-10 13:18:47 -06:00
dependabot[bot]
9a2ca8dd2f
chore: bump rustc_version from 0.2.3 to 0.4.0 ( #17854 )
...
* chore: bump rustc_version from 0.2.3 to 0.4.0
Bumps [rustc_version](https://github.com/Kimundi/rustc-version-rs ) from 0.2.3 to 0.4.0.
- [Release notes](https://github.com/Kimundi/rustc-version-rs/releases )
- [Commits](https://github.com/Kimundi/rustc-version-rs/compare/v0.2.3...v0.4.0 )
---
updated-dependencies:
- dependency-name: rustc_version
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* Make versions consistent
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-06-09 16:57:39 +00:00
behzad nouri
cab30e2356
parallelizes gossip packets receiver with processing of requests ( #17647 )
...
Gossip packet processing is composed of two stages:
* The first is consuming packets from the socket, deserializing,
sanitizing and verifying them:
https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2510-L2521
* The second is actually processing the requests/messages:
https://github.com/solana-labs/solana/blob/7f0349b29/gossip/src/cluster_info.rs#L2585-L2605
The former does not acquire any locks and so can be parallelized with
the later, allowing better pipelineing properties and smaller latency in
responding to gossip requests or propagating messages.
2021-06-07 18:36:06 +00:00
behzad nouri
60b0a13444
writes epoch-slots to crds table synchronously ( #17719 )
...
epoch-slots may be overwritten before they are written to crds table:
https://github.com/solana-labs/solana/issues/17711
This commit writes new epoch-slots to crds table synchronously with
push_epoch_slots. The functions is still not thread-safe as commented in
the code, however currently only one threads is invoking this code.
2021-06-04 13:56:51 +00:00
behzad nouri
be957f25c9
adds fallback logic if retransmit multicast fails ( #17714 )
...
In retransmit-stage, based on the packet.meta.seed and resulting
children/neighbors, each packet is sent to a different set of peers:
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L421-L457
However, current code errors out as soon as a multicast call fails,
which will skip all the remaining packets:
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470
This can exacerbate packets loss in turbine.
This commit:
* keeps iterating over retransmit packets for loop even if some
intermediate sends fail.
* adds a fallback to UdpSocket::send_to if multicast fails.
Recent discord chat:
https://discord.com/channels/428295358100013066/689412830075551748/849530845052403733
2021-06-04 12:16:37 +00:00
Tyera Eulberg
3a647c4bea
Rename ValidatorExit and move to sdk ( #17728 )
2021-06-04 03:06:13 +00:00
dependabot[bot]
3670435db4
chore: bump serial_test from 0.4.0 to 0.5.1 ( #17705 )
...
Bumps [serial_test](https://github.com/palfrey/serial_test ) from 0.4.0 to 0.5.1.
- [Release notes](https://github.com/palfrey/serial_test/releases )
- [Commits](https://github.com/palfrey/serial_test/compare/v0.4.0...v0.5.1 )
---
updated-dependencies:
- dependency-name: serial_test
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-03 10:07:56 -06:00
dependabot[bot]
ab0f4ff835
Bump serde from 1.0.122 to 1.0.126 ( #17618 )
...
* Bump serde from 1.0.122 to 1.0.126
Bumps [serde](https://github.com/serde-rs/serde ) from 1.0.122 to 1.0.126.
- [Release notes](https://github.com/serde-rs/serde/releases )
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.122...v1.0.126 )
Signed-off-by: dependabot[bot] <support@github.com>
* [auto-commit] Update all Cargo lock files
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <anatoly+githubjenkins@solana.io>
2021-05-31 22:41:25 +00:00
behzad nouri
7cf6e66ddd
excludes caller's crds values from pull responses ( #17542 )
...
If the crds entry belongs to the caller itself, then the caller will
always have the more recent version of it, regardless of it being
filtered out by the bloom filter or not.
The exception is node-instance types which are meant to detect duplicate
running instances, and those are exempted.
2021-05-28 13:19:14 +00:00
Michael Vines
8eab0e8602
Bump version to v1.8.0 ( #17541 )
2021-05-27 08:51:53 -07:00
Tyera Eulberg
9a5330b7eb
Move gossip modules into solana-gossip crate ( #17352 )
...
* Move gossip modules to solana-gossip
* Update Protocol abi digest due to move
* Move gossip benches and hook up CI
* Remove unneeded Result entries
* Single use statements
2021-05-26 09:15:46 -06:00
behzad nouri
cf1acfb021
uses Duration type for gossip discover timeout
2021-05-22 19:17:36 +00:00
Michael Vines
a911ae00ba
clippy
2021-04-18 20:55:02 -07:00
Michael Vines
a2eb655322
=1.7.0
2021-03-16 07:51:07 +00:00
Michael Vines
0c9ca5522c
Bump version to v1.7.0
2021-03-13 09:01:21 +00:00