Commit Graph

147 Commits

Author SHA1 Message Date
dependabot[bot] c16bf02448
chore: bump serde from 1.0.126 to 1.0.127 (#19010)
* chore: bump serde from 1.0.126 to 1.0.127

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.126 to 1.0.127.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.126...v1.0.127)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <you@example.com>
2021-08-02 21:16:34 +00:00
behzad nouri 049fb0417f
allows sendmmsg api taking owned values (as well as references) (#18999)
Current signature of api in sendmmsg requires a slice of inner
references:
https://github.com/solana-labs/solana/blob/fe1ee4980/streamer/src/sendmmsg.rs#L130-L152

That forces the call-site to convert owned values to references even
though doing so is redundant and adds an extra level of indirection:
https://github.com/solana-labs/solana/blob/fe1ee4980/core/src/repair_service.rs#L291

This commit expands the api using AsRef and Borrow traits to allow
calling the method with owned values (as well as references like
before).
2021-07-30 20:58:49 +00:00
behzad nouri 81026f9ea5
passes through --allow-private-addr to validators in system perf tests (#18876) 2021-07-29 19:04:45 +00:00
dependabot[bot] 5f23d8530d
chore: bump lru from 0.6.5 to 0.6.6 (#18963)
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.6.5 to 0.6.6.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.6.5...0.6.6)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-07-29 08:33:41 -06:00
behzad nouri f1198fc6d5
filters crds values in parallel when responding to gossip pull-requests (#18877)
When responding to gossip pull-requests, filter_crds_values takes a lot of time
while holding onto read-lock:
https://github.com/solana-labs/solana/blob/f51d64868/gossip/src/crds_gossip_pull.rs#L509-L566

This commit will filter-crds-values in parallel using rayon thread-pools.
2021-07-26 17:13:11 +00:00
behzad nouri d2d5f36a3c
adds validator flag to allow private ip addresses (#18850) 2021-07-23 15:25:03 +00:00
carllin 588c0464b8
Add sampling logic and DuplicateSlotRepairStatus module (#18721) 2021-07-21 11:15:08 -07:00
behzad nouri bbd22f06f4
implements generic lookups into gossip crds table (#18765)
This commit adds CrdsEntry trait which allows generic lookups into crds
table. For example to get ContactInfo or LowestSlot associated with a
Pubkey, the lookup code would be respectively:
   crds.get::<&ContactInfo>(pubkey)
   crds.get::<&LowestSlot>(pubkey)
2021-07-21 12:16:26 +00:00
Justin Starry 207c90bd8b
Shorten long SerializeWith type paths in abi digest (#18734) 2021-07-20 08:59:50 -05:00
behzad nouri 8da261cf5c
locks crds only once in ClusterInfo::repair_peers (#18752)
ClusterInfo::repair_peers locks crds table twice, and shows performance
regression if the RwLock is not reader-preferred:
https://github.com/solana-labs/solana/blob/269028360/gossip/src/cluster_info.rs#L1188-L1210
2021-07-18 16:55:58 +00:00
behzad nouri e316586516 excludes private ip addresses 2021-07-16 20:05:48 -06:00
Jeff Biseda ae5ad5cf9b
sendmmsg cleanup #18589
Rationalize usage of sendmmsg(2). Skip packets which failed to send and track failures.
2021-07-16 14:36:49 -07:00
Brian Anderson 37ee0b5599
Eliminate doc warnings and fix some markdown (#18566)
* Fix link target in doc comment

* Fix formatting of log examples in process_instruction

* Fix doc markdown in solana-gossip

* Fix doc markdown in solana-runtime

* Escape square braces in doc comments to avoid warnings

* Surround 'account references' doc items in code spans to avoid warnings

* Fix code block in loader_upgradeable_instruction

* Fix doctest for loader_upgradable_instruction
2021-07-16 00:40:07 +00:00
behzad nouri cf31afdd6a
makes CrdsGossip thread-safe (#18615) 2021-07-14 22:27:17 +00:00
sakridge 7f2254225e
Move entry/poh to own crate to speed up poh bench build (#18225) 2021-07-14 14:16:29 +02:00
behzad nouri c90af3cd63
removes id from push_lowest_slot args (#18645)
push_lowest_slot cannot sign the new crds-value unless the id (pubkey)
argument passed-in is the same pubkey as in ClusterInfo::keypair(), in
which case the id argument is redundant:
https://github.com/solana-labs/solana/blob/bb41cf346/gossip/src/cluster_info.rs#L824-L845

Additionally, the lookup is done with self.id(), but insert is done with
the id argument, which is logically a bug.
2021-07-13 22:32:59 +00:00
behzad nouri 90f8cf0920
makes CrdsGossipPush thread-safe (#18581) 2021-07-13 14:04:25 +00:00
behzad nouri e7a1f2c9b0
makes CrdsGossipPull thread-safe (#18578) 2021-07-11 15:32:10 +00:00
carllin 175083c4c1
Add updated duplicate broadcast test (#18506) 2021-07-10 22:22:07 -07:00
behzad nouri 918b5c28b2
removes redundant (mutable) self receivers (#18574) 2021-07-10 22:16:33 +00:00
behzad nouri fd9c10c2e2
adds a generic implementation of Gossip{Read,Write}Lock (#18559) 2021-07-10 14:13:52 +00:00
behzad nouri 4e1333fbe6
removes id and shred_version from CrdsGossip (#18505)
ClusterInfo is the gateway to CrdsGossip function calls, and it already
has node's pubkey and shred version (full ContactInfo and Keypair in
fact).
Duplicating these data in CrdsGossip adds redundancy and possibility for
bugs should they not be consistent with ClusterInfo.
2021-07-09 13:10:08 +00:00
behzad nouri 27cc7577a1
skips process_push_message for local messages (#18493)
received_cache is not relevant for local messages, and does not need to
be updated:
https://github.com/solana-labs/solana/blob/92c5cdab6/gossip/src/crds_gossip_push.rs#L166-L189
2021-07-09 01:42:13 +00:00
Michael Vines 1e0942e900 Rename ClusterInfo::send_vote to ClusterInfo::send_transaction 2021-07-07 15:51:14 -07:00
Justin Starry 92c5cdab62
Fix cargo check (#18499) 2021-07-07 14:21:08 -05:00
behzad nouri dba42c57b4
implements an unbiased weighted shuffle using binary indexed tree (#18343)
Current implementation of weighted_shuffle:
https://github.com/solana-labs/solana/blob/b08f8bd1b/gossip/src/weighted_shuffle.rs#L11-L37
uses a heuristic which results in biased samples.

For example, if the weights are [1, 10, 100], then the 3rd index should
come first 100 times more often than the 1st index. However,
weighted_shuffle is picking the 3rd index 200+ times more often than the
1st index, showing a disproportional bias in favor of higher weights.

This commit implements weighted shuffle using binary indexed tree to
maintain cumulative sum of weights while sampling. The resulting samples
are demonstrably unbiased and precisely proportional to the weights.

Additionally the iterator interface allows to skip computations when
not all indices are processed.

Of the use cases of weighted_shuffle, changing turbine code requires
feature-gating to keep the cluster in sync. That is not updated in
this commit, but can be done together with future updates to turbine.
2021-07-07 14:14:43 +00:00
behzad nouri 04787be8b1
encapsulates turbine peers computations of broadcast & retransmit stages (#18238)
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.

Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.

This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
2021-07-07 00:35:25 +00:00
Michael Vines c17451ca73 Acquire instance read lock once 2021-07-01 17:50:04 -07:00
Michael Vines db3a9ae7fb Fully replace NodeInstance 2021-07-01 17:50:04 -07:00
Michael Vines 71efac46cb Hoist keypair() out of some loops 2021-07-01 17:50:04 -07:00
Michael Vines b6792a3328 Add ability to change the validator identity at runtime 2021-07-01 17:50:04 -07:00
Michael Vines bf157506e8 Remove id ref 2021-07-01 17:50:04 -07:00
Ashwin Sekar f4fb5de545
Consider all peers as potential candidates during pull-request in case of offline nodes (#18333)
* Try all peers during pull-request in case of offline nodes

* fix clippy err
2021-07-01 12:00:10 -07:00
dependabot[bot] 78968d132f
chore: bump log from 0.4.11 to 0.4.14 (#18323)
* chore: bump log from 0.4.11 to 0.4.14

Bumps [log](https://github.com/rust-lang/log) from 0.4.11 to 0.4.14.
- [Release notes](https://github.com/rust-lang/log/releases)
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/log/compare/0.4.11...0.4.14)

---
updated-dependencies:
- dependency-name: log
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Make version consistent

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-06-30 18:32:01 +00:00
dependabot[bot] e9165232ef
chore: bump indexmap from 1.6.2 to 1.7.0 (#18322)
* chore: bump indexmap from 1.6.2 to 1.7.0

Bumps [indexmap](https://github.com/bluss/indexmap) from 1.6.2 to 1.7.0.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Commits](https://github.com/bluss/indexmap/compare/1.6.2...1.7.0)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2021-06-30 09:25:01 -06:00
behzad nouri 9d983a34a0
debug logs when crds table trim failed (#18307)
reports of this error being possibly spammy:
https://discord.com/channels/428295358100013066/689412830075551748/859441080054710293

The commit changes the log level to debug.
Additionally adding a new metric to understand the frequency of this error.
2021-06-29 19:39:46 +00:00
behzad nouri d7b8329b45
removes repeated calls to ClusterInfo::id in iterators and contact-info clone (#18174)
Calling ClusterInfo::id repeatedly in for loops or iterators is
inefficient, because it acquires a lock on ClusterInfo.my_contact_info,
and clones the entire contact-info.
2021-06-23 16:30:14 +00:00
behzad nouri 69a5f0e6cd
filters crds values obtained through gossip by their shred version (#18072)
filter_by_shred_version does not check the shred-version of the owner of
the crds-value. It only checks the shred-version of the node which is
relaying the value:
https://github.com/solana-labs/solana/blob/5cc073420/gossip/src/cluster_info.rs#L2274-L2289

So crds-values with different shred versions can still pass through this
function as long as they are relayed by a node with matching shred
version; and so, a single node can bridge different shred values
through-out the cluster.
2021-06-23 14:16:05 +00:00
dependabot[bot] 2156712768
chore: bump lru from 0.6.1 to 0.6.5 (#18138)
Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.6.1 to 0.6.5.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.6.1...0.6.5)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-22 17:07:45 +00:00
Michael Vines 84b9de8c18 Shredder no longer holds a keypair 2021-06-21 21:29:52 -07:00
Michael Vines 553fc210f5 Remove duplicated id field 2021-06-21 21:29:52 -07:00
behzad nouri 598093b5db adds shred-version to ip-echo-server response
When starting a validator, the node initially joins gossip with
shred_verison = 0, until it adopts the entrypoint's shred-version:
https://github.com/solana-labs/solana/blob/9b182f408/validator/src/main.rs#L417

Depending on the load on the entrypoint, this adopting entrypoint
shred-version through gossip sometimes becomes very slow, and causes
several problems in gossip because we have to partially support
shred_version == 0 which is a source of leaking crds values from one
cluster to another. e.g. see
https://github.com/solana-labs/solana/pull/17899
and the other linked issues there.

In order to remove shred_version == 0 from gossip, this commit adds
shred-version to ip-echo-server response. Once the entrypoints are
updated, on validator start-up, if --expected_shred_version is not
specified we will obtain shred-version from the entrypoint using
ip-echo-server.
2021-06-21 19:37:16 +00:00
dependabot[bot] d458fac2ff
chore: bump bincode from 1.3.1 to 1.3.3 (#18087)
* chore: bump bincode from 1.3.1 to 1.3.3

Bumps [bincode](https://github.com/servo/bincode) from 1.3.1 to 1.3.3.
- [Release notes](https://github.com/servo/bincode/releases)
- [Commits](https://github.com/servo/bincode/compare/v1.3.1...v1.3.3)

---
updated-dependencies:
- dependency-name: bincode
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2021-06-21 16:06:31 +00:00
Alexander Meißner 789f33e8db chore: cargo fmt 2021-06-18 10:42:46 -07:00
Alexander Meißner 6514096a67 chore: cargo +nightly clippy --fix -Z unstable-options 2021-06-18 10:42:46 -07:00
behzad nouri 5a99fa3790
adds mapping from nodes pubkeys to their shred-version (#17940)
Crds values of nodes with different shred versions are creeping into
gossip table resulting in runtime issues as the one addressed in:
https://github.com/solana-labs/solana/pull/17899

This commit works towards enforcing more checks and filtering based on
shred version by adding necessary mapping and api to gossip table.
Once populated, pubkey->shred-version mapping persists as long as there
are any values associated with the pubkey.
2021-06-18 15:56:04 +00:00
dependabot[bot] a0872232d3
chore: bump itertools from 0.9.0 to 0.10.1 (#17929)
* chore: bump itertools from 0.9.0 to 0.10.1

Bumps [itertools](https://github.com/rust-itertools/itertools) from 0.9.0 to 0.10.1.
- [Release notes](https://github.com/rust-itertools/itertools/releases)
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.9.0...v0.10.1)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix versions

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-06-14 18:32:20 +00:00
sakridge eeee75c5be
Don't use pinned memory when unnecessary (#17832)
Reports of excessive GPU memory usage and errors
from cudaHostRegister. There are some cases where pinning is
not required.
2021-06-14 16:10:04 +02:00
behzad nouri cca46308bc
short cuts expiration check if origin's contact-info is still valid (#17918)
Crds::find_old_labels can skip checking values timestamps if the
origin's contact info hasn't expired yet:
https://github.com/solana-labs/solana/blob/985280ec0/gossip/src/crds.rs#L394-L408
2021-06-13 19:47:07 +00:00
behzad nouri 985280ec0b
excludes epoch-slots from nodes with unknown or different shred version (#17899)
Inspecting TDS gossip table shows that crds values of nodes with
different shred-versions are creeping in. Their epoch-slots are
accumulated in ClusterSlots causing bogus slots very far from current
root which are not purged and so cause ClusterSlots keep consuming more
memory:
https://github.com/solana-labs/solana/issues/17789
https://github.com/solana-labs/solana/issues/14366#issuecomment-769896036
https://github.com/solana-labs/solana/issues/14366#issuecomment-832754654

This commit updates ClusterInfo::get_epoch_slots, and discards entries
from nodes with unknown or different shred-version.

Follow up commits will patch gossip not to waste bandwidth and memory
over crds values of nodes with different shred-version.
2021-06-13 14:08:08 +00:00