Commit Graph

123 Commits

Author SHA1 Message Date
Michael Vines 3f4731b37f Standardize thread names
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
2022-08-20 07:49:39 -07:00
behzad nouri 3b87aa9227
reverts wide fanout in broadcast when the root node is down (#26359)
A change included in
https://github.com/solana-labs/solana/pull/20480
was that when the root node in turbine broadcast tree is down, the
leader will broadcast the shred to all nodes in the first layer.
The intention was to mitigate the impact of dead nodes on shreds
propagation, because if the root node is down, then the entire cluster
will miss out the shred.
On the other hand, if x% of stake is down, this will cause 200*x% + 1
packets/shreds ratio at the broadcast stage which might contribute to
line-rate saturation and packet drop.
To avoid this bandwidth saturation issue, this commit reverts that logic
and always broadcasts shreds from the leader only to the root node.
As before we rely on erasure codes to recover shreds lost due to staked
nodes being offline.
2022-08-16 19:40:06 +00:00
behzad nouri 88599fd760
skips shreds deserialization before retransmit (#26230)
Fully deserializing shreds in window-service before sending them to
retransmit stage adds latency to shreds propagation.
This commit instead channels through the payload and relies on only
partial deserialization of a few required fields: slot, shred-index,
shred-type.
2022-06-30 12:13:00 +00:00
behzad nouri f534b8981b
maps number of data shreds to erasure batch size (#25917)
In prepration of
https://github.com/solana-labs/solana/pull/25807
which reworks erasure batch sizes, this commit:
* adds a helper function mapping the number of data shreds to the
  erasure batch size.
* adds ProcessShredsStats to Shredder::entries_to_shreds in order to
  replace and remove entries_to_data_shreds from the public interface.
2022-06-23 13:27:54 +00:00
behzad nouri fe3c1d3d49
removes erroneous uses of &Arc<...> from broadcast-stage (#25962) 2022-06-15 13:44:24 +00:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
behzad nouri 895f76a93c
hides implementation details of shred from its public interface (#24563)
Working towards embedding versioning into shreds binary, so that a new
variant of shred struct can include merkle tree hashes of the erasure
set.
2022-04-25 12:43:22 +00:00
Jeff Biseda 8b66625c95
convert std::sync::mpsc to crossbeam_channel (#22264) 2022-01-11 02:44:46 -08:00
behzad nouri 65d59f4ef0
tracks erasure coding shreds' indices explicitly (#21822)
The indices for erasure coding shreds are tied to data shreds:
https://github.com/solana-labs/solana/blob/90f41fd9b/ledger/src/shred.rs#L921

However with the upcoming changes to erasure schema, there will be more
erasure coding shreds than data shreds and we can no longer infer coding
shreds indices from data shreds.

The commit adds constructs to track coding shreds indices explicitly.
2021-12-19 22:37:55 +00:00
carllin 385efae4b3
Remove need to send bank in retransmit request from ReplayStage (#21943)
* Remove need to send bank in retransmitter
2021-12-16 21:11:01 -05:00
Michael Vines b8837c04ec Reformat imports to a consistent style for imports
rustfmt.toml configuration:
  imports_granularity = "One"
  group_imports = "One"
2021-12-03 09:19:13 -08:00
behzad nouri 0c0384ec32
revises turbine peers shuffling order (#20480)
Turbine randomly shuffles cluster nodes on a broadcast tree for each
shred. This requires knowing the stakes and nodes' contact-infos (from
gossip).

However gossip is subject to partitioning and propogation delays.
Additionally unstaked nodes may join and leave the cluster at any
moment, changing the cluster view from one node to another.

This commit:
* Always arranges the unstaked nodes at the bottom of turbine broadcast
  tree.
* Staked nodes are always included regardless of if their contact-info
  is available in gossip or not.
* Uses the unbiased WeightedShuffle construct for shuffling nodes.
2021-10-14 15:09:36 +00:00
behzad nouri 6d9818b8e4
skips retransmit for shreds with unknown slot leader (#19472)
Shreds' signatures should be verified before they reach retransmit
stage, and if the leader is unknown they should fail signature check.
Therefore retransmit-stage can as well expect to know who the slot
leader is and otherwise just skip the shred.

Blockstore checking signature of recovered shreds before sending them to
retransmit stage:
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/blockstore.rs#L884-L930

Shred signature verifier:
https://github.com/solana-labs/solana/blob/4305d4b7b/core/src/sigverify_shreds.rs#L41-L57
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/sigverify_shreds.rs#L105
2021-09-01 15:44:26 +00:00
behzad nouri 1deb4add81
removes Slot from TransmitShreds (#19327)
An earlier version of the code was funneling through stakes along with
shreds to broadcast:
https://github.com/solana-labs/solana/blob/b67ffab37/core/src/broadcast_stage.rs#L127

This was changed to only slots as stakes computation was pushed further
down the pipeline in:
https://github.com/solana-labs/solana/pull/18971

However shreds themselves embody which slot they belong to. So pairing
them with slot is redundant and adds rooms for bugs should they become
inconsistent.
2021-08-20 13:48:33 +00:00
behzad nouri aa32738dd5 uses cluster-nodes cache in broadcast-stage
* Current caching mechanism does not update cluster-nodes when the epoch
  (and so epoch staked nodes) changes:
  https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344

* Additionally, the cache update has a concurrency bug in which the
  thread which does compare_and_swap may be blocked when it tries to
  obtain the write-lock on cache, while other threads will keep running
  ahead with the outdated cache (since the atomic timestamp is already
  updated).

In the new ClusterNodesCache, entries are keyed by epoch, and so if
epoch changes cluster-nodes will be recalculated. The time-to-live
eviction policy is also encapsulated and rigidly enforced.
2021-08-05 21:47:33 +00:00
behzad nouri 44b11154ca sends slots (instead of stakes) through broadcast flow
Current broadcast code is computing stakes for each slot before sending
them down the channel:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228
https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349

Since the stakes are a function of epoch the slot belongs to (and so
does not necessarily change from one slot to another), forwarding the
slot itself would allow better caching downstream.

In addition we need to invalidate the cache if the epoch changes (which
the current code does not do), and that requires to know which slot (and
so epoch) current broadcasted shreds belong to:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344
2021-08-05 21:47:33 +00:00
Jeff Washington (jwash) 14361906ca
for all tests, bank::new -> bank::new_for_tests (#19064) 2021-08-05 08:42:38 -05:00
behzad nouri 049fb0417f
allows sendmmsg api taking owned values (as well as references) (#18999)
Current signature of api in sendmmsg requires a slice of inner
references:
https://github.com/solana-labs/solana/blob/fe1ee4980/streamer/src/sendmmsg.rs#L130-L152

That forces the call-site to convert owned values to references even
though doing so is redundant and adds an extra level of indirection:
https://github.com/solana-labs/solana/blob/fe1ee4980/core/src/repair_service.rs#L291

This commit expands the api using AsRef and Borrow traits to allow
calling the method with owned values (as well as references like
before).
2021-07-30 20:58:49 +00:00
sakridge 84e78316b1
Write helper for multithread update (#18808) 2021-07-29 03:16:36 +02:00
behzad nouri d2d5f36a3c
adds validator flag to allow private ip addresses (#18850) 2021-07-23 15:25:03 +00:00
behzad nouri e316586516 excludes private ip addresses 2021-07-16 20:05:48 -06:00
Jeff Biseda ae5ad5cf9b
sendmmsg cleanup #18589
Rationalize usage of sendmmsg(2). Skip packets which failed to send and track failures.
2021-07-16 14:36:49 -07:00
sakridge 7f2254225e
Move entry/poh to own crate to speed up poh bench build (#18225) 2021-07-14 14:16:29 +02:00
carllin 175083c4c1
Add updated duplicate broadcast test (#18506) 2021-07-10 22:22:07 -07:00
jbiseda a86ced0bac
generate deterministic seeds for shreds (#17950)
* generate shred seed from leader pubkey

* clippy

* clippy

* review

* review 2

* fmt

* review

* check

* review

* cleanup

* fmt
2021-07-07 08:21:12 -07:00
behzad nouri 04787be8b1
encapsulates turbine peers computations of broadcast & retransmit stages (#18238)
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.

Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.

This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
2021-07-07 00:35:25 +00:00
Michael Vines b6792a3328 Add ability to change the validator identity at runtime 2021-07-01 17:50:04 -07:00
Michael Vines 84b9de8c18 Shredder no longer holds a keypair 2021-06-21 21:29:52 -07:00
Michael Vines 4a12c715a3 Drop Error suffix from enum values to avoid the enum_variant_names clippy lint 2021-06-18 23:02:13 +00:00
Alexander Meißner 6514096a67 chore: cargo +nightly clippy --fix -Z unstable-options 2021-06-18 10:42:46 -07:00
Justin Starry 050bb5446d
Add local cluster tests that broadcast duplicate slots (#13995)
* Add duplicate node local cluster test

* fix clippy

* remove dupe test
2021-06-09 15:01:48 -07:00
Tyera Eulberg 544b3c0d17
Create solana-poh and move remaining rpc modules to solana-rpc (#17698)
* Create solana-poh crate

* Move BigTableUploadService to solana-ledger

* Add solana-rpc to workspace

* Move dependencies to solana-rpc

* Move remaining rpc modules to solana-rpc

* Single use statement solana-poh

* Single use statement solana-rpc
2021-06-04 09:23:06 -06:00
Tyera Eulberg 9a5330b7eb
Move gossip modules into solana-gossip crate (#17352)
* Move gossip modules to solana-gossip

* Update Protocol abi digest due to move

* Move gossip benches and hook up CI

* Remove unneeded Result entries

* Single use statements
2021-05-26 09:15:46 -06:00
behzad nouri 37b8587d4e
expands number of erasure coding shreds in the last batch in slots (#16484)
Number of parity coding shreds is always less than the number of data
shreds in FEC blocks:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719

Data shreds are batched in chunks of 32 shreds each:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714

However the very last batch of data shreds in a slot can be small, in
which case the loss rate can be exacerbated.

This commit expands the number of coding shreds in the last FEC block in
slots to: 64 - number of data shreds; so that FEC blocks are always 64
data and parity coding shreds each.

As a consequence of this, the last FEC block has more parity coding
shreds than data shreds. So for some shred indices we will have a coding
shred but no data shreds. This should not cause any kind of overlapping
FEC blocks as in:
https://github.com/solana-labs/solana/pull/10095
since this is done only for the very last batch in a slot, and the next
slot will reset the shred index.
2021-04-21 12:47:50 +00:00
behzad nouri 570fd3f810
makes turbine peer computation consistent between broadcast and retransmit (#14910)
get_broadcast_peers is using tvu_peers:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/broadcast_stage.rs#L362-L370
which is potentially inconsistent with retransmit_peers:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1332-L1345

Also, the leader does not include its own contact-info when broadcasting
shreds:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1324
but on the retransmit side, slot leader is removed only _after_ neighbors and
children are computed:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/retransmit_stage.rs#L383-L384
So the turbine broadcast tree is different between the two stages.

This commit:
* Removes retransmit_peers. Broadcast and retransmit stages will use tvu_peers
  consistently.
* Retransmit stage removes slot leader _before_ computing children and
  neighbors.
2021-03-24 13:34:48 +00:00
behzad nouri 4f82b897bc
buffers data shreds to make larger erasure coded sets (#15849)
Broadcast stage batches up to 8 entries:
https://github.com/solana-labs/solana/blob/79280b304/core/src/broadcast_stage/broadcast_utils.rs#L26-L29
which will be serialized into some number of shreds and chunked into FEC
sets of at most 32 shreds each:
https://github.com/solana-labs/solana/blob/79280b304/ledger/src/shred.rs#L576-L597
So depending on the size of entries, FEC sets can be small, which may
aggravate loss rate.
For example 16 FEC sets of 2:2 data/code shreds each have higher loss
rate than one 32:32 set.

This commit broadcasts data shreds immediately, but also buffers them
until it has a batch of 32 data shreds, at which point 32 coding shreds
are generated and broadcasted.
2021-03-23 14:52:38 +00:00
Michael Vines 5df36aec7d Pacify clippy 2021-02-19 20:08:41 -08:00
behzad nouri e1021d9f83
removes redundant epoch stakes cache in retransmit (#14781)
Following d6d76219b, staked nodes computed from vote accounts are
already cached in runtime::Stakes, so the caching in retransmit_stage is
redundant.
2021-01-24 21:15:09 +00:00
Michael Vines cbffab7850 Upgrade to Rust v1.49.0 2021-01-23 19:16:36 -08:00
sakridge f8a4afc7c1
Fix flaky broadcast test (#14329) 2020-12-29 12:35:04 -08:00
sakridge c693ffaa08
Fix subtraction overflow in metrics (#14290) 2020-12-27 16:26:22 -08:00
behzad nouri d6d76219b6
caches staked nodes computed from vote-accounts (#13929) 2020-12-17 21:22:50 +00:00
Michael Vines 7143aaa89b Clippy 2020-12-14 08:03:29 -08:00
sakridge b4cf968e14
Add back shredding broadcast stats (#13463) 2020-11-09 23:04:27 -08:00
Michael Vines d15173ad9d Address latest nightly clippy lints, but globally disable stable_sort_primitive 2020-08-17 22:36:10 -07:00
sakridge 2cf719ac2c
Cache tvu peers for broadcast (#10373) 2020-06-03 08:24:05 -07:00
carllin 97f2bcff69
master: Add nonce to shreds repairs, add shred data size to header (#10109)
* Add nonce to shreds/repairs

* Add data shred size to header

Co-authored-by: Carl <carl@solana.com>
2020-05-19 12:38:18 -07:00
Kristofer Peterson 58ef02f02b
9951 clippy errors in the test suite (#10030)
automerge
2020-05-15 09:35:43 -07:00
carllin bab3502260
Push down cluster_info lock (#9594)
* Push down cluster_info lock

* Rework budget decrement

Co-authored-by: Carl <carl@solana.com>
2020-04-21 12:54:45 -07:00
anatoly yakovenko 77fb4230d6
Calculate distance between u64 without overflow (#9592)
* fix overflow

* fixed num_live_peers overflow
2020-04-19 23:05:26 -07:00