Commit Graph

164 Commits

Author SHA1 Message Date
behzad nouri 348fe9ebe2
verifies shred slot and parent in fetch stage (#26225)
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
2022-06-28 12:45:50 +00:00
behzad nouri 39ca788b95
discards shreds in sigverify if the slot leader is the node itself (#26229)
Shreds are dropped in window-service if the slot leader is the node
itself:
https://github.com/solana-labs/solana/blob/cd2878acf/core/src/window_service.rs#L181-L185

However this is done after wasting resources verifying signature on
these shreds, and requires a redundant 2nd lookup of the slot leader.

This commit instead discards such shreds in sigverify stage where we
already know the leader for the slot.
2022-06-27 20:12:23 +00:00
behzad nouri 67936aaa74
moves Shred::seed to ShredId and adds test coverage (#26251)
Following commits will skip shreds deserializaton before retransmit, and
so we will only have a ShredId and not a fully deserialized shred to
obtain the shuffling seed from.
2022-06-27 17:58:43 +00:00
behzad nouri 30d2b112e4
bypasses rayon thread-pool for small retransmit shred batches (#26222)
In order to preserve current behavior, the threshold is set to the
current value of the argument to IndexedParallelIterator::with_min_len.
Follow up commits will recalibrate this threshold to optimize
performance on mainnet-beta.
2022-06-25 21:15:42 +00:00
behzad nouri f1b82ec44d
factors out common retransmit work for shreds of the same slot (#26218)
Shreds arriving at a node for retransmit tend to belong to the same slot
(or a just a couple of different slots). Slot leader and cluster nodes
are common for the shreds of the same slot, and so the common work to
look up these values can be factored out.
This commit first group-bys shreds by slot to factor out that common
lookup work.
2022-06-25 15:49:05 +00:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00
behzad nouri 75425521b4
moves slot updates notifications after shreds retransmit (#26094)
RetransmitSlotStats can already be utilized to track when the first
shred for a slot was received; therefore
    first_shreds_received: &Mutex<BTreeSet<Slot>>

is redundant. Sending update notifications after shreds retransmit will
also bypass the need for a mutex.
2022-06-21 17:19:40 -04:00
behzad nouri d2afa6b418
moves packet-hasher out of the mutex (#26091)
Packet-hasher is not mutated across threads and does not need to be
wrapped in a mutex.
2022-06-21 16:29:27 +00:00
behzad nouri b3d1f8d1ac
tracks number of shreds sent and received at different distances from the root (#25989) 2022-06-17 21:33:23 +00:00
behzad nouri eff59193db
enforces that LAST_SHRED_IN_SLOT is also DATA_COMPLETE_SHRED (#24892)
A data shred cannot be LAST_SHRED_IN_SLOT if not also DATA_COMPLETE_SHRED.
So LAST_SHRED_IN_SLOT should also imply DATA_COMPLETE_SHRED:
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shredder.rs#L116-L117
https://github.com/solana-labs/solana/blob/74b586ae7/core/src/broadcast_stage/standard_broadcast_run.rs#L80-L81

However current shred constructs allow specifying a shred which is
LAST_SHRED_IN_SLOT but not DATA_COMPLETE_SHRED:
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shred.rs#L117-L118
https://github.com/solana-labs/solana/blob/74b586ae7/ledger/src/shred.rs#L272-L273

The commit updates ShredFlags so that if a shred is not
DATA_COMPLETE_SHRED it cannot be LAST_SHRED_IN_SLOT either.
2022-05-02 23:33:53 +00:00
behzad nouri 0f60665100
replaces Shred::new_empty_coding with Shred::new_from_parity_shard (#24749)
Removing implementation details of shreds and payload offsets from
shredder, so that shredder does not need to mutate payload:
https://github.com/solana-labs/solana/blob/71ad12128/ledger/src/shred.rs#L968-L977

Also, Shred::new_from_data can simply obtain a slice as opposed to
Option<&[u8]>:
https://github.com/solana-labs/solana/blob/71ad12128/ledger/src/shred.rs#L268-L278
2022-04-27 18:04:10 +00:00
behzad nouri 895f76a93c
hides implementation details of shred from its public interface (#24563)
Working towards embedding versioning into shreds binary, so that a new
variant of shred struct can include merkle tree hashes of the erasure
set.
2022-04-25 12:43:22 +00:00
Justin Starry c544742091
Local cluster test cleanup and refactoring (#24559)
* remove FixedSchedule.start_epoch

* use duration for timing

* Rename to partition bool to turbine_disabled

* simplify partition config
2022-04-22 12:14:07 +08:00
behzad nouri 1d50832389
replaces counters with datapoints in gossip metrics (#24451) 2022-04-18 23:14:59 +00:00
behzad nouri 2282571493
removes outdated and flaky test_skip_repair from retransmit-stage (#24121)
test_skip_repair in retransmit-stage is no longer relevant because
following: https://github.com/solana-labs/solana/pull/19233
repair packets are filtered out earlier in window-service and so
retransmit stage does not know if a shred is repaired or not.
Also, following turbine peer shuffle changes:
https://github.com/solana-labs/solana/pull/24080
the test has become flaky since it does not take into account how peers
are shuffled for each shred.
2022-04-05 16:02:53 +00:00
HaoranYi fedf4e984f
typo (#23910) 2022-03-24 15:21:59 -05:00
Michael Vines 390dc24608 Create leader schedule before processing blockstore 2022-03-14 15:29:58 -07:00
Michael Vines 115f376465 Factor out bank_forks_utils::load_bank_forks() 2022-03-14 15:29:58 -07:00
Michael Vines c2ce152be8 Inline do_process_blockstore_from_root 2022-03-14 15:29:58 -07:00
Jeff Biseda c69e3b73ff
bench get_retransmit_peers (#23292) 2022-03-01 19:10:29 -08:00
Jeff Biseda 8b66625c95
convert std::sync::mpsc to crossbeam_channel (#22264) 2022-01-11 02:44:46 -08:00
behzad nouri 01a096adc8 adds bitflags to Packet.Meta
Instead of a separate bool type for each flag, all the flags can be
encoded in a type-safe bitflags encoded in a single u8:
https://github.com/solana-labs/solana/blob/d6ec103be/sdk/src/packet.rs#L19-L31
2022-01-04 13:53:40 +00:00
carllin 7f6fb6937a
Ensure AncestorHashesSerice selects an open port (#21919) 2021-12-18 00:44:01 -05:00
behzad nouri 4ceb2689f5
adds ShredId uniquely identifying each shred (#21820) 2021-12-14 17:34:02 +00:00
Justin Starry 254ef3e7b6
Rename Packets to PacketBatch (#21794) 2021-12-11 09:44:15 -05:00
behzad nouri cd17f63d81
adds back position field to coding-shred-header (#21600)
https://github.com/solana-labs/solana/pull/17004
removed position field from coding-shred-header because as it stands the
field is redundant and unused.
However, with the upcoming changes to erasure coding schema this field
will no longer be redundant and needs to be populated.
2021-12-05 14:42:09 +00:00
Michael Vines b8837c04ec Reformat imports to a consistent style for imports
rustfmt.toml configuration:
  imports_granularity = "One"
  group_imports = "One"
2021-12-03 09:19:13 -08:00
behzad nouri 57057f8d39 uses enum for shred type
Current code is using u8 which does not have any type-safety and can
contain invalid values:
https://github.com/solana-labs/solana/blob/66fa062f1/ledger/src/shred.rs#L167

Checks for invalid shred-types are scattered through the code:
https://github.com/solana-labs/solana/blob/66fa062f1/ledger/src/blockstore.rs#L849-L851
https://github.com/solana-labs/solana/blob/66fa062f1/ledger/src/shred.rs#L346-L348

The commit uses enum for shred type with #[repr(u8)]. Backward
compatibility is maintained by implementing Serialize and Deserialize
compatible with u8, and adding a test to assert that.
2021-11-19 14:16:39 +00:00
behzad nouri 5e1cf39c74
adds metrics for number of outgoing shreds in retransmit stage (#20882) 2021-10-24 13:12:27 +00:00
behzad nouri 0c0384ec32
revises turbine peers shuffling order (#20480)
Turbine randomly shuffles cluster nodes on a broadcast tree for each
shred. This requires knowing the stakes and nodes' contact-infos (from
gossip).

However gossip is subject to partitioning and propogation delays.
Additionally unstaked nodes may join and leave the cluster at any
moment, changing the cluster view from one node to another.

This commit:
* Always arranges the unstaked nodes at the bottom of turbine broadcast
  tree.
* Staked nodes are always included regardless of if their contact-info
  is available in gossip or not.
* Uses the unbiased WeightedShuffle construct for shuffling nodes.
2021-10-14 15:09:36 +00:00
Lijun Wang fe97cb2ddf
AccountsDb plugin framework (#20047)
Summary of Changes

Create a plugin mechanism in the accounts update path so that accounts data can be streamed out to external data stores (be it Kafka or Postgres). The plugin mechanism allows

Data stores of connection strings/credentials to be configured,
Accounts with patterns to be streamed
PostgreSQL implementation of the streaming for different destination stores to be plugged in.

The code comprises 4 major parts:

accountsdb-plugin-intf: defines the plugin interface which concrete plugin should implement.
accountsdb-plugin-manager: manages the load/unload of plugins and provide interfaces which the validator can notify of accounts update to plugins.
accountsdb-plugin-postgres: the concrete plugin implementation for PostgreSQL
The validator integrations: updated streamed right after snapshot restore and after account update from transaction processing or other real updates.
The plugin is optionally loaded on demand by new validator CLI argument -- there is no impact if the plugin is not loaded.
2021-09-30 14:26:17 -07:00
Brooks Prumo a0552e5b46
Make startup aware of Incremental Snapshots (#19600) 2021-09-07 20:43:43 +00:00
behzad nouri 01a7ec8198
uses rayon thread-pool for retransmit-stage parallelization (#19486) 2021-09-07 15:15:01 +00:00
Brooks Prumo e9374d32a3
Revert "Make startup aware of Incremental Snapshots (#19550)" (#19599)
This reverts commit d45ced0a5d.
2021-09-02 19:14:41 -05:00
Brooks Prumo d45ced0a5d
Make startup aware of Incremental Snapshots (#19550) 2021-09-02 19:05:15 -05:00
behzad nouri 6d9818b8e4
skips retransmit for shreds with unknown slot leader (#19472)
Shreds' signatures should be verified before they reach retransmit
stage, and if the leader is unknown they should fail signature check.
Therefore retransmit-stage can as well expect to know who the slot
leader is and otherwise just skip the shred.

Blockstore checking signature of recovered shreds before sending them to
retransmit stage:
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/blockstore.rs#L884-L930

Shred signature verifier:
https://github.com/solana-labs/solana/blob/4305d4b7b/core/src/sigverify_shreds.rs#L41-L57
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/sigverify_shreds.rs#L105
2021-09-01 15:44:26 +00:00
behzad nouri 7a8807b8bb retransmits shreds recovered from erasure codes
Shreds recovered from erasure codes have not been received from turbine
and have not been retransmitted to other nodes downstream. This results
in more repairs across the cluster which is slower.

This commit channels through recovered shreds to retransmit stage in
order to further broadcast the shreds to downstream nodes in the tree.
2021-08-17 13:44:10 +00:00
behzad nouri 3efccbffab sends shreds (instead of packets) to retransmit stage
Working towards channelling through shreds recovered from erasure codes
to retransmit stage.
2021-08-17 13:44:10 +00:00
behzad nouri 6e413331b5 removes erroneous uses of Arc<...> from retransmit stage 2021-08-17 13:44:10 +00:00
behzad nouri bf437b0336 removes packet-count metrics from retransmit stage
Working towards sending shreds (instead of packets) to retransmit stage
so that shreds recovered from erasure codes are as well retransmitted.

Following commit will add these metrics back to window-service, earlier
in the pipeline.
2021-08-17 13:44:10 +00:00
behzad nouri b64eeb7729 removes erroneous uses of &Arc<...> from window-service 2021-08-13 17:26:31 +00:00
behzad nouri e4be00fece falls back on working-bank if root-bank::epoch-staked-nodes is none
bank.get_leader_schedule_epoch(shred_slot)
is one epoch after epoch_schedule.get_epoch(shred_slot).

At epoch boundaries, shred is already one epoch after the root-slot. So
we need epoch-stakes 2 epochs ahead of the root. But the root bank only
has epoch-stakes for one epoch ahead, and as a result looking up epoch
staked-nodes from the root-bank fails.

To be backward compatible with the current master code, this commit
implements a fallback on working-bank if epoch staked-nodes obtained
from the root-bank is none.
2021-08-05 21:47:33 +00:00
behzad nouri 50d0e830c9 unifies cluster-nodes computation & caching across turbine stages
Broadcast-stage is using epoch_staked_nodes based on the same slot that
shreds belong to:
https://github.com/solana-labs/solana/blob/049fb0417/core/src/broadcast_stage/standard_broadcast_run.rs#L208-L228
https://github.com/solana-labs/solana/blob/0cf52e206/core/src/broadcast_stage.rs#L342-L349

But retransmit-stage is using bank-epoch of the working-bank:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/retransmit_stage.rs#L272-L289

So the two are not consistent at epoch boundaries where some nodes may
have a working bank (or similarly a root bank) lagging other nodes. As a
result the node which obtains a packet may construct turbine broadcast
tree inconsistently with its parent node in the tree and so some packets
may fail to reach all nodes in the tree.
2021-08-05 21:47:33 +00:00
behzad nouri 30bec3921e uses cluster-nodes cache in retransmit stage
The new cluster-nodes cache will:
  * ensure cluster-nodes are recalculated if the epoch (and so the epoch
    staked nodes) changes.
  * encapsulate time-to-live eviction policy.
2021-08-05 21:47:33 +00:00
Ryo Onodera da480bdb5f
Fix unstable retransmit-num_nodes (#18970) 2021-07-29 17:32:32 +00:00
behzad nouri d06dc6c8a6
shares cluster-nodes between retransmit threads (#18947)
cluster_nodes and last_peer_update are not shared between retransmit
threads, as each thread have its own value:
https://github.com/solana-labs/solana/blob/65ccfed86/core/src/retransmit_stage.rs#L476-L477

Additionally, with shared references, this code:
https://github.com/solana-labs/solana/blob/0167daa11/core/src/retransmit_stage.rs#L315-L328
has a concurrency bug where the thread which does compare_and_swap,
updates cluster_nodes much later after other threads have run with
outdated cluster_nodes for a while. In particular, the write-lock there
may block.
2021-07-29 16:20:15 +00:00
sakridge 84e78316b1
Write helper for multithread update (#18808) 2021-07-29 03:16:36 +02:00
carllin c0704d4ec9
Plumb signal from replay to ancestor hashes service (#18880) 2021-07-26 20:59:00 -07:00
carllin 1ee64afb12
Introduce AncestorHashesService (#18812) 2021-07-23 16:54:47 -07:00
behzad nouri d2d5f36a3c
adds validator flag to allow private ip addresses (#18850) 2021-07-23 15:25:03 +00:00