Commit Graph

26 Commits

Author SHA1 Message Date
behzad nouri 37296caee8
removes set_{slot,index,last_in_slot} implementations for merkle shreds (#27689)
These methods are only used in tests but invoked on a merkle shred they
will always invalidate the shred because the merkle proof will no longer
verify. As a result the shred will not sanitize and blockstore will
avoid inserting them. Their use in tests will result in spurious test
coverage because the shreds will not be ingested.

The commit removes implementation of these methods for merkle shreds.
Follow up commits will entirely remove these methods from shreds api.
2022-09-09 17:58:04 +00:00
Jeff Biseda 269eb519dd
track time to coalesce entries in recv_slot_entries (#27525) 2022-09-06 16:07:17 -07:00
Brennan Watt dfdb422fb1
Minor shred constant cleanup (#27472)
* Minor shred constant cleanup to eliminate magic number
2022-08-30 18:53:05 -07:00
Jeff Biseda d1522fc790
coalesce entries in recv_slot_entries to target byte count (#27321) 2022-08-25 13:51:55 -07:00
Brennan Watt e4a7d01e10
Rust v1.63 (#27303)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

* Update quinn-udp crate

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-22 18:01:03 -07:00
behzad nouri c0b63351ae
recovers merkle shreds from erasure codes (#27136)
The commit
* Identifies Merkle shreds when recovering from erasure codes and
  dispatches specialized code to reconstruct shreds.
* Coding shred headers are added to recovered erasure shards.
* Merkle tree is reconstructed for the erasure batch and added to
  recovered shreds.
* The common signature (for the root of Merkle tree) is attached to all
  recovered shreds.
2022-08-19 21:07:32 +00:00
Brennan Watt 7573000d87
Revert "Rust v1.63.0 (#27148)" (#27245)
This reverts commit a2e7bdf50a.
2022-08-19 09:19:44 +01:00
Brennan Watt a2e7bdf50a
Rust v1.63.0 (#27148)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-17 15:48:33 -07:00
behzad nouri c813d75944
renames size_of_erasure_encoded_slice to ShredCode::capacity (#27157)
Maintain symmetry between code and data shreds:
* ShredData::capacity -> data buffer capacity
* ShredCode::capacity -> erasure code capacity
2022-08-16 12:30:38 +00:00
behzad nouri 0e30609394
adds Shred{Code,Data}::SIZE_OF_HEADERS trait constants (#27144) 2022-08-15 19:02:32 +00:00
behzad nouri b3b57a0f07
adjusts max coding shreds per slot (#27083)
As a consequence of removing buffering when generating coding shreds:
https://github.com/solana-labs/solana/pull/25807
more coding shreds are generated than data shreds, and so
MAX_CODE_SHREDS_PER_SLOT needs to be adjusted accordingly.

The respective value is tied to ERASURE_BATCH_SIZE.
2022-08-12 18:02:01 +00:00
behzad nouri ac91cdab74
removes buffering when generating coding shreds in broadcast (#25807)
Given the 32:32 erasure recovery schema, current implementation requires
exactly 32 data shreds to generate coding shreds for the batch (except
for the final erasure batch in each slot).
As a result, when serializing ledger entries to data shreds, if the
number of data shreds is not a multiple of 32, the coding shreds for the
last batch cannot be generated until there are more data shreds to
complete the batch to 32 data shreds. This adds latency in generating
and broadcasting coding shreds.

In addition, with Merkle variants for shreds, data shreds cannot be
signed and broadcasted until coding shreds are also generated. As a
result *both* code and data shreds will be delayed before broadcast if
we still require exactly 32 data shreds for each batch.

This commit instead always generates and broadcast coding shreds as soon
as there any number of data shreds available. When serializing entries
to shreds:
* if the number of resulting data shreds is less than 32, then more
  coding shreds will be generated so that the resulting erasure batch
  has the same recovery probabilities as a 32:32 batch.
* if the number of data shreds is more than 32, then the data shreds are
  split uniformly into erasure batches with _at least_ 32 data shreds in
  each batch. Each erasure batch will have the same number of code and
  data shreds.

For example:
* If there are 19 data shreds, 27 coding shreds are generated. The
  resulting 19(data):27(code) erasure batch has the same recovery
  probabilities as a 32:32 batch.
* If there are 107 data shreds, they are split into 3 batches of 36:36,
  36:36 and 35:35 data:code shreds each.

A consequence of this change is that code and data shreds indices will
no longer align as there will be more coding shreds than data shreds
(not only in the last batch in each slot but also in the intermediate
ones);
2022-08-11 12:44:27 +00:00
behzad nouri e2a2d271f2
adds number of coding shreds to broadcast metrics (#27006) 2022-08-09 13:59:40 +00:00
behzad nouri 403b2e4841
records num data shreds obtained from serializing entries (#26888) 2022-08-03 17:07:40 +00:00
Jeff Biseda 857be1e237
sign repair requests (#26833) 2022-07-31 15:48:51 -07:00
behzad nouri 348fe9ebe2
verifies shred slot and parent in fetch stage (#26225)
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
2022-06-28 12:45:50 +00:00
Michael Vines f3639b76ce Remove some clippy lints 2022-06-22 09:23:22 -07:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00
behzad nouri 31b3e0e15a
adds metric tracking wasted data buffer in shreds (#25972) 2022-06-16 16:14:00 +00:00
behzad nouri 5f04512d3a
adds a new shred variant embedding merkle tree hashes of the erasure batch (#25237)
Coding shreds can only be signed once erasure codings are already
generated. Therefore coding shreds recovered from erasure codings lack
slot leader's signature and so cannot be retransmitted to the rest of
the cluster.

shred/merkle.rs implements a new shred variant where we generate merkle
tree for each erasure encoded batch and each shred includes:
* root of the merkle tree (Hash truncated to 20 bytes).
* slot leader's signature of the root of the merkle tree.
* merkle tree nodes along the branch the shred belongs to, where hashes
  are trimmed to 20 bytes during tree construction.

This schema results in the same signature for all shreds within an
erasure batch.

When recovering shreds from erasure codes, we can reconstruct merkle
tree for the batch and for each recovered shred also recover respective
merkle tree branch; then snap the slot leader's signature from any of
the shreds received from turbine and retransmit all recovered code or
data shreds.

Backward compatibility is achieved by encoding shred variant at byte 65
of payload (previously shred-type at this position):
* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.
* 0b0100_???? indicates a merkle coding shred with merkle branch size
  indicated by the last 4 bits.
* 0b1000_???? indicates a merkle data shred with merkle branch size
  indicated by the last 4 bits.

Merkle root and branch are encoded at the end of the shred payload.
2022-06-07 22:41:03 +00:00
behzad nouri 6c9f2eac78
removes fec_set_offset from UnfinishedSlotInfo (#25815)
If the blockstore has shreds for a slot, it should not recreate the
slot:
https://github.com/solana-labs/solana/blob/ff68bf6c2/ledger/src/leader_schedule_cache.rs#L142-L146
https://github.com/solana-labs/solana/pull/15849/files#r596657314

Therefore in broadcast stage if UnfinishedSlotInfo is None, then
fec_set_offset will be zero:
https://github.com/solana-labs/solana/blob/ff68bf6c2/core/src/broadcast_stage/standard_broadcast_run.rs#L111-L120

As a result fec_set_offset will always be zero, and is so redundant and
can be removed.
2022-06-07 22:17:37 +00:00
behzad nouri 81231a89b9 adds support for different variants of ShredCode and ShredData
The commit implements two new types:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
    }

Following commits will extend these types by adding merkle variants:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
        Merkle(merkle::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
        Merkle(merkle::ShredData),
    }
2022-06-02 18:55:50 +00:00
behzad nouri a913068512 embeds versioning into shred binary
In preparation of
https://github.com/solana-labs/solana/pull/25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b0101_1010
    }

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type:
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b1010_0101
        MerkleCode(/*proof_size:*/ u8), // 0b0100_????
        MerkleData(/*proof_size:*/ u8), // 0b1000_????
    }
2022-06-02 18:55:50 +00:00
behzad nouri 29cfa04c05
records number of residual data shreds which don't make a full batch (#25693)
Data shreds are batched into MAX_DATA_SHREDS_PER_FEC_BLOCK shreds for
each erasure batch. If there are residual shreds not making a full
batch, then we cannot generate coding shreds and need to buffer shreds
until there is a full batch; This may add latency to coding shreds
generation and broadcast.
In order to evaluate upcoming changes removing this buffering logic,
this commit adds metrics tracking residual number of data shreds which
don't make a full batch.
2022-06-02 00:32:32 +00:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
behzad nouri e2bbc3913d separates out data vs code shreds at the type level
Working towards revising shred struct to embed versioning so that a new
variant can contain merkle tree hashes of the erasure batch. To ease out
migration the commit adds more type-safety by distinguishing data vs
code shreds at the type level.

Additionally having both data and coding headers in each shred is
redundant as only one is relevant for each shred. The revised shred type
in this commit will only have one type-specific header.
https://github.com/solana-labs/solana/blob/c785f1ffc/ledger/src/shred.rs#L198-L203
2022-05-18 21:56:22 +00:00