Commit Graph

9 Commits

Author SHA1 Message Date
behzad nouri b9849179c9
bypasses merkle proof verification for recovered merkle shreds (#28076)
Merkle proof for shreds recovered from erasure codes are generated
locally, and it is superfluous to verify them when sanitizing recovered
shreds:
https://github.com/solana-labs/solana/blob/a0f49c2e4/ledger/src/shred/merkle.rs#L727-L760
2022-09-26 21:28:15 +00:00
behzad nouri f49beb0cbc
caches reed-solomon encoder/decoder instance (#27510)
ReedSolomon::new(...) initializes a matrix and a data-decode-matrix cache:
https://github.com/rust-rse/reed-solomon-erasure/blob/273ebbced/src/core.rs#L460-L466

In order to cache this computation, this commit caches the reed-solomon
encoder/decoder instance for each (data_shards, parity_shards) pair.
2022-09-25 18:09:47 +00:00
behzad nouri 45e26574f3
removes redundant shred.sanitize() from blockstore (#28016)
Shreds received from other nodes over the socket are sanitized when the
payload is deserialized:
https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/legacy.rs#L137
https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/legacy.rs#L77
https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/merkle.rs#L355
https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/merkle.rs#L439

Similarly, shreds recovered from erasure codes are also sanitized at
deserialization:
https://github.com/solana-labs/solana/blob/f02fe9c7e/ledger/src/shredder.rs#L330
or explicitly so for Merkle shreds:
https://github.com/solana-labs/solana/blob/f02fe9c7e/ledger/src/shred/merkle.rs#L753

Shreds generated locally by the node itself during its leader slots do
not need to be sanitized.

So sanitizing shreds in blockstore is redundant and wasteful. In
particular this becomes more wasteful with Merkle shreds because
sanitizing shreds would require verifying Merkle proof.
As such the commit removes redundant shred.sanitize() from blockstore.
2022-09-24 16:31:50 +00:00
behzad nouri 7d3f3b2f7d generates merkle shreds from ledger entries
The commit adds methods to convert &[Entry] to vector of Merkle shreds.
2022-09-23 16:45:18 +00:00
behzad nouri 37296caee8
removes set_{slot,index,last_in_slot} implementations for merkle shreds (#27689)
These methods are only used in tests but invoked on a merkle shred they
will always invalidate the shred because the merkle proof will no longer
verify. As a result the shred will not sanitize and blockstore will
avoid inserting them. Their use in tests will result in spurious test
coverage because the shreds will not be ingested.

The commit removes implementation of these methods for merkle shreds.
Follow up commits will entirely remove these methods from shreds api.
2022-09-09 17:58:04 +00:00
behzad nouri c0b63351ae
recovers merkle shreds from erasure codes (#27136)
The commit
* Identifies Merkle shreds when recovering from erasure codes and
  dispatches specialized code to reconstruct shreds.
* Coding shred headers are added to recovered erasure shards.
* Merkle tree is reconstructed for the erasure batch and added to
  recovered shreds.
* The common signature (for the root of Merkle tree) is attached to all
  recovered shreds.
2022-08-19 21:07:32 +00:00
behzad nouri c813d75944
renames size_of_erasure_encoded_slice to ShredCode::capacity (#27157)
Maintain symmetry between code and data shreds:
* ShredData::capacity -> data buffer capacity
* ShredCode::capacity -> erasure code capacity
2022-08-16 12:30:38 +00:00
behzad nouri 0e30609394
adds Shred{Code,Data}::SIZE_OF_HEADERS trait constants (#27144) 2022-08-15 19:02:32 +00:00
behzad nouri 5f04512d3a
adds a new shred variant embedding merkle tree hashes of the erasure batch (#25237)
Coding shreds can only be signed once erasure codings are already
generated. Therefore coding shreds recovered from erasure codings lack
slot leader's signature and so cannot be retransmitted to the rest of
the cluster.

shred/merkle.rs implements a new shred variant where we generate merkle
tree for each erasure encoded batch and each shred includes:
* root of the merkle tree (Hash truncated to 20 bytes).
* slot leader's signature of the root of the merkle tree.
* merkle tree nodes along the branch the shred belongs to, where hashes
  are trimmed to 20 bytes during tree construction.

This schema results in the same signature for all shreds within an
erasure batch.

When recovering shreds from erasure codes, we can reconstruct merkle
tree for the batch and for each recovered shred also recover respective
merkle tree branch; then snap the slot leader's signature from any of
the shreds received from turbine and retransmit all recovered code or
data shreds.

Backward compatibility is achieved by encoding shred variant at byte 65
of payload (previously shred-type at this position):
* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.
* 0b0100_???? indicates a merkle coding shred with merkle branch size
  indicated by the last 4 bits.
* 0b1000_???? indicates a merkle data shred with merkle branch size
  indicated by the last 4 bits.

Merkle root and branch are encoded at the end of the shred payload.
2022-06-07 22:41:03 +00:00