Commit Graph

141 Commits

Author SHA1 Message Date
behzad nouri 528a03f32a
removes outdated matches crate from dependencies (#33172)
removes outdated matches crate from the dependencies

std::matches has been stable since rust 1.42.0.
Other use-cases are covered by assert_matches crate.
2023-09-07 12:52:57 +00:00
behzad nouri 1431275328
removes outdated check for merkle shreds (#33088) 2023-08-31 20:01:54 +00:00
Jon Cinque 0fe902ced7
Bump rand to 0.8, rand_chacha to 0.3, getrandom to 0.2 (#32871)
* sdk: Add concurrent support for rand 0.7 and 0.8

* Update rand, rand_chacha, and getrandom versions

* Run command to replace `gen_range`

Run `git grep -l gen_range | xargs sed -i'' -e 's/gen_range(\(\S*\), /gen_range(\1../'

* sdk: Fix users of older `gen_range`

* Replace `hash::new_rand` with `hash::new_with_thread_rng`

Run:
```
git grep -l hash::new_rand | xargs sed -i'' -e 's/hash::new_rand([^)]*/hash::new_with_thread_rng(/'
```

* perf: Use `Keypair::new()` instead of `generate`

* Use older rand version in zk-token-sdk

* program-runtime: Inline random key generation

* bloom: Fix clippy warnings in tests

* streamer: Scope rng usage correctly

* perf: Fix clippy warning

* accounts-db: Map to char to generate a random string

* Remove `from_secret_key_bytes`, it's just `keypair_from_seed`

* ledger: Generate keypairs by hand

* ed25519-tests: Use new rand

* runtime: Use new rand in all tests

* gossip: Clean up clippy and inline keypair generators

* core: Inline keypair generation for tests

* Push sbf lockfile change

* sdk: Sort dependencies correctly

* Remove `hash::new_with_thread_rng`, use `Hash::new_unique()`

* Use Keypair::new where chacha isn't used

* sdk: Fix build by marking rand 0.7 optional

* Hardcode secret key length, add static assertion

* Unify `getrandom` crate usage to fix linking errors

* bloom: Fix tests that require a random hash

* Remove some dependencies, try to unify others

* Remove unnecessary uses of rand and rand_core

* Update lockfiles

* Add back some dependencies to reduce rebuilds

* Increase max rebuilds from 14 to 15

* frozen-abi: Remove `getrandom`

* Bump rebuilds to 17

* Remove getrandom from zk-token-proof
2023-08-21 19:11:21 +02:00
dependabot[bot] c2d605b884
Bump bitflags from 1.3.2 to 2.3.3 (#32438)
* Bump bitflags from 1.3.2 to 2.3.3

Bumps [bitflags](https://github.com/bitflags/bitflags) from 1.3.2 to 2.3.3.
- [Release notes](https://github.com/bitflags/bitflags/releases)
- [Changelog](https://github.com/bitflags/bitflags/blob/main/CHANGELOG.md)
- [Commits](https://github.com/bitflags/bitflags/compare/1.3.2...2.3.3)

---
updated-dependencies:
- dependency-name: bitflags
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* updates programs/sbf/Cargo.lock

* derives necessary traits

* replaces from_bits_unchecked with from_bits_retain

* patches test

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: behzad nouri <behzadnouri@gmail.com>
2023-07-24 12:56:55 +00:00
behzad nouri 7f2f0136bd
enables merkle shreds for devnet and development clusters (#32533) 2023-07-20 12:47:28 +00:00
behzad nouri cfb028819a
deprecates Signature::new in favor of Signature::{try_,}from (#32481) 2023-07-14 22:51:12 +00:00
behzad nouri d54b6204be
removes instances of clippy::manual_let_else (#32417) 2023-07-09 21:41:36 +00:00
behzad nouri 5d9aba5548
increases retransmit-stage deduper capacity and reset-cycle (#30758)
For duplicate block detection, for each (slot, shred-index, shred-type)
we need to allow 2 different shreds to be retransmitted.
The commit implements this using two bloom-filter dedupers:
* Shreds are deduplicated using the 1st deduper.
* If a shred is not a duplicate, then we check if:
      (slot, shred-index, shred-type, k)
  is not a duplicate for either k = 0  or k = 1 using the 2nd deduper,
  and if so then the shred is retransmitted.

This allows to achieve larger capactiy compared to current LRU-cache.
2023-03-20 20:32:23 +00:00
sakridge 7a8563f2c8
Panic when shred index exceeds the max per slot (#30555)
Assert when shred index exceeds the max per slot
2023-03-04 02:49:23 +01:00
behzad nouri cf0a149add
removes the merkle root from shreds binary (#29427)
https://github.com/solana-labs/solana/pull/29445
makes it unnecessary to embed merkle roots into shreds binary. This
commit removes the merkle root from shreds binary.

This adds 20 bytes to shreds capacity to store more data.
Additionally since we no longer need to truncate the merkle root, the
signature would be on the full 32 bytes of hash as opposed to the
truncated one.

Also signature verification would now effectively verify merkle proof as
well, so we no longer need to verify merkle proof in the sanitize
implementation.
2023-02-15 21:17:24 +00:00
Jeff Biseda 180273b97d
defer HighestShred repairs during shred propagation threshold (#30142) 2023-02-09 14:57:55 -08:00
behzad nouri 80a39bd6a5
adds feature to (temporarily) drop merkle shreds from testnet (#29711) 2023-01-15 15:41:58 +00:00
behzad nouri 5b5a3ebce8
adds metrics for num merkle shreds on the receiving end (#29710) 2023-01-14 23:07:42 +00:00
behzad nouri 9db25655f7
recovers merkle roots from shreds binary in {verify,sign}_shreds_gpu (#29445)
{verify,sign}_shreds_gpu need to point to offsets within the packets for
the signed data. For merkle shreds this signed data is the merkle root
of the erasure batch and this would necessitate embedding the merkle
roots in the shreds payload.
However this is wasteful and reduces shreds capacity to store data
because the merkle root can already be recovered from the encoded merkle
proof.

Instead of pointing to offsets within the shreds payload, this commit
recovers merkle roots from the merkle proofs and stores them in an
allocated buffer. {verify,sign}_shreds_gpu would then point to offsets
within this new buffer for the respective signed data.

This would unblock us from removing merkle roots from shreds payload
which would save capacity to send more data with each shred.
2023-01-04 14:20:05 +00:00
behzad nouri 754ecf467b
generalizes the return type of Shred::get_signed_data (#29446)
The commit adds an associated SignedData type to Shred trait so that
merkle and legacy shreds can return different types for signed_data
method.
This would allow legacy shreds to point to a section of the shred
payload, whereas merkle shreds would compute and return the merkle root.
Ultimately this would allow to remove the merkle root from the shreds
binary.
2022-12-31 17:08:25 +00:00
behzad nouri 50afb80f52
adds shred::layout::get_signed_data (#29438)
Working towards removing merkle root from shreds payload, the commit
implements api to obtain signed data from shreds binary.
2022-12-30 18:52:10 +00:00
behzad nouri db3d926633
implements shred::layout::get_merkle_root (#29437)
In preparation of removing merkle roots form shreds binary, the commit
adds api to recover the root from the merkle proof embedded in shreds
payload.
2022-12-29 23:32:58 +00:00
behzad nouri 92ca725c6b
patches shred::merkle::make_shreds_from_data when data is empty (#29295)
If data is empty, make_shreds_from_data will now return one data shred
with empty data. This preserves invariants verified in tests regardless
of data size.
2022-12-16 18:56:21 +00:00
Jon Cinque b1340d77a2
sdk: Make Packet::meta private, use accessor functions (#29092)
sdk: Make packet meta private
2022-12-06 12:54:49 +01:00
behzad nouri f49beb0cbc
caches reed-solomon encoder/decoder instance (#27510)
ReedSolomon::new(...) initializes a matrix and a data-decode-matrix cache:
https://github.com/rust-rse/reed-solomon-erasure/blob/273ebbced/src/core.rs#L460-L466

In order to cache this computation, this commit caches the reed-solomon
encoder/decoder instance for each (data_shards, parity_shards) pair.
2022-09-25 18:09:47 +00:00
behzad nouri 7d3f3b2f7d generates merkle shreds from ledger entries
The commit adds methods to convert &[Entry] to vector of Merkle shreds.
2022-09-23 16:45:18 +00:00
behzad nouri b183e00dcf
patches range check in shred::layout::get_signed_message_range (#27822) 2022-09-16 00:08:10 +00:00
Jeff Biseda d1522fc790
coalesce entries in recv_slot_entries to target byte count (#27321) 2022-08-25 13:51:55 -07:00
Brennan Watt e4a7d01e10
Rust v1.63 (#27303)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

* Update quinn-udp crate

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-22 18:01:03 -07:00
behzad nouri c0b63351ae
recovers merkle shreds from erasure codes (#27136)
The commit
* Identifies Merkle shreds when recovering from erasure codes and
  dispatches specialized code to reconstruct shreds.
* Coding shred headers are added to recovered erasure shards.
* Merkle tree is reconstructed for the erasure batch and added to
  recovered shreds.
* The common signature (for the root of Merkle tree) is attached to all
  recovered shreds.
2022-08-19 21:07:32 +00:00
Brennan Watt 7573000d87
Revert "Rust v1.63.0 (#27148)" (#27245)
This reverts commit a2e7bdf50a.
2022-08-19 09:19:44 +01:00
Brennan Watt a2e7bdf50a
Rust v1.63.0 (#27148)
* Upgrade to Rust v1.63.0

* Add nightly_clippy_allows

* Resolve some new clippy nightly lints

* Increase QUIC packets completion timeout

Co-authored-by: Michael Vines <mvines@gmail.com>
2022-08-17 15:48:33 -07:00
behzad nouri c813d75944
renames size_of_erasure_encoded_slice to ShredCode::capacity (#27157)
Maintain symmetry between code and data shreds:
* ShredData::capacity -> data buffer capacity
* ShredCode::capacity -> erasure code capacity
2022-08-16 12:30:38 +00:00
behzad nouri b3b57a0f07
adjusts max coding shreds per slot (#27083)
As a consequence of removing buffering when generating coding shreds:
https://github.com/solana-labs/solana/pull/25807
more coding shreds are generated than data shreds, and so
MAX_CODE_SHREDS_PER_SLOT needs to be adjusted accordingly.

The respective value is tied to ERASURE_BATCH_SIZE.
2022-08-12 18:02:01 +00:00
behzad nouri ac91cdab74
removes buffering when generating coding shreds in broadcast (#25807)
Given the 32:32 erasure recovery schema, current implementation requires
exactly 32 data shreds to generate coding shreds for the batch (except
for the final erasure batch in each slot).
As a result, when serializing ledger entries to data shreds, if the
number of data shreds is not a multiple of 32, the coding shreds for the
last batch cannot be generated until there are more data shreds to
complete the batch to 32 data shreds. This adds latency in generating
and broadcasting coding shreds.

In addition, with Merkle variants for shreds, data shreds cannot be
signed and broadcasted until coding shreds are also generated. As a
result *both* code and data shreds will be delayed before broadcast if
we still require exactly 32 data shreds for each batch.

This commit instead always generates and broadcast coding shreds as soon
as there any number of data shreds available. When serializing entries
to shreds:
* if the number of resulting data shreds is less than 32, then more
  coding shreds will be generated so that the resulting erasure batch
  has the same recovery probabilities as a 32:32 batch.
* if the number of data shreds is more than 32, then the data shreds are
  split uniformly into erasure batches with _at least_ 32 data shreds in
  each batch. Each erasure batch will have the same number of code and
  data shreds.

For example:
* If there are 19 data shreds, 27 coding shreds are generated. The
  resulting 19(data):27(code) erasure batch has the same recovery
  probabilities as a 32:32 batch.
* If there are 107 data shreds, they are split into 3 batches of 36:36,
  36:36 and 35:35 data:code shreds each.

A consequence of this change is that code and data shreds indices will
no longer align as there will be more coding shreds than data shreds
(not only in the last batch in each slot but also in the intermediate
ones);
2022-08-11 12:44:27 +00:00
behzad nouri d3a14f5b30
simplifies packet/shred sanity checks (#26356) 2022-07-05 21:41:19 +00:00
steviez 54cd31e9e2
Reduce repeated discard_shreds checks (#26329) 2022-06-30 14:45:10 -05:00
behzad nouri 88599fd760
skips shreds deserialization before retransmit (#26230)
Fully deserializing shreds in window-service before sending them to
retransmit stage adds latency to shreds propagation.
This commit instead channels through the payload and relies on only
partial deserialization of a few required fields: slot, shred-index,
shred-type.
2022-06-30 12:13:00 +00:00
behzad nouri 348fe9ebe2
verifies shred slot and parent in fetch stage (#26225)
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
2022-06-28 12:45:50 +00:00
behzad nouri 67936aaa74
moves Shred::seed to ShredId and adds test coverage (#26251)
Following commits will skip shreds deserializaton before retransmit, and
so we will only have a ShredId and not a fully deserialized shred to
obtain the shuffling seed from.
2022-06-27 17:58:43 +00:00
behzad nouri 1f0f5dc03e verifies shred-version in fetch stage
Shred versions are not verified until window-service where resources are
already wasted to sig-verify and deserialize shreds.
The commit verifies shred-version earlier in the pipeline in fetch stage.
2022-06-22 12:17:37 +00:00
behzad nouri 47e62add5b
removes feature gate code adding shred-type to shred seed (#25963)
The feature is already activated on all clusters, and does not impact
processing of ledger/snapshots.
2022-06-20 14:39:24 +00:00
behzad nouri 5f04512d3a
adds a new shred variant embedding merkle tree hashes of the erasure batch (#25237)
Coding shreds can only be signed once erasure codings are already
generated. Therefore coding shreds recovered from erasure codings lack
slot leader's signature and so cannot be retransmitted to the rest of
the cluster.

shred/merkle.rs implements a new shred variant where we generate merkle
tree for each erasure encoded batch and each shred includes:
* root of the merkle tree (Hash truncated to 20 bytes).
* slot leader's signature of the root of the merkle tree.
* merkle tree nodes along the branch the shred belongs to, where hashes
  are trimmed to 20 bytes during tree construction.

This schema results in the same signature for all shreds within an
erasure batch.

When recovering shreds from erasure codes, we can reconstruct merkle
tree for the batch and for each recovered shred also recover respective
merkle tree branch; then snap the slot leader's signature from any of
the shreds received from turbine and retransmit all recovered code or
data shreds.

Backward compatibility is achieved by encoding shred variant at byte 65
of payload (previously shred-type at this position):
* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.
* 0b0100_???? indicates a merkle coding shred with merkle branch size
  indicated by the last 4 bits.
* 0b1000_???? indicates a merkle data shred with merkle branch size
  indicated by the last 4 bits.

Merkle root and branch are encoded at the end of the shred payload.
2022-06-07 22:41:03 +00:00
behzad nouri 5dbf7d8f91
removes raw indexing into packet data (#25554)
Packets are at the boundary of the system where, vast majority of the
time, they are received from an untrusted source. Raw indexing into the
data buffer can open attack vectors if the offsets are invalid.
Validating offsets beforehand is verbose and error prone.

The commit updates Packet::data() api to take a SliceIndex and always to
return an Option. The call-sites are so forced to explicitly handle the
case where the offsets are invalid.
2022-06-03 01:05:06 +00:00
behzad nouri 81231a89b9 adds support for different variants of ShredCode and ShredData
The commit implements two new types:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
    }

Following commits will extend these types by adding merkle variants:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
        Merkle(merkle::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
        Merkle(merkle::ShredData),
    }
2022-06-02 18:55:50 +00:00
behzad nouri a913068512 embeds versioning into shred binary
In preparation of
https://github.com/solana-labs/solana/pull/25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b0101_1010
    }

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type:
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b1010_0101
        MerkleCode(/*proof_size:*/ u8), // 0b0100_????
        MerkleData(/*proof_size:*/ u8), // 0b1000_????
    }
2022-06-02 18:55:50 +00:00
behzad nouri de612c25b3
removes shred wire layout specs from sigverify (#25520)
sigverify_shreds relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
https://github.com/solana-labs/solana/pull/25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
2022-05-26 13:06:27 +00:00
behzad nouri cafa85bfbb
includes shred-type when computing turbine broadcast seed (#25556)
Indices for code and data shreds of the same slot overlap; and so they
will have the same random number generator seed when shuffling cluster
nodes for turbine broadcast.

This results in the same propagation path for code and data shreds of
the same index and effectively smaller sample size for re-transmitter
nodes. For example a 32:32 batch (32 code + 32 data shreds), is
retransmitted through _at most_ 32 unique nodes, whereas ideally we want
~64 unique re-transmitters.

This commit adds shred-type to seed function so that code and data
sherds of the same (slot, index) will (most likely) have different
propagation paths.
2022-05-25 20:31:53 +00:00
behzad nouri 880684565c
limits read access into Packet data to Packet.meta.size (#25484)
Bytes past Packet.meta.size are not valid to read from.

The commit makes the buffer field private and instead provides two
methods:
* Packet::data() which returns an immutable reference to the underlying
  buffer up to Packet.meta.size. The rest of the buffer is not valid to
  read from.
* Packet::buffer_mut() which returns a mutable reference to the entirety
  of the underlying buffer to write into. The caller is responsible to
  update Packet.meta.size after writing to the buffer.
2022-05-25 16:52:54 +00:00
Justin Starry cad1c41ce2 Add Packet::deserialize_slice convenience method 2022-05-24 17:31:14 +08:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
behzad nouri be1d606dea adds sanity checks to Shred::reference_tick_from_data
Shred::reference_tick_from_data should check if payload is indeed a data
shred and has valid size.
2022-05-18 21:56:22 +00:00
behzad nouri e2bbc3913d separates out data vs code shreds at the type level
Working towards revising shred struct to embed versioning so that a new
variant can contain merkle tree hashes of the erasure batch. To ease out
migration the commit adds more type-safety by distinguishing data vs
code shreds at the type level.

Additionally having both data and coding headers in each shred is
redundant as only one is relevant for each shred. The revised shred type
in this commit will only have one type-specific header.
https://github.com/solana-labs/solana/blob/c785f1ffc/ledger/src/shred.rs#L198-L203
2022-05-18 21:56:22 +00:00
behzad nouri 9b13b1b712
adds const_assert_eq for shred constants (#25288)
Adding const_assert_eq:
* Documents explicitly what the constants are equal to.
* Prevents introducing bugs by silently changing the constants as the
  code is updated.
2022-05-17 22:44:35 +00:00
behzad nouri 9587c8537f
limits pre-allocation size when deserializing shreds (#24921)
Though current Shred struct is not vulnerable to this, adding with_limit
causes pre-allocation size to be limited to prevent against memory
exhaustion attacks:
https://github.com/bincode-org/bincode/blob/2d3f42034/readme.md
2022-05-03 12:13:45 +00:00