Commit Graph

3079 Commits

Author SHA1 Message Date
behzad nouri eacb9183d4
patches bug where the 1st coding shred is not inserted into blockstore (#25916)
StandardBroadcastRun::insert skips 1st shred with index zero because
the 1st *data* shred is inserted synchronously:
https://github.com/solana-labs/solana/blob/53695ecd2/core/src/broadcast_stage/standard_broadcast_run.rs#L239-L246
https://github.com/solana-labs/solana/blob/53695ecd2/core/src/broadcast_stage/standard_broadcast_run.rs#L334-L339

https://github.com/solana-labs/solana/pull/7481
which added this code was not inserting coding shreds into blockstore.
Starting with
https://github.com/solana-labs/solana/pull/8899
coding shreds are inserted into blockstore as well as data shreds, but
the insert logic erroneously skips first coding shred because it does
not check if shred is code or data.
2022-06-16 13:59:15 +00:00
behzad nouri fe3c1d3d49
removes erroneous uses of &Arc<...> from broadcast-stage (#25962) 2022-06-15 13:44:24 +00:00
Brian Anderson db9004bd0f
Fix doc warnings (#25953) 2022-06-14 21:55:08 -06:00
Tao Zhu c96d9d127a
Include forwarding counters in leader slot metrics (#25874)
* To include forwarding counters in leader slot metrics

* Capture slot_end_detected time when checking leader slots, to be used in reporting later

* Simplify banking stage loop to report leader slot metrics

Co-authored-by: carllin <carl@solana.com>
2022-06-13 17:03:34 -05:00
Michael Vines ace24a7c82 A default tower is no longer considered to contain a stray last vote 2022-06-10 14:17:26 -07:00
Lijun Wang 29b597cea5
Connection pool support in connection cache and QUIC connection reliability improvement (#25793)
* Connection pool in connection cache and handle connection errors

1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
2022-06-10 09:25:24 -07:00
Yueh-Hsuan Chiang ee4469c882
Skip compaction in backup_and_clear_blockstore (#25810)
#### Problem
blockstore clean and compact is quite slow with wait-for-supermajority purge and can take 20-30 minutes
as described in #25710.

#### Summary of Changes
This PR removes the compaction logic in backup_and_clear_blockstore as the
actual the restoration from a bad fork is handled by `blockstore.purge_slots`
(which is done by issuing rocksdb range-delete that makes the bad fork
unavailable.)

Compaction is irreverent to the shred version, as its main job in this context
is to reclaim disk storage from the deleted slots, which we can let the rocksdb
automatic background compaction to handle it.

Fixes #25710
2022-06-09 17:11:50 +08:00
carllin bf8faa8a30
Report banking stage tracer metrics (#25620) 2022-06-09 00:25:37 -05:00
Jon Cinque 79a8ecd0ac
client: Remove static connection cache, plumb it instead (#25667)
* client: Remove static connection cache, plumb it instead

* Add TpuClient::new_with_connection_cache to not break downstream

* Refactor get_connection and RwLock into ConnectionCache

* Fix merge conflicts from new async TpuClient

* Remove `ConnectionCache::set_use_quic`

* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
2022-06-08 13:57:12 +02:00
behzad nouri 6c9f2eac78
removes fec_set_offset from UnfinishedSlotInfo (#25815)
If the blockstore has shreds for a slot, it should not recreate the
slot:
https://github.com/solana-labs/solana/blob/ff68bf6c2/ledger/src/leader_schedule_cache.rs#L142-L146
https://github.com/solana-labs/solana/pull/15849/files#r596657314

Therefore in broadcast stage if UnfinishedSlotInfo is None, then
fec_set_offset will be zero:
https://github.com/solana-labs/solana/blob/ff68bf6c2/core/src/broadcast_stage/standard_broadcast_run.rs#L111-L120

As a result fec_set_offset will always be zero, and is so redundant and
can be removed.
2022-06-07 22:17:37 +00:00
Brennan Watt ba04063956
Add CPUmetrics (#25802)
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
2022-06-07 11:34:25 -07:00
apfitzge e6c21a3036
Convert Measure::this to measure! and remove Measure::this (#25776)
* Remove the args param from Measure::this since we don't ever use it

* banking_stage.rs: convert to measure!

* poh_recorder.rs: convert to measure!

* cost_update_service.rs: convert to measure!

* poh_service.rs: convert to measure!

* bank.rs: convert to measure!

* measure.rs: Remove Measure::this now that all have been converted to measure!
2022-06-06 20:21:05 -05:00
sakridge 447a3239e7
Add new replay metrics for replay blockstore_into_bank and complete (#25717) 2022-06-03 19:45:27 +02:00
behzad nouri 5dbf7d8f91
removes raw indexing into packet data (#25554)
Packets are at the boundary of the system where, vast majority of the
time, they are received from an untrusted source. Raw indexing into the
data buffer can open attack vectors if the offsets are invalid.
Validating offsets beforehand is verbose and error prone.

The commit updates Packet::data() api to take a SliceIndex and always to
return an Option. The call-sites are so forced to explicitly handle the
case where the offsets are invalid.
2022-06-03 01:05:06 +00:00
behzad nouri 81231a89b9 adds support for different variants of ShredCode and ShredData
The commit implements two new types:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
    }

Following commits will extend these types by adding merkle variants:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
        Merkle(merkle::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
        Merkle(merkle::ShredData),
    }
2022-06-02 18:55:50 +00:00
Pankaj Garg 1c2ae470c5
Fix forwarding of transactions over QUIC (#25674)
* Spawn QUIC server to receive forwarded txs

* Update validator port range

* forward votes using UDP

* no forwarding from unstaked nodes

* forwarding stats in banking stage

* fix test builds

* fix lifetime of forward sender
2022-06-02 11:14:58 -07:00
HaoranYi d3ac4e941b
Bench: preshrink + sigverify (#25480)
* double shrinking

* add bench

* rename

* aggregate timing

* remove pre/post shrink time

* update api after merge
2022-06-02 09:19:01 -05:00
Tao Zhu 51ac599915
Add user requested CU (eg. compute_budget.compute_unit_limit) to immutable_deserialized_packet, to be used in cost model and prioritized forwarding (#25695) 2022-06-01 22:43:48 +00:00
Ryo Onodera aedcb05dc8
Record solana-validator ver to metrics at startup (#25635)
* Record solana-validator ver to metrics at startup

* Update Cargo.lock
2022-06-01 13:37:50 +09:00
Christian Kamm 02b26ddd82
SigVerify: Fix num_valid_packets metric (#25643)
It used to report the number of packets with successful signature
validations but was accidentally changed to count packets passed into
the verifier by e4409a87fe.

This restores the previous meaning.
2022-05-31 18:51:20 +10:00
carllin 90a3315b69
Detect tracer key in sigverify (#25579)
* Mark the tracer transaction

* simplify tracer check
2022-05-30 18:41:54 -05:00
Justin Starry e4409a87fe
Add pre shrink pass before sigverify batch (#25136) 2022-05-28 01:51:55 +10:00
Yueh-Hsuan Chiang 5b67960c76
(Refactor) Move blocktore options related stuff to blockstore_options.rs (#25509)
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.

#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.

By doing this, we address the mutual dependency and also make the code cleaner.
2022-05-26 16:59:26 -07:00
dependabot[bot] 7f4128947b
chore: bump lru from 0.7.5 to 0.7.6 (#25572)
* chore: bump lru from 0.7.5 to 0.7.6

Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.7.5 to 0.7.6.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.7.5...0.7.6)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-26 19:05:02 +00:00
ryleung-solana 1ca5c3a7bd
Switch to using enum-dispatch to switch between UDP and Quic (#24713) 2022-05-26 11:21:16 -04:00
behzad nouri de612c25b3
removes shred wire layout specs from sigverify (#25520)
sigverify_shreds relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
https://github.com/solana-labs/solana/pull/25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
2022-05-26 13:06:27 +00:00
Christian Kamm 0efb7478cd
FindPacketSenderStake: Remove parallelism to improve performance (#25562)
* FindPacketSenderStake: Remove parallelism to improve performance

The work unit sizes were so small that using the thread pool
slowed down this stage significantly.

* fix checks

Co-authored-by: Justin Starry <justin@solana.com>
2022-05-26 21:17:52 +10:00
behzad nouri cafa85bfbb
includes shred-type when computing turbine broadcast seed (#25556)
Indices for code and data shreds of the same slot overlap; and so they
will have the same random number generator seed when shuffling cluster
nodes for turbine broadcast.

This results in the same propagation path for code and data shreds of
the same index and effectively smaller sample size for re-transmitter
nodes. For example a 32:32 batch (32 code + 32 data shreds), is
retransmitted through _at most_ 32 unique nodes, whereas ideally we want
~64 unique re-transmitters.

This commit adds shred-type to seed function so that code and data
sherds of the same (slot, index) will (most likely) have different
propagation paths.
2022-05-25 20:31:53 +00:00
behzad nouri 880684565c
limits read access into Packet data to Packet.meta.size (#25484)
Bytes past Packet.meta.size are not valid to read from.

The commit makes the buffer field private and instead provides two
methods:
* Packet::data() which returns an immutable reference to the underlying
  buffer up to Packet.meta.size. The rest of the buffer is not valid to
  read from.
* Packet::buffer_mut() which returns a mutable reference to the entirety
  of the underlying buffer to write into. The caller is responsible to
  update Packet.meta.size after writing to the buffer.
2022-05-25 16:52:54 +00:00
carllin 9651cdad99
Refactor Sigverify trait (#25359) 2022-05-24 16:01:41 -05:00
Jeff Biseda 61c5a471e8
preserve optimistic_slot in blockstore (#25311) 2022-05-24 12:03:28 -07:00
Justin Starry e66ea7cb6a Clean up Bank::commit_transactions parameters 2022-05-24 20:24:42 +08:00
Justin Starry cad1c41ce2 Add Packet::deserialize_slice convenience method 2022-05-24 17:31:14 +08:00
Tyera Eulberg 514f73f4b1
Remove retain_mut dep (#25494) 2022-05-23 21:45:49 +00:00
steviez ec7ca411dd
Make PacketBatch packets vector non-public (#25413)
Upcoming changes to PacketBatch to support variable sized packets will
modify the internals of PacketBatch. So, this change removes usage of
the internal packet struct and instead uses accessors (which are
currently just wrappers of Vector functions but will change down the
road).
2022-05-23 15:30:15 -05:00
dependabot[bot] e75569e85a
chore: bump systemstat from 0.1.10 to 0.1.11 (#25471)
Bumps [systemstat](https://github.com/unrelentingtech/systemstat) from 0.1.10 to 0.1.11.
- [Release notes](https://github.com/unrelentingtech/systemstat/releases)
- [Commits](https://github.com/unrelentingtech/systemstat/compare/v0.1.10...v0.1.11)

---
updated-dependencies:
- dependency-name: systemstat
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-23 20:08:47 +00:00
Christian Kamm 6429aff13b
findpacketsenderstake: add discard after receive (#25458)
This mimics a similar change in sigverify, see #25388
2022-05-23 21:27:20 +02:00
behzad nouri c248fb3f51
renames Packet Meta::{,set_}addr methods to {,set_}socket_addr (#25478)
In order to distinguish between Meta.addr field which is an IpAddr and
the methods which refer to a SocketAddr.
2022-05-23 15:48:59 +00:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
sakridge e22be02d3a
sigverify: add discard before dedup (#25388) 2022-05-23 03:40:33 +02:00
Brooks Prumo f8842032c6
clippy: fix "this let-binding has unit value" warnings (#25429) 2022-05-22 12:17:59 -04:00
Pankaj Garg 7fb0ef1fa5
Use async send for forwarding transactions (#25435) 2022-05-20 21:20:47 -07:00
Jeff Biseda e263be2000
handle start_http failure in rpc_service (#25400) 2022-05-20 17:59:23 -07:00
Brennan Watt e025376719
Fix packet accounting after dedup (#25357)
* Fix packet accounting after dedup
* Rename function to better represent intent
2022-05-20 17:00:13 -07:00
Brennan Watt 2fdc850176
Use Shared IP to Stake Map (#25377)
* Find packet sender stake stage use shared IP to stake map
2022-05-20 12:51:07 -07:00
Michael Vines c54e06355f
voteSubscribe pubsub notification now includes the vote transaction signature (#25291) 2022-05-19 18:28:46 -07:00
Michael Vines 97efbdc303
Defer tower saving until push_vote(), there's no need to do it sooner (#25374) 2022-05-19 18:27:58 -07:00
buffalu 971748b335
fix banking stage starvation (#25245) 2022-05-18 22:37:47 +02:00
dependabot[bot] 7402878628
chore: bump raptorq from 1.6.5 to 1.7.0 (#25330)
Bumps [raptorq](https://github.com/cberner/raptorq) from 1.6.5 to 1.7.0.
- [Release notes](https://github.com/cberner/raptorq/releases)
- [Commits](https://github.com/cberner/raptorq/compare/v1.6.5...v1.7.0)

---
updated-dependencies:
- dependency-name: raptorq
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-18 17:56:38 +00:00
dependabot[bot] 542bd0ec3c
chore: bump rayon from 1.5.2 to 1.5.3 (#25242)
* chore: bump rayon from 1.5.2 to 1.5.3

Bumps [rayon](https://github.com/rayon-rs/rayon) from 1.5.2 to 1.5.3.
- [Release notes](https://github.com/rayon-rs/rayon/releases)
- [Changelog](https://github.com/rayon-rs/rayon/blob/master/RELEASES.md)
- [Commits](https://github.com/rayon-rs/rayon/compare/v1.5.2...v1.5.3)

---
updated-dependencies:
- dependency-name: rayon
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
2022-05-18 09:39:57 -06:00