RetransmitSlotStats can already be utilized to track when the first
shred for a slot was received; therefore
first_shreds_received: &Mutex<BTreeSet<Slot>>
is redundant. Sending update notifications after shreds retransmit will
also bypass the need for a mutex.
* To include forwarding counters in leader slot metrics
* Capture slot_end_detected time when checking leader slots, to be used in reporting later
* Simplify banking stage loop to report leader slot metrics
Co-authored-by: carllin <carl@solana.com>
* Connection pool in connection cache and handle connection errors
1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
#### Problem
blockstore clean and compact is quite slow with wait-for-supermajority purge and can take 20-30 minutes
as described in #25710.
#### Summary of Changes
This PR removes the compaction logic in backup_and_clear_blockstore as the
actual the restoration from a bad fork is handled by `blockstore.purge_slots`
(which is done by issuing rocksdb range-delete that makes the bad fork
unavailable.)
Compaction is irreverent to the shred version, as its main job in this context
is to reclaim disk storage from the deleted slots, which we can let the rocksdb
automatic background compaction to handle it.
Fixes#25710
* client: Remove static connection cache, plumb it instead
* Add TpuClient::new_with_connection_cache to not break downstream
* Refactor get_connection and RwLock into ConnectionCache
* Fix merge conflicts from new async TpuClient
* Remove `ConnectionCache::set_use_quic`
* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
* Remove the args param from Measure::this since we don't ever use it
* banking_stage.rs: convert to measure!
* poh_recorder.rs: convert to measure!
* cost_update_service.rs: convert to measure!
* poh_service.rs: convert to measure!
* bank.rs: convert to measure!
* measure.rs: Remove Measure::this now that all have been converted to measure!
Packets are at the boundary of the system where, vast majority of the
time, they are received from an untrusted source. Raw indexing into the
data buffer can open attack vectors if the offsets are invalid.
Validating offsets beforehand is verbose and error prone.
The commit updates Packet::data() api to take a SliceIndex and always to
return an Option. The call-sites are so forced to explicitly handle the
case where the offsets are invalid.
* Spawn QUIC server to receive forwarded txs
* Update validator port range
* forward votes using UDP
* no forwarding from unstaked nodes
* forwarding stats in banking stage
* fix test builds
* fix lifetime of forward sender
It used to report the number of packets with successful signature
validations but was accidentally changed to count packets passed into
the verifier by e4409a87fe.
This restores the previous meaning.
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.
#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.
By doing this, we address the mutual dependency and also make the code cleaner.
* FindPacketSenderStake: Remove parallelism to improve performance
The work unit sizes were so small that using the thread pool
slowed down this stage significantly.
* fix checks
Co-authored-by: Justin Starry <justin@solana.com>
Indices for code and data shreds of the same slot overlap; and so they
will have the same random number generator seed when shuffling cluster
nodes for turbine broadcast.
This results in the same propagation path for code and data shreds of
the same index and effectively smaller sample size for re-transmitter
nodes. For example a 32:32 batch (32 code + 32 data shreds), is
retransmitted through _at most_ 32 unique nodes, whereas ideally we want
~64 unique re-transmitters.
This commit adds shred-type to seed function so that code and data
sherds of the same (slot, index) will (most likely) have different
propagation paths.
Bytes past Packet.meta.size are not valid to read from.
The commit makes the buffer field private and instead provides two
methods:
* Packet::data() which returns an immutable reference to the underlying
buffer up to Packet.meta.size. The rest of the buffer is not valid to
read from.
* Packet::buffer_mut() which returns a mutable reference to the entirety
of the underlying buffer to write into. The caller is responsible to
update Packet.meta.size after writing to the buffer.
Upcoming changes to PacketBatch to support variable sized packets will
modify the internals of PacketBatch. So, this change removes usage of
the internal packet struct and instead uses accessors (which are
currently just wrappers of Vector functions but will change down the
road).