solana

Commit Graph

Author	SHA1	Message	Date
behzad nouri	d54b6204be	removes instances of clippy::manual_let_else (#32417 )	2023-07-09 21:41:36 +00:00
steviez	5feebd2dc8	ledger-tool: Manually walk optimistic slots ancestors (#32362 ) If a slot is marked as optimistically confirmed, it is probable but not guaranteed that its' ancestors will also be marked as optimistically confirmed in the Blockstore. Given the importance of examining optimistically confirmed slots around cluster restarts, manually walk an AncestorIterator to avoid the chance of a slot improperly being ignored in cluster restart scenarios.	2023-07-05 10:18:15 -04:00
steviez	d5ad29d837	Make Blockstore::scan_and_fix_roots() take optional start/stop slots (#32289 ) The optional args allow reuse by ledger-tool repair roots command Also, hold cleanup lock for duration of Blockstore::scan_and_fix_roots(). This prevents a scenario where scan_and_fix_roots() could identify a slot as needing to be marked root, that slot getting cleaned by LedgerCleanupService, and then scan_and_fix_roots() marking the slot as root on the now purged slot.	2023-06-28 22:32:03 -05:00
steviez	6b013f46eb	Add comments and unit test for Blockstore::scan_and_fix_roots() (#32283 )	2023-06-27 10:39:23 -05:00
Alexander Meißner	ee2c2ef6c7	Cleanup - require_static_program_ids_in_transaction (#31767 ) require_static_program_ids_in_transaction	2023-06-07 17:12:41 +02:00
Ashwin Sekar	3e8f5bad81	refactor: highest_cluster_confirmed_root -> highest_super_majority_root (#31619 )	2023-05-14 00:42:03 -07:00
Ashwin Sekar	ef75f1cb4e	Add ancestor hashes to state machine (#31627 ) * Notify replay of pruned duplicate confirmed slots * Ingest replay signal and run ancestor hashes for pruned * Forward PDC to ancestor hashes and ingest pruned dumps from ancestor hashes service * Add local-cluster test	2023-05-13 02:05:44 -07:00
steviez	0eec1ad57f	chore: Update Blockstore::get_slots_since() doc comments (#31134 ) Additionally, change the function definition to use Slot instead of u64. Slot is defined as an alias of u64 so these are functionally equivalent, but using Slot over u64 is more expressive of the intent.	2023-04-11 02:39:31 -05:00
steviez	814de50f2a	chore: Variable rename `height` ==> `slot` in blockstore function (#31132 ) Slots refer to time windows where a block could be produced whereas height refers to how many blocks are actually in a fork. This function is operating on a list of slots, so the use of "height" is incorrect and misleading.	2023-04-11 02:38:20 -05:00
Illia Bobyr	a1149ecafe	make_slot_entries(): Use unique hashes (#30627 )	2023-03-30 19:36:30 -07:00
steviez	8db35eb7e5	ledger-tool: Add flag to find non-vote optimistic slots (#30580 ) In cluster restart scenarios, it is desirable to know the latest optimistic slot(s), which the subcommand latest-optimistic-slots will return. However, it is also useful to know whether slots contain only votes or if they contain votes and user transactions. This PR adds an extra column of output to show whether an optimistically confirmed slot is vote only (contains zero non-vote transactions). Additionally, a flag has been added to enable filtering output to exclude vote only slots.	2023-03-23 14:13:41 -07:00
steviez	bc933c63ce	Fix SlotMeta connected tracking (#28069 ) Fix SlotMeta is_connected tracking The tracking of connected status was previously based upon an assumption that would be practically false for all validators. The connected status of slots played into whether Blockstore would signal ReplayStage that it had new shreds ready to be replayed. Prior to the change, we would never signal and ReplayStage would always wait the entire duration of a 100ms timeout before restarting its' main processing loop. This commit introduces a change where we mark snapshot slots as connected. A validator may not have a path all the way back to genesis itself; however, snapshots are taken at known roots so we extend the connected status to these slots. Once a node has been bootstrapped once to have is connected, the logic persists in Blockstore such that all children on the main fork also get their connected status updated properly.	2023-03-21 20:17:58 +08:00
behzad nouri	5d9aba5548	increases retransmit-stage deduper capacity and reset-cycle (#30758 ) For duplicate block detection, for each (slot, shred-index, shred-type) we need to allow 2 different shreds to be retransmitted. The commit implements this using two bloom-filter dedupers: * Shreds are deduplicated using the 1st deduper. * If a shred is not a duplicate, then we check if: (slot, shred-index, shred-type, k) is not a duplicate for either k = 0 or k = 1 using the 2nd deduper, and if so then the shred is retransmitted. This allows to achieve larger capactiy compared to current LRU-cache.	2023-03-20 20:32:23 +00:00
Jeff Biseda	20614fa746	restore timestamp() in find_missing_indexes (#30345 )	2023-02-15 16:12:36 -08:00
steviez	328b674edc	Remove recursive read lock that could deadlock Blockstore (#30203 ) This deadlock could only occur on nodes that call Blockstore::get_rooted_block(). Regular validators don't call this function, RPC nodes and nodes that have BigTableUploadService enabled do. Blockstore::get_rooted_block() grabs a read lock on lowest_cleanup_slot right at the start to check if the block has been cleaned up, and to ensure it doesn't get cleaned up during execution. As part of the callstack of get_rooted_block(), Blockstore::get_completed_ranges() will get called, which also grabs a read lock on lowest_cleanup_slot. If LedgerCleanupService attempts to grab a write lock between the read lock calls, we could hit a deadlock if priority is given to the write lock request in this scenario. This change removes the call to get the read lock in get_completed_ranges(). The lock is only held for the scope of this function, which is a single rocksdb read and thus not needed. This does mean that a different error will be returned if the requested slot was below lowest_cleanup_slot. Previously, a BlockstoreError::SlotCleanedUp would have been thrown; the rocksdb error will be bubbled up now. Note that callers of get_rooted_block() will still get the SlotCleanedUp error when appropriate because get_rooted_block() grabs the lock. If the slot is unavailable, it will return immediately. If the slot is available, get_rooted_block() holding the lock means the slot will remain available.	2023-02-13 17:33:24 -06:00
Ryo Onodera	3e6162e69e	Add address lookup tables to minimized snapshot (#30158 ) * Add address lookup tables to minimized snapshot * Add comment for future posterity * Add reference to the issue * Adjust comment a bit Co-authored-by: Andrew Fitzgerald <apfitzge@gmail.com> --------- Co-authored-by: Andrew Fitzgerald <apfitzge@gmail.com>	2023-02-10 14:46:02 +09:00
Jeff Biseda	180273b97d	defer HighestShred repairs during shred propagation threshold (#30142 )	2023-02-09 14:57:55 -08:00
steviez	d3dab24bbe	chore: Use `i` over `ix` variable name when naming worker threads (#30206 )	2023-02-09 01:24:57 +00:00
Illia Bobyr	6bc566d802	doc: ledger: Document `CompletedDataSetInfo` (#29998 )	2023-02-07 21:19:07 -08:00
Ryo Onodera	40bbf99c74	Add fully-reproducible online tracer for banking (#29196 ) * Add fully-reproducible online tracer for banking * Don't use eprintln!()... * Update programs/sbf/Cargo.lock... * Remove meaningless assert_eq * Group test-only code under aptly named mod * Remove needless overflow handling in receive_until * Delay stat aggregation as it's possible now * Use Cow to avoid needless heap allocs * Properly consume metrics action as soon as hold * Trace UnprocessedTransactionStorage::len() instead * Loosen joining api over type safety for replaystage * Introce hash event to override these when simulating * Use serde_with/serde_as instead of hacky workaround * Update another Cargo.lock... * Add detailed comment for Packet::buffer serialize * Rename sender_overhead_minimized_receiver_loop() * Use type interference for TraceError * Another minor rename * Retire now useless ForEach to simplify code * Use type alias as much as possible * Properly translate and propagate tracing errors * Clarify --enable-banking-trace with better naming * Consider unclean (signal-based) node restarts.. * Tweak logging and cli * Remove Bank events as it's not needed anymore * Make tpu own banking tracer thread * Reduce diff a bit.. * Use latest serde_with * Finally use the published rolling-file crate * Make test code change more consistent * Revive dead and non-terminating test code path... * Dispose batches early now that possible * Split off thread handle very early at ::new() * Tweak message for TooSmallDirByteLimitl * Remove too much of indirection * Remove needless pub from ::channel() * Clarify test comments * Avoid needless event creation if tracer is disabled * Write tests around file rotation and spill-over * Remove unneeded PathBuf::clone()s... * Introduce inner struct instead of tuple... * Remove unused enum BankStatus... * Avoid .unwrap() for the case of disabled tracer...	2023-01-25 21:54:38 +09:00
behzad nouri	272e667cb2	deprecates Pubkey::new in favor of Pubkey::{,try_}from (#29805 ) The commit deprecates Pubkey::new which lacks type-safety and instead implements TryFrom<&[u8]> and TryFrom<Vec<u8>> for Pubkey.	2023-01-21 18:06:27 +00:00
Brennan Watt	aa40c2b712	Increase turbine propagation const (#29742 ) * Increase turbine propagation const Value is used as a delay threshold for issuing shred repairs and analysis is showing we are overly aggressive in requesting repairs. Shreds show up via turbine before the repair completes the vast majority of the time * Use Duration type for MAX_TURBINE_PROPAGATION	2023-01-17 15:01:00 -08:00
steviez	2b88401ef7	chore: Cleanup and document a Blockstore chaining test (#29705 )	2023-01-14 03:04:53 -06:00
Illia Bobyr	59fde130d6	ledger/blockstore: PerfSampleV2: num_non_vote_transactions (#29404 ) Store non-vote transaction counts that are now recorded by the banks into the `blockstore`. `SamplePerformanceService` now populates `PerfSampleV2` with counts from the banks.	2023-01-12 19:14:04 -08:00
behzad nouri	283a2b1540	removes #[allow(clippy::same_item_push)] (#29543 )	2023-01-06 17:32:26 +00:00
behzad nouri	d87128e02c	fixes errors from clippy::needless_borrow (#29535 ) https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow	2023-01-05 18:21:56 +00:00
behzad nouri	5c9beef498	fixes errors from clippy::useless_conversion (#29534 ) https://rust-lang.github.io/rust-clippy/master/index.html#useless_conversion	2023-01-05 18:05:32 +00:00
steviez	ff8bb5362c	Remove repetitive logic in SlotMeta first insert detection logic (#29153 )	2022-12-15 17:38:27 -06:00
behzad nouri	9524c9dbff	patches errors from clippy::uninlined_format_args https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args	2022-12-06 19:32:15 +00:00
steviez	01cd55a27a	Change SlotMeta is_connected bool to bitflags (#29001 ) We currently use the is_connected field to be able to signal to ReplayStage that a slot has replayable updates. It was discovered that this functionality is effectively broken, and that is_connected is never true. In order to convey this information to ReplayStage more effectively, we need extra state information so this PR changes the existing bool to bitflags with two bits. From a compatibility standpoint, the is_connected bool was already occupying one byte in the serialized SlotMeta in blockstore. Thus, the change from a bool to bitflags still "fits" in that one byte allotment. In consideration of a case where a client may wish to downgrade software and use the same ledger, deserializing the bitflags into a bool could fail if the new bit is set. As such, this PR introduces the second bit field, but does not set it anywhere. Once clusters have mass adopted a software version with this PR, a subsequent change to actually set and use the new field can be introduced.	2022-12-01 14:42:35 -06:00
steviez	3c42c87098	Remove obsoleted return value from Blockstore insert shred method (#28992 )	2022-12-01 11:17:46 -06:00
steviez	b6dce6cf3b	Move BlockstoreInsertionMetrics field update to blockstore.rs (#28991 ) The num_repair field is only blockstore insertion metric being updated outside of Blockstore::insert() call chain; move the update to insert() with the rest of the fields in BlockstoreInsertionMetrics struct.	2022-11-30 11:46:35 -06:00
Brennan Watt	9a6ab5e7fe	Distinguish turbine vs repair insertion metrics (#28980 )	2022-11-30 09:03:53 -08:00
Brooks Prumo	d1ba42180d	clippy for rust 1.65.0 (#28765 )	2022-11-09 19:39:38 +00:00
steviez	2272fd807e	Remove Blockstore manual compaction code (#28409 ) The manual Blockstore compaction that was being initiated from LedgerCleanupService has been disabled for quite some time in favor of several optimizations. Co-authored-by: Ryo Onodera <ryoqun@gmail.com>	2022-10-28 10:39:00 +02:00
Justin Starry	2d8665d307	Record inner instruction stack height (#28430 ) * Record inner instruction stack height * fix sbf tests * feedback	2022-10-26 10:37:44 +08:00
steviez	60f6e24b76	Make Blockstore::get_entries_in_data_block() use multi_get() (#28245 )	2022-10-09 15:34:03 -04:00
steviez	49dbae7e53	Use VecDeque as a queue instead of Vec (#28190 )	2022-10-03 22:55:59 -05:00
Yueh-Hsuan Chiang	6b17bee5a8	Remove the const default for RocksFifo (#27965 ) #### Summary of Changes Removes the constant default for ShredStorageType::RocksFifo as the shred storage size is either user-specified or derived from --limit-ledger-size in #27459.	2022-10-01 15:10:54 -07:00
steviez	f38ed1c266	Use more descriptive variable names in blockstore chaining tests (#28131 )	2022-09-29 10:24:09 -05:00
behzad nouri	72537e7e07	bypasses rayon thread-pool for single entry batches (#28077 ) With no parallelization, thread-pool only adds overhead.	2022-09-26 21:32:58 +00:00
behzad nouri	f49beb0cbc	caches reed-solomon encoder/decoder instance (#27510 ) ReedSolomon::new(...) initializes a matrix and a data-decode-matrix cache: https://github.com/rust-rse/reed-solomon-erasure/blob/273ebbced/src/core.rs#L460-L466 In order to cache this computation, this commit caches the reed-solomon encoder/decoder instance for each (data_shards, parity_shards) pair.	2022-09-25 18:09:47 +00:00
behzad nouri	45e26574f3	removes redundant shred.sanitize() from blockstore (#28016 ) Shreds received from other nodes over the socket are sanitized when the payload is deserialized: https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/legacy.rs#L137 https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/legacy.rs#L77 https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/merkle.rs#L355 https://github.com/solana-labs/solana/blob/315707504/ledger/src/shred/merkle.rs#L439 Similarly, shreds recovered from erasure codes are also sanitized at deserialization: https://github.com/solana-labs/solana/blob/f02fe9c7e/ledger/src/shredder.rs#L330 or explicitly so for Merkle shreds: https://github.com/solana-labs/solana/blob/f02fe9c7e/ledger/src/shred/merkle.rs#L753 Shreds generated locally by the node itself during its leader slots do not need to be sanitized. So sanitizing shreds in blockstore is redundant and wasteful. In particular this becomes more wasteful with Merkle shreds because sanitizing shreds would require verifying Merkle proof. As such the commit removes redundant shred.sanitize() from blockstore.	2022-09-24 16:31:50 +00:00
behzad nouri	97c9af4c6b	plumbs through flag to generate merkle variant of shreds	2022-09-23 16:45:18 +00:00
steviez	e4affb9fea	Add Blockstore::highest_slot() method (#27981 )	2022-09-23 04:53:43 -05:00
steviez	eaa4787201	Cleanup blockstore test (#27999 )	2022-09-22 23:37:47 +00:00
behzad nouri	9a57c64f21	patches clippy errors from new rust nightly release (#27996 )	2022-09-22 22:23:03 +00:00
Yueh-Hsuan Chiang	cccade42b3	Optimize get_slots_since() using the batched version of multi_get() (#27686 ) #### Problem The current implementation of get_slots_since() invokes multiple rocksdb::get(). As a result, each get() operation may end up requiring one disk read. This leads to poor performance of get_slots_since described in #24878. #### Summary of Changes This PR makes get_slots_since() use the batched version of multi_get() instead, which allows multiple get operations to be processed in batch so that they can be answered with fewer disk reads.	2022-09-19 21:52:13 -07:00
Yueh-Hsuan Chiang	ba3d9cd325	Add LedgerColumn::multi_get() (#26354 ) #### Problem Blockstore operations such as get_slots_since() issues multiple rocksdb::get() at once which is not optimal for performance. #### Summary of Changes This PR adds LedgerColumn::multi_get() based on rocksdb::batched_multi_get(), the optimized version of multi_get() where get requests are processed in batch to minimize read I/O.	2022-09-12 15:01:22 -07:00
Yueh-Hsuan Chiang	ed00365101	Add ledger tool command print-file-metadata (#26790 ) Add ledger-tool command print-file-metadata #### Summary of Changes This PR adds a ledger tool subcommand print-file-metadata. ``` USAGE: solana-ledger-tool print-file-metadata [FLAGS] [OPTIONS] [SST_FILE_NAME] Prints the metadata of the specified ledger-store file. If no file name is unspecified, then it will print the metadata of all ledger files ```	2022-09-06 21:46:35 -07:00

1 2 3 4 5 ...

341 Commits