solana

Commit Graph

Author	SHA1	Message	Date
behzad nouri	1ac2a8cfa5	removes delayed crds inserts when upserting gossip table (#16806 ) It is crucial that VersionedCrdsValue::insert_timestamp does not go backward in time: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds.rs#L67-L79 Otherwise methods such as get_votes and get_epoch_slots_since will break, which will break their downstream flow, including vote-listener and optimistic confirmation: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215 https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298 For that, Crds::new_versioned is intended to be called "atomically" with Crds::insert_verioned (as the comment already says so): https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds.rs#L126-L129 However, currently this is violated in the code. For example, filter_pull_responses creates VersionedCrdsValues (with the current timestamp), then acquires an exclusive lock on gossip, then process_pull_responses writes those values to the crds table: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L2375-L2392 Depending on the workload and lock contention, the insert_timestamps may well be in the past when these values finally are inserted into gossip. To avoid such scenarios, this commit: * removes Crds::new_versioned and Crd::insert_versioned. * makes VersionedCrdsValue constructor private, only invoked in Crds::insert, so that insert_timestamp is populated right before insert. This will improve insert_timestamp monotonicity as long as Crds::insert is not called with a stalled timestamp. Following commits may further improve this by calling timestamp() inside Crds::insert, and/or switching to std::time::Instant which guarantees monotonicity.	2021-04-28 11:56:13 +00:00
behzad nouri	03194145c0	removes first_coding_index from erasure recovery code (#16646 ) first_coding_index is the same as the set_index and is so redundant: https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/blockstore_meta.rs#L49-L60	2021-04-23 12:00:37 +00:00
behzad nouri	37b8587d4e	expands number of erasure coding shreds in the last batch in slots (#16484 ) Number of parity coding shreds is always less than the number of data shreds in FEC blocks: https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719 Data shreds are batched in chunks of 32 shreds each: https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714 However the very last batch of data shreds in a slot can be small, in which case the loss rate can be exacerbated. This commit expands the number of coding shreds in the last FEC block in slots to: 64 - number of data shreds; so that FEC blocks are always 64 data and parity coding shreds each. As a consequence of this, the last FEC block has more parity coding shreds than data shreds. So for some shred indices we will have a coding shred but no data shreds. This should not cause any kind of overlapping FEC blocks as in: https://github.com/solana-labs/solana/pull/10095 since this is done only for the very last batch in a slot, and the next slot will reset the shred index.	2021-04-21 12:47:50 +00:00
steviez	bb24318ef0	Document shreds (#16514 ) No functionality changes from this commit	2021-04-16 14:04:46 -05:00
Justin Starry	85eb37fab0	Merge pull request from GHSA-8v47-8c53-wwrc * Track transaction check time separately from account loads * banking packet process metrics * Remove signature clone in status cache lookup * Reduce allocations when converting packets to transactions * Add blake3 hash of transaction messages in status cache * Bug fixes * fix tests and run fmt * Address feedback * fix simd tx entry verification * Fix rebase * Feedback * clean up * Add tests * Remove feature switch and fall back to signature check * Bump programs/bpf Cargo.lock * clippy * nudge benches * Bump `BankSlotDelta` frozen ABI hash` * Add blake3 to sdk/programs/Cargo.lock * nudge bpf tests * short circuit status cache checks Co-authored-by: Trent Nelson <trent@solana.com>	2021-04-13 00:28:08 -06:00
behzad nouri	3f63ed9a72	removes OrderedIterator and transaction batch iteration order (#16153 ) In TransactionBatch, https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/transaction_batch.rs#L4-L11 lock_results[i] is aligned with transactions[iteration_order[i]]: https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/bank.rs#L2414-L2424 https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/accounts.rs#L788-L817 However load_and_execute_transactions is iterating over lock_results[iteration_order[i]] https://github.com/solana-labs/solana/blob/e50f59844/runtime/src/bank.rs#L2878-L2889 and then returning i as for the index of the retryable transaction. If iteratorion_order is [1, 2, 0], and i is 0, then: lock_results[iteration_order[i]] = lock_results[1] which corresponds to transactions[iteration_order[1]] = transactions[2] so neither i = 0, nor iteration_order[i] = 1 gives the correct index for the corresponding transaction (which is 2). This commit removes OrderedIterator and transaction batch iteration order entirely. There is only one place in blockstore processor which the iteration order is not ordinal: https://github.com/solana-labs/solana/blob/e50f59844/ledger/src/blockstore_processor.rs#L269-L271 It seems like, instead of using an iteration order, that can shuffle entry transactions in-place.	2021-03-31 23:59:19 +00:00
behzad nouri	4f82b897bc	buffers data shreds to make larger erasure coded sets (#15849 ) Broadcast stage batches up to 8 entries: https://github.com/solana-labs/solana/blob/79280b304/core/src/broadcast_stage/broadcast_utils.rs#L26-L29 which will be serialized into some number of shreds and chunked into FEC sets of at most 32 shreds each: https://github.com/solana-labs/solana/blob/79280b304/ledger/src/shred.rs#L576-L597 So depending on the size of entries, FEC sets can be small, which may aggravate loss rate. For example 16 FEC sets of 2:2 data/code shreds each have higher loss rate than one 32:32 set. This commit broadcasts data shreds immediately, but also buffers them until it has a batch of 32 data shreds, at which point 32 coding shreds are generated and broadcasted.	2021-03-23 14:52:38 +00:00
Jeff Washington (jwash)	57ba86c821	eliminate lock on record (#15929 ) * eliminate lock on record * use same error as MaxHeightReached * clippy * review feedback * refactor should_tick code * pr feedback	2021-03-23 09:10:04 -05:00
carllin	c1ba265dd9	Wallclock BankingStage Throttle (#15731 )	2021-03-15 17:11:15 -07:00
sakridge	d09112fa6d	PoH batch size calibration (#15717 )	2021-03-05 16:01:21 -08:00
sakridge	830be855dc	Forward and hold packets (#15634 )	2021-03-03 10:23:05 -08:00
carllin	ae96ba3459	Plumb slot update pubsub notifications (#15488 )	2021-02-28 23:29:11 -08:00
sakridge	1b59b163dd	Add max retransmit and shred insert slot (#15475 )	2021-02-23 13:06:33 -08:00
Trent Nelson	7f7370c306	Re-allow clippy::integer_arithmetic at crate-level	2021-02-17 13:55:08 -07:00
carllin	629dcd0f39	Cleanup buffered packets (#15210 )	2021-02-12 03:27:37 -08:00
behzad nouri	e1021d9f83	removes redundant epoch stakes cache in retransmit (#14781 ) Following `d6d76219b`, staked nodes computed from vote accounts are already cached in runtime::Stakes, so the caching in retransmit_stage is redundant.	2021-01-24 21:15:09 +00:00
sakridge	5c95d8e963	Shred filter (#14030 )	2020-12-10 07:54:15 -08:00
sakridge	c5fe076432	Better dupe detection (#13992 )	2020-12-09 23:14:31 -08:00
behzad nouri	cbea9ebc34	indexes nodes' contact infos in crds table (#13553 ) In several places in gossip code, the entire crds table is scanned only to filter out nodes' contact infos. Currently on mainnet, crds table is of size ~70k, while there are only ~470 nodes. So the full table scan is inefficient. Instead we may maintain an index of only nodes' contact infos.	2020-11-15 16:38:04 +00:00
sakridge	b4cf968e14	Add back shredding broadcast stats (#13463 )	2020-11-09 23:04:27 -08:00
behzad nouri	37c8842bcb	scans crds table in parallel for finding old labels (#13073 ) From runtime profiles, the majority time of ClusterInfo::handle_purge https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1605-L1626 is spent scanning crds table finding old labels: https://github.com/solana-labs/solana/blob/0776fa05c/core/src/crds.rs#L175-L197 This can be done in parallel given that gossip thread-pool: https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1637-L1641 is idle when handle_purge is invoked: https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1681	2020-10-23 14:17:37 +00:00
Michael Vines	959880db60	Remove unused pubkey::Pubkey imports	2020-10-21 19:08:13 -07:00
Michael Vines	17c391121a	Run `codemod --extensions rs Hash::new_rand solana_sdk:#️⃣:new_rand`	2020-10-21 19:08:13 -07:00
Michael Vines	7bc073defe	Run `codemod --extensions rs Pubkey::new_rand solana_sdk::pubkey::new_rand`	2020-10-21 19:08:13 -07:00
behzad nouri	537bbde22e	builds crds filters in parallel (#12360 ) Based on run-time profiles, the majority time of new_pull_requests is spent building bloom filters, in hashing and bit-vec ops. This commit builds crds filters in parallel using rayon constructs. The added benchmark shows ~5x speedup (4-core machine, 8 threads).	2020-09-29 23:06:02 +00:00
Ryo Onodera	cb8661bd49	Persistent tower (#10718 ) * Save/restore Tower * Avoid unwrap() * Rebase cleanups * Forcibly pass test * Correct reconcilation of votes after validator resume * d b g * Add more tests * fsync and fix test * Add test * Fix fmt * Debug * Fix tests... * save * Clarify error message and code cleaning around it * Move most of code out of tower save hot codepath * Proper comment for the lack of fsync on tower * Clean up * Clean up * Simpler type alias * Manage tower-restored ancestor slots without banks * Add comment * Extract long code blocks... * Add comment * Simplify returned tuple... * Tweak too aggresive log * Fix typo... * Add test * Update comment * Improve test to require non-empty stray restored slots * Measure tower save and dump all tower contents * Log adjust and add threshold related assertions * cleanup adjust * Properly lower stray restored slots priority... * Rust fmt * Fix test.... * Clarify comments a bit and add TowerError::TooNew * Further clean-up arround TowerError * Truly create ancestors by excluding last vote slot * Add comment for stray_restored_slots * Add comment for stray_restored_slots * Use BTreeSet * Consider root_slot into post-replay adjustment * Tweak logging * Add test for stray_restored_ancestors * Reorder some code * Better names for unit tests * Add frozen_abi to SavedTower * Fold long lines * Tweak stray ancestors and too old slot history * Re-adjust error conditon of too old slot history * Test normal ancestors is checked before stray ones * Fix conflict, update tests, adjust behavior a bit * Fix test * Address review comments * Last touch! * Immediately after creating cleaning pr * Revert stray slots * Revert comment... * Report error as metrics * Revert not to panic! and ignore unfixable test... * Normalize lockouts.root_slot more strictly * Add comments for panic! and more assertions * Proper initialize root without vote account * Clarify code and comments based on review feedback * Fix rebase * Further simplify based on assured tower root * Reorder code for more readability Co-authored-by: Michael Vines <mvines@gmail.com>	2020-09-19 14:03:54 +09:00
behzad nouri	9b866d79fb	shards crds values based on their hash prefix (#12187 ) filter_crds_values checks every crds filter against every hash value: https://github.com/solana-labs/solana/blob/ee646aa7/core/src/crds_gossip_pull.rs#L432 which can be inefficient if the filter's bit-mask only matches small portion of the entire crds table. This commit shards crds values into separate tables based on shard_bits first bits of their hash prefix. Given a (mask, mask_bits) filter, filtering crds can be done by inspecting only relevant shards. If CrdsFilter.mask_bits <= shard_bits, then precisely only the crds values which match (mask, mask_bits) bit pattern are traversed. If CrdsFilter.mask_bits > shard_bits, then approximately only 1/2^shard_bits of crds values are inspected. Benchmarking on a gce cluster of 20 nodes, I see ~10% improvement in generate_pull_responses metric, but with larger clusters, crds table and 2^mask_bits are both larger, so the impact should be more significant.	2020-09-17 14:05:16 +00:00
behzad nouri	28f2fa3fd5	uses rust intrinsics to convert hashes to u64 (#12097 )	2020-09-09 15:28:17 +00:00
Michael Vines	d15173ad9d	Address latest nightly clippy lints, but globally disable stable_sort_primitive	2020-08-17 22:36:10 -07:00
carllin	7e25130529	Send votes from banking stage to vote listener (#11434 ) * Send votes from banking stage to vote listener Co-authored-by: Carl <carl@solana.com>	2020-08-07 11:21:35 -07:00
carllin	bf18524368	Add hook for getting vote transactions on replay (#11264 ) * Add hook for getting vote transactions on replay Co-authored-by: Carl <carl@solana.com>	2020-07-29 23:17:40 -07:00
carllin	f1699721ef	Bench RaptorQ (#10886 ) Co-authored-by: Carl <carl@solana.com>	2020-07-02 18:31:32 -07:00
Greg Fitzgerald	1c498369b5	Remove fee-payer guesswork from Message and Transaction (#10776 ) * Make Message::new_with_payer the default constructor * Remove Transaction::new_[un]signed_instructions These guess the fee-payer instead of stating it explicitly	2020-06-24 14:52:38 -06:00
Greg Fitzgerald	6ee222363e	Move BankForks to solana_runtime (#10637 ) * Move BankForks to solana_runtime * Update imports	2020-06-17 15:27:03 +00:00
sakridge	0de6c444d6	Simd poh (#10604 ) * Simd poh * Fix poh verify bench	2020-06-16 23:03:26 -07:00
Greg Fitzgerald	2eb6f498a8	Remove redundant BankForks parameter (#10537 )	2020-06-12 11:04:17 -06:00
Kristofer Peterson	e23340d89e	Clippy cleanup for all targets and nighly rust (also support 1.44.0) (#10445 ) * address warnings from 'rustup run beta cargo clippy --workspace' minor refactoring in: - cli/src/cli.rs - cli/src/offline/blockhash_query.rs - logger/src/lib.rs - runtime/src/accounts_db.rs expect some performance improvement AccountsDB::clean_accounts() * address warnings from 'rustup run beta cargo clippy --workspace --tests' * address warnings from 'rustup run nightly cargo clippy --workspace --all-targets' * rustfmt * fix warning stragglers * properly fix clippy warnings test_vote_subscribe() replace ref-to-arc with ref parameters where arc not cloned * Remove lock around JsonRpcRequestProcessor (#10417) automerge * make ancestors parameter optional to avoid forcing construction of empty hash maps Co-authored-by: Greg Fitzgerald <greg@solana.com>	2020-06-09 09:38:14 +09:00
sakridge	2cf719ac2c	Cache tvu peers for broadcast (#10373 )	2020-06-03 08:24:05 -07:00
carllin	97f2bcff69	master: Add nonce to shreds repairs, add shred data size to header (#10109 ) * Add nonce to shreds/repairs * Add data shred size to header Co-authored-by: Carl <carl@solana.com>	2020-05-19 12:38:18 -07:00
Jack May	eb1acaf927	Remove archiver and storage program (#9992 ) automerge	2020-05-14 18:22:47 -07:00
Greg Fitzgerald	76b1c2baf0	One less alloc per transaction (#9705 ) * One less alloc per transaction * Fix benches * Fix compiler warnings in bench build * Fix move build * Fix bench	2020-04-24 13:03:46 -06:00
carllin	bab3502260	Push down cluster_info lock (#9594 ) * Push down cluster_info lock * Rework budget decrement Co-authored-by: Carl <carl@solana.com>	2020-04-21 12:54:45 -07:00
sakridge	69f1e487b3	Reduce cluster-info metrics. (#9465 )	2020-04-14 21:21:58 -07:00
sakridge	4677cdb4c2	Optimize broadcast cluster_info critical section (#9327 )	2020-04-06 17:36:22 -07:00
carllin	7b68628e6c	Remove write lock (#9311 ) * Remove write lock Co-authored-by: Carl <carl@solana.com>	2020-04-05 15:18:45 -07:00
Jack May	d61191db40	fix bench warnings (#9277 )	2020-04-02 21:56:38 -07:00
Michael Vines	18c1f0dfe9	Remove stub core/src/genesis_utils.rs (#8999 )	2020-03-21 10:54:40 -07:00
anatoly yakovenko	9cedeb0a8d	Pull streamer out into its own module. (#8917 ) automerge	2020-03-17 23:30:23 -07:00
Tyera Eulberg	ab361a8073	Rename KeypairUtil to Signer (#8360 ) automerge	2020-02-20 13:28:55 -08:00
Greg Fitzgerald	b5dba77056	Rename blocktree to blockstore (#7757 ) automerge	2020-01-13 13:13:52 -08:00

1 2 3

128 Commits