solana

Commit Graph

Author	SHA1	Message	Date
behzad nouri	e7073ecab1	adds gossip metrics for number of staked nodes (#17330 )	2021-05-19 19:25:21 +00:00
Tao Zhu	0781fe1b4f	Upgrade Rust to 1.52.0 (#17096 ) * Upgrade Rust to 1.52.0 update nightly_version to newly pushed docker image fix clippy lint errors 1.52 comes with grcov 0.8.0, include this version to script * upgrade to Rust 1.52.1 * disabling Serum from downstream projects until it is upgraded to Rust 1.52.1	2021-05-19 09:31:47 -05:00
Tyera Eulberg	827355a6b1	Create solana-rpc crate and move subscriptions (#17320 ) * Move non_circulating_supply to runtime * Add solana-rpc crate and move max_slots * Move subscriptions to solana-rpc * Single use statements	2021-05-19 00:54:28 -06:00
behzad nouri	f7b0184f81	patches flaky test_new_mark_creation_time (#17288 )	2021-05-18 13:39:35 +00:00
Trent Nelson	67e6a3106f	rpc: plumb shred_version through RpcContactInfo	2021-05-14 08:36:08 +00:00
Tyera Eulberg	27004f1b76	Return error for excluded secondary-index keys (#17193 ) * Add runtime helpers to check secondary indexes for key * Add custom rpc error * Check secondary-index key inclusion in rpc * Clone complete AccountSecondaryIndexes into rpc to avoid bank query	2021-05-13 21:04:21 +00:00
behzad nouri	0e646d10bb	prunes received-cache only once per unique owner's key (#17039 )	2021-05-13 13:50:16 +00:00
behzad nouri	0aa7824884	retains one node-instance per pubkey (#17187 ) crds table retains up to 32 node-instance values per each pubkey. This is so because if there are multiple running instances of the same node, then we want gossip to propagate node-instance values associated with both instances, therefore the corresponding label/key includes the randomly generated token in addition to the pubkey: https://github.com/solana-labs/solana/blob/9c42a89a4/core/src/crds_value.rs#L448 https://github.com/solana-labs/solana/pull/14037 As a result, the number of such values per pubkey are effectively unbounded, requiring custom mitigations implemented in: https://github.com/solana-labs/solana/pull/14467 but still taking redundant extra memory and bandwidth. This commit instead retains only one node-instance per pubkey by extending crds values override logic. If a crds value is of type node-instance, it will always override an existing one with the same key if it has more recent starting timestamp (not wallclock). As a result, gossip will always propagate the node-instance with more recent timestamp. Since the check_duplicate logic will stop the node with older timestamp, this change should preserve existing functionality.	2021-05-13 13:35:46 +00:00
Lijun Wang	9c42a89a43	Issue #17008 -- make snapshot archives to hold on to configurable. (#17158 ) * purge_old_snapshot_archives is changed to take an extra argument 'maximum_snapshots_to_retain' to control the max number of latest snapshot archives to retain. Note the oldest snapshot is always retained as before and is not subjected to this new options. * The validator and ledger-tool executables are modified with a CLI argument --maximum-snapshots-to-retain. And the options are propagated down the call chains. Their corresponding shell scripts were changed accordingly. * SnapshotConfig is modified to have an extra field for the maximum_snapshots_to_retain * Unit tests are developed to cover purge_old_snapshot_archives	2021-05-12 10:32:27 -07:00
Tyera Eulberg	6e9deaf1bd	Move block-time caching earlier (#17109 ) * Require that blockstore block-time only be recognized slot, instead of root * Move cache_block_time to after Bank freeze * Single use statement * Pass transaction_status_sender by reference * Remove unnecessary slot-existence check before caching block time altogether * Move block-time existence check into Blockstore::cache_block_time, Blockstore no longer needed in blockstore_processor helper	2021-05-10 13:14:56 -06:00
Jeff Washington (jwash)	f39dda00e0	type AccountSecondaryIndexes = HashSet (#17108 )	2021-05-10 14:22:48 +00:00
behzad nouri	81ad795d46	removes position field in coding-shred-header CodingShredHeader.position is equal to ShredCommonHeader.index - ShredCommonHeader.fec_set_index and is so redundant. The extra position field can add bugs if not consistent with index and fec_set_index.	2021-05-10 13:20:56 +00:00
behzad nouri	22c02b917e	reads gossip push messages off crds ordinal index Having an ordinal index on crds values based on insert order allows to efficiently filter values using a cursor. In particular CrdsGossipPush::push_messages hash-map can be replaced with a cursor, saving on the bookkeepings, purging, etc	2021-05-09 22:40:41 +00:00
behzad nouri	dfa3e7a61c	indexes crds values by their insert order	2021-05-09 22:40:41 +00:00
Michael Vines	d6c076f1b6	getBlockProduction now correctly reports block production	2021-05-07 19:04:51 -07:00
behzad nouri	fa86a335b0	implements cursor for gossip crds table queries (#16952 ) VersionedCrdsValue.insert_timestamp is used for fetching crds values inserted since last query: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215 https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298 So it is crucial that insert_timestamp does not go backward in time when new values are inserted into the table. However std::time::SystemTime is not monotonic, or due to workload, lock contention, thread scheduling, etc, ... new values may be inserted with a stalled timestamp way in the past. Additionally, reading system time for the above purpose is inefficient/unnecessary. This commit adds an ordinal index to crds values indicating their insert order. Additionally, it implements a new Cursor type for fetching values inserted since last query.	2021-05-06 14:04:17 +00:00
Michael Vines	9ba2c53b85	Add --tower argument to specify where tower files are persisted	2021-05-05 12:20:39 -07:00
Trent Nelson	f17b80236f	test-validator: Plumb --limit-ledger-size	2021-05-04 08:45:24 +00:00
carllin	bc7e741514	Integrate gossip votes into switching threshold (#16973 )	2021-05-04 00:51:42 -07:00
publish-docs.sh	6318705607	Add keys	2021-05-03 17:18:54 -07:00
publish-docs.sh	b948a18841	Key rotation	2021-05-03 17:18:54 -07:00
publish-docs.sh	b2778f34f5	Rotate keys	2021-05-03 17:18:54 -07:00
behzad nouri	7cea2c4466	validates gossip addresses before sending pull-requests IP addresses need to be validated before sending packets to them. This commit, sends a ping packet to nodes before any pull requests. Pull requests are then only sent to the nodes which have responded with the correct hash of their respective ping packet.	2021-05-03 18:21:06 +00:00
behzad nouri	2231017b35	uses Mutex instead of RwLock for ping_cache	2021-05-03 18:21:06 +00:00
behzad nouri	a698e34744	patches local pending push messages processing (#16833 ) process_push_messages writes local pending push messages to the crds table, but it discards the return value: https://github.com/solana-labs/solana/blob/cf779c63c/core/src/crds_gossip.rs#L96-L102 In order to exclude outdated values from the next pull-request, we need to record the hash of values purged/overridden by the local push messages, otherwise pull-responses will return outdated values back to the node: https://github.com/solana-labs/solana/blob/c1829dd00/core/src/crds_gossip_pull.rs#L447-L452 Additionally, gossip packets arrive and are processed out of order. So, local pending push messages should be flushed before generating bloom filters for pull-requests, preventing pull-responses returning the same values back to the node itself. This requires flipping order of generating pull and push messages: https://github.com/solana-labs/solana/blob/cf779c63c/core/src/cluster_info.rs#L1757-L1762 Both above bugs cause redundant traffic and bandwidth waste in gossip pull-responses.	2021-05-03 16:00:17 +00:00
Jeff Washington (jwash)	541aa5ad85	tests: lamports -> lamports() (#16982 )	2021-05-03 10:45:54 -05:00
Justin Starry	8e561354d5	Improve readability of vote lockout processing (#16987 ) * Improve readability of vote lockout processing * clippy * simplify comment * feedback	2021-05-02 08:36:06 +00:00
carllin	5981399612	Distinguish max replayed and max observed vote (#16936 )	2021-04-29 14:43:28 -07:00
Michael Vines	542d88929f	Add getBlockProduction RPC method	2021-04-28 20:02:54 -07:00
carllin	b5d30846d6	Retry latest vote if expired (#16735 )	2021-04-28 11:46:16 -07:00
behzad nouri	25054bfd35	retains peer's contact-info when making pull requests (#16715 ) ClusterInfo::new_pull_requests has to lookup contact-infos: https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/cluster_info.rs#L1663-L1673 when it was already available when making pull requests: https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/crds_gossip_pull.rs#L232	2021-04-28 13:19:12 +00:00
behzad nouri	1ac2a8cfa5	removes delayed crds inserts when upserting gossip table (#16806 ) It is crucial that VersionedCrdsValue::insert_timestamp does not go backward in time: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds.rs#L67-L79 Otherwise methods such as get_votes and get_epoch_slots_since will break, which will break their downstream flow, including vote-listener and optimistic confirmation: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215 https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298 For that, Crds::new_versioned is intended to be called "atomically" with Crds::insert_verioned (as the comment already says so): https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds.rs#L126-L129 However, currently this is violated in the code. For example, filter_pull_responses creates VersionedCrdsValues (with the current timestamp), then acquires an exclusive lock on gossip, then process_pull_responses writes those values to the crds table: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L2375-L2392 Depending on the workload and lock contention, the insert_timestamps may well be in the past when these values finally are inserted into gossip. To avoid such scenarios, this commit: * removes Crds::new_versioned and Crd::insert_versioned. * makes VersionedCrdsValue constructor private, only invoked in Crds::insert, so that insert_timestamp is populated right before insert. This will improve insert_timestamp monotonicity as long as Crds::insert is not called with a stalled timestamp. Following commits may further improve this by calling timestamp() inside Crds::insert, and/or switching to std::time::Instant which guarantees monotonicity.	2021-04-28 11:56:13 +00:00
behzad nouri	b17d5eeaee	moves cluster-info metrics to a separate module (#16883 )	2021-04-28 02:04:49 +00:00
behzad nouri	b468ead1b1	uses current timestamp when flushing local pending push queue (#16808 ) local_message_pending_push_queue is recording timestamps at the time the value is created, and uses that when the pending values are flushed: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L321 https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds_gossip.rs#L96-L102 which is then used as the insert_timestamp when inserting values in the crds table: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/crds_gossip_push.rs#L183 The flushing may happen 100ms after the values are created (or even later if there is a lock contention). This will cause non-monotone insert_timestamps in the crds table (where time goes backward), hindering the usability of insert_timestamps for other computations. For example both ClusterInfo::get_votes and get_epoch_slots_since rely on monotone insert_timestamps when values are inserted into the table: https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215 https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298 This commit removes timestamps from local_message_pending_push_queue and uses current timestamp when flushing the queue.	2021-04-28 00:15:11 +00:00
steviez	bc31378797	Trim extra shred bytes in blockstore (#16602 ) Strip the zero-padding off of data shreds before insertion into blockstore Co-authored-by: Stephen Akridge <sakridge@gmail.com> Co-authored-by: Nathan Hawkins <utsl@utsl.org>	2021-04-27 17:40:41 -05:00
behzad nouri	3b8d6b59fb	records hash of values purged by expired pull-responses (#16800 ) process_pull_responses should record hash of values purged by expired responses (as well as unexpired ones): https://github.com/solana-labs/solana/blob/c1829dd00/core/src/crds_gossip_pull.rs#L385-L387 otherwise, these values are not excluded from following pull-requests (from likely different nodes): https://github.com/solana-labs/solana/blob/c1829dd00/core/src/crds_gossip_pull.rs#L447-L452 and would waste bandwidth should they be included in subsequent pull-responses.	2021-04-27 12:06:49 +00:00
behzad nouri	0f3ac51cf1	limits to data_header.size when combining shreds' payloads (#16708 ) Shredder::deshred is ignoring data_header.size when combining shreds' payloads: https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/shred.rs#L940-L961 Also adding more sanity checks on the alignment of data shreds indices.	2021-04-27 12:04:44 +00:00
Michael Vines	59fc33635a	Add getVoteAccounts RPC method parameter to restrict results to a single vote account	2021-04-27 04:27:15 +00:00
behzad nouri	9706512115	removes old runtime feature gates in gossip and turbine (#16633 )	2021-04-26 17:12:02 +00:00
Jeff Washington (jwash)	ca14c18998	owner -> owner() (#16782 )	2021-04-23 22:49:47 +00:00
Michael Vines	63436cc2bf	Disable flaky test_poh_service (#16772 )	2021-04-23 12:14:11 -05:00
behzad nouri	2c82f2154d	retains crds values if the origin is still active (#16576 ) Local timestamps are updated for records associated with a pubkey if the origin is still active: https://github.com/solana-labs/solana/blob/c8ed14c64/core/src/crds.rs#L301-L311 However this is done inconsistently on some gossip paths (pull requests and pull responses) but not all (e.g. push messages). Additionally update_record_timestamp is inefficient since there can be ~800 values associated with each pubkey. This commit updates records timestamps only on contact-infos; and, instead utilizes origin's timestamp when purging old values.	2021-04-23 15:14:49 +00:00
behzad nouri	03194145c0	removes first_coding_index from erasure recovery code (#16646 ) first_coding_index is the same as the set_index and is so redundant: https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/blockstore_meta.rs#L49-L60	2021-04-23 12:00:37 +00:00
Justin Starry	75b8434b76	Add TPU client for sending txs to the current leader tpu port (#16736 ) * Add TPU client for sending txs to the current leader tpu port * Update tpu_client.rs	2021-04-23 09:35:12 +08:00
Tyera Eulberg	636b5987af	Update getLeaderSchedule options (#16749 )	2021-04-22 19:27:30 +00:00
Michael Vines	6004c0abf5	getLeaderSchedule now supports filtered results based on validator identity	2021-04-21 17:59:26 -07:00
Michael Vines	91b6888e15	verify_pubkey() now takes a ref	2021-04-21 14:43:49 -07:00
carllin	4c94f8933f	Ingest votes from gossip into fork choice (#16560 )	2021-04-21 14:40:35 -07:00
Michael Vines	a1ef2bd74d	Ignore flaky test_pull_request_time_pruning	2021-04-21 12:07:36 -07:00
behzad nouri	37b8587d4e	expands number of erasure coding shreds in the last batch in slots (#16484 ) Number of parity coding shreds is always less than the number of data shreds in FEC blocks: https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719 Data shreds are batched in chunks of 32 shreds each: https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714 However the very last batch of data shreds in a slot can be small, in which case the loss rate can be exacerbated. This commit expands the number of coding shreds in the last FEC block in slots to: 64 - number of data shreds; so that FEC blocks are always 64 data and parity coding shreds each. As a consequence of this, the last FEC block has more parity coding shreds than data shreds. So for some shred indices we will have a coding shred but no data shreds. This should not cause any kind of overlapping FEC blocks as in: https://github.com/solana-labs/solana/pull/10095 since this is done only for the very last batch in a slot, and the next slot will reset the shred index.	2021-04-21 12:47:50 +00:00

1 2 3 4 5 ...

2299 Commits