Commit Graph

50 Commits

Author SHA1 Message Date
behzad nouri 5a9896706c
indexes epoch slots in crds table (#15459)
ClusterInfo::get_epoch_slots_since scans the entire crds table to obtain
epoch-slots inserted since a timestamp:
https://github.com/solana-labs/solana/blob/013daa8f4/core/src/cluster_info.rs#L1245-L1262
The alternative is to index epoch-slots in crds table ordered by their
insert timestamp.
2021-02-26 14:12:04 +00:00
behzad nouri d1df9da7d3
fixes test_filter_current flakiness (#14816) 2021-01-25 15:57:46 +00:00
behzad nouri 491b059755
broadcasts duplicate shreds through gossip (#14699) 2021-01-24 15:47:43 +00:00
Michael Vines cbffab7850 Upgrade to Rust v1.49.0 2021-01-23 19:16:36 -08:00
behzad nouri e4da6761a7
fixes test_filter_current flakiness (#14749) 2021-01-21 21:53:10 +00:00
behzad nouri 8e581601d6
patches crds vote-index assignment bug (#14438)
If tower is full, old votes are evicted from the front of the deque:
https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L367-L373
whereas recent votes if expire are evicted from the back:
https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L529-L537

As a result, from a single tower_index scalar, we cannot infer which crds-vote
should be overwritten:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L576

In addition there is an off by one bug in the existing code. tower_index is
bounded by MAX_LOCKOUT_HISTORY - 1:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/consensus.rs#L382
So, it is at most 30, whereas MAX_VOTES is 32:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L29
Which means that this branch is never taken:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L590-L593
so crds table alwasys keeps 29 **oldest** votes by wallclock, and then
only overrides the 30st one each time. (i.e a tally of only two most
recent votes).
2021-01-21 13:08:07 +00:00
behzad nouri 766195dded
limits number of crds values associated with a pubkey (#14467) 2021-01-08 18:54:40 +00:00
behzad nouri 2fd38d9912
indexes votes in crds table (#14272) 2020-12-27 13:31:05 +00:00
behzad nouri 6a3797e164
adds crds-value for broadcasting duplicate shreds through gossip (#14133)
In gossip, the header overhead we get from:
https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/cluster_info.rs#L434-L435
https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/crds_value.rs#L31-L36
https://github.com/solana-labs/solana/blob/de9ac43eb/core/src/crds_value.rs#L73
already exceeds SIZE_OF_NONCE in shreds. We also need aditional
meta-data (wallclock, source pubkey, ...). Which means that given the
SHRED_PAYLOAD_SIZE, we cannot fit all these in PACKET_DATA_SIZE:
https://github.com/solana-labs/solana/blob/de9ac43eb/ledger/src/shred.rs#L80

On top of that, we need 2 shred payloads as the proof of duplicate. So
each DuplicateShred crds value includes only a chunk of the payload,
along with the meta-data to reconstruct the full payload from the chunks
on the receiving end.
2020-12-18 14:32:43 +00:00
behzad nouri c2b7115031
indexes crds values associated with a pubkey (#14088)
record_labels returns all the possible labels for a record identified by
a pubkey, used in updating timestamp of crds values:
https://github.com/solana-labs/solana/blob/1792100e2/core/src/crds_value.rs#L560-L577
https://github.com/solana-labs/solana/blob/1792100e2/core/src/crds.rs#L240-L251
The code relies on CrdsValueLabel to be limited to a small deterministic
set of possible values for a fixed pubkey. As we expand crds values to
include duplicate shreds, this limits what the duplicate proofs can be
keyed by in the table.
In addition the computation of these labels is inefficient and will
become more so as duplicate shreds and more types of crds values are
added. An alternative is to maintain an index of all crds values
associated with a pubkey.
2020-12-15 01:49:22 +00:00
behzad nouri 409fe3bca1
adds the instance token to crds-labels for node-instance crds-values (#14037)
If a node "a" receives instance-info from node "b1" it will override any
instance-info associated with "b1" pubkey in its crds table. This makes
it less likely that when "b1" receives crds values from "a" (either
through pull or push), it sees other instances of itself (because node
"a" discarded them when it received "b1" instance info).

In order for the crds table to contain all instance-info associated with
the same pubkey at the same time, we need to add the instance tokens to
the keys in the crds table (i.e. the CrdsValueLabel).
2020-12-10 17:01:55 +00:00
behzad nouri 895d7d6a65 removes RwLock on ClusterInfo.instance 2020-12-09 10:24:23 -08:00
behzad nouri 8cd5eb9863 checks for duplicate validator instances using gossip 2020-12-09 10:24:23 -08:00
behzad nouri 26bf2b7e45
processes pull-request callers only once per unique caller (#13750)
process_pull_requests acquires a write lock on crds table to update
records timestamp for each of the pull-request callers:
https://github.com/solana-labs/solana/blob/3087c9049/core/src/crds_gossip_pull.rs#L287-L300
However, pull-requests overlap a lot in callers and this function ends
up doing a lot of redundant duplicate work.

This commit obtains unique callers before acquiring an exclusive lock on
crds table.
2020-11-22 17:51:14 +00:00
behzad nouri 1ffab5de77
breaks prunes data into chunks to fit into packets (#13613)
Validator logs show that prune messages are dropped because they exceed
packet data size:
https://github.com/solana-labs/solana/blob/f25c969ad/perf/src/packet.rs#L90-L92
This can exacerbate gossip traffic by redundantly increasing push
messages across network. The workaround is to break prunes into smaller
chunks and send over in multiple messages.
2020-11-19 16:38:01 +00:00
behzad nouri cbea9ebc34
indexes nodes' contact infos in crds table (#13553)
In several places in gossip code, the entire crds table is scanned only
to filter out nodes' contact infos. Currently on mainnet, crds table is
of size ~70k, while there are only ~470 nodes. So the full table scan is
inefficient. Instead we may maintain an index of only nodes' contact
infos.
2020-11-15 16:38:04 +00:00
Michael Vines 7bc073defe Run `codemod --extensions rs Pubkey::new_rand solana_sdk::pubkey::new_rand` 2020-10-21 19:08:13 -07:00
behzad nouri 57ed4e4657
patches bug in Crds::find_old_labels with pubkey specific timeout (#12528)
Current code only returns values which are expired based on the default
timeout. Example from the added unit test:
  - value inserted at time 0
  - pubkey specific timeout = 1
  - default timeout = 3
Then at now = 2, the value is expired, but the function fails to return
the value because it compares with the default timeout.
2020-09-29 09:04:40 +00:00
Michael Vines 35f5f9fc7b Add feature set identifier to gossiped version information 2020-09-25 11:40:36 -07:00
Ryo Onodera 39b3ac6a8d
Introduce automatic ABI maintenance mechanism (2/2; rollout) (#8012)
* Introduce automatic ABI maintenance mechanism (2/2; rollout)

* Fix stable clippy

* Change to symlink

* Freeze abi of Tower

* fmt...

* Improve dev-experience!

* Update BankSlotDelta

$ diff -u /tmp/abi8/*7dg6BreYxTuxiVz6aLvk3p2Z7GQk2cJqfGvC9h4FAoSj* /tmp/abi8/*9chBcbXVJ4fK7uGgydQzam5aHipaAKFw6V4LDFpjbE4w*
--- /tmp/abi8/bank__BankSlotDelta_frozen_abi__test_abi_digest_7dg6BreYxTuxiVz6aLvk3p2Z7GQk2cJqfGvC9h4FAoSj      2020-06-18 18:01:22.831228087 +0900
+++ /tmp/abi8/bank__BankSlotDelta_frozen_abi__test_abi_digest_9chBcbXVJ4fK7uGgydQzam5aHipaAKFw6V4LDFpjbE4w      2020-07-03 15:59:58.430695244 +0900
@@ -140,7 +140,7 @@
                                                         field u8
                                                             primitive u8
                                                         field solana_sdk::instruction::InstructionError
-                                                            enum InstructionError (variants = 34)
+                                                            enum InstructionError (variants = 35)
                                                                 variant(0) GenericError (unit)
                                                                 variant(1) InvalidArgument (unit)
                                                                 variant(2) InvalidInstructionData (unit)
@@ -176,6 +176,7 @@
                                                                 variant(31) CallDepth (unit)
                                                                 variant(32) MissingAccount (unit)
                                                                 variant(33) ReentrancyNotAllowed (unit)
+                                                                variant(34) MaxSeedLengthExceeded (unit)
                                                     variant(9) CallChainTooDeep (unit)
                                                     variant(10) MissingSignatureForFee (unit)
                                                     variant(11) InvalidAccountIndex (unit)

* Fix some merge conflicts...
2020-07-06 20:22:23 +09:00
Kristofer Peterson e23340d89e
Clippy cleanup for all targets and nighly rust (also support 1.44.0) (#10445)
* address warnings from 'rustup run beta cargo clippy --workspace'

minor refactoring in:
- cli/src/cli.rs
- cli/src/offline/blockhash_query.rs
- logger/src/lib.rs
- runtime/src/accounts_db.rs

expect some performance improvement AccountsDB::clean_accounts()

* address warnings from 'rustup run beta cargo clippy --workspace --tests'

* address warnings from 'rustup run nightly cargo clippy --workspace --all-targets'

* rustfmt

* fix warning stragglers

* properly fix clippy warnings test_vote_subscribe()
replace ref-to-arc with ref parameters where arc not cloned

* Remove lock around JsonRpcRequestProcessor (#10417)

automerge

* make ancestors parameter optional to avoid forcing construction of empty hash maps

Co-authored-by: Greg Fitzgerald <greg@solana.com>
2020-06-09 09:38:14 +09:00
Kristofer Peterson 58ef02f02b
9951 clippy errors in the test suite (#10030)
automerge
2020-05-15 09:35:43 -07:00
Michael Vines 2521f75c18
Advertise node software version in gossip (#9981)
* Advertise node version in gossip

* Remove solana_clap_utils::version! macro
2020-05-11 15:02:01 -07:00
anatoly yakovenko a0514eb2ae
thiserror, docs, remove general Failure case (#9741)
automerge
2020-04-29 18:12:51 -07:00
anatoly yakovenko 193dbb1794
sanitize lowest slots (#9747) 2020-04-27 20:22:30 -07:00
anatoly yakovenko 8ef097bf6f
Input values are not sanitized after they are deserialized, making it far too easy for Leo to earn SOL (#9706)
* sanitize gossip protocol messages
* sanitize transactions
* crds protocol sanitize
2020-04-27 11:06:00 -07:00
Michael Vines 8b14eb9020
Place AccountsHashes in same enum ordinal position as the v1.0 version (#9251)
automerge
2020-04-01 18:47:50 -07:00
sakridge dc347dd3d7
Add Accounts hash consistency halting (#8772)
* Accounts hash consistency halting

* Add option to inject account hash faults for testing.

Enable option in local cluster test to see that node halts.
2020-03-16 08:37:31 -07:00
anatoly yakovenko f64ab49307
Cluster has no way to know which slots are available (#8732)
automerge
2020-03-11 21:31:50 -07:00
Tyera Eulberg ab361a8073
Rename KeypairUtil to Signer (#8360)
automerge
2020-02-20 13:28:55 -08:00
sakridge 1720fe6a46
Snapshot hash gossip changes (#8358) 2020-02-20 11:46:13 -08:00
Pankaj Garg e50bc0d34b
Do not compress small incomplete slot list (#8355)
automerge
2020-02-20 09:48:39 -08:00
Pankaj Garg ea8d9d1aea
Bitwise compress incomplete epoch slots (#8341) 2020-02-19 20:24:09 -08:00
Michael Vines 7305a1f407 Reformatting 2020-02-18 21:08:43 -07:00
Pankaj Garg 0d5c1239c6
Update epoch slots to include all missing slots (#8276)
* Update epoch slots to include all missing slots

* new test for compress/decompress

* address review comments

* limit cache based on size, instead of comparing roots
2020-02-17 12:39:30 -08:00
Sagar Dhawan a95d37ea25
Fix repair slowness when most peers are unable to serve requests (#7287)
* Fix repair when most peers are incapable of serving requests

* Add a test for getting the lowest slot in blocktree

* Replace some more u64s with Slot
2019-12-05 11:25:13 -08:00
anatoly yakovenko 67f636545a Refactor sigverify to stage for signing shreds on the GPU (#6635)
automerge
2019-11-06 10:52:30 -08:00
anatoly yakovenko f54cfcdb8f
Store and persists full stack of tower votes in gossip (#6695)
* vote array

wip

wip

wip

update

gossip index should match tower index

tests build

clippy

test index after expired vote

test

bank specific last vote sync time

* verify

* we are likely to see many more warnings about old votes now
2019-11-04 16:19:54 -08:00
Sagar Dhawan 568475e2db
Fix incorrectly signed CrdsValues (#6696) 2019-11-03 10:07:51 -08:00
Pankaj Garg 4798e7fa73
Integrate data shreds (#5541)
* Insert data shreds in blocktree and database

* Integrate data shreds with rest of the code base

* address review comments, and some clippy fixes

* Fixes to some tests

* more test fixes

* ignore some local cluster tests

* ignore replicator local cluster tests
2019-08-20 17:16:06 -07:00
Sagar Dhawan d7a2b790dc
Limit the size of gossip push and gossip pull response (#5348)
* Limit the size of gossip push and gossip pull response

* Remove Default::default

* Rename var
2019-07-30 15:43:17 -07:00
carllin 8c1b9a0b67
Data plane verification (#4639)
* Add signature to blob

* Change Signable trait to support returning references to signable data

* Add signing to broadcast

* Verify signatures in window_service

* Add testing for signatures to erasure

* Add RPC for getting current slot, consume RPC call in test_repairman_catchup for more deterministic results
2019-06-12 16:43:05 -07:00
carllin 6a9e0bc593 Change EpochSlots to use BtreeSet so that serialization/deserialization returns the same order (#4404)
automerge
2019-05-23 03:50:41 -07:00
carllin b8fd51e97d
Add new gossip structure for supporting repairs (#4205)
* Add Epoch Slots to gossip

* Add new gossip structure to support Repair

* remove unnecessary clones

* Setup dummy fast repair in repair_service

* PR comments
2019-05-08 13:50:32 -07:00
Sagar Dhawan 61f950a60c Sign Gossip Vote Messages 2019-03-19 19:56:17 -07:00
Greg Fitzgerald c1eec0290e
Rename userdata to data (#3282)
* Rename userdata to data

Instead of saying "userdata", which is ambiguous and imprecise,
say "instruction data" or "account data".

Also, add `ProgramError::InvalidInstructionData`

Fixes #2761
2019-03-14 10:48:27 -06:00
Rob Walker 195a880576
pass Pubkeys as refs, copy only where values needed (#3213)
* pass Pubkeys as refs, copy only where values needed

* Pubkey is pervasive

* fixup
2019-03-09 19:28:43 -08:00
Michael Vines 79b2542ca4 Remove CrdsValue::LeaderId 2019-03-08 19:41:51 -08:00
Sagar Dhawan c8c85ff93b
Fix propagation of incorrectly signed messages in Gossip (#3201) 2019-03-08 18:08:24 -08:00
Michael Vines 5f5d779ee1 Move src/ into core/src. Top-level crate is now called solana-workspace 2019-03-02 09:52:18 -08:00