Commit Graph

296 Commits

Author SHA1 Message Date
Jeff Washington (jwash) f39dda00e0
type AccountSecondaryIndexes = HashSet (#17108) 2021-05-10 14:22:48 +00:00
behzad nouri 22c02b917e reads gossip push messages off crds ordinal index
Having an ordinal index on crds values based on insert order allows to
efficiently filter values using a cursor. In particular
CrdsGossipPush::push_messages hash-map can be replaced with a cursor,
saving on the bookkeepings, purging, etc
2021-05-09 22:40:41 +00:00
behzad nouri fa86a335b0
implements cursor for gossip crds table queries (#16952)
VersionedCrdsValue.insert_timestamp is used for fetching crds values
inserted since last query:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298

So it is crucial that insert_timestamp does not go backward in time when
new values are inserted into the table. However std::time::SystemTime is
not monotonic, or due to workload, lock contention, thread scheduling,
etc, ... new values may be inserted with a stalled timestamp way in the
past. Additionally, reading system time for the above purpose is
inefficient/unnecessary.

This commit adds an ordinal index to crds values indicating their insert
order. Additionally, it implements a new Cursor type for fetching values
inserted since last query.
2021-05-06 14:04:17 +00:00
behzad nouri 7cea2c4466 validates gossip addresses before sending pull-requests
IP addresses need to be validated before sending packets to them.
This commit, sends a ping packet to nodes before any pull requests.
Pull requests are then only sent to the nodes which have responded with
the correct hash of their respective ping packet.
2021-05-03 18:21:06 +00:00
behzad nouri 25054bfd35
retains peer's contact-info when making pull requests (#16715)
ClusterInfo::new_pull_requests has to lookup contact-infos:
https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/cluster_info.rs#L1663-L1673

when it was already available when making pull requests:
https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/crds_gossip_pull.rs#L232
2021-04-28 13:19:12 +00:00
Justin Starry 75b8434b76
Add TPU client for sending txs to the current leader tpu port (#16736)
* Add TPU client for sending txs to the current leader tpu port

* Update tpu_client.rs
2021-04-23 09:35:12 +08:00
Justin Starry a7e65c0034
RPC: use finalized as default pubsub commitment level (#16659)
* RPC: use finalized as default pubsub commitment level

* update docs

* Fix tests
2021-04-20 08:19:54 +00:00
Tyera Eulberg 7dfb51c0b4
Cli: move airdrop to rpc requests (#16557)
* Add recent_blockhash to requestAirdrop

* Move tx confirmation to separate method

* Add RpcClient airdrop methods

* Request cli airdrop via RpcClient

* Pass optional faucet_addr into TestValidator and fix tests

* Update client/src/rpc_client.rs

Co-authored-by: Michael Vines <mvines@gmail.com>

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-15 06:25:23 +00:00
Tyera Eulberg 70f3f7e679
Move obsolete rpc endpoints to separate api for removal (#16500)
* Move obsolete rpc methods to separate api for removal

* Remove obsolete method from docs

* Fix test using obs method
2021-04-12 20:33:40 -06:00
behzad nouri 570fd3f810
makes turbine peer computation consistent between broadcast and retransmit (#14910)
get_broadcast_peers is using tvu_peers:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/broadcast_stage.rs#L362-L370
which is potentially inconsistent with retransmit_peers:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1332-L1345

Also, the leader does not include its own contact-info when broadcasting
shreds:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1324
but on the retransmit side, slot leader is removed only _after_ neighbors and
children are computed:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/retransmit_stage.rs#L383-L384
So the turbine broadcast tree is different between the two stages.

This commit:
* Removes retransmit_peers. Broadcast and retransmit stages will use tvu_peers
  consistently.
* Retransmit stage removes slot leader _before_ computing children and
  neighbors.
2021-03-24 13:34:48 +00:00
behzad nouri f2865dfd63
requires stakes for propagating crds values through gossip (#15561) 2021-03-12 15:50:14 +00:00
Justin Starry 918d04e3f0
Add more slot update notifications (#15734)
* Add more slot update notifications

* fix merge

* Address feedback and add integration test

* switch to datapoint

* remove unused shred method

* fix clippy

* new thread for rpc completed slots

* remove extra constant

* fixes

* rely on channel closing

* fix check
2021-03-12 21:44:06 +08:00
Michael Vines 5df36aec7d Pacify clippy 2021-02-19 20:08:41 -08:00
Trent Nelson 7f7370c306 Re-allow clippy::integer_arithmetic at crate-level 2021-02-17 13:55:08 -07:00
sakridge b24cb9840e
Speedup ledger cleanup test (#15304)
Just clone to produce shreds and use a separate insert thread.
2021-02-17 08:59:25 -08:00
Jeff Washington (jwash) ba02452d75
add validator flag no-accounts-db-index-hashing (#15350)
* add validator flag no_accounts_db_index_hashing

* add validator flag no_accounts_db_index_hashing
2021-02-16 21:13:48 +00:00
sakridge 5b8f046c67
More configurable rocksdb compaction (#15213)
rocksdb compaction can cause long stalls, so
make it more configurable to try and reduce those stalls
and also to coordinate between multiple nodes to not induce
stall at the same time.
2021-02-14 10:16:30 -08:00
carllin 990bb426a9
Fix flaky test test_concurrent_snapshot_packaging (#15252) 2021-02-11 16:03:51 -08:00
Jeff Washington (jwash) fabecdc86c
use thread pool for non-index hash calculations (#15149) 2021-02-05 19:48:55 +00:00
behzad nouri 6fd5ec0e4c
caches descendants in bank forks (#15107) 2021-02-05 18:00:45 +00:00
Tyera Eulberg d1563f0ccd
Bump tonic, prost, tarpc, tokio (#15013)
* Update tonic & prost, and regenerate proto

* Reignore doc code

* Revert pull #14367, but pin tokio to v0.2 for jsonrpc

* Bump backoff and goauth -> and therefore tokio

* Bump tokio in faucet, net-utils

* Bump remaining tokio, plus tarpc
2021-02-05 00:21:53 -07:00
Jeff Washington (jwash) 600ff0d915
calculate hash from store instead of index (#15034)
* calculate hash from store instead of index

* restore update hash in abs
2021-02-04 09:00:33 -06:00
behzad nouri 0ad063f4e9
adds flag to disable duplicate instance check (#15006) 2021-02-03 16:26:17 +00:00
Tyera Eulberg 98aa1fa4ea
Upgrade jsonrpc crates to v17.0.0 (#15018)
* Upgrade to jsonrpc 17.0.0

* Fix test

* tree

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-02-02 19:53:08 -07:00
Tom Parker-Shemilt 01230a0105
Remove serial_test_derive dependency (#14891) 2021-01-28 22:35:31 -07:00
Tyera Eulberg ffa5c7dcc8
Deprecate commitment variants (#14797)
* Deprecate commitment variants

* Add new CommitmentConfig builders

* Add helpers to avoid allowing deprecated variants

* Remove deprecated transaction-status code

* Include new commitment variants in runtime commitment; allow deprecated as long as old variants persist

* Remove deprecated banks code

* Remove deprecated variants in core; allow deprecated in rpc/rpc-subscriptions for now

* Heavier hand with rpc/rpc-subscription commitment

* Remove deprecated variants from local-cluster

* Remove deprecated variants from various tools

* Remove deprecated variants from validator

* Update docs

* Remove deprecated client code

* Add new variants to cli; remove deprecated variants as possible

* Don't send new commitment variants to old clusters

* Retain deprecated method in test_validator_saves_tower

* Fix clippy matches! suggestion for BPF solana-sdk legacy compile test

* Refactor node version check to handle commitment variants and transaction encoding

* Hide deprecated variants from cli help

* Add cli App comments
2021-01-26 19:23:07 +00:00
behzad nouri e1021d9f83
removes redundant epoch stakes cache in retransmit (#14781)
Following d6d76219b, staked nodes computed from vote accounts are
already cached in runtime::Stakes, so the caching in retransmit_stage is
redundant.
2021-01-24 21:15:09 +00:00
Michael Vines bf1943e489 Add solana-test-validator --warp-slot argument 2021-01-22 21:17:02 -08:00
behzad nouri 8e581601d6
patches crds vote-index assignment bug (#14438)
If tower is full, old votes are evicted from the front of the deque:
https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L367-L373
whereas recent votes if expire are evicted from the back:
https://github.com/solana-labs/solana/blob/2074e407c/programs/vote/src/vote_state/mod.rs#L529-L537

As a result, from a single tower_index scalar, we cannot infer which crds-vote
should be overwritten:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L576

In addition there is an off by one bug in the existing code. tower_index is
bounded by MAX_LOCKOUT_HISTORY - 1:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/consensus.rs#L382
So, it is at most 30, whereas MAX_VOTES is 32:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L29
Which means that this branch is never taken:
https://github.com/solana-labs/solana/blob/2074e407c/core/src/crds_value.rs#L590-L593
so crds table alwasys keeps 29 **oldest** votes by wallclock, and then
only overrides the 30st one each time. (i.e a tally of only two most
recent votes).
2021-01-21 13:08:07 +00:00
behzad nouri b5fd0ed859
rewrites turbine retransmit peers computation (#14584) 2021-01-19 04:18:47 +00:00
carllin 6dfad0652f
Cache account stores, flush from AccountsBackgroundService (#13140) 2021-01-11 17:00:23 -08:00
Michael Vines a95675a7ce
Avoid tmp snapshot backlog in SnapshotPackagerService under high load (#14516) 2021-01-11 10:21:15 -08:00
Michael Vines 7be6770808 Rename CompressionType to ArchiveFormat 2021-01-09 09:07:49 -08:00
carllin 5affd8aa72
Add secondary indexes (#14212) 2020-12-31 18:06:03 -08:00
behzad nouri 691031fefd
limits number of crds values returned when responding to pull requests (#13739)
Crds values buffered when responding to pull-requests can be very large taking a lot of memory.
Added a limit for number of buffered crds values based on outbound data budget.
2020-12-18 18:45:12 +00:00
Michael Vines 7143aaa89b Clippy 2020-12-14 08:03:29 -08:00
carllin 55fc963595
Move slot cleanup to AccountsBackgroundService (#13911)
* Move bank drop to AccountsBackgroundService

* Send to ABS on drop instead, protects against other places banks are dropped

* Fix Abi

* test

Co-authored-by: Carl Lin <carl@solana.com>
2020-12-13 01:22:34 +00:00
Michael Vines bbad3fe501 TestValidator now implements Drop, no need to close() it 2020-12-11 04:17:38 +00:00
Michael Vines 0a9ff1dc9d Initial solana-test-validator command-line program 2020-12-11 04:17:38 +00:00
Michael Vines 73111b005f Reduce the number of snapshots 2020-12-01 11:13:37 -08:00
Michael Vines 43b82b31e5 More TestValidator cleanup 2020-11-26 08:56:25 +00:00
Michael Vines b5f7e39be8 TestValidator public interface cleanup 2020-11-25 17:04:37 -08:00
behzad nouri b58f69297f
makes crds fields private (#13703)
Crds fields should maintain several invariants between themselves, so
exposing them as public fields can be bug prone. In addition these
invariants are asserted on every write:
https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L138-L154
https://github.com/solana-labs/solana/blob/9668dd85d/core/src/crds.rs#L239-L262
which adds extra instructions and is not optimal. Should these fields be
private the asserts will be redundant.
2020-11-19 20:57:40 +00:00
Tyera Eulberg ef99689592
Improve TestValidator instantiation (#13627)
* Add TestValidator::new_with_fees constructor, and warning for low bootstrap_validator_lamports

* Add logging to solana-tokens integration test to help catch low bootstrap_validator_lamports in the future

* Reasonable TestValidator mint_lamports
2020-11-16 23:27:36 -07:00
behzad nouri 8f0796436a
shares the lock on gossip when processing prune messages (#13339)
Processing prune messages acquires an exclusive lock on gossip:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/cluster_info.rs#L1824-L1825
This can be reduced to a shared lock if active-sets are changed to use
atomic bloom filters:
https://github.com/solana-labs/solana/blob/55b0428ff/core/src/crds_gossip_push.rs#L50
2020-11-05 15:42:00 +00:00
behzad nouri ae91270961
implements ping-pong packets between nodes (#12794)
https://hackerone.com/reports/991106

> It’s possible to use UDP gossip protocol to amplify DDoS attacks. An attacker
> can spoof IP address in UDP packet when sending PullRequest to the node.
> There's no any validation if provided source IP address is not spoofed and
> the node can send much larger PullResponse to victim's IP. As I checked,
> PullRequest is about 290 bytes, while PullResponse is about 10 kB. It means
> that amplification is about 34x. This way an attacker can easily perform DDoS
> attack both on Solana node and third-party server.
>
> To prevent it, need for example to implement ping-pong mechanism similar as
> in Ethereum: Before accepting requests from remote client needs to validate
> his IP. Local node sends Ping packet to the remote node and it needs to reply
> with Pong packet that contains hash of matching Ping packet. Content of Ping
> packet is unpredictable. If hash from Pong packet matches, local node can
> remember IP where Ping packet was sent as correct and allow further
> communication.
>
> More info:
> https://github.com/ethereum/devp2p/blob/master/discv4.md#endpoint-proof
> https://github.com/ethereum/devp2p/blob/master/discv4.md#wire-protocol

The commit adds a PingCache, which maintains records of remote nodes
which have returned a valid response to a ping message, and on-the-fly
ping messages pending a pong response from the remote node.

When handling pull-requests, those from addresses which have not passed
the ping-pong check are filtered out, and additionally ping packets are
added for addresses which need to be (re)verified.
2020-10-28 17:03:02 +00:00
behzad nouri 37c8842bcb
scans crds table in parallel for finding old labels (#13073)
From runtime profiles, the majority time of ClusterInfo::handle_purge
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1605-L1626
is spent scanning crds table finding old labels:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/crds.rs#L175-L197

This can be done in parallel given that gossip thread-pool:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1637-L1641
is idle when handle_purge is invoked:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1681
2020-10-23 14:17:37 +00:00
Michael Vines 959880db60 Remove unused pubkey::Pubkey imports 2020-10-21 19:08:13 -07:00
Michael Vines 7bc073defe Run `codemod --extensions rs Pubkey::new_rand solana_sdk::pubkey::new_rand` 2020-10-21 19:08:13 -07:00
Ryo Onodera c0675968b1
Support Debug Bank (#13017) 2020-10-21 01:05:45 +09:00