Commit Graph

320 Commits

Author SHA1 Message Date
Brooks Prumo b05fb87f22
Add test_bank_forks_incremental_snapshot() (#18565)
This commit builds on PR #18504 by adding a test to core/tests/snapshot.rs for Incremental Snapshots. The test adds banks to bank forks in a loop and takes both full snapshots and incremental snapshots at intervals, and validates they are rebuild-able.

For background info about Incremental Snapshots, see #17088.

Fixes #18829 and #18972
2021-07-29 16:46:54 -05:00
Tyera Eulberg 8596db8f53
Bump jsonrpc crates and remove old tokio (#18779)
* Bump jsonrpc crates and replace old tokio

* Bump tokio

* getBlockTime

* getBlocks

* getBlocksWithLimit, getInflationReward

* getBlock

* getFirstAvailableBlock

* getTransaction

* getSignaturesForAddress

* getSignatureStatuses

* Remove superfluous runtime
2021-07-26 12:32:17 -06:00
behzad nouri d2d5f36a3c
adds validator flag to allow private ip addresses (#18850) 2021-07-23 15:25:03 +00:00
Brooks Prumo d1debcd971
Add incremental snapshot utils (#18504)
This commit adds high-level functions for creating and loading-from
incremental snapshots, plus all low-level functions required to perform
those tasks.  This commit **does not** add taking incremental snapshots
as part of a running validator, nor starting up a node with an
incremental snapshot; just laying ground work.

Additionally, `snapshot_utils` and `serde_snapshot` have been
refactored to use a common code paths for the different snapshots.

Also of note, some renaming has happened:
  1. Snapshots are now either `full_` or `incremental_` throughout the
     codebase.  If not specified, the code applies to both.
  2. Bank snapshots now are called "bank snapshots"
     (before they were called "slot snapshots", "bank snapshots", or
      just "snapshots").  The one exception is within `Bank`, where they
     are still just "snapshots", because they are already "bank
     snapshots".
  3. Snapshot archives now have `_archive` in the code.  This
     should clear up an ambiguity between bank snapshots and snapshot
     archives.
2021-07-22 14:40:37 -05:00
Jeff Washington (jwash) d092fa1f03
add ledger-tool verify verify-accounts-index option (#18375)
* add ledger-tool verify verify-accounts-index option

* comment, merge, respond to feedback, cleanup
2021-07-13 11:06:18 -05:00
Brooks Prumo 45d54b1fc6
Add SnapshotArchiveInfo and refactor functions in snapshot_utils (#18232) 2021-07-01 12:20:56 -05:00
Brooks Prumo 89a3e4f91e
Move SnapshotConfig into its own module (#18331)
Also move ArchiveFormat to snapshot_utils, and do not
reexport SnapshotVersion.
2021-07-01 08:55:26 -05:00
Alexander Meißner 6514096a67 chore: cargo +nightly clippy --fix -Z unstable-options 2021-06-18 10:42:46 -07:00
Jeff Washington (jwash) dbd4dc04b0
ledger tool limit_load_slot_count_from_snapshot avoids assert failures (#17974) 2021-06-15 15:39:22 -05:00
Jeff Washington (jwash) f558b9b6bf
verify bank hash on startup with ledger tool option (#17939) 2021-06-15 11:52:12 -05:00
Jeff Washington (jwash) 471b34132e
add metrics for startup (#17913)
* add metrics for startup

* roll timings up higher

* fix test

* fix duplicate
2021-06-14 17:46:49 -05:00
Jeff Washington (jwash) e6bbd4b3f0
add metrics to handle_snapshot_requests (#17937) 2021-06-14 15:46:19 -05:00
Lijun Wang 269d995832
Make account shrink configurable #17544 (#17778)
1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true

Fixes #17544
2021-06-09 21:21:32 -07:00
Ryo Onodera 1f97b2365f
Avoid full-range compactions with periodic filtered b.g. ones (#16697)
* Update rocksdb to v0.16.0

* Promote the infrequent and important log to info!

* Force background compaction by ttl without manual compaction

* Fix test

* Support no compaction mode in test_ledger_cleanup_compaction

* Fix comment

* Make compaction_interval customizable

* Avoid major compaction with periodic filtering...

* Adress lazy_static, special cfs and range check

* Clean up a bit and add comment

* Add comment

* More comments...

* Config code cleanup

* Add comment

* Use .conflicts_with()

* Nullify unneeded delete_range ops for special CFs

* Some clean ups

* Clarify the locking intention

* Ensure special CFs' consistency with PurgeType::CompactionFilter

* Fix comment

* Fix bad copy paste

* Fix various types...

* Don't use tuples

* Add a unit test for compaction_filter

* Fix typo...

* Remove flag and just use new behavior always

* Fix wrong condition negation...

* Doc. about no set_last_purged_slot in purge_slots

* Write a test and fix off-by-one bug....

* Apply suggestions from code review

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Follow up to github review suggestions

* Fix line-wrapping

* Fix conflict

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-05-28 16:42:56 +09:00
Jeff Washington (jwash) 6b9d8d41a3
add --limit_load_slot_count_from_snapshot to ledger-tool (#17417) 2021-05-26 10:36:12 -05:00
Tyera Eulberg 9a5330b7eb
Move gossip modules into solana-gossip crate (#17352)
* Move gossip modules to solana-gossip

* Update Protocol abi digest due to move

* Move gossip benches and hook up CI

* Remove unneeded Result entries

* Single use statements
2021-05-26 09:15:46 -06:00
behzad nouri e867d7f3b8
removes Crds::lookup and lookup_versioned (#17438) 2021-05-24 18:21:54 +00:00
behzad nouri 9d112cf41f
encapsulates purged values bookkeeping into crds module (#17265)
For all code paths (gossip push, pull, purge, etc) that remove or
override a crds value, it is necessary to record hash of values purged
from crds table, in order to exclude them from subsequent pull-requests;
otherwise the next pull request will likely return outdated values,
wasting bandwidth:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491

Currently this is done all over the place in multiple modules, and this
has caused bugs in the past where purged values were not recorded.

This commit encapsulated this bookkeeping into crds module, so that any
code path which removes or overrides a crds value, also records the hash
of purged value in-place.
2021-05-24 13:47:21 +00:00
behzad nouri 71de021177 adds metric for turbine retransmit tree mismatch
In order to remove port-based forwarding logic in turbine, we need to
first track how often the turbine retransmit/broadcast trees mismatch
across nodes.
One consistency condition is that if the node is on the critical path
(i.e. the first node in each neighborhood), then we expect that the
packet arrives at tvu socket as opposed to tvu-forwards.

This commit adds a metric to track how often above condition is not met.
2021-05-21 17:10:56 +00:00
behzad nouri 2adce67260
extends crds values timeouts if stakes are unknown (#17261)
If stakes are unknown, then timeouts will be short, resulting in values
being purged from the crds table, and consequently higher pull-response
load when they are obtained again from gossip. In particular, this slows
down validator start where almost all values obtained from entrypoint
are immediately discarded.
2021-05-21 15:55:22 +00:00
Tao Zhu 0781fe1b4f
Upgrade Rust to 1.52.0 (#17096)
* Upgrade Rust to 1.52.0
update nightly_version to newly pushed docker image
fix clippy lint errors
1.52 comes with grcov 0.8.0, include this version to script

* upgrade to Rust 1.52.1

* disabling Serum from downstream projects until it is upgraded to Rust 1.52.1
2021-05-19 09:31:47 -05:00
Tyera Eulberg 827355a6b1
Create solana-rpc crate and move subscriptions (#17320)
* Move non_circulating_supply to runtime

* Add solana-rpc crate and move max_slots

* Move subscriptions to solana-rpc

* Single use statements
2021-05-19 00:54:28 -06:00
behzad nouri 0e646d10bb
prunes received-cache only once per unique owner's key (#17039) 2021-05-13 13:50:16 +00:00
Lijun Wang 9c42a89a43
Issue #17008 -- make snapshot archives to hold on to configurable. (#17158)
* purge_old_snapshot_archives is changed to take an extra argument 'maximum_snapshots_to_retain' to control the max number of latest snapshot archives to retain. Note the oldest snapshot is always retained as before and is not subjected to this new options.
* The validator and ledger-tool executables are modified with a CLI argument --maximum-snapshots-to-retain. And the options are propagated down the call chains. Their corresponding shell scripts were changed accordingly.
* SnapshotConfig is modified to have an extra field for the maximum_snapshots_to_retain
* Unit tests are developed to cover purge_old_snapshot_archives
2021-05-12 10:32:27 -07:00
Jeff Washington (jwash) f39dda00e0
type AccountSecondaryIndexes = HashSet (#17108) 2021-05-10 14:22:48 +00:00
behzad nouri 22c02b917e reads gossip push messages off crds ordinal index
Having an ordinal index on crds values based on insert order allows to
efficiently filter values using a cursor. In particular
CrdsGossipPush::push_messages hash-map can be replaced with a cursor,
saving on the bookkeepings, purging, etc
2021-05-09 22:40:41 +00:00
behzad nouri fa86a335b0
implements cursor for gossip crds table queries (#16952)
VersionedCrdsValue.insert_timestamp is used for fetching crds values
inserted since last query:
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1197-L1215
https://github.com/solana-labs/solana/blob/ec37a843a/core/src/cluster_info.rs#L1274-L1298

So it is crucial that insert_timestamp does not go backward in time when
new values are inserted into the table. However std::time::SystemTime is
not monotonic, or due to workload, lock contention, thread scheduling,
etc, ... new values may be inserted with a stalled timestamp way in the
past. Additionally, reading system time for the above purpose is
inefficient/unnecessary.

This commit adds an ordinal index to crds values indicating their insert
order. Additionally, it implements a new Cursor type for fetching values
inserted since last query.
2021-05-06 14:04:17 +00:00
behzad nouri 7cea2c4466 validates gossip addresses before sending pull-requests
IP addresses need to be validated before sending packets to them.
This commit, sends a ping packet to nodes before any pull requests.
Pull requests are then only sent to the nodes which have responded with
the correct hash of their respective ping packet.
2021-05-03 18:21:06 +00:00
behzad nouri 25054bfd35
retains peer's contact-info when making pull requests (#16715)
ClusterInfo::new_pull_requests has to lookup contact-infos:
https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/cluster_info.rs#L1663-L1673

when it was already available when making pull requests:
https://github.com/solana-labs/solana/blob/a1ef2bd74/core/src/crds_gossip_pull.rs#L232
2021-04-28 13:19:12 +00:00
Justin Starry 75b8434b76
Add TPU client for sending txs to the current leader tpu port (#16736)
* Add TPU client for sending txs to the current leader tpu port

* Update tpu_client.rs
2021-04-23 09:35:12 +08:00
Justin Starry a7e65c0034
RPC: use finalized as default pubsub commitment level (#16659)
* RPC: use finalized as default pubsub commitment level

* update docs

* Fix tests
2021-04-20 08:19:54 +00:00
Tyera Eulberg 7dfb51c0b4
Cli: move airdrop to rpc requests (#16557)
* Add recent_blockhash to requestAirdrop

* Move tx confirmation to separate method

* Add RpcClient airdrop methods

* Request cli airdrop via RpcClient

* Pass optional faucet_addr into TestValidator and fix tests

* Update client/src/rpc_client.rs

Co-authored-by: Michael Vines <mvines@gmail.com>

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-04-15 06:25:23 +00:00
Tyera Eulberg 70f3f7e679
Move obsolete rpc endpoints to separate api for removal (#16500)
* Move obsolete rpc methods to separate api for removal

* Remove obsolete method from docs

* Fix test using obs method
2021-04-12 20:33:40 -06:00
behzad nouri 570fd3f810
makes turbine peer computation consistent between broadcast and retransmit (#14910)
get_broadcast_peers is using tvu_peers:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/broadcast_stage.rs#L362-L370
which is potentially inconsistent with retransmit_peers:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1332-L1345

Also, the leader does not include its own contact-info when broadcasting
shreds:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/cluster_info.rs#L1324
but on the retransmit side, slot leader is removed only _after_ neighbors and
children are computed:
https://github.com/solana-labs/solana/blob/84e52b606/core/src/retransmit_stage.rs#L383-L384
So the turbine broadcast tree is different between the two stages.

This commit:
* Removes retransmit_peers. Broadcast and retransmit stages will use tvu_peers
  consistently.
* Retransmit stage removes slot leader _before_ computing children and
  neighbors.
2021-03-24 13:34:48 +00:00
behzad nouri f2865dfd63
requires stakes for propagating crds values through gossip (#15561) 2021-03-12 15:50:14 +00:00
Justin Starry 918d04e3f0
Add more slot update notifications (#15734)
* Add more slot update notifications

* fix merge

* Address feedback and add integration test

* switch to datapoint

* remove unused shred method

* fix clippy

* new thread for rpc completed slots

* remove extra constant

* fixes

* rely on channel closing

* fix check
2021-03-12 21:44:06 +08:00
Michael Vines 5df36aec7d Pacify clippy 2021-02-19 20:08:41 -08:00
Trent Nelson 7f7370c306 Re-allow clippy::integer_arithmetic at crate-level 2021-02-17 13:55:08 -07:00
sakridge b24cb9840e
Speedup ledger cleanup test (#15304)
Just clone to produce shreds and use a separate insert thread.
2021-02-17 08:59:25 -08:00
Jeff Washington (jwash) ba02452d75
add validator flag no-accounts-db-index-hashing (#15350)
* add validator flag no_accounts_db_index_hashing

* add validator flag no_accounts_db_index_hashing
2021-02-16 21:13:48 +00:00
sakridge 5b8f046c67
More configurable rocksdb compaction (#15213)
rocksdb compaction can cause long stalls, so
make it more configurable to try and reduce those stalls
and also to coordinate between multiple nodes to not induce
stall at the same time.
2021-02-14 10:16:30 -08:00
carllin 990bb426a9
Fix flaky test test_concurrent_snapshot_packaging (#15252) 2021-02-11 16:03:51 -08:00
Jeff Washington (jwash) fabecdc86c
use thread pool for non-index hash calculations (#15149) 2021-02-05 19:48:55 +00:00
behzad nouri 6fd5ec0e4c
caches descendants in bank forks (#15107) 2021-02-05 18:00:45 +00:00
Tyera Eulberg d1563f0ccd
Bump tonic, prost, tarpc, tokio (#15013)
* Update tonic & prost, and regenerate proto

* Reignore doc code

* Revert pull #14367, but pin tokio to v0.2 for jsonrpc

* Bump backoff and goauth -> and therefore tokio

* Bump tokio in faucet, net-utils

* Bump remaining tokio, plus tarpc
2021-02-05 00:21:53 -07:00
Jeff Washington (jwash) 600ff0d915
calculate hash from store instead of index (#15034)
* calculate hash from store instead of index

* restore update hash in abs
2021-02-04 09:00:33 -06:00
behzad nouri 0ad063f4e9
adds flag to disable duplicate instance check (#15006) 2021-02-03 16:26:17 +00:00
Tyera Eulberg 98aa1fa4ea
Upgrade jsonrpc crates to v17.0.0 (#15018)
* Upgrade to jsonrpc 17.0.0

* Fix test

* tree

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-02-02 19:53:08 -07:00
Tom Parker-Shemilt 01230a0105
Remove serial_test_derive dependency (#14891) 2021-01-28 22:35:31 -07:00
Tyera Eulberg ffa5c7dcc8
Deprecate commitment variants (#14797)
* Deprecate commitment variants

* Add new CommitmentConfig builders

* Add helpers to avoid allowing deprecated variants

* Remove deprecated transaction-status code

* Include new commitment variants in runtime commitment; allow deprecated as long as old variants persist

* Remove deprecated banks code

* Remove deprecated variants in core; allow deprecated in rpc/rpc-subscriptions for now

* Heavier hand with rpc/rpc-subscription commitment

* Remove deprecated variants from local-cluster

* Remove deprecated variants from various tools

* Remove deprecated variants from validator

* Update docs

* Remove deprecated client code

* Add new variants to cli; remove deprecated variants as possible

* Don't send new commitment variants to old clusters

* Retain deprecated method in test_validator_saves_tower

* Fix clippy matches! suggestion for BPF solana-sdk legacy compile test

* Refactor node version check to handle commitment variants and transaction encoding

* Hide deprecated variants from cli help

* Add cli App comments
2021-01-26 19:23:07 +00:00