Commit Graph

2214 Commits

Author SHA1 Message Date
Michael Vines 7027d56064 Resolve nightly-2021-10-05 clippy complaints 2021-10-06 10:37:58 -07:00
Tao Zhu 03913f6661
add tx count and thread id to stats, each stat reports and resets when slot changes (#20451) 2021-10-06 00:09:19 -05:00
Justin Starry 129716f3f0
Optimize stakes cache and rewards at epoch boundaries (#20432)
* Optimize stakes cache and rewards at epoch boundaries

* Fetch from accounts db

* Add cli flag for disabling epoch boundary optimization
2021-10-06 00:53:26 -04:00
Tao Zhu 6ff508c643
add transaction cost histogram metrics (#20350) 2021-10-05 08:57:39 -05:00
Brooks Prumo 4cd50f5d45
Don't gossip more snapshot hashes than what we retain (#20379) 2021-10-01 15:59:45 -05:00
Lijun Wang fe97cb2ddf
AccountsDb plugin framework (#20047)
Summary of Changes

Create a plugin mechanism in the accounts update path so that accounts data can be streamed out to external data stores (be it Kafka or Postgres). The plugin mechanism allows

Data stores of connection strings/credentials to be configured,
Accounts with patterns to be streamed
PostgreSQL implementation of the streaming for different destination stores to be plugged in.

The code comprises 4 major parts:

accountsdb-plugin-intf: defines the plugin interface which concrete plugin should implement.
accountsdb-plugin-manager: manages the load/unload of plugins and provide interfaces which the validator can notify of accounts update to plugins.
accountsdb-plugin-postgres: the concrete plugin implementation for PostgreSQL
The validator integrations: updated streamed right after snapshot restore and after account update from transaction processing or other real updates.
The plugin is optionally loaded on demand by new validator CLI argument -- there is no impact if the plugin is not loaded.
2021-09-30 14:26:17 -07:00
Jeff Biseda 3854cfaa00
Use batch_send in forward_buffered_packets (#20330) 2021-09-29 20:49:43 -07:00
sakridge 94668c95c2
Prune sigverify queue (#20331) 2021-09-30 05:41:05 +02:00
Brooks Prumo 3ea6a01254
Only gossip snapshot hashes for full snapshots (#20271) 2021-09-27 19:29:08 -05:00
Jeff Biseda 640e93187c
periodically report sigverify_stage stats (#19674) 2021-09-21 10:37:58 -07:00
sakridge 013e1d9d49
Limit transaction forwarding from banking_stage (#19940) 2021-09-21 08:49:41 -07:00
carllin e6b4dd3866
Add bank to banking stage regardless of if there is a working bank (#19855) 2021-09-17 16:55:53 -07:00
Pavel Strakhov 65227f44dc
Optimize RPC pubsub for multiple clients with the same subscription (#18943)
* reimplement rpc pubsub with a broadcast queue

* update tests for new pubsub implementation

* fix: fix review suggestions

* chore(rpc): add additional pubsub metrics

* integrate max subscriptions check into SubscriptionTracker to reduce locking

* separate subscription control from tracker

* limit memory usage of items in pubsub broadcast queue, improve error handling

* add more pubsub metrics

* add final count metrics to pubsub

* add metric for total number of subscriptions

* fix small review suggestions

* remove by_params from SubscriptionTracker and add node_progress_watchers map instead

* add subscription tracker tests

* add metrics for number of pubsub notifications as a counter

* ignore clippy lint in TokenCounter

* fix underflow in token counter

* reduce queue capacity in pubsub tests

* fix(rpc): fix test timeouts

* fix race in account subscription test

* Add RpcSubscriptions::new_for_tests

Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
2021-09-17 13:40:14 -06:00
sakridge dc69cc1ae4
Only allow votes when root distance gets too high (#19917) 2021-09-16 15:12:26 +02:00
Justin Starry ca3f147670
Add banking metrics for buffered and dropped packets (#19902) 2021-09-15 15:53:55 -05:00
Tao Zhu 67fa9945e1
Add few more metrics data points (#19624)
* Add slot, count and accumulated-units to per-program-timings for determining transaction cost elements

* correct the stats naming; fixes the dirty bit resetting
2021-09-15 09:49:49 -05:00
Justin Starry 34c1a9ac85
Report consumed_buffered_packets_count stat to metrics (#19900) 2021-09-15 14:19:39 +00:00
Tyera Eulberg c91519961c
Use f64 for stake math in get_stake_percent_in_gossip (#19895) 2021-09-14 23:36:30 -06:00
Michael 4ff50519ff
Add an info log to indicate the node has reached supermajority and print the active stake percentage (#19893) 2021-09-14 21:48:15 -06:00
Jeff Washington (jwash) b57e86abf2
cache account hash info (#19426)
* cache account hash info

* ledger_path -> accounts_hash_cache_path
2021-09-13 20:39:26 -05:00
carllin 87a7f00926
Track reset bank in PohRecorder (#19810) 2021-09-13 16:55:35 -07:00
Brooks Prumo 62c8bcf565
Add default() to SnapshotConfig (#19776) 2021-09-12 13:44:27 -05:00
Brooks Prumo 7aa5f6b833
Add CLI args for incremental snapshots (#19694)
Add `--incremental-snapshots` flag to enable incremental snapshots.
This will allow setting `--full-snapshot-interval-slots` and
`--incremental-snapshot-interval-slots`.

Also added `--maximum-incremental-snapshots-to-retain`.

Co-authored-by: Michael Vines <mvines@gmail.com>
2021-09-10 15:59:26 -05:00
Michael Vines 4386e09710 Reduce wait for supermajority threshold back to 80% 2021-09-09 21:17:35 -07:00
sakridge 3a8c678f62
Remove some copying (#19691) 2021-09-08 18:32:38 +02:00
Jeff Washington (jwash) 456bf15012
AccountsIndexConfig -> AccountsDbConfig (#19687) 2021-09-08 04:30:38 +00:00
Jeff Washington (jwash) d3f938f0cf
Remove Copy from AccountsIndexConfig. Not all types will support it (#19686) 2021-09-07 20:09:40 -05:00
Brooks Prumo a0552e5b46
Make startup aware of Incremental Snapshots (#19600) 2021-09-07 20:43:43 +00:00
behzad nouri 01a7ec8198
uses rayon thread-pool for retransmit-stage parallelization (#19486) 2021-09-07 15:15:01 +00:00
Brooks Prumo fe8ba81ce6
Rename to is_valid instead of is_invalid (#19670) 2021-09-07 09:31:54 -05:00
Brooks Prumo 9d9482b9d8
Plumb `maximum_incremental_snapshot_archives_to_retain` (#19640) 2021-09-06 18:01:56 -05:00
Sean Young d461a9ac10 verify_precompiles needs FeatureSet
Rather than pass in individual features, pass in the entire feature set
so that we can add the ed25519 program feature in a later commit.
2021-09-05 18:59:37 +01:00
Tyera Eulberg decec3cd8b
Demote write locks on transaction program ids (#19593)
* Add feature

* Demote write lock on program ids

* Fixup bpf tests

* Update MappedMessage::is_writable

* Comma nit

* Review comments
2021-09-04 03:05:30 +00:00
Brooks Prumo 1828579580
Pass SnapshotConfig to SnapshotPackagerService (#19616) 2021-09-03 21:42:32 +00:00
Brooks Prumo 5e25ee5ebe
Add maximum_incremental_snapshot_archives_to_retain to SnapshotConfig (#19612) 2021-09-03 20:21:32 +00:00
Brooks Prumo 7ab0aec61f
Rename maximum_full_snapshot_archives_to_retain (#19610)
To prepare for adding maximum_incremental_snapshot_archives_to_retain,
rename the current field in SnapshotConfig.
2021-09-03 11:28:10 -05:00
Brooks Prumo e9374d32a3
Revert "Make startup aware of Incremental Snapshots (#19550)" (#19599)
This reverts commit d45ced0a5d.
2021-09-02 19:14:41 -05:00
Brooks Prumo d45ced0a5d
Make startup aware of Incremental Snapshots (#19550) 2021-09-02 19:05:15 -05:00
Jeff Biseda 7a8eba10b2
add synchronization comment to handle_new_root (#19571) 2021-09-02 13:52:14 -07:00
Lijun Wang 8378e8790f
Accountsdb replication installment 2 (#19325)
This is the 2nd installment for the AccountsDb replication.

Summary of Changes

The basic google protocol buffer protocol for replicating updated slots and accounts. tonic/tokio is used for transporting the messages.

The basic framework of the client and server for replicating slots and accounts -- the persisting of accounts in the replica-side will be done at the next PR -- right now -- the accounts are streamed to the replica-node and dumped. Replication for information about Bank is also not done in this PR -- to be addressed in the next PR to limit the change size.

Functionality used by both the client and server side are encapsulated in the replica-lib crate.

There is no impact to the existing validator by default.

Tests:

Observe the confirmed slots replicated to the replica-node.
Observe the accounts for the confirmed slot are received at the replica-node side.
2021-09-01 14:10:16 -07:00
behzad nouri 6d9818b8e4
skips retransmit for shreds with unknown slot leader (#19472)
Shreds' signatures should be verified before they reach retransmit
stage, and if the leader is unknown they should fail signature check.
Therefore retransmit-stage can as well expect to know who the slot
leader is and otherwise just skip the shred.

Blockstore checking signature of recovered shreds before sending them to
retransmit stage:
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/blockstore.rs#L884-L930

Shred signature verifier:
https://github.com/solana-labs/solana/blob/4305d4b7b/core/src/sigverify_shreds.rs#L41-L57
https://github.com/solana-labs/solana/blob/4305d4b7b/ledger/src/sigverify_shreds.rs#L105
2021-09-01 15:44:26 +00:00
Brooks Prumo 1d5a8ebc6a
Revert "Add LastFullSnapshotSlot to SnapshotConfig (#19341)" (#19529)
This reverts commit 4d361af976.
2021-08-31 22:03:19 -05:00
Brooks Prumo fe9ee9134a
Make background services aware of incremental snapshots (#19401)
AccountsBackgroundService now knows about incremental snapshots.  It is
now also in charge of deciding if an AccountsPackage is destined to be a
SnapshotPackage or not (or just used by AccountsHashVerifier).

!!! New behavior changes !!!

Taking snapshots (both bank and archive) **MUST** succeed.

This is required because of how the last full snapshot slot is
calculated, which is used by AccountsBackgroundService when calling
`clean_accounts()`.

File system calls are now unwrapped and will result in a crash. As Trent told me:

>Well I think if a snapshot fails due to some IO error, it's very likely that the operator is going to have to intervene before it works.  We should exit error in this case, otherwise the validator might happily spin for several more hours, never successfully writing a complete snapshot, before something else brings it down.  This would leave the validator's last local snapshot many more slots behind than it would be had we exited outright and potentially force the operator to abandon ledger continuity in favor of a quick catchup

Other errors will set the `exit` flag to `true`, and the node will gracefully shutdown.

Fixes #19167 
Fixes #19168
2021-08-31 18:33:27 -05:00
Tyera Eulberg a3bef2e537
Fix shreds-to-hours/days estimations (#19477) 2021-08-30 13:16:06 -06:00
behzad nouri 8ad52fa095
implements copy-on-write for vote-accounts (#19362)
Bank::vote_accounts redundantly clones vote-accounts HashMap even though
an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/95c998a19/runtime/src/bank.rs#L5174-L5186

This commit implements copy-on-write semantics for vote-accounts by
wrapping the underlying HashMap in Arc<...>.
2021-08-30 15:54:01 +00:00
carllin 84db04ce6c
Fix duplicate broadcast test (#19365) 2021-08-27 17:53:24 -07:00
Justin Starry 2d7f036afd
Add solana-program-runtime crate (#19438) 2021-08-27 00:30:36 +00:00
Brooks Prumo 6d939811e9
Name snapshots consistently (#19346)
#### Problem

Snapshot names are overloaded, and there are multiple terms that mean the same thing. This is confusing. Here's a list of ones in the codebase that I've found:

```
- snapshot_dir
- snapshots_dir
- snapshot_path
- snapshot_output_dir
- snapshot_package_output_path
- snapshot_archives_dir
```

#### Summary of Changes

For all the ones that are about the directory where snapshot archives are stored, ensure they are `snapshot_archives_dir`. For the ones about the (bank) snapshots directory, set to `bank_snapshots_dir`.


Co-authored-by: Michael Vines <mvines@gmail.com>
2021-08-21 15:41:03 -05:00
Brooks Prumo 4d361af976
Add LastFullSnapshotSlot to SnapshotConfig (#19341) 2021-08-20 17:06:53 +00:00
behzad nouri 1deb4add81
removes Slot from TransmitShreds (#19327)
An earlier version of the code was funneling through stakes along with
shreds to broadcast:
https://github.com/solana-labs/solana/blob/b67ffab37/core/src/broadcast_stage.rs#L127

This was changed to only slots as stakes computation was pushed further
down the pipeline in:
https://github.com/solana-labs/solana/pull/18971

However shreds themselves embody which slot they belong to. So pairing
them with slot is redundant and adds rooms for bugs should they become
inconsistent.
2021-08-20 13:48:33 +00:00