Commit Graph

448 Commits

Author SHA1 Message Date
Brennan Watt f3b760dd91
Add IO metrics (#26804)
* Add Disk IO metrics
2022-08-02 14:29:53 -07:00
behzad nouri ec36f0c5df removes redundant Option<&Arc<...>> wrapper for Blockstore in serve-repair 2022-08-02 15:30:53 +00:00
behzad nouri 6423da0218 removes redundant Arc<RwLock<...>> wrapper off ServeRepair 2022-08-02 15:30:53 +00:00
Jeff Biseda 857be1e237
sign repair requests (#26833) 2022-07-31 15:48:51 -07:00
Pankaj Garg fb922f613c
Compute maximum parallel QUIC streams using client stake (#26802)
* Compute maximum parallel QUIC streams using client stake

* clippy fixes

* Add unit test
2022-07-29 08:44:24 -07:00
Trent Nelson a603c8b0bc Enable QUIC servers by default 2022-07-22 15:45:10 -06:00
Trent Nelson 2ee19f536a Revert "Revert "core: disable quic servers on mainnet-beta" (#26216)"
This reverts commit 4a7fb2a808.
2022-07-22 15:45:10 -06:00
Brennan Watt 502f249904
Add proc net dev metrics to net stats (#26603)
* Add proc net dev metrics to net stats
2022-07-20 11:44:36 -07:00
sakridge 4a7fb2a808
Revert "core: disable quic servers on mainnet-beta" (#26216)
Enable QUIC server
2022-07-20 20:37:24 +02:00
Pankaj Garg ea7448c568
Use client certs in QUIC to get peer's stake (#26477)
* Use client certs in QUIC to get peer's stake

* fixes to cert processing

* integrate the code

* clippy

* more cleanup

* sort cargo deps

* test fixes

* info -> debug
2022-07-11 18:06:40 +00:00
Nicholas Clarke ee0a40937e
Add validator argument log_messages_bytes_limit to change log truncation limit.
Add new cli argument log_messages_bytes_limit to solana-validator to control how long program logs can be before truncation
2022-07-11 10:53:18 -05:00
Nick Rempel 7e4a5de99c
Refactor ConnectionCache::use_quic (#26235)
* Remove UseQuic type

Move to storing the UdpSocket on ConnectionCache and accepting a bool

* Remove use_quic from ConnectionCache constructor

Replace with separate with_udp constructor to force callers to choose
2022-07-05 10:49:42 -07:00
behzad nouri 61f0a7d9c3
replaces Mutex<PohRecorder> with RwLock<PohRecorder> (#26370)
Mutex causes superfluous lock contention when a read-only reference suffices.
2022-07-05 14:29:44 +00:00
Jeff Washington (jwash) 557bf6e656
allow initial hash calc to occur in bg (#26271)
* allow initial hash calc to occur in bg

* validator_initialized -> startup_verification_complete

* add infos for leader and vote

* rework snapshot for startup verification

* change to assert
2022-06-29 16:48:33 -05:00
behzad nouri 348fe9ebe2
verifies shred slot and parent in fetch stage (#26225)
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
2022-06-28 12:45:50 +00:00
Brooks Prumo 662818ef0d
Use `VoteAccount::node_pubkey()` (#26207) 2022-06-27 09:09:06 -05:00
Ryo Onodera cd2878acf9
Avoid to miss to root for local slots before the hard fork (#19912)
* Make sure to root local slots even with hard fork

* Address review comments

* Cleanup a bit

* Further clean up

* Further clean up a bit

* Add comment

* Tweak hard fork reconciliation code placement
2022-06-26 15:14:17 +09:00
Jeff Biseda bafdb7dd62
Revert handle start_http failure in rpc_service (#25400) (#26130)
* revert e263be2000
2022-06-22 10:52:27 -07:00
HaoranYi b5d0c7b468
Revert "tvu and tpu timeout on joining its microservices (#24111)" (#26132)
This reverts commit e105547c14.
2022-06-22 10:57:46 -05:00
Pankaj Garg 43ff65ece9
Use single send socket in UdpTpuConnection (#26105) 2022-06-21 14:56:21 -07:00
Boqin Qin(秦 伯钦) 611d2ec73c
core: fix double-readlock in validator (#26053) 2022-06-20 15:07:00 +00:00
Trent Nelson a5f290a66f core: disable quic servers on mainnet-beta 2022-06-17 20:04:05 -06:00
Lijun Wang 29b597cea5
Connection pool support in connection cache and QUIC connection reliability improvement (#25793)
* Connection pool in connection cache and handle connection errors

1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
2022-06-10 09:25:24 -07:00
Yueh-Hsuan Chiang ee4469c882
Skip compaction in backup_and_clear_blockstore (#25810)
#### Problem
blockstore clean and compact is quite slow with wait-for-supermajority purge and can take 20-30 minutes
as described in #25710.

#### Summary of Changes
This PR removes the compaction logic in backup_and_clear_blockstore as the
actual the restoration from a bad fork is handled by `blockstore.purge_slots`
(which is done by issuing rocksdb range-delete that makes the bad fork
unavailable.)

Compaction is irreverent to the shred version, as its main job in this context
is to reclaim disk storage from the deleted slots, which we can let the rocksdb
automatic background compaction to handle it.

Fixes #25710
2022-06-09 17:11:50 +08:00
Jon Cinque 79a8ecd0ac
client: Remove static connection cache, plumb it instead (#25667)
* client: Remove static connection cache, plumb it instead

* Add TpuClient::new_with_connection_cache to not break downstream

* Refactor get_connection and RwLock into ConnectionCache

* Fix merge conflicts from new async TpuClient

* Remove `ConnectionCache::set_use_quic`

* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
2022-06-08 13:57:12 +02:00
Brennan Watt ba04063956
Add CPUmetrics (#25802)
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
2022-06-07 11:34:25 -07:00
Pankaj Garg 1c2ae470c5
Fix forwarding of transactions over QUIC (#25674)
* Spawn QUIC server to receive forwarded txs

* Update validator port range

* forward votes using UDP

* no forwarding from unstaked nodes

* forwarding stats in banking stage

* fix test builds

* fix lifetime of forward sender
2022-06-02 11:14:58 -07:00
Ryo Onodera aedcb05dc8
Record solana-validator ver to metrics at startup (#25635)
* Record solana-validator ver to metrics at startup

* Update Cargo.lock
2022-06-01 13:37:50 +09:00
Yueh-Hsuan Chiang 5b67960c76
(Refactor) Move blocktore options related stuff to blockstore_options.rs (#25509)
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.

#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.

By doing this, we address the mutual dependency and also make the code cleaner.
2022-05-26 16:59:26 -07:00
Michael Vines b05c7d91ed Fix derive_partial_eq_without_eq clippy lint 2022-05-22 22:22:21 -07:00
Jeff Biseda e263be2000
handle start_http failure in rpc_service (#25400) 2022-05-20 17:59:23 -07:00
Jeff Washington (jwash) 3a4f0d3397
println -> info (#25163) 2022-05-12 11:07:13 -05:00
HaoranYi 41d34d45e0
pass exit by ref (#25120) 2022-05-11 09:17:21 -05:00
DimAn 2fa9bc3e70
Add options to store full and/or incremental snapshots in separate locations (#24247) 2022-05-10 16:37:41 -04:00
Pankaj Garg 88c16c0176
Check if quic is enabled before warming up quic connections (#24821)
* Check if quic is enabled before warming up quic connections

* fix after rebase

* don't start warmup service if quic not enabled

* fix test
2022-05-01 03:52:38 +00:00
Michael Vines d0a8a16a57 ReplayStage no longer relies on Validator to reset the poh recorder at start 2022-04-22 21:17:49 -07:00
Michael Vines 84e3342612 Process blockstore after starting the TVU 2022-04-22 21:17:49 -07:00
Michael Vines 83e041299a Run real snapshot packager while processing blockstore at validator startup 2022-04-22 21:17:49 -07:00
Justin Starry c544742091
Local cluster test cleanup and refactoring (#24559)
* remove FixedSchedule.start_epoch

* use duration for timing

* Rename to partition bool to turbine_disabled

* simplify partition config
2022-04-22 12:14:07 +08:00
Michael Vines 05f32f287c solana-validator monitor now reports slot-level progress while loading blockstore 2022-04-19 22:09:48 -07:00
Michael Vines 9e4999ef6a Remove halt_at_slot from RuntimeConfig, it's not a runtime concern 2022-04-19 19:23:58 -07:00
Michael Vines 988210908c Move verify_udp_stats_access out of the way 2022-04-19 19:23:58 -07:00
Michael Vines c6f3da4879 blockstore_processor now accepts an Arc<Rwlock<BankForks>> 2022-04-19 19:23:58 -07:00
Michael Vines 0e2e0c8b7d Extract most storage-related services from the Tvu abstraction 2022-04-19 19:23:58 -07:00
Michael Vines 268a2109de Relocate hard forks info log 2022-04-19 19:23:58 -07:00
Michael Vines dd766042df Remove LedgerMetricReportService from TVU 2022-04-19 19:23:58 -07:00
sakridge 7a4a6597c0
Don't enforce ulimit for validator test config (#24272) 2022-04-12 22:06:37 +02:00
Jon Cinque 9b8850f99e
test-validator: Add `--max-compute-units` flag (#24130)
* test-validator: Add `--max-compute-units` flag

* Add `RuntimeConfig` for tweaking runtime behavior

* Actually add the file

* Move RuntimeConfig to runtime
2022-04-12 02:28:10 +02:00
Giorgio Gambino 60b2155bd3
Add accounts-filler-size command line option (#23896) 2022-04-11 13:10:09 -05:00
HaoranYi e105547c14
tvu and tpu timeout on joining its microservices (#24111)
* panic when test timeout

* nonblocking send when when droping banks

* debug log

* timeout for tvu

* unused varaible

* timeout for tpu

* Revert "debug log"

This reverts commit da780a3301a51d7c496141a85fcd35014fe6dff5.

* add timeout const

* fix typo

* Revert "nonblocking send when when droping banks".
I will create another pull request for this.

This reverts commit 088c98ec0facf825b5eca058fb860deba6d28888.

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tpu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/tvu.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

* Update core/src/validator.rs

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>

Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
2022-04-07 20:20:13 -05:00