* Update cost model to use requested_cu instead of estimated cu #27608
* remove CostUpdate and CostModel from replay/tvu
* revive cost update service to send cost tracker stats
* CostModel is now static
* remove unused package
Co-authored-by: Tao Zhu <tao@solana.com>
* Move ConnectionCache back to solana-client, and duplicate ThinClient, TpuClient there
* Dedupe thin_client modules
* Dedupe tpu_client modules
* Move TpuClient to TpuConnectionCache
* Move ThinClient to TpuConnectionCache
* Move TpuConnection and quic/udp trait implementations back to solana-client
* Remove enum_dispatch from solana-tpu-client
* Move udp-client to its own crate
* Move quic-client to its own crate
The manual Blockstore compaction that was being initiated from
LedgerCleanupService has been disabled for quite some time in favor of
several optimizations.
Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
--tpu-enable-udp is introduced. And when this is on, the transaction receive and transaction forward is enabled using udp.
Except for a few tests which was hard-coded sending transactions using udp, most tests are being done with udp based tpu disabled.
* Plumb priority_fee_cache into rpc
* Add PrioritizationFeeCache api
* Add getRecentPrioritizationFees rpc endpoint
* Use MAX_TX_ACCOUNT_LOCKS to limit input keys
* Remove unused cache apis
* Map fee data by slot, and make rpc account inputs optional
* Add priority_fee_cache to rpc test framework, and add test
* Add endpoint to jsonrpc docs
* Update docs/src/developing/clients/jsonrpc-api.md
* Update docs/src/developing/clients/jsonrpc-api.md
In kin-sim, we found that bounded channel causes halt for account
background services. As the number of accounts grows, the time for
pruning and cleaning increases, which would leads to longer intervals
between the pruning of deaded bank slots. With 1.7B accounts, we will
exceed the 10K bounded channel threshold that causes halt of account
back ground services. Without pruning, the node will eventually run out
of memory.
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
* Create a new function cleanup_accounts_paths, a trivial change
* Remove account files asynchronously
* Update and simplify the implementation after the validator test runs.
* Fixes after testing on the dev device
* Discard tokio. Use thread instead
* Fix comments format
* Fix config type to pass the github test
* Fix failed tests. Handle the case of non-existing path
* Final cleanup, addressing the review comments
Avoided OsString.
Made the function more generic with "impl AsRef<Path>"
Co-authored-by: Jeff Washington <jeff.washington@solana.com>
Prior to this change, long running commands like `solana-ledger-tool
verify` would OOM due to AccountsDb cleanup not happening.
Co-authored-by: Michael Vines <mvines@gmail.com>
* Use client certs in QUIC to get peer's stake
* fixes to cert processing
* integrate the code
* clippy
* more cleanup
* sort cargo deps
* test fixes
* info -> debug
* Remove UseQuic type
Move to storing the UdpSocket on ConnectionCache and accepting a bool
* Remove use_quic from ConnectionCache constructor
Replace with separate with_udp constructor to force callers to choose
* allow initial hash calc to occur in bg
* validator_initialized -> startup_verification_complete
* add infos for leader and vote
* rework snapshot for startup verification
* change to assert
Shred slot and parent are not verified until window-service where
resources are already wasted to sig-verify and deserialize shreds.
This commit moves above verification to earlier in the pipeline in fetch
stage.
* Make sure to root local slots even with hard fork
* Address review comments
* Cleanup a bit
* Further clean up
* Further clean up a bit
* Add comment
* Tweak hard fork reconciliation code placement
* Connection pool in connection cache and handle connection errors
1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
#### Problem
blockstore clean and compact is quite slow with wait-for-supermajority purge and can take 20-30 minutes
as described in #25710.
#### Summary of Changes
This PR removes the compaction logic in backup_and_clear_blockstore as the
actual the restoration from a bad fork is handled by `blockstore.purge_slots`
(which is done by issuing rocksdb range-delete that makes the bad fork
unavailable.)
Compaction is irreverent to the shred version, as its main job in this context
is to reclaim disk storage from the deleted slots, which we can let the rocksdb
automatic background compaction to handle it.
Fixes#25710
* client: Remove static connection cache, plumb it instead
* Add TpuClient::new_with_connection_cache to not break downstream
* Refactor get_connection and RwLock into ConnectionCache
* Fix merge conflicts from new async TpuClient
* Remove `ConnectionCache::set_use_quic`
* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
* Spawn QUIC server to receive forwarded txs
* Update validator port range
* forward votes using UDP
* no forwarding from unstaked nodes
* forwarding stats in banking stage
* fix test builds
* fix lifetime of forward sender
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.
#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.
By doing this, we address the mutual dependency and also make the code cleaner.
* panic when test timeout
* nonblocking send when when droping banks
* debug log
* timeout for tvu
* unused varaible
* timeout for tpu
* Revert "debug log"
This reverts commit da780a3301a51d7c496141a85fcd35014fe6dff5.
* add timeout const
* fix typo
* Revert "nonblocking send when when droping banks".
I will create another pull request for this.
This reverts commit 088c98ec0facf825b5eca058fb860deba6d28888.
* Update core/src/tpu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/tpu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/tvu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/tvu.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* Update core/src/validator.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* initial work for poh timing report service
* add poh_timing_report_service to validator
* fix comments
* clippy
* imrove test coverage
* delete record when complete
* rename shred full to slot full.
* debug logging
* fix slot full
* remove debug comments
* adding fmt trait
* derive default
* default for poh timing reporter
* better comments
* remove commented code
* fix test
* more test fixes
* delete timestamps for slot that are older than root_slot
* debug log
* record poh start end in bank reset
* report full to start time instead
* fix poh slot offset
* report poh start for normal ticks
* fix typo
* refactor out poh point report fn
* rename
* optimize delete - delete only when last_root changed
* change log level to trace
* convert if to match
* remove redudant check
* fix SlotPohTiming comments
* review feedback on poh timing reporter
* review feedback on poh_recorder
* add test case for out-of-order arrival of timing points and incomplete timing points
* refactor poh_timing_points into its own mod
* remove option for poh_timing_report service
* move poh_timing_point_sender to constructor
* clippy
* better comments
* more clippy
* more clippy
* add slot poh timing point macro
* clippy
* assert in test
* comments and display fmt
* fix check
* assert format
* revise comments
* refactor
* extrac send fn
* revert reporting_poh_timing_point
* align loggin
* small refactor
* move type declaration to the top of the module
* replace macro with constructor
* clippy: remove redundant closure
* review comments
* simplify poh timing point creation
Co-authored-by: Haoran Yi <hyi@Haorans-MacBook-Air.local>