### Problem
When FIFO compaction is used while --rocksdb_fifo_shred_storage_size
is unspecified, the FIFO shred storage size is set to a const default based
on the default `--limit-ledger-size`.
### Summary of the Change
When --rocksdb_fifo_shred_storage_size is unspecified, it is now
derived from `--limit-ledger-size` by reserving 1500 bytes for each
shred.
--tpu-enable-udp is introduced. And when this is on, the transaction receive and transaction forward is enabled using udp.
Except for a few tests which was hard-coded sending transactions using udp, most tests are being done with udp based tpu disabled.
Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
* Restrict the usable port range of the validator such that adding QUIC_PORT_OFFSET never gets us an invalid port. Also validate this for incoming ContactInfos
* Require the proper port range in ContactInfo::valid_client_facing_addr
* Use asserts instead of panics, and enforce USABLE_PORT_RANGE for all the ports in ContactInfo
* Fix typo
* Make the quic client return errors on the quinn endpoint connect() call,
not just the result of awaiting the connect() call, as the connect()
call can itself fail realistically (e.g. due to expected/invalid IPs, etc)
* Update USABLE_PORT_RANGE to a better range and use port_range_validator to validate dynamic-port-range rather than a panic
* Fall back on UDP when the remote peer's tpu port is too large to have QUIC_PORT_OFFSET added to it
* Get rid of tpu port sanitization in ContactInfo
* Turn USABLE_PORT_RANGE into a Range and make connnection_cache fall back on UDP when the tpu port is out of range
* Fix build
* Dummy commit
* Reert dummy commit
* dummy commit
* revert dummy commit
* Fix typo
* Fix range validation
* Fix formatting
* Fix USABLE_PORT_RANGE
* Remove USABLE_PORT_RANGE
* Avoid creating a QuicLazyInitializedEndpoint when forcing the use of UDP
* Implement test for connection cache overflow
* Add ability to use a non-default app profile id in bigtable requests
* Only run subcommand once when getting global configs
* Remove unneded scoping on option type
* Connection pool in connection cache and handle connection errors
1. The connection not has a pool of connections per address, configurable, default 4
2. The connections per address share a lazy initialized endpoint
3. Handle connection issues better, avoid race conditions
4. Various log improvement for help debug connection issues
* client: Remove static connection cache, plumb it instead
* Add TpuClient::new_with_connection_cache to not break downstream
* Refactor get_connection and RwLock into ConnectionCache
* Fix merge conflicts from new async TpuClient
* Remove `ConnectionCache::set_use_quic`
* Move DEFAULT_TPU_USE_QUIC to client, use ConnectionCache::default()
#### Problem
Currently, the creation of ShredStorageType::RocksFifo is hard coded in validator/src/main.rs.
But this common code will also need to be used in other places like ledger-tool.
#### Summary of Changes
This PR creates a helper functionShredStorageType::rocks_fifo that takes a total shred_storage_size
and equally allocates to data-shred and coding-shred storage.
Add in some CPU utilization metrics such as: number of vCPUs, clock frequency, average load across different time intervals, and number of total threads
#### Problem
When FIFO compaction is used, the size ratio between data shred and coding
shred is set to 1:1 based on the `--rocksdb_fifo_shred_storage_size` arg.
However, BlockstoreRocksFifoOptions::default() uses a slightly optimized
5:4 ratio instead, and the default() function is only used in benchmarks.
#### Summary of Changes
This PR makes both validator argument and BlockstoreRocksFifoOptions::default()
to use 1:1 ratio between data and coding shred size.
* Spawn QUIC server to receive forwarded txs
* Update validator port range
* forward votes using UDP
* no forwarding from unstaked nodes
* forwarding stats in banking stage
* fix test builds
* fix lifetime of forward sender
#### Problem
blockstore_db.rs has a mutual dependency between blockstore_metrics.rs.
#### Summary of Changes
This PR removes the mutual dependency by moving the option-related stuff
out from blockstore_db.rs to its new home --- blockstore_options.rs.
By doing this, we address the mutual dependency and also make the code cleaner.
#### Problem
LedgerColumnOptions contain two fields, perf_read_counter and perf_write_counter,
that are not really options but internal counters.
#### Summary of Changes
This PR introduces BlockstoreRocksDbPerfSamplingStatus, a struct that holds internal
status for RocksDB perf sampling and moves perf_read_counter and perf_write_counter
out from LedgerColumnOptions.
#### Summary of Changes
This PR replaces the use of thread_rng in RocksDB perf metric samples by
AtomicU32 with Ordering::Relaxed to improve the performance of determining
whether to sample the current RocksDB's read/write perf metric.
#### Problem
Currently, the number of RocksDB perf samples is controlled by an env arg
which is later handled using a lazy_static variable. However, there is a known
performance overhead of using lazy_static as mentioned in
https://github.com/solana-labs/solana/pull/6472.
#### Summary of Changes
Instead, this PR uses a hidden validator argument, --rocksdb-perf-sample-interval,
for controlling how often RocksDB read/write performance sample is collected.
Introduced flag --tpu-do-batch2.
Introduced flag to control the batch size-- by default 100
The default batch timeout is 200ms -- configurable. If either it time out or the batch size is filled, a new batch is sent
The batch honor the retry rate on the transaction already sent before.
Introduced two threads in STS: one for receiving new transactions and doing batch send and one for retrying old transactions and doing batch.6.
Fixes #
* transaction-status: Add return data to meta
* Add return data to simulation results
* Use pretty-hex for printing return data
* Update arg name, make TransactionRecord struct
* Rename TransactionRecord -> ExecutionRecord
This PR adds `--rocksdb-ledger-compression` as a hidden argument to the validator
for specifying the compression algorithm for TransactionStatus. Available compression
algorithms include `lz4`, `snappy`, `zlib`. The default value is `none`.
Experimental results show that with lz4 compression, we can achieve ~37% size-reduction
on the TransactionStatus column family, or ~8% size-reduction of the ledger store size.
This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will
pass-down this struct to LedgerColumn to allow it to perform metric reporting.
#### Summary of Changes
This PR further enables group by operation on storage type in blockstore_rocksdb_cfs metrics.
Such group-by allows us to further compare the performance metrics between rocks-level and
rocks-fifo.
To make things extensible, this PR introduces BlockstoreAdvancedOptions and move shred_storage_type.
All fields in BlockstoreAdvancedOptions will support group-by operation in blockstore_rocksdb_cfs.
Dependency: #23580
#### Summary of Changes
This PR adds two hidden arguments to the validator that allow users to use RocksDB's FIFO compaction for storing shreds.
--shred-storage <SHRED_STORAGE>
EXPERIMENTAL: Controls how RocksDB compacts shreds. *WARNING*: You will lose your ledger data
when you switch between options. Possible values are: 'level': stores shreds using RocksDB's default (level)
compaction. 'fifo': stores shreds under RocksDB's FIFO compaction. This option is more efficient on
disk-write-bytes of the ledger store. [default: level] [possible values: level, fifo]
--shred-storage-size <SHRED_STORAGE_SIZE_BYTES>
The shred storage size in bytes. The suggested value is 50% of your ledger storage size in bytes. [default:
268435456000]
* AcctIdx: env var "SOLANA_TEST_ACCOUNTS_INDEX_MEMORY_LIMIT_MB"
* ignore env var when starting as validator
* Update runtime/src/bucket_map_holder.rs
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
Co-authored-by: Trent Nelson <trent.a.b.nelson@gmail.com>
* use correct operation name
* require enable_rpc_transaction_history flag when enabling block_subscription
Co-authored-by: Zano <segfaultdoctor@protonmail.com>
This introduces the `--clone-from-file` option for
solana-test-validator. It allows specifying any number of files
(without extension) containing account info and data, which will be
loaded at genesis. This is similar to `--bpf-program` for programs
loading.
The files will be searched for in the CWD or in `tests/fixtures`.
Example: `solana-test-validator --clone-from-file SRM_token USD_token`
Now that bootstrap with incremental snapshots enabled has a fallback
mechanism in place, this no-incremental-snapshot-fetch flag is no longer
required.
Fixes#21127
When retaining trusted peer snapshot hashes, and the peer's full
snapshot hashes match a trusted snapshot hash, but the peer doesn't have
incremental snapshot hashes, that's fine; the peer's hashes should be
retained, not discarded.
* Move test-validator to own module to reduce core dependencies
* Fix a few TestValidator paths
* Use solana_test_validator crate for solana_test_validator bin
* Move client int tests to separate crate
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* add filler accounts to bloat validator and predict failure
* assert no accounts match filler
* cleanup magic numbers
* panic if can't load from snapshot with filler accounts specified
* some renames
* renames
* into_par_iter
* clean filler accts, too
Support using connection pooling and use multiple threads to do Postgres db operations. The performance is improved from 1500 RPS to 40,000 RPS measured during validator start.
Support multiple plugins at the same time.
* Add separate vote processing tpu port
* Add feature to send to tpu vote port
* Add vote rejecting sigverify mode
* use packet.meta.is_simple_vote_tx in place of deserialization
* consolidate code that identifies vote tx atcommon path for cpu and gpu
* new key for feature set
* banking forward tpu vote
* add tpu vote port to dockerfile and other review changes
* Simplify thread id compare
* fix a test; updated cluster_info ABI change
Co-authored-by: Tao Zhu <tao@solana.com>
Co-authored-by: sakridge <sakridge@gmail.com>
* add --ignore-delinquency flag to validator exit and wait-for-restart-window sub commands
* Fix a merge issue
* Add missing variable declaration
* Remove empty line to help CI checks pass
* run rustfmt
* Change argument wording for clarity and verbosity
* Change --ignore-delinquent-stake to --max-delinquent-stake
* cargo fmtgit add validator/src/main.rsgit add validator/src/main.rs
* Adjust per mvines
* Formatting
* Improve input validation
* Please automate cargo fmt somehow
Summary of Changes
Create a plugin mechanism in the accounts update path so that accounts data can be streamed out to external data stores (be it Kafka or Postgres). The plugin mechanism allows
Data stores of connection strings/credentials to be configured,
Accounts with patterns to be streamed
PostgreSQL implementation of the streaming for different destination stores to be plugged in.
The code comprises 4 major parts:
accountsdb-plugin-intf: defines the plugin interface which concrete plugin should implement.
accountsdb-plugin-manager: manages the load/unload of plugins and provide interfaces which the validator can notify of accounts update to plugins.
accountsdb-plugin-postgres: the concrete plugin implementation for PostgreSQL
The validator integrations: updated streamed right after snapshot restore and after account update from transaction processing or other real updates.
The plugin is optionally loaded on demand by new validator CLI argument -- there is no impact if the plugin is not loaded.
* windows: Make solana-test-validator work
The important changes to get this going on Windows:
* ledger lock needs to be done on a file instead of the directory
* IPC service needs to use the Windows pipe naming scheme
* always disable the JIT
* file logging not possible yet because we can't redirect stderr,
but this will change once env_logger fixes the pipe output target!
* Integrate review feedback
* reimplement rpc pubsub with a broadcast queue
* update tests for new pubsub implementation
* fix: fix review suggestions
* chore(rpc): add additional pubsub metrics
* integrate max subscriptions check into SubscriptionTracker to reduce locking
* separate subscription control from tracker
* limit memory usage of items in pubsub broadcast queue, improve error handling
* add more pubsub metrics
* add final count metrics to pubsub
* add metric for total number of subscriptions
* fix small review suggestions
* remove by_params from SubscriptionTracker and add node_progress_watchers map instead
* add subscription tracker tests
* add metrics for number of pubsub notifications as a counter
* ignore clippy lint in TokenCounter
* fix underflow in token counter
* reduce queue capacity in pubsub tests
* fix(rpc): fix test timeouts
* fix race in account subscription test
* Add RpcSubscriptions::new_for_tests
Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
This is a bugfix.
When `load_frozen_forks()` called `snapshot_bank()`, the accounts hash
was not computed. So then when a snapshot archive was eventually
created, its hash would be all 1s, which is clearly wrong. The fix is
to update the bank's accounts hash before taking the bank snapshot.
Add `--incremental-snapshots` flag to enable incremental snapshots.
This will allow setting `--full-snapshot-interval-slots` and
`--incremental-snapshot-interval-slots`.
Also added `--maximum-incremental-snapshots-to-retain`.
Co-authored-by: Michael Vines <mvines@gmail.com>
When the validator is waiting to restart, the snapshot slot is checked.
With Incremental Snapshots, the full snapshot interval is going to be
_much_ higher, which would make this function wait for a looooong time.
So, if there's an incremental snapshot slot when checking, use that
slot, otherwise fall back to the full snapshot slot.
#### Problem
There's no way to get incremental snapshot information from RPC.
#### Summary of Changes
- Add new RPC method, `getHighestSnapshotSlot` that returns a `SnapshotSlotInfo`, which contains both the highest full snapshot slot, and the highest incremental snapshot slot _based on_ the full snapshot.
- Deprecate old RPC method, `getSnapshotSlot`
- Update API docs
Fixes#19579
This is the 2nd installment for the AccountsDb replication.
Summary of Changes
The basic google protocol buffer protocol for replicating updated slots and accounts. tonic/tokio is used for transporting the messages.
The basic framework of the client and server for replicating slots and accounts -- the persisting of accounts in the replica-side will be done at the next PR -- right now -- the accounts are streamed to the replica-node and dumped. Replication for information about Bank is also not done in this PR -- to be addressed in the next PR to limit the change size.
Functionality used by both the client and server side are encapsulated in the replica-lib crate.
There is no impact to the existing validator by default.
Tests:
Observe the confirmed slots replicated to the replica-node.
Observe the accounts for the confirmed slot are received at the replica-node side.
#### Problem
Snapshot names are overloaded, and there are multiple terms that mean the same thing. This is confusing. Here's a list of ones in the codebase that I've found:
```
- snapshot_dir
- snapshots_dir
- snapshot_path
- snapshot_output_dir
- snapshot_package_output_path
- snapshot_archives_dir
```
#### Summary of Changes
For all the ones that are about the directory where snapshot archives are stored, ensure they are `snapshot_archives_dir`. For the ones about the (bank) snapshots directory, set to `bank_snapshots_dir`.
Co-authored-by: Michael Vines <mvines@gmail.com>