* - cache calculated transaction cost to allow sharing;
- atomic cost tracking op;
- only lock accounts for transactions eligible for current block;
- moved qos service and stats reporting to its own model;
- add cost_weight default to neutral (as 1), vote has zero weight;
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
* Update core/src/qos_service.rs
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
* Update core/src/qos_service.rs
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
Add additional checks to should_retransmit_and_persist()
- Check invalid shred index
- Update cases that check if node was leader
- Some comments and variable rename for clarity
* Move test-validator to own module to reduce core dependencies
* Fix a few TestValidator paths
* Use solana_test_validator crate for solana_test_validator bin
* Move client int tests to separate crate
Co-authored-by: Tyera Eulberg <tyera@solana.com>
- decouple cost_model from cost_tracker; allowing one cost_model
instance being shared within a validator;
- update cost_model api to calculate_cost(&self...)->transaction_cost
Turbine randomly shuffles cluster nodes on a broadcast tree for each
shred. This requires knowing the stakes and nodes' contact-infos (from
gossip).
However gossip is subject to partitioning and propogation delays.
Additionally unstaked nodes may join and leave the cluster at any
moment, changing the cluster view from one node to another.
This commit:
* Always arranges the unstaked nodes at the bottom of turbine broadcast
tree.
* Staked nodes are always included regardless of if their contact-info
is available in gossip or not.
* Uses the unbiased WeightedShuffle construct for shuffling nodes.
* add filler accounts to bloat validator and predict failure
* assert no accounts match filler
* cleanup magic numbers
* panic if can't load from snapshot with filler accounts specified
* some renames
* renames
* into_par_iter
* clean filler accts, too
Support using connection pooling and use multiple threads to do Postgres db operations. The performance is improved from 1500 RPS to 40,000 RPS measured during validator start.
Support multiple plugins at the same time.
Now that CRDS supports incremental snapshot hashes,
SnapshotPackagerService needs to push 'em!
This commit does two main things:
1. SnapshotPackagerService now knows about incremental snapshot hashes,
and will push SnapshotPackage::IncrementalSnapshot hashes to CRDS.
2. At startup, when loading from a full + incremental snapshot, the
hashes need to be passed all the way to SnapshotPackagerService so it
can push these starting hashes to CRDS. Those values have been piped
through.
Fixes#20441 and #20423
* Add separate vote processing tpu port
* Add feature to send to tpu vote port
* Add vote rejecting sigverify mode
* use packet.meta.is_simple_vote_tx in place of deserialization
* consolidate code that identifies vote tx atcommon path for cpu and gpu
* new key for feature set
* banking forward tpu vote
* add tpu vote port to dockerfile and other review changes
* Simplify thread id compare
* fix a test; updated cluster_info ABI change
Co-authored-by: Tao Zhu <tao@solana.com>
Co-authored-by: sakridge <sakridge@gmail.com>
Summary of Changes
Create a plugin mechanism in the accounts update path so that accounts data can be streamed out to external data stores (be it Kafka or Postgres). The plugin mechanism allows
Data stores of connection strings/credentials to be configured,
Accounts with patterns to be streamed
PostgreSQL implementation of the streaming for different destination stores to be plugged in.
The code comprises 4 major parts:
accountsdb-plugin-intf: defines the plugin interface which concrete plugin should implement.
accountsdb-plugin-manager: manages the load/unload of plugins and provide interfaces which the validator can notify of accounts update to plugins.
accountsdb-plugin-postgres: the concrete plugin implementation for PostgreSQL
The validator integrations: updated streamed right after snapshot restore and after account update from transaction processing or other real updates.
The plugin is optionally loaded on demand by new validator CLI argument -- there is no impact if the plugin is not loaded.
* windows: Make solana-test-validator work
The important changes to get this going on Windows:
* ledger lock needs to be done on a file instead of the directory
* IPC service needs to use the Windows pipe naming scheme
* always disable the JIT
* file logging not possible yet because we can't redirect stderr,
but this will change once env_logger fixes the pipe output target!
* Integrate review feedback
* reimplement rpc pubsub with a broadcast queue
* update tests for new pubsub implementation
* fix: fix review suggestions
* chore(rpc): add additional pubsub metrics
* integrate max subscriptions check into SubscriptionTracker to reduce locking
* separate subscription control from tracker
* limit memory usage of items in pubsub broadcast queue, improve error handling
* add more pubsub metrics
* add final count metrics to pubsub
* add metric for total number of subscriptions
* fix small review suggestions
* remove by_params from SubscriptionTracker and add node_progress_watchers map instead
* add subscription tracker tests
* add metrics for number of pubsub notifications as a counter
* ignore clippy lint in TokenCounter
* fix underflow in token counter
* reduce queue capacity in pubsub tests
* fix(rpc): fix test timeouts
* fix race in account subscription test
* Add RpcSubscriptions::new_for_tests
Co-authored-by: Pavel Strakhov <p.strakhov@iconic.vc>
Co-authored-by: Nikita Podoliako <n.podoliako@zubr.io>
Co-authored-by: Tyera Eulberg <tyera@solana.com>
* Add slot, count and accumulated-units to per-program-timings for determining transaction cost elements
* correct the stats naming; fixes the dirty bit resetting
Add `--incremental-snapshots` flag to enable incremental snapshots.
This will allow setting `--full-snapshot-interval-slots` and
`--incremental-snapshot-interval-slots`.
Also added `--maximum-incremental-snapshots-to-retain`.
Co-authored-by: Michael Vines <mvines@gmail.com>
This is the 2nd installment for the AccountsDb replication.
Summary of Changes
The basic google protocol buffer protocol for replicating updated slots and accounts. tonic/tokio is used for transporting the messages.
The basic framework of the client and server for replicating slots and accounts -- the persisting of accounts in the replica-side will be done at the next PR -- right now -- the accounts are streamed to the replica-node and dumped. Replication for information about Bank is also not done in this PR -- to be addressed in the next PR to limit the change size.
Functionality used by both the client and server side are encapsulated in the replica-lib crate.
There is no impact to the existing validator by default.
Tests:
Observe the confirmed slots replicated to the replica-node.
Observe the accounts for the confirmed slot are received at the replica-node side.
AccountsBackgroundService now knows about incremental snapshots. It is
now also in charge of deciding if an AccountsPackage is destined to be a
SnapshotPackage or not (or just used by AccountsHashVerifier).
!!! New behavior changes !!!
Taking snapshots (both bank and archive) **MUST** succeed.
This is required because of how the last full snapshot slot is
calculated, which is used by AccountsBackgroundService when calling
`clean_accounts()`.
File system calls are now unwrapped and will result in a crash. As Trent told me:
>Well I think if a snapshot fails due to some IO error, it's very likely that the operator is going to have to intervene before it works. We should exit error in this case, otherwise the validator might happily spin for several more hours, never successfully writing a complete snapshot, before something else brings it down. This would leave the validator's last local snapshot many more slots behind than it would be had we exited outright and potentially force the operator to abandon ledger continuity in favor of a quick catchup
Other errors will set the `exit` flag to `true`, and the node will gracefully shutdown.
Fixes#19167Fixes#19168
#### Problem
Snapshot names are overloaded, and there are multiple terms that mean the same thing. This is confusing. Here's a list of ones in the codebase that I've found:
```
- snapshot_dir
- snapshots_dir
- snapshot_path
- snapshot_output_dir
- snapshot_package_output_path
- snapshot_archives_dir
```
#### Summary of Changes
For all the ones that are about the directory where snapshot archives are stored, ensure they are `snapshot_archives_dir`. For the ones about the (bank) snapshots directory, set to `bank_snapshots_dir`.
Co-authored-by: Michael Vines <mvines@gmail.com>
Shreds recovered from erasure codes have not been received from turbine
and have not been retransmitted to other nodes downstream. This results
in more repairs across the cluster which is slower.
This commit channels through recovered shreds to retransmit stage in
order to further broadcast the shreds to downstream nodes in the tree.
Working towards sending shreds (instead of packets) to retransmit stage
so that shreds recovered from erasure codes are as well retransmitted.
Following commit will add these metrics back to window-service, earlier
in the pipeline.
Cross cluster gossip contamination is causing cluster-slots hash map to
contain a lot of bogus values and consume too much memory:
https://github.com/solana-labs/solana/issues/17789
If a node is using the same identity key across clusters, then these
erroneous values might not be filtered out by shred-versions check,
because one of the variants of the contact-info will have matching
shred-version:
https://github.com/solana-labs/solana/issues/17789#issuecomment-896304969
The cluster-slots hash-map is bounded and trimmed at the lower end by
the current root. This commit also discards slots epochs ahead of the
root.
Renaming these types to better communicate their usages, which will
further diverge as incremental snapshot support is added.
With the new names, AccountsPacakge now refers to the type between
AccountsBackgroundProcess and AccountsHashVerifier, and SnapshotPackage
refers to the type between AccountsHashVerifier and
SnapshotPackagerService.
* added realtime cost checking logic to reject block that would exceed max limit:
- defines max limits at block_cost_limits.rs
- right after each bath's execution, accumulate its cost and check again
limit, return error if limit is exceeded
* update abi that changed due to adding additional TransactionError
* To avoid counting stats mltiple times, only accumulate execute-timing when a bank is completed
* gate it by a feature
* move cost const def into block_cost_limits.rs
* redefine the cost for signature and account access, removed signer part as it is not well defined for now
* check if per_program_timings of execute_timings before sending
Add a test for snapshots that spins up AccountsBackgroundService,
AccountsHashVerifier, and SnapshotPackagerService.
Currently there is not a test for snapshots that spins up the background
services fully. This means that there's not a current test that I can
use when adding incremental snapshot support to these three services.
Fixes#19014
Filtering out storages for incremental snapshots will be needed by the
background services for incremental snapshot support, but there is not a
Bank at that point. Since the filtering doesn't apply only to Bank, and
more to snapshots, move the functionality into snapshot_utils.
AccountsHashVerifier will need access to both the full and incremental
snapshot archive interval slots config values, which is in the
SnapshotConfig.
Also, cleanup some `Option<>` params and their references.
While reviewing PR #18565, as issue was brought up to refactor some code
around verifying the bank after rebuilding from snapshots. A new
top-level function has been added to get the latest snapshot archives
and load the bank then verify. Additionally, new tests have been
written and existing tests have been updated to use this new function.
Fixes#18973
While resolving the issue, it became clear there was some additional
low-hanging fruit this change enabled. Specifically, the functions
`bank_to_xxx_snapshot_archive()` now return their respective
`SnapshotArchiveInfo`. And on the flip side,
`bank_from_snapshot_archives()` now takes `SnapshotArchiveInfo`s instead
of separate paths and archive formats. This bundling simplifies bank
rebuilding.
bank.get_leader_schedule_epoch(shred_slot)
is one epoch after epoch_schedule.get_epoch(shred_slot).
At epoch boundaries, shred is already one epoch after the root-slot. So
we need epoch-stakes 2 epochs ahead of the root. But the root bank only
has epoch-stakes for one epoch ahead, and as a result looking up epoch
staked-nodes from the root-bank fails.
To be backward compatible with the current master code, this commit
implements a fallback on working-bank if epoch staked-nodes obtained
from the root-bank is none.
If two threads simultaneously call into ClusterNodesCache::get for the
same epoch, and the cache entry is outdated, then both threads recompute
cluster-nodes for the epoch and redundantly overwrite each other.
This commit wraps ClusterNodesCache entries in Arc<Mutex<...>>, so that
when needed only one thread does the computations to update the entry.
* Current caching mechanism does not update cluster-nodes when the epoch
(and so epoch staked nodes) changes:
https://github.com/solana-labs/solana/blob/19bd30262/core/src/broadcast_stage/standard_broadcast_run.rs#L332-L344
* Additionally, the cache update has a concurrency bug in which the
thread which does compare_and_swap may be blocked when it tries to
obtain the write-lock on cache, while other threads will keep running
ahead with the outdated cache (since the atomic timestamp is already
updated).
In the new ClusterNodesCache, entries are keyed by epoch, and so if
epoch changes cluster-nodes will be recalculated. The time-to-live
eviction policy is also encapsulated and rigidly enforced.
The new cluster-nodes cache will:
* ensure cluster-nodes are recalculated if the epoch (and so the epoch
staked nodes) changes.
* encapsulate time-to-live eviction policy.
Cluster nodes are cached keyed by the respective epoch from which stakes
are obtained, and so if epoch changes cluster-nodes will be recomputed.
A time-to-live eviction policy is enforced to refresh entries in case
gossip contact-infos are updated.
This commit also renames `snapshot_interval_slots` to
`full_snapshot_archive_interval_slots`, updates the comments on the
fields, and make appropriate updates where SnapshotConfig is used.
This PR solves #18815. Note that I had to make the snapshot prefix
constants inside `snapshot_utils.rs` public at the crate level in order
to make this work. I'm not sure whether or not introducing this
dependency is entirely good, either way the `snapshot_utils.rs` file
needs a lot of rework so things will move around, I believe this does
the work in the meantime. Any feedback will be greatly appreciated.
* wip - cost_update_services to log both us and cu for each instruction to determine possible ratio
* replace microsecond with compute_unit as cost unit
This commit builds on PR #18504 by adding a test to core/tests/snapshot.rs for Incremental Snapshots. The test adds banks to bank forks in a loop and takes both full snapshots and incremental snapshots at intervals, and validates they are rebuild-able.
For background info about Incremental Snapshots, see #17088.
Fixes#18829 and #18972
This commit adds high-level functions for creating and loading-from
incremental snapshots, plus all low-level functions required to perform
those tasks. This commit **does not** add taking incremental snapshots
as part of a running validator, nor starting up a node with an
incremental snapshot; just laying ground work.
Additionally, `snapshot_utils` and `serde_snapshot` have been
refactored to use a common code paths for the different snapshots.
Also of note, some renaming has happened:
1. Snapshots are now either `full_` or `incremental_` throughout the
codebase. If not specified, the code applies to both.
2. Bank snapshots now are called "bank snapshots"
(before they were called "slot snapshots", "bank snapshots", or
just "snapshots"). The one exception is within `Bank`, where they
are still just "snapshots", because they are already "bank
snapshots".
3. Snapshot archives now have `_archive` in the code. This
should clear up an ambiguity between bank snapshots and snapshot
archives.
This commit adds CrdsEntry trait which allows generic lookups into crds
table. For example to get ContactInfo or LowestSlot associated with a
Pubkey, the lookup code would be respectively:
crds.get::<&ContactInfo>(pubkey)
crds.get::<&LowestSlot>(pubkey)
* Move transaction sanitization earlier in the pipeline
* Renamed HashedTransaction to SanitizedTransaction
* Implement deref for sanitized transaction
* bring back process_transactions test method
* Use sanitized transactions for cost model calculation
push_lowest_slot cannot sign the new crds-value unless the id (pubkey)
argument passed-in is the same pubkey as in ClusterInfo::keypair(), in
which case the id argument is redundant:
https://github.com/solana-labs/solana/blob/bb41cf346/gossip/src/cluster_info.rs#L824-L845
Additionally, the lookup is done with self.id(), but insert is done with
the id argument, which is logically a bug.
1. check transaction has valid program_id before using it to avoid possible panic;
2. change calculate_cost function signature to return Result;
3. add CostModelError enum, update return type from Result<_, str> to Result<_, CostModelError>
* update ledger tool to restore cost model from blockstore when compute-slot-cost
* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool
* refactor and simplify a test
The repair-peers cache is reset each time repair service loop runs,
and so computed repeatedly for the same slots:
https://github.com/solana-labs/solana/blob/d2b07dca9/core/src/repair_service.rs#L275
This commit uses an LRU cache to persists repair-peers for each slot.
In addition to LRU eviction rules, in order to avoid re-using outdated
data, each entry also has 10 seconds TTL.
Broadcast stage and retransmit stage should arrange nodes on turbine
broadcast tree in exactly same order. Additionally any changes to this
ordering (e.g. updating how unstaked nodes are handled) requires feature
gating to keep the cluster in sync.
Current implementation is scattered out over several public methods and
exposes too much of implementation details (e.g. usize indices into
peers vector) which makes code changes and checking for feature
activations more difficult.
This commit encapsulates turbine peer computations into a new struct,
and only exposes two public methods, get_broadcast_peer and
get_retransmit_peers, for call-sites.
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`
* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup
* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table
* Change mutex on cost_tracker to RwLock
* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.
* acquire and hold locks for block of TXs, instead of acquire and release per transaction;
* remove redundant would_fit check from cost_tracker update execution path
* refactor cost checking with less frequent lock acquiring
* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.
* create hashmap with new_capacity to reduce runtime heap realloc.
* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default
* address potential deadlock by acquiring locks one at time
When starting a validator, the node initially joins gossip with
shred_verison = 0, until it adopts the entrypoint's shred-version:
https://github.com/solana-labs/solana/blob/9b182f408/validator/src/main.rs#L417
Depending on the load on the entrypoint, this adopting entrypoint
shred-version through gossip sometimes becomes very slow, and causes
several problems in gossip because we have to partially support
shred_version == 0 which is a source of leaking crds values from one
cluster to another. e.g. see
https://github.com/solana-labs/solana/pull/17899
and the other linked issues there.
In order to remove shred_version == 0 from gossip, this commit adds
shred-version to ip-echo-server response. Once the entrypoints are
updated, on validator start-up, if --expected_shred_version is not
specified we will obtain shred-version from the entrypoint using
ip-echo-server.
1. Added both options for measuring space usage using total accounts usage and for individual store shrink ratio using an enum. Validator CLI options: --accounts-shrink-optimize-total-space and --accounts-shrink-ratio
2. Added code for selecting candidates based on total usage in a separate function select_candidates_by_total_usage
3. Added unit tests for the new functions added
4. The default implementations is kept at 0.8 shrink ratio with --accounts-shrink-optimize-total-space set to true
Fixes#17544
* replay stage feeds back realtime per-program execution cost to cost model;
* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;
* changed cost unit to microsecond, using value collected from mainnet;
* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.
* Create solana-poh crate
* Move BigTableUploadService to solana-ledger
* Add solana-rpc to workspace
* Move dependencies to solana-rpc
* Move remaining rpc modules to solana-rpc
* Single use statement solana-poh
* Single use statement solana-rpc
* Accounts dumping logic
* Add test for interaction between cache flush and remove_unrooted_slot()
* Update comments
* Rename
* renaming
* Add more comments
* Renaming
* Fixup test and bad check
* * Add following to banking_stage:
1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
2. CostTracker which is shared between threads, tracks transaction costs for each block.
* replace hard coded program ID with id() calls
* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.
* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.
* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes
* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;
* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations
* Update rocksdb to v0.16.0
* Promote the infrequent and important log to info!
* Force background compaction by ttl without manual compaction
* Fix test
* Support no compaction mode in test_ledger_cleanup_compaction
* Fix comment
* Make compaction_interval customizable
* Avoid major compaction with periodic filtering...
* Adress lazy_static, special cfs and range check
* Clean up a bit and add comment
* Add comment
* More comments...
* Config code cleanup
* Add comment
* Use .conflicts_with()
* Nullify unneeded delete_range ops for special CFs
* Some clean ups
* Clarify the locking intention
* Ensure special CFs' consistency with PurgeType::CompactionFilter
* Fix comment
* Fix bad copy paste
* Fix various types...
* Don't use tuples
* Add a unit test for compaction_filter
* Fix typo...
* Remove flag and just use new behavior always
* Fix wrong condition negation...
* Doc. about no set_last_purged_slot in purge_slots
* Write a test and fix off-by-one bug....
* Apply suggestions from code review
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
* Follow up to github review suggestions
* Fix line-wrapping
* Fix conflict
Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
* Add BlockHeight CF to blockstore
* Rename CacheBlockTimeService to be more general
* Cache block-height using service
* Fixup previous proto mishandling
* Add block_height to block structs
* Add block-height to solana block
* Fallback to BankForks if block time or block height are not yet written to Blockstore
* Add docs
* Review comments
* Move gossip modules to solana-gossip
* Update Protocol abi digest due to move
* Move gossip benches and hook up CI
* Remove unneeded Result entries
* Single use statements
* Add blockstore-root-scan for api nodes on boot
* Ensure cluster-confirmed root and parents are set as root in blockstore in load_frozen_forks()
* Plumb rpc-scan-and-fix-roots validator flag
For all code paths (gossip push, pull, purge, etc) that remove or
override a crds value, it is necessary to record hash of values purged
from crds table, in order to exclude them from subsequent pull-requests;
otherwise the next pull request will likely return outdated values,
wasting bandwidth:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491
Currently this is done all over the place in multiple modules, and this
has caused bugs in the past where purged values were not recorded.
This commit encapsulated this bookkeeping into crds module, so that any
code path which removes or overrides a crds value, also records the hash
of purged value in-place.
In order to remove port-based forwarding logic in turbine, we need to
first track how often the turbine retransmit/broadcast trees mismatch
across nodes.
One consistency condition is that if the node is on the critical path
(i.e. the first node in each neighborhood), then we expect that the
packet arrives at tvu socket as opposed to tvu-forwards.
This commit adds a metric to track how often above condition is not met.
If stakes are unknown, then timeouts will be short, resulting in values
being purged from the crds table, and consequently higher pull-response
load when they are obtained again from gossip. In particular, this slows
down validator start where almost all values obtained from entrypoint
are immediately discarded.
On the receiving end, the outdated values are discarded, and they will
only waste bandwidth:
https://github.com/solana-labs/solana/blob/3f0480d06/core/src/crds_gossip_pull.rs#L385-L400
This is also exacerbating validator start, since the entrypoint is
returning old values in pull responses, and the validator immediately
discards those; resulting in huge delay until the validator obtains
contact-info of the entrypoint and is able to adopt shred-version and
fully start.
When a validator starts, it has an (almost) empty crds table and it only
sends one pull-request to the entrypoint. The bloom filter in the
pull-request targets 10% false rate given the number of items. So, if
the `num_items` is very wrong, it makes a very small bloom filter with a
very high false rate:
https://github.com/solana-labs/solana/blob/2ae57c172/runtime/src/bloom.rs#L70-L80https://github.com/solana-labs/solana/blob/2ae57c172/core/src/crds_gossip_pull.rs#L48
As a result, it is very unlikely that the validator obtains entrypoint's
contact-info in response. This exacerbates how long the validator will
loop on:
> Waiting to adopt entrypoint shred version
https://github.com/solana-labs/solana/blob/ed51cde37/validator/src/main.rs#L390-L412
This commit increases the min number of bloom items when making gossip
pull requests. Effectively this will break the entrypoint crds table
into 64 shards, one pull-request for each, a larger bloom filter for
each shard, and increases the chances that the response will include
entrypoint's contact-info, which is needed for adopting shred version
and validator start.
The current implementations use only the id and disregard other fields,
in particular wallclock. This can lead to bugs where an outdated
contact-info shadows or overrides a current one because they compare
equal.
* Upgrade Rust to 1.52.0
update nightly_version to newly pushed docker image
fix clippy lint errors
1.52 comes with grcov 0.8.0, include this version to script
* upgrade to Rust 1.52.1
* disabling Serum from downstream projects until it is upgraded to Rust 1.52.1
crds table retains up to 32 node-instance values per each pubkey. This
is so because if there are multiple running instances of the same node,
then we want gossip to propagate node-instance values associated with
both instances, therefore the corresponding label/key includes the
randomly generated token in addition to the pubkey:
https://github.com/solana-labs/solana/blob/9c42a89a4/core/src/crds_value.rs#L448https://github.com/solana-labs/solana/pull/14037
As a result, the number of such values per pubkey are effectively
unbounded, requiring custom mitigations implemented in:
https://github.com/solana-labs/solana/pull/14467
but still taking redundant extra memory and bandwidth.
This commit instead retains only one node-instance per pubkey by
extending crds values override logic. If a crds value is of type
node-instance, it will always override an existing one with the same key
if it has more recent starting timestamp (not wallclock). As a result,
gossip will always propagate the node-instance with more recent
timestamp. Since the check_duplicate logic will stop the node with older
timestamp, this change should preserve existing functionality.
* purge_old_snapshot_archives is changed to take an extra argument 'maximum_snapshots_to_retain' to control the max number of latest snapshot archives to retain. Note the oldest snapshot is always retained as before and is not subjected to this new options.
* The validator and ledger-tool executables are modified with a CLI argument --maximum-snapshots-to-retain. And the options are propagated down the call chains. Their corresponding shell scripts were changed accordingly.
* SnapshotConfig is modified to have an extra field for the maximum_snapshots_to_retain
* Unit tests are developed to cover purge_old_snapshot_archives