* Push and aggregate RestartLastVotedForkSlots.
* Fix API and lint errors.
* Reduce clutter.
* Put my own LastVotedForkSlots into the aggregate.
* Write LastVotedForkSlots aggregate progress into local file.
* Fix typo and name constants.
* Fix flaky test.
* Clarify the comments.
* - Use constant for wait_for_supermajority
- Avoid waiting after first shred when repair is in wen_restart
* Fix delay_after_first_shred and remove loop in wen_restart.
* Read wen_restart slots inside the loop instead.
* Discard turbine shreds while in wen_restart in windows insert rather than
shred_fetch_stage.
* Use the new Gossip API.
* Rename slots_to_repair_for_wen_restart and a few others.
* Rename a few more and list all states.
* Pipe exit down to aggregate loop so we can exit early.
* Fix import of RestartLastVotedForkSlots.
* Use the new method to generate test bank.
* Make linter happy.
* Use new bank constructor for tests.
* Fix a bad merge.
* - add new const for wen_restart
- fix the test to cover more cases
- add generate_repairs_for_slot_not_throtted_by_tick and
generate_repairs_for_slot_throtted_by_tick to make it readable
* Add initialize and put the main logic into a loop.
* Change aggregate interface and other fixes.
* Add failure tests and tests for state transition.
* Add more tests and add ability to recover from written records in
last_voted_fork_slots_aggregate.
* Various name changes.
* We don't really care what type of error is returned.
* Wait on expected progress message in proto file instead of sleep.
* Code reorganization and cleanup.
* Make linter happy.
* Add WenRestartError.
* Split WenRestartErrors into separate erros per state.
* Revert "Split WenRestartErrors into separate erros per state."
This reverts commit 4c920cb8f8d492707560441912351cca779129f6.
* Use individual functions when testing for failures.
* Move initialization errors into initialize().
* Use anyhow instead of thiserror to generate backtrace for error.
* Add missing Cargo.lock.
* Add error log when last_vote is missing in the tower storage.
* Change error log info.
* Change test to match exact error.
#### Problem
In TieredAccountMeta, RENT_EXEMPT_RENT_EPOCH will be used when
its optional field rent_epoch is None. However, for legacy reasons, 0
should be used for zero-lamport accounts.
#### Summary of Changes
Return 0 for TieredAccountMeta::rent_epoch() for zero-lamport accounts.
#### Test Plan
accounts_db::tests::test_clean_zero_lamport_and_dead_slot
* deprecate ThinClient
* switch localcluster bench test to use tpuclient
add back in command line args for thinclient. add thin-client deprecation README
refactor TpuClient connection
* remove thin-client from net/
* change 2.0.0 to 1.19.0
The name was previously hard-coded to solReceiver. The use of the same
name makes it hard to figure out which thread is which when these
threads are handling many services (Gossip, Tvu, etc).
ledger-tool: verify: add --verify-slots and --verify-slots-details
This adds:
--record-slots <FILENAME>
Write the slot hashes to this file.
--record-slots-config hash-only|accounts
Store the bank (=accounts) json file, or not.
--verify-slots <FILENAME>
Verify slot hashes against this file.
The first case can be used to dump a list of (slot, hash) to a json file
during a replay. The second case can be used to check slot hashes against
previously recorded values.
This is useful for debugging consensus failures, eg:
# on good commit/branch
ledger-tool verify --record-slots good.json --record-slots-config=accounts
# on bad commit or potentially consensus breaking branch
ledger-tool verify --verify-slots good.json
On a hash mismatch an error will be logged with the expected hash vs the
computed hash.
* add bench for ed25519 instruction
* add bench for secp256k1 instruction
* Apply suggestions from code review
Co-authored-by: Andrew Fitzgerald <apfitzge@gmail.com>
* prepare unique txs for benching
* use iter::Cycle for endless loop
---------
Co-authored-by: Andrew Fitzgerald <apfitzge@gmail.com>
* Forbids all program replacements except for reloads and builtins.
* Adds test_assign_program_failure() and test_assign_program_success().
* Explicitly disallows LoadedProgramType::DelayVisibility to be inserted in the global cache.
There are several cases for fetching entries from the Blockstore:
- Fetching entries for block replay
- Fetching entries for CompletedDataSetService
- Fetching entries to service RPC getBlock requests
All of these operations occur in a different calling thread. However,
the currently implementation utilizes a shared thread-pool within the
Blockstore function. There are several problems with this:
- The thread pool is shared between all of the listed cases, despite
block replay being the most critical. These other services shouldn't
be able to interfere with block replay
- The thread pool is overprovisioned for the average use; thread
utilization on both regular validators and RPC nodes shows that many
of the thread see very little activity. But, these thread existing
introduce "accounting" overhead
- rocksdb exposes an API to fetch multiple items at once, potentially
with some parallelization under the hood. Using parallelization in
our API and the underlying rocksdb is overkill and we're doing more
damage than good.
This change removes that threadpool completely, and instead fetches
all of the desired entries in a single call. This has been observed
to have a minor degradation on the time spent within the Blockstore
get_slot_entries_with_shred_info() function. Namely, some buffer
copying and deserialization that previously occurred in parallel now
occur serially.
However, the metric that tracks the amount of time spent replaying
blocks (inclusive of fetch) is unchanged. Thus, despite spending
marginally more time to fetch/copy/deserialize with only a single
thread, the gains from not thrashing everything else with the pool
keep us at parity.