Commit Graph

1388 Commits

Author SHA1 Message Date
Pankaj Garg 403225f112 Remove public visibility of program cache from bank (#279) 2024-03-20 16:24:48 -05:00
Wen 9e394bd0e5 Wen_restart: check block full using blockstore (#250)
* Switch to blockstore.is_full() check because replay thread isn't active.

* Use make_chaining_slot_entries and add first_parent to the method.
Small style fixes.

* Switch to blockstore.is_full() check because replay thread isn't active.
2024-03-15 22:25:14 -05:00
steviez 55408093cd Make ReplayStage own the threadpool for tx replay (#190)
The threadpool used to replay multiple transactions in parallel is
currently global state via a lazy_static definition. Making this pool
owned by ReplayStage will enable subsequent work to make the pool
size configurable on the CLI.

This makes `ReplayStage` create and hold the threadpool which is passed
down to blockstore_processor::confirm_slot().

blockstore_processor::process_blockstore_from_root() now creates its'
own threadpool as well; however, this pool is only alive while for
the scope of that function and does not persist the lifetime of the
process.
2024-03-15 22:22:45 -05:00
Brooks 096a1f4e5c Removes holding storages in AccountsHashVerifier for fastboot (#120) 2024-03-15 22:22:45 -05:00
Lucas Steuernagel 2c55819e7b Gather recording booleans in a data structure (#134) 2024-03-09 13:28:08 -06:00
steviez 9decf2a320 blockstore: Remove unnecessary function and threadpool (#122)
In a previous change, we removed the threadpool used to fetch entries
in parallel in favor of combining all fetches into a single rocksdb
multi_get() call.

This change does the same thing, except for a threadpool that was used
to fetch entries when we needed them to purge the transaction status
and address signatures columns.
2024-03-09 13:28:07 -06:00
Dmitri Makarov 264f4dfdd0 [SVM] Move RuntimeConfig to program-runtime (#96)
RuntimeConfig doesn't use anything SVM specific and logically belongs
in program runtime rather than SVM.  This change moves the definition
of RuntimeConfig struct from the SVM crate to program-runtime and
adjusts `use` statements accordingly.
2024-03-09 13:27:11 -06:00
Will Hickey df57657985 Revert "[anza migration] rename crates (#10)"
This reverts commit 3f9a7a52ea.
2024-03-05 10:18:50 -06:00
Ashwin Sekar 94ed7a31e9 blockstore_purge: fix inspect -> inspect_err (#66) 2024-03-05 09:43:25 -06:00
Brooks f94752d514 Adds StartingSnapshotStorages to AccountsHashVerifier (#58) 2024-03-05 09:43:25 -06:00
Yihau Chen 14cb9cff92 [anza migration] rename crates (#10)
* rename geyser-plugin-interface

* rename cargo registry

* rename watchtower

* rename ledger tool

* rename validator

* rename install

* rename geyser plugin interface when patch
2024-03-05 09:43:25 -06:00
Ashwin Sekar cc4072bce8
blockstore: atomize slot clearing, relax parent slot meta check (#35124)
* blockstore: atomize slot clearing, relax parent slot meta check

clear_unconfirmed_slot can leave blockstore in an irrecoverable state
if it panics in the middle. write batch this function, so that any
errors can be recovered after restart.

additionally relax the constraint that the parent slot meta must exist,
as it could have been cleaned up if outdated.

* pr feedback: use PurgeType, don't pass slot_meta

* pr feedback: add unit test

* pr feedback: refactor into separate function

* pr feedback: add special columns to helper, err msg, comments

* pr feedback: reword comments and write batch error message

* pr feedback: bubble write_batch error to caller

* pr feedback: reword comments

Co-authored-by: steviez <stevecz@umich.edu>

---------

Co-authored-by: steviez <stevecz@umich.edu>
2024-03-02 23:23:55 -05:00
Brooks 245530b28e
Uses purge_all_bank_snapshots() (#35380) 2024-03-01 07:11:38 -05:00
Sean Young 9bb59aa30f
ledger-tool: verify: add --record-slots and --verify-slots (#34246)
ledger-tool: verify: add --verify-slots and --verify-slots-details

This adds:

    --record-slots <FILENAME>
	Write the slot hashes to this file.

    --record-slots-config hash-only|accounts
	Store the bank (=accounts) json file, or not.

    --verify-slots <FILENAME>
        Verify slot hashes against this file.

The first case can be used to dump a list of (slot, hash) to a json file
during a replay. The second case can be used to check slot hashes against
previously recorded values.

This is useful for debugging consensus failures, eg:

    # on good commit/branch
    ledger-tool verify --record-slots good.json --record-slots-config=accounts

    # on bad commit or potentially consensus breaking branch
    ledger-tool verify --verify-slots good.json

On a hash mismatch an error will be logged with the expected hash vs the
computed hash.
2024-03-01 08:39:30 +00:00
Ashwin Sekar e8c87e86ef
local-cluster: fix flaky optimistic_confirmation tests (#35356)
* local-cluster: fix flaky optimistic_confirmation tests

* pr feedback: latest_vote -> newest_vote, reword some comments
2024-02-29 12:05:20 -08:00
Brooks bdc5cceb18
Purges all bank snapshots after fastboot (#35350) 2024-02-29 14:31:13 -05:00
behzad nouri a7a41e7631
adds Merkle shred variant with retransmitter's signature (#35293)
Moving towards locking down Turbine propagation path, the commit
reserves a buffer within shred payload for retransmitter's signature.
2024-02-28 20:31:40 +00:00
steviez 09925a11eb
Remove the Blockstore thread pool used for fetching Entries (#34768)
There are several cases for fetching entries from the Blockstore:
- Fetching entries for block replay
- Fetching entries for CompletedDataSetService
- Fetching entries to service RPC getBlock requests

All of these operations occur in a different calling thread. However,
the currently implementation utilizes a shared thread-pool within the
Blockstore function. There are several problems with this:
- The thread pool is shared between all of the listed cases, despite
  block replay being the most critical. These other services shouldn't
  be able to interfere with block replay
- The thread pool is overprovisioned for the average use; thread
  utilization on both regular validators and RPC nodes shows that many
  of the thread see very little activity. But, these thread existing
  introduce "accounting" overhead
- rocksdb exposes an API to fetch multiple items at once, potentially
  with some parallelization under the hood. Using parallelization in
  our API and the underlying rocksdb is overkill and we're doing more
  damage than good.

This change removes that threadpool completely, and instead fetches
all of the desired entries in a single call. This has been observed
to have a minor degradation on the time spent within the Blockstore
get_slot_entries_with_shred_info() function. Namely, some buffer
copying and deserialization that previously occurred in parallel now
occur serially.

However, the metric that tracks the amount of time spent replaying
blocks (inclusive of fetch) is unchanged. Thus, despite spending
marginally more time to fetch/copy/deserialize with only a single
thread, the gains from not thrashing everything else with the pool
keep us at parity.
2024-02-26 20:27:03 -06:00
behzad nouri 0ab425b43b
splits test_shred_variant_compat into separate test-cases (#35306) 2024-02-26 17:32:47 +00:00
behzad nouri c8ee4f59ad
uses struct instead of tuple for Merkle shreds variant (#35303)
Working towards adding a new Merkle shred variant with retransmitter's
signature, the commit uses struct instead of tuple to describe Merkle shred
variant.
2024-02-26 15:58:40 +00:00
Brooks 7da8d82aa1
Adds snapshot_utils::purge_all_bank_snapshots() (#35291) 2024-02-23 11:15:10 -05:00
steviez 4905076fb6
Remove channel that sends roots to BlockstoreCleanupService (#35211)
Currently, ReplayStage sends new roots to BlockstoreCleanupService, and
BlockstoreCleanupService decides when to clean based on advancement of
the latest root. This is totally unnecessary as the latest root is
cached by the Blockstore, and this value can simply be fetched.

This change removes the channel completely, and instead just fetches
the latest root from Blockstore directly. Moreso, some logic is added
to check the latest root less frequently, based on the set purge
interval.

All in all, we went from sending > 100 slots/min across a crossbeam
channel to reading an atomic roughly 3 times/min, while also removing
the need for an additional thread that read from the channel.
2024-02-21 10:16:16 -06:00
Dmitri Makarov 0acee67891
SVM: move transaction_results from accounts-db to SVM (#35183)
SVM: Remove accounts-db deps in accounts_loader tests
2024-02-20 12:54:56 -08:00
sakridge e4064023bf
Set COPYFILE_DISABLE for mac os so it doesn't generate ._ files (#35213) 2024-02-16 21:58:06 +01:00
behzad nouri 0cfb06f745
adds rollout path for chained Merkle shreds (#35076)
The commit adds should_chain_merkle_shreds to incrementally roll out
chained Merkle shreds to clusters.
2024-02-08 23:06:00 +00:00
Dmitri Makarov b9ee3b475b
SVM: Move RentDebits from accounts-db to Solana SDK (#35135) 2024-02-07 15:10:17 -08:00
Pankaj Garg 46b9586630
SVM: Move SVM code to its own crate folder (#35119) 2024-02-06 16:06:32 -08:00
steviez fddfc8431e
Reorder fields in shred_insert_is_full datapoint (#35117)
Put the slot as the first field to make grep'ing for datapoints for a
specific slot in logs easier. This does not effect the datapoints
submission / presentation in metrics database
2024-02-06 16:38:05 -06:00
behzad nouri 8d0ca9db78
chains Merkle shreds in broadcast fake shreds (#35061)
The commit migrates
    turbine/src/broadcast_stage/broadcast_fake_shreds_run.rs
to use chained Merkle shreds variant.
2024-02-06 20:02:38 +00:00
Pankaj Garg 3cf5dd2afb
SVM: Move RuntimeConfig to svm folder (#35085) 2024-02-05 13:49:36 -08:00
Brooks daa2449ad4
Removes RwLock on AccountsDb::shrink_paths (#35027) 2024-02-01 09:35:34 -05:00
behzad nouri 79bbe4381a
adds chained_merkle_root to shredder arguments (#34952)
Working towards chaining Merkle root of erasure batches, the commit adds
chained_merkle_root to shredder arguments.
2024-01-27 15:04:31 +00:00
behzad nouri d4fdcd940a
adds feature to enable chained Merkle shreds (#34916)
During a cluster upgrade when only half of the cluster can ingest the new shred
variant, sending shreds of the new variant can cause nodes to diverge.
The commit adds a feature to enable chained Merkle shreds explicitly.
2024-01-27 15:03:16 +00:00
steviez 9122193e17
blockstore: Make is_orphan() a method of SlotMeta (#34889)
The old function's only input is a SlotMeta, so makes sense to move
it a member function of SlotMeta
2024-01-22 19:14:51 -06:00
behzad nouri 9a520fd5b4
adds chained merkle shreds variant (#34787)
With the new chained variants, each Merkle shred will also embed the Merkle
root of the previous erasure batch.
2024-01-20 16:08:16 +00:00
steviez 3bccdaff7f
blockstore: Adjust the error message for missing shreds (#34833)
The log statement is currently a bit misleading, and could be
interpretted as saying this routine deleted a shred.

Adjust the log statement to state that this routine is looking for the
shred but couldn't find it. Also, elevate the log to error level as
inconsistent state across columns should not be happening.
2024-01-18 12:17:49 -06:00
behzad nouri 586c794c8a
adds get_proof_offset for Merkle shreds (#34798)
In preparation of adding chained Merkle shreds variant, the commit
reworks api for proof-offset within the shred binary.
2024-01-17 20:53:56 +00:00
Andrew Fitzgerald 257ba2f0b1
Add benchmark for execute_batch (#34717) 2024-01-13 09:09:04 -08:00
Justin Starry 5f74fc4f16
Update genesis processing to have a fallback collector id for tests (#34135)
* Update genesis processing to have a fallback collector id for tests

* DCOU-ify the collector id for tests parameter (#1902)

* wrap test_collector_id in DCOU

* rename param to collector_id_for_tests

* fix program test

* fix dcou

---------

Co-authored-by: Brooks <brooks@prumo.org>
2024-01-10 08:34:41 +08:00
Brooks abe699b7b4
Adds newline to fastboot's CLI help (#34712) 2024-01-09 15:28:39 -05:00
steviez dce3ce3734
Adjust blockstore open logs to say blockstore instead of database (#34672)
Additionally, make the log before/after the open more similar so it is
more clear while skimming logs that they correspond to each other.
2024-01-05 21:23:39 -06:00
Ashwin Sekar 19088411ff
blockstore: populate duplicate shred proofs for merkle root conflicts (#34270)
* blockstore: populate duplicate shred proofs for merkle root conflicts

* pr feedback: check test case

* pr feedback: comment

* pr feedback: match statement, shred_id, comment

* add feature flag

* pr feedback: rename ff var and perform_merkle_check

* pr feedback: move panic to callers in get_shred_from_just_inserted_or_db

* avoid unecessary write if proof is already present
2024-01-03 12:15:52 -05:00
Nick Frostbutter fc2a8794be
[docs] updated readme and fix links (#34565)
* feat: updated readme

* fix: updated links

* fix: proposal links

* fix: more links

* fix: json-rpc links

* fix: more links

* fix: zk links

* fix: managing forks

* fix: links for deprecated methods
2024-01-03 09:06:06 -05:00
Ashwin Sekar cc584a0c19
blockstore: write only dirty erasure meta and merkle root metas (#34269)
* blockstore: write only dirty erasure meta and merkle root metas

* pr feedback: use enum to distinguish clean/dirty

* pr feedback: comments, rename

* pr feedback: use AsRef
2023-12-22 16:26:50 -05:00
GoodDaisy 03386cc7b9
Fix typos (#34459)
* Fix typos

* Fix typos

* fix typo
2023-12-21 13:06:00 -07:00
Ryo Onodera d2b5afc410
Finish unified scheduler plumbing with min impl (#34300)
* Finalize unified scheduler plumbing with min impl

* Fix comment

* Rename leftover type name...

* Make logging text less ambiguous

* Make PhantomData simplyer without already used S

* Make TaskHandler stateless again

* Introduce HandlerContext to simplify TaskHandler

* Add comment for coexistence of Pool::{new,new_dyn}

* Fix grammar

* Remove confusing const for upcoming changes

* Demote InstalledScheduler::context() into dcou

* Delay drop of context up to return_to_pool()-ing

* Revert "Demote InstalledScheduler::context() into dcou"

This reverts commit 049a126c905df0ba8ad975c5cb1007ae90a21050.

* Revert "Delay drop of context up to return_to_pool()-ing"

This reverts commit 60b1bd2511a714690b0b2331e49bc3d0c72e3475.

* Make context handling really type-safe

* Update comment

* Fix grammar...

* Refine type aliases for boxed traits

* Swap the tuple order for readability & semantics

* Simplify PooledScheduler::result_with_timings type

* Restore .in_sequence()

* Use where for aesthetics

* Simplify if...

* Fix typo...

* Polish ::schedule_execution() a bit

* Fix rebase conflicts..

* Make test more readable

* Fix test failures after rebase...
2023-12-19 09:50:41 +09:00
behzad nouri 750023530c
makes last erasure batch size >= 64 shreds (#34330) 2023-12-13 06:48:00 +00:00
steviez 70cab76495
Remove deletion of TransactionStatusIndex entries (#34023)
These entries are legacy code at this point; however, older release
branches require these entries to be present. Also, while it would be
nice to clean up these entries immediately, they only occupy a small
amount of space so having them linger a little longer isn't a big deal.
2023-12-12 22:53:51 -06:00
behzad nouri d5eee01950
adds feature gated code to drop legacy shreds (#34328) 2023-12-06 22:47:46 +00:00
Lucas Steuernagel 1877fdb273
Use BankForks on tests - Part 4 (#34271)
* Use BankForks on tests - Part 4

* Ensure the correct slot is set
2023-12-06 13:32:04 -03:00