Put the slot as the first field to make grep'ing for datapoints for a
specific slot in logs easier. This does not effect the datapoints
submission / presentation in metrics database
During a cluster upgrade when only half of the cluster can ingest the new shred
variant, sending shreds of the new variant can cause nodes to diverge.
The commit adds a feature to enable chained Merkle shreds explicitly.
The log statement is currently a bit misleading, and could be
interpretted as saying this routine deleted a shred.
Adjust the log statement to state that this routine is looking for the
shred but couldn't find it. Also, elevate the log to error level as
inconsistent state across columns should not be happening.
* Update genesis processing to have a fallback collector id for tests
* DCOU-ify the collector id for tests parameter (#1902)
* wrap test_collector_id in DCOU
* rename param to collector_id_for_tests
* fix program test
* fix dcou
---------
Co-authored-by: Brooks <brooks@prumo.org>
* blockstore: write only dirty erasure meta and merkle root metas
* pr feedback: use enum to distinguish clean/dirty
* pr feedback: comments, rename
* pr feedback: use AsRef
* Finalize unified scheduler plumbing with min impl
* Fix comment
* Rename leftover type name...
* Make logging text less ambiguous
* Make PhantomData simplyer without already used S
* Make TaskHandler stateless again
* Introduce HandlerContext to simplify TaskHandler
* Add comment for coexistence of Pool::{new,new_dyn}
* Fix grammar
* Remove confusing const for upcoming changes
* Demote InstalledScheduler::context() into dcou
* Delay drop of context up to return_to_pool()-ing
* Revert "Demote InstalledScheduler::context() into dcou"
This reverts commit 049a126c905df0ba8ad975c5cb1007ae90a21050.
* Revert "Delay drop of context up to return_to_pool()-ing"
This reverts commit 60b1bd2511a714690b0b2331e49bc3d0c72e3475.
* Make context handling really type-safe
* Update comment
* Fix grammar...
* Refine type aliases for boxed traits
* Swap the tuple order for readability & semantics
* Simplify PooledScheduler::result_with_timings type
* Restore .in_sequence()
* Use where for aesthetics
* Simplify if...
* Fix typo...
* Polish ::schedule_execution() a bit
* Fix rebase conflicts..
* Make test more readable
* Fix test failures after rebase...
These entries are legacy code at this point; however, older release
branches require these entries to be present. Also, while it would be
nice to clean up these entries immediately, they only occupy a small
amount of space so having them linger a little longer isn't a big deal.
There are operations in bank_fork_utils that may fail; we explicitly
call std::process::exit() on several of these. Granted we may end up
exiting the process higher up the callstack, bubbling the errors up
allow a caller that could handle the error to do so.
As we develop new features or modifications, we occassionally need to
introduce new columns to the Blockstore. Adding a new column introduces
a compatibility break given that opening the database in Primary mode
(R/W access) requires opening all columns. Reverting to an old software
version that is unaware of the new column is obviously problematic.
In the past, we have addressed by backporting minimal "stub" PR's to
older versions. This is annoying, and only allow compatibility for the
single version or two that we backport to.
This PR adds a change to automatically detect all columns, and create
default column descriptors for columns we were unaware of. As a result,
older software versions can open a Blockstore that was modified by a
newer software version, even if that new version added columns that the
old version is unaware of.
* Add entries table to bt init
* Add entries to storage-proto
* Use new Blockstore method in bigtable_upload
* Add LedgerStorage::upload_confirmed_block_with_entries and use in bigtable_upload
* Upload entries to bigtable
This helper simply called std::mem::size_of<Self::Index>(). However, all
of the underlying functions that create keys manually copy fields into a
byte array. The fields are copied in end-to-end whereas size_of() might
include alignment bytes.
For example, a (u64, u32) only has 12 bytes of "data", but it would
have size 16 due to the 4 alignment padding bytes that would be
added to get the u32 (size 4) aligned with the u64 (size 8).
The function matches the access type and calls a different RocksDB
function depending on whether we have primary or secondary access. But,
lots of the code is the same for these two paths so de-duplicate the
repeated sections.
Currently, the RPC API that touch the Blockstore emit a datapoint for
each call. For an RPC node serving many requests, these datapoints
could get quite noisy, both in logs as well as traffic to the metrics
agent.
So, instead of submitting a datapoint for every call, accumulate the
number of calls in a struct and report that entire struct periodically.
The Blockstore currently maintains a RwLock<Slot> of the maximum root
it has seen inserted. The value is initialized during
Blockstore::open() and updated during calls to Blockstore::set_roots().
The max root is queried fairly often for several use cases, and caching
the value is cheaper than constructing an iterator to look it up every
time.
However, the access patterns of these RwLock match that of an atomic.
That is, there is no critical section of code that is run while the
lock is head. Rather, read/write locks are acquired in order to read/
update, respectively. So, change the RwLock<u64> to an AtomicU64.
The test_duplicate_with_pruned_ancestor test needs to get around a
limitation where the shreds with a parent older than the latest root are
discarded. The previous approach manually adjusted the root value in the
blockstore; this is not ideal in that it is fiddling with the inner
working of Blockstore.
So, use the is_trusted argument in Blockstore::insert_shreds(); setting
is_trusted=true bypasses the sanity checks (including the parent >=
latest root check).
JsonRpcRequestProcessor::check_blockstore_root() contained some logic
that performed duplicate sanity checking on a Blockstore fetch result.
The checking involved creating rocksdb iterators, which has non-trivial
overhead.
This PR removes the duplicate checking, and also adds comments to help
reason about how JsonRpcRequestProcessor interprets the Blockstore
result.
These services currently live in core/; however, they operate on the
ledger. Mores so, these two services operate on the blockstore only,
and not necessarily the entire ledger. So, it makes sense to move these
services out of core and into ledger. We've recently been doing similar
changes with breaking things out into individual crates in order to
reduce the scope of core.
So, this change moves the services from core/ to ledger/, and replaces
ledger with blockstore.
Instead of returning Result<Option<UnixTimestamp>>, return
Result<UnixTimestamp> and map None to an error. This makes the return
type similar to that of Blockstore::get_rooted_block().
* Remove RWLock from EntryNotifier because it causes perf degradation when entry notifications are enabled on geyser
* remove unused RWLock
* Remove RWLock