Instead of sending accounts individually, send batches of accounts for
background hashing.
Before this change we used to send accounts individually, and the
background thread used to do:
loop {
let account = receiver.recv();
account.hash();
// go back to sleep in recv()
}
Because most accounts are small and hashing them is very fast, the
background thread used to sleep a lot, and required many syscalls from
the sender in order to be woken up.
Batching reduces the number of syscalls.
* add hot storage test for test_partial_clean
* fix test_partial_clean for hot storage
* reviews
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
* add full_clean_refcount tests for both account storage formats
* keep code comments
* fix test to match with code comments
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
* add macro for accounts-db panic test
convert double remove test for both account file provider
* refactor to share accounts_db_test and accounts_db_panic_test macro
* rebase
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
* add define_accounts_db_test macro and refactor tests of accounts db for different AccountFileProvider
* use test macro for test_alive_bytes
* implement review feedbacks
* typo
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
std::num::Saturating allows us to create integers that will override
the standard arithmetic operators to use saturating math. This removes
the need for a custom macro as well as reduces mental load as someone
only needs to remember that they want saturating math once.
This PR introduces std::num::Saturating integers to replace all
use of saturating_add_assign!() in the accounts-db crate
* add scan_index for improving index generation
* pr feedback
* rework some stuff from pr feedback
* get rid of redundant if
* deal with rent correctly
* dsiable read cache while populating stakes cache on load
* use struct with drop as api
* use LoadHint
* remove disable_read_cache_updates_count
* add comment
* fmt
#### Problem
TieredStorage::drop() currently panic when it fails to delete the
underlying file to raise awareness of possible storage resource
leakage, including io::ErrorKind::NotFound. But sometimes the
TieredStorage (or AccountsFile in general) instance is created
then dropped without any file being created. This causes some
false-alarms including unit-tests.
#### Summary of Changes
This PR excludes NotFound in reporting storage leakage on
TieredStorage::drop().
#### Problem
As #72 introduced AccountsFile::TieredStorage, it also performs
file-type check when opening an accounts-file to determine whether
it is a tiered-storage or an append-vec. But before tiered-storage is
enabled, this opening check is unnecessary.
#### Summary of Changes
Remove the accounts-file type check code and simply assume everything
is append-vec on AccountsFile::new_from_file().
#### Problem
AccountsFile currently doesn't have an implementation for TieredStorage.
To enable AccountsDB tests for the TieredStorage, we need AccountsFile
to support TieredStorage.
#### Summary of Changes
This PR implements a AccountsFile::TieredStorage, a thin wrapper between
AccountsFile and TieredStorage.
#### Problem
The TieredStorage has not yet implemented the AccountsFile::capacity()
API.
#### Summary of Changes
Implement capacity() API for TieredStorage and limit file size to 16GB,
same as the append-vec file.
#### Problem
TieredStorage::file_size() essentially supports AccountsFile::len(),
but its API is inconsistent with AccountsFile's.
#### Summary of Changes
Refactor TieredStorage::file_size() to ::len() and share the same API
as AccountsFile's.
#### Test Plan
Build
Existing unit-tests.
#### Problem
The current implementation of TieredStorage::file_size() requires
a sys-call to provide the file size.
#### Summary of Changes
Add len() API to TieredStorageReader, and have HotStorageReader()
implement the API using Mmap::len().
#### Test Plan
Update existing unit-test to also verify HotStorageReader::len().
#### Problem
The current AppendVecId actually refers to an accounts file id.
#### Summary of Changes
Rename AppendVecId to AccountsFileId.
#### Test Plan
Build
#### Problem
The TieredStorageFooter has the min_account_address and
max_account_address fields to describe the account address
range in its file. But the current implementation hasn't updated
the fields yet.
#### Summary of Changes
This PR enables the TieredStorage to persist address range
information into its footer via min_account_address and
max_account_address.
#### Test Plan
Updated tiered-storage test to verify persisted account address range.
#### Problem
The TieredStorage::new_readonly() function currently has the following
problems:
* It opens the file without checking the magic number before checking and loading the footer.
* It opens the file twice: first to load the footer, then open again by the reader.
#### Summary of Changes
This PR refactors TieredStorage::new_readonly() so that it first performs all
checks inside the constructor of TieredReadableFile. The TieredReadableFile
instance is then passed to the proper reader (currently HotStorageReader)
when all checks are passed.
#### Test Plan
* Added a new test to check MagicNumberMismatch.
* Existing tiered-storage tests
* accounts-db: unpack_archive: avoid extra iteration on each path
We used to do a iterator.clone().any(...) followed by
iterator.collect(). Merge the two and avoid an extra iteration and
re-parsing of the path.
* accounts-db: unpack_archive: unpack accounts straight into their final destination
We used to unpack accounts into account_path/accounts/<account> then
rename to account_path/<account>. We now unpack them into their final
destination directly and avoid the rename syscall.
#### Problem
TieredWritableFile currently uses File instead of BufWriter.
This will introduce more syscall when doing file writes.
#### Summary of Changes
This PR makes TieredWritableFile uses BufWriter to allow the
write-call to be more optimized to reduce the number of syscalls.
#### Test Plan
Existing tiered-storage test.
Will run experiments to verify its performance improvement.
#### Dependency
https://github.com/anza-xyz/agave/pull/260
#### Problem
TieredStorageFile struct currently offers new_readonly() and new_writable()
to allow both read and write work-load to share the same struct. However,
as we need the writer to use BufWriter to improve performance as well as
enable Hasher on writes. There is a need to refactor TieredStorageFile to
split its usage for read-only and writable.
#### Summary of Changes
Refactor TieredStorageFile to TieredReadonlyFIle and TieredWritableFile.
#### Test Plan
Existing tiered-storage tests.
#### Problem
tiered_storage/writer.rs was added when we planned to support multiple
tiers in the tiered-storage (i.e., at least hot and cold). However, as we
changed our plan to handle cold accounts as state-compressed accounts,
we don't need a general purposed tiered-storage writer at this moment.
#### Summary of Changes
Remove tiered_storage/writer.rs as we currently don't have plans to develop cold storage.
#### Test Plan
Existing tiered-storage tests.
#### Problem
As we further optimize the HotStorageMeta in #146, there is a need
for a HotAccount struct that contains all the hot account information.
Meanwhile, we currently don't have plans to develop a cold account
format at this moment. As a result, this makes it desirable to repurpose
TieredReadableAccount to HotAccount.
#### Summary of Changes
Repurpose TieredReadableAccount to HotAccount.
#### Test Plan
Existing tiered-storage tests.