* add scan_index for improving index generation
* pr feedback
* rework some stuff from pr feedback
* get rid of redundant if
* deal with rent correctly
* dsiable read cache while populating stakes cache on load
* use struct with drop as api
* use LoadHint
* remove disable_read_cache_updates_count
* add comment
* fmt
#### Problem
TieredStorage::drop() currently panic when it fails to delete the
underlying file to raise awareness of possible storage resource
leakage, including io::ErrorKind::NotFound. But sometimes the
TieredStorage (or AccountsFile in general) instance is created
then dropped without any file being created. This causes some
false-alarms including unit-tests.
#### Summary of Changes
This PR excludes NotFound in reporting storage leakage on
TieredStorage::drop().
#### Problem
As #72 introduced AccountsFile::TieredStorage, it also performs
file-type check when opening an accounts-file to determine whether
it is a tiered-storage or an append-vec. But before tiered-storage is
enabled, this opening check is unnecessary.
#### Summary of Changes
Remove the accounts-file type check code and simply assume everything
is append-vec on AccountsFile::new_from_file().
#### Problem
AccountsFile currently doesn't have an implementation for TieredStorage.
To enable AccountsDB tests for the TieredStorage, we need AccountsFile
to support TieredStorage.
#### Summary of Changes
This PR implements a AccountsFile::TieredStorage, a thin wrapper between
AccountsFile and TieredStorage.
#### Problem
The TieredStorage has not yet implemented the AccountsFile::capacity()
API.
#### Summary of Changes
Implement capacity() API for TieredStorage and limit file size to 16GB,
same as the append-vec file.
#### Problem
TieredStorage::file_size() essentially supports AccountsFile::len(),
but its API is inconsistent with AccountsFile's.
#### Summary of Changes
Refactor TieredStorage::file_size() to ::len() and share the same API
as AccountsFile's.
#### Test Plan
Build
Existing unit-tests.
#### Problem
The current implementation of TieredStorage::file_size() requires
a sys-call to provide the file size.
#### Summary of Changes
Add len() API to TieredStorageReader, and have HotStorageReader()
implement the API using Mmap::len().
#### Test Plan
Update existing unit-test to also verify HotStorageReader::len().
#### Problem
The current AppendVecId actually refers to an accounts file id.
#### Summary of Changes
Rename AppendVecId to AccountsFileId.
#### Test Plan
Build
#### Problem
The TieredStorageFooter has the min_account_address and
max_account_address fields to describe the account address
range in its file. But the current implementation hasn't updated
the fields yet.
#### Summary of Changes
This PR enables the TieredStorage to persist address range
information into its footer via min_account_address and
max_account_address.
#### Test Plan
Updated tiered-storage test to verify persisted account address range.
#### Problem
The TieredStorage::new_readonly() function currently has the following
problems:
* It opens the file without checking the magic number before checking and loading the footer.
* It opens the file twice: first to load the footer, then open again by the reader.
#### Summary of Changes
This PR refactors TieredStorage::new_readonly() so that it first performs all
checks inside the constructor of TieredReadableFile. The TieredReadableFile
instance is then passed to the proper reader (currently HotStorageReader)
when all checks are passed.
#### Test Plan
* Added a new test to check MagicNumberMismatch.
* Existing tiered-storage tests
* accounts-db: unpack_archive: avoid extra iteration on each path
We used to do a iterator.clone().any(...) followed by
iterator.collect(). Merge the two and avoid an extra iteration and
re-parsing of the path.
* accounts-db: unpack_archive: unpack accounts straight into their final destination
We used to unpack accounts into account_path/accounts/<account> then
rename to account_path/<account>. We now unpack them into their final
destination directly and avoid the rename syscall.
#### Problem
TieredWritableFile currently uses File instead of BufWriter.
This will introduce more syscall when doing file writes.
#### Summary of Changes
This PR makes TieredWritableFile uses BufWriter to allow the
write-call to be more optimized to reduce the number of syscalls.
#### Test Plan
Existing tiered-storage test.
Will run experiments to verify its performance improvement.
#### Dependency
https://github.com/anza-xyz/agave/pull/260
#### Problem
TieredStorageFile struct currently offers new_readonly() and new_writable()
to allow both read and write work-load to share the same struct. However,
as we need the writer to use BufWriter to improve performance as well as
enable Hasher on writes. There is a need to refactor TieredStorageFile to
split its usage for read-only and writable.
#### Summary of Changes
Refactor TieredStorageFile to TieredReadonlyFIle and TieredWritableFile.
#### Test Plan
Existing tiered-storage tests.
#### Problem
tiered_storage/writer.rs was added when we planned to support multiple
tiers in the tiered-storage (i.e., at least hot and cold). However, as we
changed our plan to handle cold accounts as state-compressed accounts,
we don't need a general purposed tiered-storage writer at this moment.
#### Summary of Changes
Remove tiered_storage/writer.rs as we currently don't have plans to develop cold storage.
#### Test Plan
Existing tiered-storage tests.
#### Problem
As we further optimize the HotStorageMeta in #146, there is a need
for a HotAccount struct that contains all the hot account information.
Meanwhile, we currently don't have plans to develop a cold account
format at this moment. As a result, this makes it desirable to repurpose
TieredReadableAccount to HotAccount.
#### Summary of Changes
Repurpose TieredReadableAccount to HotAccount.
#### Test Plan
Existing tiered-storage tests.
#### Problem
TieredStorage stores account hash as an optional field inside its HotStorage.
However, the field isn't used and we have already decided to deprecate
the account hash.
#### Summary of Changes
Remove account-hash from the tiered-storage.
#### Test Plan
Existing tiered-storage tests.
Running validators w/ tiered-storage in mainnet-beta w/o storing account-hash.
#### Problem
In TieredAccountMeta, RENT_EXEMPT_RENT_EPOCH will be used when
its optional field rent_epoch is None. However, for legacy reasons, 0
should be used for zero-lamport accounts.
#### Summary of Changes
Return 0 for TieredAccountMeta::rent_epoch() for zero-lamport accounts.
#### Test Plan
accounts_db::tests::test_clean_zero_lamport_and_dead_slot
#### Problem
While accounts-db might not invoke appends_account twice
for the same AccountsFile, TieredStorage::write_accounts()
itself isn't thread-safe, and it depends on the above accounts-db
assumption.
#### Summary of Changes
This PR makes TieredStorage::write_accounts() thread-safe.
So only the first thread that successfully updates the already_written
flag can proceed and write the input accounts. All subsequent
calls to write_accounts() will be a no-op and return AttemptToUpdateReadOnly
Error.
#### Problem
There're some test functions that have been used in different
mod in TieredStorage. It's better to have one same place for
all tiere-storage related test functions.
#### Summary of Changes
Created test_utils.rs under /tiered_storage and move test-related
functions into it.
#### Test Plan
Existing tests.
#### Problem
While the implementation of hot-storage reader and writer
are mostly done, it is not yet connected to TieredStorage.
#### Summary of Changes
This PR enables hot-storage in TieredStorage::write_accounts().
#### Test Plan
Completes the existing tests in TieredStorage to directly
write and read from a TieredStorage with the hot storage format.
#### Problem
append_accounts() only appends (len - skip) accounts.
However, AppendVec::append_accounts() reserves `len`
instead of `(len - skip)` for its vectors.
#### Summary of Changes
Use (len - skip) as the initial size of the Vectors.
#### Problem
HotStorageReader and TieredStorageReader haven't implemented
accounts() that is required by AcocuntsFile.
#### Summary of Changes
This PR implements accounts() for both HotStorageReader
and TieredStorageReader
#### Test Plan
Extend the existing test to cover accounts().
#### Problem
TieredStorageMeta and TieredStorageReader::get_account API uses
u32 to represent IndexOffset. However, within the TieredStorage scope,
IndexOffset should be used, it is not until working with AccountsFile API
when u32 representation of offset is needed.
#### Summary of Changes
Have TieredStorageMeta and TieredStorageReader to use IndexOffset.
#### Test Plan
Existing unit-tests.
#### Problem
To allow hot-storage to use HotStorageWriter::write_account() to
implement AccountsFile::append_accounts(), it is required to
provide a Vector of StoredAccountInfo to allow AccountsDB to
properly prepare the entry for each account.
#### Summary of Changes
This PR enables HotStorageWriter::write_account() to return
Vec<StoredAccountInfo>.
#### Test Plan
Extend existing tests for HotStorageWriter to verify the correctness
of the returned Vec<StoredAccountInfo>.
#### Problem
TieredStorageReader is a wrapper enum that works for
both Hot and Cold storage readers, but its get_account()
and account_matches_owner() API are missing.
#### Summary of Changes
Add get_account() and account_matches_owner() to
TieredStorageReader.
#### Test Plan
hot.rs offers similar coverage for HotStorageReader.
#### Problem
In HotStorageReader, the account_matches_owners takes
&[&Pubkey] as the address candidates. However, it should
be &[Pubkey] as defined in the accounts_file API.
#### Summary of Changes
Correct HotStorageReader::account_matches_owners() to
take &[Pubkey] instead.
#### Test Plan
Existing unit-tests
#### Problem
Using non-reference type of AccountHash in
AccountMetaOptionalFields causes an unnecessary copy
as mentioned in #34948.
#### Summary of Changes
Uses &AccountHash in AccountMetaOptionalFields to
avoid copying.
#### Test Plan
Existing unit tests.
Fixes#34948
#### Problem
So far the current HotStorageWriter::write_accounts() only writes
accounts blocks and index block.
#### Summary of Changes
The PR further writes owners block in HotStorageWriter::write_accounts().
#### Test Plan
Extended existing test for HotStorageWriter to cover the owners block.
#### Problem
In HotStorageWriter::write_accounts, it skips storing rent-epoch when
the rent-epoch equals Epoch::MAX. While the value is correct, it is
more suitable to use RENT_EXEMPT_RENT_EPOCH instead as the
goal here is to save bytes for rent-exempt accounts.
#### Summary of Changes
Replace Epoch::MAX by RENT_EXEMPT_RENT_EPOCH when checking
whether to skip storing rent-epoch in HotStorageWriter.
#### Problem
The implementation of write_accounts() for HotAccountStorage is missing.
It consists of the writing of account blocks, index block, and owners block.
#### Summary of Changes
This PR completes part of the HotStorageWriter::write_accounts().
Specifically, it finishes the writing of account blocks and index block.
#### Test Plan
A new unit-test is added to verify the correctness of the work-in-progress
HotStorageWriter::write_accounts().
#### Problem
To write the owners-block, it requires an in-memory struct that maintains
a set of unique owner addresses while providing a look-up function to
obtain the OwnerOffset with the specified owner address.
#### Summary of Changes
This PR adds OwnersTable, the in-memory struct that maintains
a set of unique owner addresses while providing a look-up function to
obtain the OwnerOffset with the specified owner address.
#### Test Plan
A new unit-test is added.
#### Problem
The OwnersBlockFormat is currently defined inside footer.rs
instead of inside owners.rs. In addition, the implementation of
OwnersBlock doesn't honor OwnersBlockFormat.
#### Summary of Changes
This PR moves OwnersBlockFormat from footer.rs to owners.rs
and repurpose OwnersBlock as OwnersBlockFormat (just like
the IndexBlockFormat inside index.rs)
#### Test Plan
Existing unit-tests.
The function open_genesis_config() performs several operations that
could fail. If any of these fail, the process exits immediately.
Instead of exiting immediately, bubble up the error and let the caller
decide the appropriate action. solana-validator and solana-ledger-tool
will functionally be unchanged, but this consolidates startup failures
for both of these processes.
#### Problem
Before we have fully switched to the new way to determine whether
an account is executable, we still need a bit for th executable flag at
this moment in the TieredStorage as well as for backward compatibility
in case we want to revert it back.
#### Summary of Changes
This PR adds the executable flag into AccountMetaFlags.
#### Test Plan
Updated existing tests for AccountMetaFlags to cover executable flag.
#### Problem
The implementation of HotAccountWriter is missing.
#### Summary of Changes
This PR kicks off the implementation of HotStorageWriter by
adding HotStorageWriter::new().
#### Test Plan
Add a new unit-test that verifies the correctness of HotStorageWriter
writing zero accounts using HotStorageReader.
#### Problem
In the previous refactoring, IndexBlockFormat::AddressAndBlockOffsetOnly
can now work independently with the AccountMetaFormat. As a result,
the naming of this format should be updated to reflect this.
#### Summary of Changes
As the format first persists the list of addresses followed by the list of offsets,
this PR renames AddressAndBlockOffsetOnly to AddressesThenOffsets.
#### Problem
All structs that implement AccountsFile are required to support get_account,
but the implementation for HotAccounts is currently missing.
#### Summary of Changes
This PR implements HotAccountsReader::get_account_from_index_offset.
This will allow it to support AccountsFile::get_account API.
#### Test Plan
Add new unit-test.
#### Problem
There're some typos in the comments and assert messages index.rs mentioned in #34529
#### Summary of Changes
Fix the typos (only in assert message and comments).
#### Problem
Each AccountsFile is required to implement account_matches_owners(),
a public API that checks whether the account located at the specified
offset matches any input owners. However, the implementation of
account_matches_owners() for HotAccountsStorage is missing.
#### Summary of Changes
This PR implements HotStorageReader::account_matches_owners().
#### Test Plan
A new unit-test is added to this PR.
#### Problem
TieredStorage doesn't perform boundary check in get_account_offset
when the input IndexOffset isn't valid.
#### Summary of Changes
This PR adds two checks. First, it checks whether the IndexOffset exceeds
the boundary of the index block. Second, when an index format that has the
same index entries as account entries is used, it also checks whether IndexOffset
is smaller than account_entry_count.
#### Test Plan
Two new tests are added to this PR.
#### Problem
get_account_address() does not check whether IndexOffset is valid.
#### Summary of Changes
This PR adds two checks. First, it checks whether the IndexOffset exceeds
the boundary of the index block. Second, when an index format that has the
same index entries as account entries is used, it also checks whether IndexOffset
is smaller than account_entry_count.
#### Test Plan
New unit-test is added.
#### Problem
Hot accounts are stored in accounts blocks, whose offset is smaller than
the index block offset. However, the current code doesn't perform
any boundary checks when accessing hot account meta.
#### Summary of Changes
Adds boundary check when accessing hot account meta.
#### Problem
HotAccountOffset::new() might return Err for invalid offset, and this
part needs some test coverage.
#### Summary of Changes
Add unit-tests for checking invalid HotAccountOffset.
#### Problem
The current naming and the code comments for HOT_ACCOUNT_OFFSET_ALIGNMENT
aren't really reflecting its role as pointed out in #34335.
#### Summary of Changes
This PR renames HOT_ACCOUNT_OFFSET_ALIGNMENT to HOT_ACCOUNT_ALIGNMENT
as it's the hot account instead of hot account offset needs to be aligned.
In addition, improve the comment block for HOT_ACCOUNT_ALIGNMENT.
#### Problem
Hot and cold accounts storage have different implementations of
their offsets. As a result, a single struct AccountOffset isn't suitable
to describe the offsets used by hot and cold accounts storage.
#### Summary of Changes
This PR makes AccountOffset a trait. On top of that, introduces
HotAccountOffset that implements AccountOffset.
#### Test Plan
Updated existing unit-tests.
* tune ancient append vec size to 130M
* fix a test and get rid of the assert since it is covered in the test
* use 128M
* assert max append vec size for ancient append vec
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
* ancient append vec append creates big enough ancient append vecs
* assert we write the correct # of bytes
* fix tests
* u64 align accounts bytes written
* fix build
* add tests
* we don't need to align the store to u64.
When we call `remaining_bytes` to calculate, for the last element, u64_align!(len) could be greater than capacity. However, because saturate_sub is used, we still get the correct answer of 0. Therefore, no need to align the storage to u64.
* add comments
* add shink test for min_size
* remove dead code
* fix build
* reviews
* add test for squash ancient vec overflow too much to fit into ideal av
* add test for get_bytes on AccountToStore
* clippy
* modify test to assert 0 size of opposite
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
#### Problem
After we have defined AccountOffset to be u32, it means the address space within
one HotAccountsFile is up to 4GB. However, since the Accounts Blocks in a
HotAccountsFile is 8-byte aligned, it has the opportunity to store more data by
internally multiplying the AccountOffset by 8.
#### Summary of Changes
This PR allows a HotAccountsFile to store up to 32GB accounts data by using
8 x AccountOffset as its actual offset.
#### Test Plan
Updated existing unit-tests.
#### Problem
AccountOffset currently uses `usize`, which size is platform dependent.
We want a fixed size type that is consist to what we persist in the tiered-storage file.
#### Summary of Changes
This PR makes AccountOffset use u32.
#### Problem
MatchAccountOwnerError currently belongs to append_vec.
However, it is a public error type that is also required by other AccountsFile
implementations such as TieredStorageFile.
#### Summary of Changes
Move MatchAccountOwnerError from append_vec to accounts_file.
* remaining_bytes aligns len since all writes will align first
* use remaining_bytes() to check for whether the new account can fit into the av storage
* Add test coverage for av remaining bytes alignment
* use great equal to check space available
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
#### Problem
IndexOffset currently uses `usize`, which size is platform dependent.
We want a fixed size type that is consist to what we persist in the tiered-storage file.
#### Summary of Changes
This PR makes IndexOffset use u32.
#### Problem
In TieredStorage, we want to make AccountOffset and IndexOffset u32 instead
of usize just like OwnerOffset. However, we need to first change what we persist
in the storage. Currently, IndexBlock persists offsets as u64 instead of u32.
#### Summary of Changes
This PR makes IndexBlock persist u32 offsets.
#### Test Plan
Existing test cases.
#### Problem
TieredAccountMeta currently uses `u32` for owner_offset, while
we already have a dedicate type for that --- OwnerOffset.
#### Summary of Changes
This PR makes TieredAccountMeta use OwnerOffset for its
owner_offset field.
#### Test Plan
Existing unit-tests.
#### Problem
The current OwnerOffset is defined as usize, which actual size could be u32 or u64.
On the other hand, the OwnerOffset is used to access the ith owner in one
TieredStorageFile. As a result, u32 is more than enough to represent the number of
owners inside one TieredStorageFile while u16 might not be always enough.
#### Summary of Changes
This PR defines OwnerOffset as u32.
#### Test Plan
Existing test cases.
* add a feature to disable rent collection
* fix a test
* fix a test
* rekey
* should collect rent
* Update runtime/src/bank/fee_distribution.rs
Co-authored-by: Brooks <brooks@prumo.org>
* expand tests to cover both rent collection disabled and enabled
* feedbacks
* reviews - move should collect rent check out of rent collector into bank
* enforce rent_epoch to u64:max when rent collection is disabled
* review feedbacks and fix a test
When rent fee collection is disabled, we won't collect rent for any account. If there are any rent paying accounts, their `rent_epoch` won't change too.
* revise comments
* update rent_epoch for rent exempted account
* rebase
* set rent_epoch in rent collection for rent exempted account
* revert test change
* don't assert
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
Co-authored-by: Brooks <brooks@prumo.org>
#### Problem
The HotStorageReader does not have an API to obtain owners_address.
#### Summary of Changes
This PR adds HotStorageReader::get_owner_address().
#### Test Plan
A new unit-test is added to this PR.
#### Problem
#34032 mentioned nice-to-have a formatting improvement that
can improve the readability.
#### Summary of Changes
Improve the code formatting in test_hot_storage_get_account_offset_and_address().
#### Problem
A TieredStorage file has three types of block: accounts block, index block,
and owner block, but implementation of the TieredStorage OwnersBlock
is missing in the current master.
#### Summary of Changes
This PR implements OwnersBlock which stores a set of owners' addresses
in a compact but efficient way.
#### Test Plan
A new unit-test is included in this PR.
#### Problem
HotStorageReader currently not yet has an API to obtain account_address
#### Summary of Changes
This PR adds HotStorageReader::get_account_address() which returns
the Pubkey of the account associated with the specified IndexOffset.
#### Test Plan
Augmented an existing unit-test to cover get_account_address() case.
#### Problem
TieredStorageFooter::owners_offset refers to the offset to the
owners block, which looks very similar to OwnerOffset that
has a completely different concept -- the offset to access the
address of an owner in the owners-block.
#### Summary of Changes
Similar to the previous renaming index_offset to index_block_offset,
this PR renames owners_offset to owners_block_offset to
avoid confusion.
#### Problem
HotStorageReader currently not yet has an API to obtain account_offset
#### Summary of Changes
This PR adds HotStorageReader::get_account_offset() which takes
IndexOffset and returns AccountOffset.
#### Test Plan
A new unit-test is included in this PR.
#### Problem
In IndexBlockFormat, both `IndexOffset` and `AccountOffset` parameters
are named `offset` in some functions which could be confusing.
#### Summary of Changes
Renamed `offset` to `index_offset` and `account_offset` to improve
readability.
#### Problem
#33927 introduced a new type AccountOffset, but HotStorageReader
still uses `usize` to access accounts.
#### Summary of Changes
This PR makes HotStorageReader use the new AccountOffset type.
#### Problem
TieredStorage conceptually has different offsets. However, the current code directly
uses the same primitive type for accessing offsets, which is error-prone as one could
easily use one offset to access data that should be accessed with a different offset
type.
#### Summary of Changes
This PR adds IndexOffset type -- a struct for obtaining the ith entry inside the
index-block to obtain account's offset and address.
#### Problem
TieredStorage conceptually has different offsets. However, the current code directly
uses the same primitive type for accessing offsets, which is error-prone as one could
easily use one offset to access data that should be accessed with a different offset
type.
#### Summary of Changes
This PR introduces the AccountOffset type, which allows static-check to on different
type of TieredStorage offsets.
#### Problem
The current tiered-storage code uses "account-index" to call index-block.
This could lead to confusion especially as we start giving each offset/position/index a specific type.
#### Summary of Changes
This PR renames all structs/variables that use account-index to refer to index-block.
* allow test feature to skip rewrites
* hook up cli arg for test skip rewrites, update tests
* fix sanity checker
* add account hash to abi to fix a test
* reviews
* use hashmap to collect skip_rewrites. exclude skip_rewrites from dirty
pubkey set
* accumulate skipped_rewrite in reduce
* mutex
* fmt
* skip hash verify for this test flag
* add skipped rewrites num stat
* skip bank hash verify not account hash verify
* reviews
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
* Split compute budget instruction processing from ComputeBudget struct itself, so CB instructions can be processed elsewhere without involving ComputeBudget
* updated tests
* avoid built ComputeBudget from dated ComputeBudgetLimits in this refactoring PR
* Clean-up program-runtime/src/compute_budget_processor.rs
* Add test for a corner case that deprecated instruction is used to request units greater than max limit;
* Update code to handle the corner case.
* buffer accounts field for hash
* use smallvec to allocate hash buffer on stack
* sort deps
* more opt
* clippy
---------
Co-authored-by: HaoranYi <haoran.yi@solana.com>
Revert "Split compute budget instructions process from struct itself (#33513)"
This reverts commit c73bebe984. This
was found to be a consensus breaking change.
* Split compute budget instruction processing from ComputeBudget struct itself, allow compute_budget_instructions be processed elsewhere without having to instantiate ComputeBudget
* updated tests
#### Problem
HotStorageReader currently only implements get_footer(). It does not
have a function to obtain the account meta.
#### Summary of Changes
This PR implements HotStorageReader::get_account_meta_from_offset().
A function that returns the account meta located at the specified offset.
This will be the helper function that will be later used to obtain the account
meta when the offset is available from the index block of a hot storage file.
#### Test Plan
A new test is included in this PR.
#### Problem
HotStorageReader currently doesn't have a test that covers its footer.
#### Summary of Changes
This PR includes a test for HotStorageReader that verifies the footer.