Commit Graph

168 Commits

Author SHA1 Message Date
behzad nouri 7a8807b8bb retransmits shreds recovered from erasure codes
Shreds recovered from erasure codes have not been received from turbine
and have not been retransmitted to other nodes downstream. This results
in more repairs across the cluster which is slower.

This commit channels through recovered shreds to retransmit stage in
order to further broadcast the shreds to downstream nodes in the tree.
2021-08-17 13:44:10 +00:00
behzad nouri 3c71670bd9 returns completed-data-set-info from insert_data_shred
instead of opaque (u32, u32) which are then converted to
CompletedDataSetInfo at the call-site.
2021-08-17 13:44:10 +00:00
behzad nouri b64eeb7729 removes erroneous uses of &Arc<...> from window-service 2021-08-13 17:26:31 +00:00
Jeff Washington (jwash) 14361906ca
for all tests, bank::new -> bank::new_for_tests (#19064) 2021-08-05 08:42:38 -05:00
carllin 03353d500f
Actively manage dead slots in AncestorHashesService (#18912) 2021-08-02 14:33:28 -07:00
carllin 588c0464b8
Add sampling logic and DuplicateSlotRepairStatus module (#18721) 2021-07-21 11:15:08 -07:00
sakridge 7f2254225e
Move entry/poh to own crate to speed up poh bench build (#18225) 2021-07-14 14:16:29 +02:00
carllin a1c0f144f4
Add blockstore column for frozen hashes and duplicate confirmed (#18533) 2021-07-12 20:59:16 -07:00
Michael Vines 4098af3b5b Record vote account commission with voting/staking rewards and surface in RPC 2021-07-12 15:09:44 -07:00
carllin 0eca92de18
Make set roots an iterator (#18357) 2021-07-01 20:02:40 -07:00
Tao Zhu 5e424826ba
Persist cost table to blockstore (#18123)
* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.
2021-07-01 11:32:41 -05:00
Lijun Wang a67d26a1e8
Fixed an issue doing the set_roots repeatedly for the same set. Instead doing the per chunks. (#18314) 2021-06-30 13:29:16 -07:00
sakridge 8d9a6deda4
Add repair number per slot (#18082) 2021-06-30 18:20:07 +02:00
Michael Vines 84b9de8c18 Shredder no longer holds a keypair 2021-06-21 21:29:52 -07:00
Alexander Meißner 6514096a67 chore: cargo +nightly clippy --fix -Z unstable-options 2021-06-18 10:42:46 -07:00
Tyera Eulberg d0511de9a6
chore: bump trees from 0.2.1 to 0.4.2 (#18052)
* chore: bump trees from 0.2.1 to 0.4.2 (#18041)

Bumps [trees](https://github.com/oooutlk/trees) from 0.2.1 to 0.4.2.
- [Release notes](https://github.com/oooutlk/trees/releases)
- [Commits](https://github.com/oooutlk/trees/commits)

---
updated-dependencies:
- dependency-name: trees
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Accommodate field & type changes

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-17 22:45:09 +00:00
steviez d52632f895
Add extra checks to verify_index_integrity (#17851) 2021-06-14 13:23:31 -05:00
steviez 4677e6c132
Combine repeated logic for checking if slot has been cleaned up (#17713) 2021-06-04 17:40:27 -05:00
carllin 96ba2edfeb
Switch EpochSlots to be frozen slots, not completed slots (#17168) 2021-06-03 00:20:00 +00:00
Ryo Onodera 1f97b2365f
Avoid full-range compactions with periodic filtered b.g. ones (#16697)
* Update rocksdb to v0.16.0

* Promote the infrequent and important log to info!

* Force background compaction by ttl without manual compaction

* Fix test

* Support no compaction mode in test_ledger_cleanup_compaction

* Fix comment

* Make compaction_interval customizable

* Avoid major compaction with periodic filtering...

* Adress lazy_static, special cfs and range check

* Clean up a bit and add comment

* Add comment

* More comments...

* Config code cleanup

* Add comment

* Use .conflicts_with()

* Nullify unneeded delete_range ops for special CFs

* Some clean ups

* Clarify the locking intention

* Ensure special CFs' consistency with PurgeType::CompactionFilter

* Fix comment

* Fix bad copy paste

* Fix various types...

* Don't use tuples

* Add a unit test for compaction_filter

* Fix typo...

* Remove flag and just use new behavior always

* Fix wrong condition negation...

* Doc. about no set_last_purged_slot in purge_slots

* Write a test and fix off-by-one bug....

* Apply suggestions from code review

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>

* Follow up to github review suggestions

* Fix line-wrapping

* Fix conflict

Co-authored-by: Tyera Eulberg <teulberg@gmail.com>
2021-05-28 16:42:56 +09:00
Tyera Eulberg ab581dafc2
Add block height to ConfirmedBlock structs (#17523)
* Add BlockHeight CF to blockstore

* Rename CacheBlockTimeService to be more general

* Cache block-height using service

* Fixup previous proto mishandling

* Add block_height to block structs

* Add block-height to solana block

* Fallback to BankForks if block time or block height are not yet written to Blockstore

* Add docs

* Review comments
2021-05-26 22:16:16 -06:00
Michael Vines 9541411c15 Plumb transaction-level rewards (aka "rent debits") into the `getTransaction` RPC method 2021-05-27 03:05:05 +00:00
carllin 52dccc656a
Purge slots greater than new last index (#16071) 2021-05-26 16:12:57 -07:00
carllin 3dfe87973b
Propagate dead slots up to replay (#17227) 2021-05-25 13:43:47 -07:00
Tyera Eulberg 41ec1c8d50
Add blockstore-root-scan for api nodes on boot (#17402)
* Add blockstore-root-scan for api nodes on boot

* Ensure cluster-confirmed root and parents are set as root in blockstore in load_frozen_forks()

* Plumb rpc-scan-and-fix-roots validator flag
2021-05-24 13:24:47 -06:00
sakridge a8dca3976b
Refactor genesis download/load/check functions (#17276)
* Refactor genesis ingest functions

* Consolidate genesis.bin/genesis.tar.bz2 references
2021-05-24 16:45:36 +02:00
Tao Zhu 0781fe1b4f
Upgrade Rust to 1.52.0 (#17096)
* Upgrade Rust to 1.52.0
update nightly_version to newly pushed docker image
fix clippy lint errors
1.52 comes with grcov 0.8.0, include this version to script

* upgrade to Rust 1.52.1

* disabling Serum from downstream projects until it is upgraded to Rust 1.52.1
2021-05-19 09:31:47 -05:00
Tyera Eulberg 6e9deaf1bd
Move block-time caching earlier (#17109)
* Require that blockstore block-time only be recognized slot, instead of root

* Move cache_block_time to after Bank freeze

* Single use statement

* Pass transaction_status_sender by reference

* Remove unnecessary slot-existence check before caching block time altogether

* Move block-time existence check into Blockstore::cache_block_time, Blockstore no longer needed in blockstore_processor helper
2021-05-10 13:14:56 -06:00
behzad nouri 81ad795d46 removes position field in coding-shred-header
CodingShredHeader.position is equal to
  ShredCommonHeader.index - ShredCommonHeader.fec_set_index
and is so redundant. The extra position field can add bugs if not
consistent with index and fec_set_index.
2021-05-10 13:20:56 +00:00
carllin bc7e741514
Integrate gossip votes into switching threshold (#16973) 2021-05-04 00:51:42 -07:00
steviez 475b00c42f
Add test to ensure data shreds with empty data would be inserted (#16955)
* Add test to ensure data shreds with empty data would be inserted

* Add error log statements for malformed shreds
2021-04-30 10:38:15 -05:00
steviez bc31378797
Trim extra shred bytes in blockstore (#16602)
Strip the zero-padding off of data shreds before insertion into blockstore

Co-authored-by: Stephen Akridge <sakridge@gmail.com>
Co-authored-by: Nathan Hawkins <utsl@utsl.org>
2021-04-27 17:40:41 -05:00
behzad nouri 03194145c0
removes first_coding_index from erasure recovery code (#16646)
first_coding_index is the same as the set_index and is so redundant:
https://github.com/solana-labs/solana/blob/37b8587d4/ledger/src/blockstore_meta.rs#L49-L60
2021-04-23 12:00:37 +00:00
behzad nouri 37b8587d4e
expands number of erasure coding shreds in the last batch in slots (#16484)
Number of parity coding shreds is always less than the number of data
shreds in FEC blocks:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L719

Data shreds are batched in chunks of 32 shreds each:
https://github.com/solana-labs/solana/blob/6907a2366/ledger/src/shred.rs#L714

However the very last batch of data shreds in a slot can be small, in
which case the loss rate can be exacerbated.

This commit expands the number of coding shreds in the last FEC block in
slots to: 64 - number of data shreds; so that FEC blocks are always 64
data and parity coding shreds each.

As a consequence of this, the last FEC block has more parity coding
shreds than data shreds. So for some shred indices we will have a coding
shred but no data shreds. This should not cause any kind of overlapping
FEC blocks as in:
https://github.com/solana-labs/solana/pull/10095
since this is done only for the very last batch in a slot, and the next
slot will reset the shred index.
2021-04-21 12:47:50 +00:00
Michael Vines a911ae00ba clippy 2021-04-18 20:55:02 -07:00
François Garillot b08cff9e77
Simplify some pattern-matches (#16402)
When those match an exact combinator on Option / Result.

Tool-aided by [comby-rust](https://github.com/huitseeker/comby-rust).
2021-04-08 12:40:37 -06:00
Tyera Eulberg b8b6777262
Only get Blockstore::last_root once (#16362) 2021-04-05 04:14:02 +00:00
Tyera Eulberg 1a13d22984
Fixup iterator method (#16357) 2021-04-04 23:32:51 +00:00
carllin 4e5ef6bce2
Add cluster state verifier logging (#16330)
* Add cluster state verifier logging

* Add duplicate-slots iterator to ledger tool
2021-04-02 21:48:44 -07:00
Tyera Eulberg da27acabcc
Rpc: enable getConfirmedSignaturesForAddress2 to return confirmed (not yet finalized) data (#16281)
* Update blockstore method to allow return of unfinalized signature

* Support confirmed sigs in getConfirmedSignaturesForAddress2

* Add deprecated comments

* Update docs

* Enable confirmed transaction-history in cli

* Return real confirmation_status; fill in not-yet-finalized block time if possible
2021-04-01 04:35:57 +00:00
Tyera Eulberg 18bd47dbe1
Rpc: fix getConfirmedTransaction slot (#16288)
* Fix transaction blockstore apis

* Update blockstore apis in rpc
2021-03-31 21:04:00 -06:00
Tyera Eulberg 433f1ead1c
Rpc: enable getConfirmedBlock and getConfirmedTransaction to return confirmed (not yet finalized) data (#16142)
* Add Blockstore block and tx apis that allow unrooted responses

* Add TransactionStatusMessage, and send on bank freeze; also refactor TransactionStatusSender

* Track highest slot with tx-status writes complete

* Rename and unpub fn

* Add commitment to GetConfirmed input configs

* Support confirmed blocks in getConfirmedBlock

* Support confirmed txs in getConfirmedTransaction

* Update sigs-for-addr2 comment

* Enable confirmed block in cli

* Enable confirmed transaction in cli

* Review comments

* Rename blockstore method
2021-03-26 16:47:35 -06:00
behzad nouri 4f82b897bc
buffers data shreds to make larger erasure coded sets (#15849)
Broadcast stage batches up to 8 entries:
https://github.com/solana-labs/solana/blob/79280b304/core/src/broadcast_stage/broadcast_utils.rs#L26-L29
which will be serialized into some number of shreds and chunked into FEC
sets of at most 32 shreds each:
https://github.com/solana-labs/solana/blob/79280b304/ledger/src/shred.rs#L576-L597
So depending on the size of entries, FEC sets can be small, which may
aggravate loss rate.
For example 16 FEC sets of 2:2 data/code shreds each have higher loss
rate than one 32:32 set.

This commit broadcasts data shreds immediately, but also buffers them
until it has a batch of 32 data shreds, at which point 32 coding shreds
are generated and broadcasted.
2021-03-23 14:52:38 +00:00
carllin d76ad33597
Handle blockstore insert dup checks (#16051) 2021-03-22 16:18:22 -07:00
Justin Starry 918d04e3f0
Add more slot update notifications (#15734)
* Add more slot update notifications

* fix merge

* Address feedback and add integration test

* switch to datapoint

* remove unused shred method

* fix clippy

* new thread for rpc completed slots

* remove extra constant

* fixes

* rely on channel closing

* fix check
2021-03-12 21:44:06 +08:00
Tyera Eulberg 7e65289729
Convert blockstore TransactionStatus column family to protobufs (#15733)
* Prevent panic if TransactionStatus can't be deserialized

* Convert Blockstore TransactionStatus column to protobuf

* Add compatability test
2021-03-05 09:05:35 -07:00
Michael Vines e92cff3efa create-snapshot subcommad now accepts the ROOT keyword 2021-02-26 16:40:10 -08:00
Michael Vines 5df36aec7d Pacify clippy 2021-02-19 20:08:41 -08:00
Tyera Eulberg 170cb792eb
Return blockstore error if previous_blockhash cannot be determined (#15382)
* Return blockstore error if previous_blockhash cannot be determined

* Add require_previous_blockshash flag
2021-02-18 01:04:52 +00:00
Tyera Eulberg 0812931c38
Log if unsanitary transactions are read from blockstore (#15319) 2021-02-14 06:32:43 +00:00