Based on run-time profiles, the majority time of new_pull_requests is
spent building bloom filters, in hashing and bit-vec ops.
This commit builds crds filters in parallel using rayon constructs. The
added benchmark shows ~5x speedup (4-core machine, 8 threads).
filter_crds_values checks every crds filter against every hash value:
https://github.com/solana-labs/solana/blob/ee646aa7/core/src/crds_gossip_pull.rs#L432
which can be inefficient if the filter's bit-mask only matches small
portion of the entire crds table.
This commit shards crds values into separate tables based on shard_bits
first bits of their hash prefix. Given a (mask, mask_bits) filter,
filtering crds can be done by inspecting only relevant shards.
If CrdsFilter.mask_bits <= shard_bits, then precisely only the crds
values which match (mask, mask_bits) bit pattern are traversed.
If CrdsFilter.mask_bits > shard_bits, then approximately only
1/2^shard_bits of crds values are inspected.
Benchmarking on a gce cluster of 20 nodes, I see ~10% improvement in
generate_pull_responses metric, but with larger clusters, crds table and
2^mask_bits are both larger, so the impact should be more significant.
* Filter push/pulls from spies
* Don't pull from peers with shred version == 0, don't push to people with shred_version == 0
Co-authored-by: Carl <carl@solana.com>
* Gossip benchmark
* Rayon tweaking
* push pulls
* fanout to max nodes
* fixup! fanout to max nodes
* fixup! fixup! fanout to max nodes
* update
* multi vote test
* fixup prune
* fast propagation
* fixups
* compute up to 95%
* test for specific tx
* stats
* stats
* fixed tests
* rename
* track a lagging view of which nodes have the local node in their active set in the local received_cache
* test fixups
* dups are old now
* dont prune your own origin
* send vote to tpu
* tests
* fixed tests
* fixed test
* update
* ignore scale
* lint
* fixup
* fixup
* fixup
* cleanup
Co-authored-by: Stephen Akridge <sakridge@gmail.com>
* Batch process pull responses
* Generate pull requests at 1/2 rate
* Do filtering work of process_pull_response in read lock
Only take write lock to insert if needed.
* filter messages that are likely to be pushed from the response
* tests
* tests
* wait to start filtering responses, and push stats to influx
* wait to start filtering responses, and push stats to influx
* reduce the timers to match the publish self timeout
* fmt
* fmt
* Revert "Untar is called for shred archives that do not exist. (#9565)"
This reverts commit 729cb5eec6.
* Revert "Dont insert shred payload into rocksdb (#9366)"
This reverts commit 5ed39de8c5.
* Add CrdsValue timeout checks on Pull Responses
* Allow older values to enter Crds as long as a ContactInfo exists
* Allow staked contact infos to be inserted into crds if they haven't expired
* Try and handle oveflows
* Fix test
* Some comments
* Fix compile
* fix test deadlock
* Add a test for processing timed out values received via pull response
* Coalesce gossip pull requests and serve them in batches
* batch all filters and immediately respond to messages in gossip
* Fix tests
* make download_from_replicator perform a greedy recv
* fixed bloom filter math
* Add split each pull request into multiple pulls with different filters
* Rework CrdsFilter to generate all possible masks to cover the keyspace
* Limit the bloom sizes such that each pull request is no larger than mtu
* Be able to create bank snapshots
* fix clippy
* load snapshot on start
* regenerate account index from the storage
* Remove rc feature dependency
* cleanup
* save snapshot for slot 0