Commit Graph

18 Commits

Author SHA1 Message Date
behzad nouri 37c8842bcb
scans crds table in parallel for finding old labels (#13073)
From runtime profiles, the majority time of ClusterInfo::handle_purge
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1605-L1626
is spent scanning crds table finding old labels:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/crds.rs#L175-L197

This can be done in parallel given that gossip thread-pool:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1637-L1641
is idle when handle_purge is invoked:
https://github.com/solana-labs/solana/blob/0776fa05c/core/src/cluster_info.rs#L1681
2020-10-23 14:17:37 +00:00
Michael Vines 959880db60 Remove unused pubkey::Pubkey imports 2020-10-21 19:08:13 -07:00
Michael Vines 7bc073defe Run `codemod --extensions rs Pubkey::new_rand solana_sdk::pubkey::new_rand` 2020-10-21 19:08:13 -07:00
behzad nouri 1866521df6
retains hash value of outdated responses received from pull requests (#12513)
pull_response_fail_inserts has been increasing:
https://cdn.discordapp.com/attachments/478692221441409024/759096187587657778/pull_response_fail_insert.png
but for outdated values which fail to insert:
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds_gossip_pull.rs#L332-L344
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds.rs#L104-L108
are not recorded anywhere, and so the next pull request may obtain the
same redundant payload again, unnecessary taking bandwidth.

This commit holds on to the hashes of failed-inserts for a while, similar
to purged_values:
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds_gossip_pull.rs#L380
and filter them out for the next pull request:
https://github.com/solana-labs/solana/blob/a5c3fc14b3/core/src/crds_gossip_pull.rs#L204
2020-10-01 00:39:22 +00:00
behzad nouri 57ed4e4657
patches bug in Crds::find_old_labels with pubkey specific timeout (#12528)
Current code only returns values which are expired based on the default
timeout. Example from the added unit test:
  - value inserted at time 0
  - pubkey specific timeout = 1
  - default timeout = 3
Then at now = 2, the value is expired, but the function fails to return
the value because it compares with the default timeout.
2020-09-29 09:04:40 +00:00
behzad nouri 9b866d79fb
shards crds values based on their hash prefix (#12187)
filter_crds_values checks every crds filter against every hash value:
https://github.com/solana-labs/solana/blob/ee646aa7/core/src/crds_gossip_pull.rs#L432
which can be inefficient if the filter's bit-mask only matches small
portion of the entire crds table.

This commit shards crds values into separate tables based on shard_bits
first bits of their hash prefix. Given a (mask, mask_bits) filter,
filtering crds can be done by inspecting only relevant shards.

If CrdsFilter.mask_bits <= shard_bits, then precisely only the crds
values which match (mask, mask_bits) bit pattern are traversed.
If CrdsFilter.mask_bits > shard_bits, then approximately only
1/2^shard_bits of crds values are inspected.

Benchmarking on a gce cluster of 20 nodes, I see ~10% improvement in
generate_pull_responses metric, but with larger clusters, crds table and
2^mask_bits are both larger, so the impact should be more significant.
2020-09-17 14:05:16 +00:00
sakridge f519fdecc2
generate_pull_response optimization (#11597) 2020-08-12 22:45:19 -07:00
Greg Fitzgerald 0550b893b0
Fix typos (#10675)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-06-17 20:54:52 -07:00
anatoly yakovenko ba83e4ca50
Fix fannout gossip bench (#10509)
* Gossip benchmark

* Rayon tweaking

* push pulls

* fanout to max nodes

* fixup! fanout to max nodes

* fixup! fixup! fanout to max nodes

* update

* multi vote test

* fixup prune

* fast propagation

* fixups

* compute up to 95%

* test for specific tx

* stats

* stats

* fixed tests

* rename

* track a lagging view of which nodes have the local node in their active set in the local received_cache

* test fixups

* dups are old now

* dont prune your own origin

* send vote to tpu

* tests

* fixed tests

* fixed test

* update

* ignore scale

* lint

* fixup

* fixup

* fixup

* cleanup

Co-authored-by: Stephen Akridge <sakridge@gmail.com>
2020-06-13 22:03:38 -07:00
sakridge ecb6959720
Optimize process pull responses (#10460)
* Batch process pull responses

* Generate pull requests at 1/2 rate

* Do filtering work of process_pull_response in read lock

Only take write lock to insert if needed.
2020-06-09 17:08:13 -07:00
Kristofer Peterson 58ef02f02b
9951 clippy errors in the test suite (#10030)
automerge
2020-05-15 09:35:43 -07:00
anatoly yakovenko b150da837a Use epoch as the gossip purge timeout for staked nodes. (#7005)
automerge
2019-11-20 11:25:18 -08:00
Sagar Dhawan 568475e2db
Fix incorrectly signed CrdsValues (#6696) 2019-11-03 10:07:51 -08:00
TristanDebrunner 9e52d11ad0
Remove Backend trait (#6407) 2019-10-17 15:19:27 -06:00
Greg Fitzgerald fcef54d062 Add a constructor to generate random pubkeys 2019-03-31 16:23:18 -06:00
Rob Walker 195a880576
pass Pubkeys as refs, copy only where values needed (#3213)
* pass Pubkeys as refs, copy only where values needed

* Pubkey is pervasive

* fixup
2019-03-09 19:28:43 -08:00
Michael Vines 79b2542ca4 Remove CrdsValue::LeaderId 2019-03-08 19:41:51 -08:00
Michael Vines 5f5d779ee1 Move src/ into core/src. Top-level crate is now called solana-workspace 2019-03-02 09:52:18 -08:00