Tenets:
1. Limit thread names to 15 characters
2. Prefix all Solana-controlled threads with "sol"
3. Use Camel case. It's more character dense than Snake or Kebab case
It used to report the number of packets with successful signature
validations but was accidentally changed to count packets passed into
the verifier by e4409a87fe.
This restores the previous meaning.
Upcoming changes to PacketBatch to support variable sized packets will
modify the internals of PacketBatch. So, this change removes usage of
the internal packet struct and instead uses accessors (which are
currently just wrappers of Vector functions but will change down the
road).
* SigVerify: Add total time metrics for dedup/discard/verify
Previously it was impossible to determine the total time the stage spent
on these activities within a measurement window.
* SigVerify: Add _us postfix to time metrics
As shown by the added benchmark, current code does worse if there is a
spam address plus a lot of unique addresses.
on current master:
test bench_packet_discard_many_senders ... bench: 1,997,960 ns/iter (+/- 103,715)
test bench_packet_discard_mixed_senders ... bench: 14,256,116 ns/iter (+/- 534,865)
test bench_packet_discard_single_sender ... bench: 1,306,809 ns/iter (+/- 61,992)
with this commit:
test bench_packet_discard_many_senders ... bench: 1,644,025 ns/iter (+/- 83,715)
test bench_packet_discard_mixed_senders ... bench: 1,089,789 ns/iter (+/- 86,324)
test bench_packet_discard_single_sender ... bench: 955,234 ns/iter (+/- 55,953)
* Page-pin packet memory for cuda
Bring back recyclers and pin offset buffers
* Add packet recycler to streamer
* Add set_pinnable to sigverify vecs to pin them
* Add packets reset test
* Add test for recycler and reduce the gc lock critical section
* Add comments/tests to cuda_runtime
* Add recycler to recv_blobs path.
* Add trace/names for debug and PacketsRecycler to bench-streamer
* Predict realloc and unpin beforehand.
* Add helper to reserve and pin
* Cap buffered packets length
* Call cuda wrapper functions
* Use Rc to prevent clone of Packets
* Fix min => max in banking_stage threads.
Coalesce packet buffers better since a larger batch will
be faster through banking and sigverify.
Deconstruct batches into banking_stage from sigverify since
sigverify likes to accumulate batches but then a single banking_stage
thread will be stuck with a large batch. Maximize parallelism by
creating more chunks of work for banking_stage.