Commit Graph

285 Commits

Author SHA1 Message Date
Alessandro Decina 77ce500494
quic server: disable GSO, reduce reply data allocations 10x (#1647)
quic server: disable GSO

The server only accepts inbound unidirectional streams initiated by
clients, which means that reply data never exceeds one MTU. By disabling
GSO, we make quinn_proto::Connection::poll_transmit allocate only 1 MTU
vs 10 * MTU for _each_ transmit. This reduces allocations 10x.
2024-06-13 01:20:09 +07:00
Lijun Wang f54c120450
Connection rate limiting (#948)
* use rate limit on connectings

use rate limit on connectings; missing file

* Change connection rate limit to 8/min instead of 4/s

* Addressed some feedback from Trent

* removed some comments

* fix test failures which are opening connections more frequently

* moved the flag up

* turn off rate limiting to debug CI

* Fix CI test failures

* differentiate of the two throttling cases in stats: across connections or per ip addr

* fmt issues

* Addressed some feedback from Trent

* Added unit tests

Cleanup connection cache rate limiter if exceeding certain threshold

missing files

CONNECITON_RATE_LIMITER_CLEANUP_THRESHOLD to 100_000

clippy issue

clippy issue

sort crates

* revert Cargo.lock changes

* Addressed some feedback from Pankaj
2024-05-14 17:33:43 -07:00
Lijun Wang 137a982a1e
sleep instead of drop when stream rate exceeded limit; (#939)
* sleep instead of drop when stream rate exceeded limit;

Consider connection count of staked nodes when calculating allowed PPS

remove rtt from throttle_duration calculation

removed connection count in StreamerCounter -- we do not need it at this point

* remove connection count related changes -- they are unrelated to this PR

* revert unintended changes
2024-04-22 22:16:57 -07:00
Alessandro Decina 1c28671ecb
quic server: set initial window to PACKET_DATA_SIZE (#905)
Start small so spam doesn't use too many resources. Then we grow the
window once we figure out a connection's stake.
2024-04-22 10:57:12 -07:00
Alessandro Decina 2770424782
quic: delay calling set_max_concurrent_uni_streams/set_receive_window (#904)
* quic: don't call connection.set_max_concurrent_uni_streams if we're going to drop a connection

Avoids taking a mutex and waking a task.

* quic: don't increase the receive window before we've actually accepted a connection
2024-04-22 10:56:20 -07:00
Lijun Wang 0e96ea6ad8
share the stream counter for unstaked connections as well (#962) 2024-04-22 10:36:54 -07:00
Lijun Wang 9d953cb83a
Limit max concurrent connections (#851)
Limit max concurrent connections
2024-04-19 10:31:40 -07:00
Lijun Wang f2aa4f0741
Parameterize max streams per ms (#707)
Make PPS a parameter instead of the hard coded
2024-04-15 15:58:10 -07:00
Yihau Chen 0e6d42e613
bump nix to 0.28.0 (#628)
* bump nix to 0.28.0

* enable 'socket' for net-utils

* enable 'signal' for install

* enable 'user' for perf

* enable 'net' for streamer
2024-04-11 12:03:23 +08:00
Alessandro Decina 55ab7fadbc
quic: use smallvec to aggregate chunks, save 1 alloc per packet (#735)
quic: use smallvec, save one allocation per packet

Use smallvec to hold chunks. Streams are packet-sized so we don't expect
them to have many chunks. This saves us an allocation for each packet.
2024-04-11 12:25:35 +10:00
Alessandro Decina 85c6e412e0
quic: do ordered reads, save 1 BTreeMap allocation per packet (#736)
quic: switch to ordered reads

Unordered reads cause a BTreeMap allocation for each packet inside quinn
in Assembler::ensure_ordering.

Most streams will fit in one datagram and will therefore be ordered by
definition. Switch to ordered reads to avoid the allocation.
2024-04-11 12:25:06 +10:00
Lijun Wang 92ebf0f80c
Treat super low staked as unstaked in streamer QOS (#701)
* Treat super low staked with QOS of unstaked

* simplify

* address some comment from Pankaj
2024-04-10 17:13:41 -06:00
Andrew Fitzgerald 1744e9efd7
BankingStage Forwarding Filter (#685)
* add PacketFlags::FROM_STAKED_NODE

* Only forward packets from staked node

* fix local-cluster test forwarding

* review comment

* tpu_votes get marked as from_staked_node
2024-04-09 23:12:26 +00:00
Lijun Wang 592107a907
corrected to not use hardcoded connections count for unstaked (#633)
* corrected to not use hardcoded connections count for unstaked

* Fixed a math problem on max_unstaked_load_in_throttling_window

* Fixed a unit test failure
2024-04-09 15:20:24 -07:00
Brooks 4546e79cbc
Fixes nits in streamer perf tracking (#648) 2024-04-09 16:47:18 -04:00
steviez 923e303acb
Move recv timer to start before crossbream receiver recv_timeout() (#642)
The timer starting after recv_timeout() means the measured time will NOT
include time spent waiting for the first PacketBatch from the receiver.
2024-04-08 11:28:00 -05:00
Yihau Chen 01460ef5cc
remove InetAddr from streamer/src/sendmmsg.rs (#557)
* remove InetAddr from streamer/src/sendmmsg.rs

* add ref links

* use SocketAddr conversion directly
2024-04-07 00:35:31 +08:00
Lijun Wang b443cfb0c7
Show staked vs nonstaked packets sent down/throttled (#600)
* Show staked vs nonstaked packets sent down

* add metrics on throttled staked vs non-staked
2024-04-05 13:49:23 -07:00
Yihau Chen 562254ef56
remove InetAddr from streamer/src/recvmmsg.rs (#558)
* remove InetAddr from streamer/src/recvmmsg.rs

* remove 'allow deprecated'

* add ref link
2024-04-05 16:51:30 +08:00
Lijun Wang 2b0391049d
transaction performance tracking -- streamer stage (#257)
* transaction performance tracking -- streamer stage
2024-04-04 13:19:13 -07:00
ryleung-solana 36e97654e3
Make the quic server connection table use an async lock, reducing thrashing (#293)
Make the quic server connection table use an async lock, reducing lock contention
2024-03-18 12:05:00 -07:00
steviez ce34f3f014
Rename and uniquify QUIC thread names (#28)
When viewing in various tools such as gdb and perf, it is not easy to
distinguish which threads are serving which function (TPU or TPU FWD)
2024-03-05 12:09:17 -06:00
steviez 7d6f1d5911
Give streamer::receiver() threads unique names (#35369)
The name was previously hard-coded to solReceiver. The use of the same
name makes it hard to figure out which thread is which when these
threads are handling many services (Gossip, Tvu, etc).
2024-03-01 13:36:08 -06:00
dependabot[bot] 7c59786f10
build(deps): bump indexmap from 2.1.0 to 2.2.2 (#35125)
* build(deps): bump indexmap from 2.1.0 to 2.2.2

Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.1.0 to 2.2.2.
- [Changelog](https://github.com/indexmap-rs/indexmap/blob/master/RELEASES.md)
- [Commits](https://github.com/indexmap-rs/indexmap/compare/2.1.0...2.2.2)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* [auto-commit] Update all Cargo lock files

* call swap_remove_entry directly

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-buildkite <dependabot-buildkite@noreply.solana.com>
Co-authored-by: yihau <yihau.chen@icloud.com>
2024-02-07 19:02:20 +00:00
Lijun Wang 8fde8d26c7
don't sign X.509 certs (#34896)
This get rid of 3rd party components rcgen in the path of private key access to make the code more secure.
2024-01-28 16:17:46 -08:00
Pankaj Garg 9edf65b877
Do not reserve QUIC stream capacity for unstaked client on forward port (#34779) 2024-01-16 16:35:29 -08:00
Pankaj Garg f92275bcaa
Fix determination of staked QUIC connections (#34760)
* Fix determination of staked QUIC connections

* address review comments

* review comments

* treat connections with zero stake as unstaked
2024-01-13 18:38:31 -08:00
Pankaj Garg 22fcffeea8
Move EMA and stream throttling code to a new file (#34759) 2024-01-11 16:54:54 -08:00
Pankaj Garg 904700cc56
Use EMA to compute QUIC streamer load for staked connections (#34586)
* Use EMA to compute QUIC streamer load for staked connections

* change min load to 25% of max load

* reduce max PPS from 500K to 250K

* update ema_function to account for missing intervals

* replace f64 math with u128

* track streams across all connections for a peer

* u128  -> u64

* replace ' as ' type conversion to from and try_from

* add counter for u64 overflow

* reset recent stream load on ema interval

* do not use same counter for unstaked connections from a peer IP
2024-01-11 10:05:38 -08:00
Lijun Wang 1a001751dd
add metrics on throttled streams (#34579) 2023-12-22 16:49:01 -08:00
Pankaj Garg 6bbd3661e1
Throttle unstaked quic streams for a given connection (#34562)
* Throttle unstaked quic streams for a given connection

* Fix interval duration check

* move wait to handle_chunk

* set max unistreams to 0

* drop new streams

* cleanup

* some more cleanup

* fix tests

* update test and stop code

* fix bench-tps
2023-12-21 18:47:52 -08:00
ryleung-solana 132c910f81
Quic update identity (#33865)
Update the Quic transport layer keypair and identity when the Validator's identity keypair is updated
2023-12-08 14:53:19 +08:00
Jeff Biseda 3f805ad06d
improve batch_send error handling (#33936) 2023-10-31 23:39:26 -07:00
Alexander Meißner 9e703f85de
Upgrades Rust to 1.72.0 & nightly-2023-08-25 (#32961)
* allow pedantic invalid cast lint

* allow lint with false-positive triggered by `test-case` crate

* nightly `fmt` correction

* adapt to rust layout changes

* remove dubious test

* Use transmute instead of pointer cast and de/ref when check_aligned is false.

* Renames clippy::integer_arithmetic to clippy::arithmetic_side_effects.

* bump rust nightly to 2023-08-25

* Upgrades Rust to 1.72.0

---------

Co-authored-by: Trent Nelson <trent@solana.com>
2023-09-01 07:26:13 +00:00
behzad nouri 4ec5ea6f7b
replaces assert!(matches!(...)) with assert_matches!(...) (#33068)
assert_matches!(...) provides more informative error message when it
fails and it is part of nightly rust:
https://doc.rust-lang.org/std/assert_matches/macro.assert_matches.html
2023-08-30 13:48:27 -04:00
Trent Nelson b8dc5daedb
preliminaries for bumping nightly to 2023-08-25 (#33047)
* remove unnecessary hashes around raw string literals

* remove unncessary literal `unwrap()`s

* remove panicking `unwrap()`

* remove unnecessary `unwrap()`

* use `[]` instead of `vec![]` where applicable

* remove (more) unnecessary explicit `into_iter()` calls

* remove redundant pattern matching

* don't cast to same type and constness

* do not `cfg(any(...` a single item

* remove needless pass by `&mut`

* prefer `or_default()` to `or_insert_with(T::default())`

* `filter_map()` better written as `filter()`

* incorrect `PartialOrd` impl on `Ord` type

* replace "slow zero-filled `Vec` initializations"

* remove redundant local bindings

* add required lifetime to associated constant
2023-08-29 23:05:35 +00:00
Jon Cinque 0fe902ced7
Bump rand to 0.8, rand_chacha to 0.3, getrandom to 0.2 (#32871)
* sdk: Add concurrent support for rand 0.7 and 0.8

* Update rand, rand_chacha, and getrandom versions

* Run command to replace `gen_range`

Run `git grep -l gen_range | xargs sed -i'' -e 's/gen_range(\(\S*\), /gen_range(\1../'

* sdk: Fix users of older `gen_range`

* Replace `hash::new_rand` with `hash::new_with_thread_rng`

Run:
```
git grep -l hash::new_rand | xargs sed -i'' -e 's/hash::new_rand([^)]*/hash::new_with_thread_rng(/'
```

* perf: Use `Keypair::new()` instead of `generate`

* Use older rand version in zk-token-sdk

* program-runtime: Inline random key generation

* bloom: Fix clippy warnings in tests

* streamer: Scope rng usage correctly

* perf: Fix clippy warning

* accounts-db: Map to char to generate a random string

* Remove `from_secret_key_bytes`, it's just `keypair_from_seed`

* ledger: Generate keypairs by hand

* ed25519-tests: Use new rand

* runtime: Use new rand in all tests

* gossip: Clean up clippy and inline keypair generators

* core: Inline keypair generation for tests

* Push sbf lockfile change

* sdk: Sort dependencies correctly

* Remove `hash::new_with_thread_rng`, use `Hash::new_unique()`

* Use Keypair::new where chacha isn't used

* sdk: Fix build by marking rand 0.7 optional

* Hardcode secret key length, add static assertion

* Unify `getrandom` crate usage to fix linking errors

* bloom: Fix tests that require a random hash

* Remove some dependencies, try to unify others

* Remove unnecessary uses of rand and rand_core

* Update lockfiles

* Add back some dependencies to reduce rebuilds

* Increase max rebuilds from 14 to 15

* frozen-abi: Remove `getrandom`

* Bump rebuilds to 17

* Remove getrandom from zk-token-proof
2023-08-21 19:11:21 +02:00
Lijun Wang b44c9bca89
Reduce max staked streams count to avoid fragmentations (#32771)
Reduce max staked concurrent streams to 512 from 2048.
2023-08-15 12:02:58 -07:00
steviez 0dd4c208e6
Remove redundant inc_new_counter for num received packets (#32664)
The same value is reported as a field in StreamerReceiveStats in
/streamer/src/streamer.rs
2023-07-31 09:53:17 -06:00
behzad nouri 868e086d75
upgrades quinn and rustls crates (#32499) 2023-07-14 17:30:57 +00:00
behzad nouri d54b6204be
removes instances of clippy::manual_let_else (#32417) 2023-07-09 21:41:36 +00:00
behzad nouri 0da01270ef
removes redundant recycler clones (#32401) 2023-07-06 18:25:20 +00:00
behzad nouri 5a80dc0d73
adds QUIC endpoint specific for turbine connections (#32294)
Working towards separating out turbine QUIC from TPU.
2023-07-03 18:57:18 +00:00
Lijun Wang 689ca503e2
Remove a unnecessary sleep in run server (#32216)
remove sleep; and handle initializing connection as soon as available
2023-06-22 15:18:05 -07:00
ryleung-solana 36222a44d7
Use QUIC Retry packets during handshake (#31802)
Have the Quic server send a Retry packet to verify client control of the source IP
2023-06-06 14:23:23 -07:00
Illia Bobyr 4353ac6797
Pass Arc<AtomicBool> by value, not by reference. (#31916)
`Arc` is already a reference internally, so it does not seem to be
beneficial to pass a reference to it.  Just adds an extra layer of
indirection.

Functions that need to be able to increment `Arc` reference count need
to take `Arc<AtomicBool>`, but those that just want to read the
`AtomicBool` value can accept `&AtomicBool`, making them a bit more
generic.

This change focuses specifically on `Arc<AtomicBool>`.  There are other
uses of `&Arc<T>` in the code base that could be converted in a similar
manner.  But it would make the change even larger.
2023-06-01 17:25:48 -07:00
Lijun Wang 0426a2d96e
Flkay quic test in check_block_multiple_connections (#31871)
Flkay quic test -- put the test code in else condition
2023-05-30 12:43:07 -07:00
behzad nouri f1ebc5b5c3
separates out quic streamer connection stats from different servers (#31797) 2023-05-25 16:54:24 +00:00
Lijun Wang a8e2b82e38
Expect errors when opening 2nd stream due to connection limit (#31706) 2023-05-19 08:24:52 -07:00
behzad nouri cb65a785bc
makes sockets in LegacyContactInfo private (#31248)
Working towards LegacyContactInfo => ContactInfo migration, the commit
hides some implementation details of LegacyContactInfo and expands API
parity with the new ContactInfo.
2023-04-21 15:39:16 +00:00