* Security: Limit reconnection rate to individual peers
Reconnection Rate
Limit the reconnection rate to each individual peer by applying the
liveness cutoff to the attempt, responded, and failure time fields.
If any field is recent, the peer is skipped.
The new liveness cutoff skips any peers that have recently been attempted
or failed. (Previously, the liveness check was only applied if the peer
was in the `Responded` state, which could lead to repeated retries of
`Failed` peers, particularly in small address books.)
Reconnection Order
Zebra prefers more useful peer states, then the earliest attempted,
failed, and responded times, then the most recent gossiped last seen
times.
Before this change, Zebra took the most recent time in all the peer time
fields, and used that time for liveness and ordering. This led to
confusion between trusted and untrusted data, and success and failure
times.
Unlike the previous order, the new order:
- tries all peers in each state, before re-trying any peer in that state,
and
- only checks the the gossiped untrusted last seen time
if all other times are equal.
* Preserve the later time if changes arrive out of order
* Update CandidateSet::next documentation
* Update CandidateSet state diagram
* Fix variant names in comments
* Explain why timestamps can be left out of MetaAddrChanges
* Add a simple test for the individual peer retry limit
* Only generate valid Arbitrary PeerServices values
* Add an individual peer retry limit AddressBook and CandidateSet test
* Stop deleting recently live addresses from the address book
If we delete recently live addresses from the address book, we can get a
new entry for them, and reconnect too rapidly.
* Rename functions to match similar tokio API
* Fix docs for service sorting
* Clarify a comment
* Cleanup a variable and comments
* Remove blank lines in the CandidateSet state diagram
* Add a multi-peer proptest that checks outbound attempt fairness
* Fix a comment typo
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Simplify time maths in MetaAddr
* Create a Duration32 type to simplify calculations and comparisons
* Rename variables for clarity
* Split a string constant into multiple lines
* Make constants match rustdoc order
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Security: stop gossiping failure and attempt times as last_seen times
Previously, Zebra had a single time field for peer addresses, which was
updated every time a peer was attempted, sent a message, or failed.
This is a security issue, because the `last_seen` time should be
"the last time [a peer] connected to that node", so that
"nodes can use the time field to avoid relaying old 'addr' messages".
So Zebra was sending incorrect peer information to other nodes.
As part of this change, we split the `last_seen` time into the
following fields:
- untrusted_last_seen: gossiped from other peers
- last_response: time we got a response from a directly connected peer
- last_attempt: time we attempted to connect to a peer
- last_failure: time a connection with a peer failed
* Implement Arbitrary and strategies for MetaAddrChange
Also replace the MetaAddr Arbitrary impl with a derive.
* Write proptests for MetaAddr and MetaAddrChange
MetaAddr:
- the only times that get included in serialized MetaAddrs are
the untrusted last seen and responded times
MetaAddrChange:
- the untrusted last seen time is never updated
- the services are only updated if there has been a handshake
When peers ask for peer addresses, add our local listener address to the
set of addresses, sanitize, then truncate. Sanitize shuffles addresses,
so if there are lots of addresses in the address book, our address will
only be sent to some peers.
- stop putting inbound addresses in the address book
- drop address book entries that can't be used for outbound connections
- distinguish between temporary inbound and permanent outbound peer
addresses
- also create variants to handle proxy connections
(but don't use them yet)
- avoid tracking connection state for isolated connections
- document security constraints for the address book and peer set
`sanitize` could be misused in two ways:
* accidentally modifying the addresses in the address book itself
* forgetting to sanitize new fields added to `MetaAddr`
This change prevents accidental modification by taking `&self`, and
explicitly creates a new sanitized `MetaAddr` with all fields listed.
Design:
- Add a `PeerAddrState` to each `MetaAddr`
- Use a single peer set for all peers, regardless of state
- Implement time-based liveness as an `AddressBook` method, rather than
a `PeerAddrState` variant
- Delete `AddressBook.by_state`
Implementation:
- Simplify `AddressBook` changes using `update` and `take` modifier
methods
- Simplify the `AddressBook` iterator implementation, replacing it with
methods that are more obviously correct
- Consistently collect peer set metrics
Documentation:
- Expand and update the peer set documentation
We can optimise later, but for now we want simple code that is more
obviously correct.
* network: move gossiped peer selection logic into address book.
* network: return BoxService from init.
* zebrad: add note on why we truncate thegossiped peer list
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* Remove unused .rustfmt.toml
Many of these options are never actually loaded by our CI because of a channel
mismatch, where they're not applied on stable but only on nightly (see the logs
from a rustfmt job). This means that we can get different settings when
running `cargo fmt` on the nightly and stable channels, which was causing a CI
failure on this PR. Reverting back to the default rustfmt settings avoids this
problem and keeps us in line with upstream rustfmt. There's no loss to us
since we were using the defaults anyways.
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
The previous implementation failed when timestamps were duplicated between
peers, because there was not a 1-1 relationship between timestamps and peers.
The disconnected_peers() function allows us to prevent duplicate
connections without maintaining shared state between the peerset and the
dial-additional-peers task.
This allows us to hide the `TimestampCollector` and to expose only the
address book data required by the inbound request service. It also lets
us have a common data structure (the `AddressBook`) for collecting peer
information that can be used to manage information that other peers
report to us.