Commit Graph

13 Commits

Author SHA1 Message Date
Jane Lusby 685bdaf2df don't require absense of cancel handles
Prior to this change, we required that services that are canceled do not
have a cancel handle in the `cancel_handles` list, based on the
assumption that the handle must have been removed in the process of
canceling this service.

This doesn't holding up though, because it is currently possible for us
to have the same peer connect to us multiple times, the second connect
removes the cancel handle of the original connect and inserts it's own
cancel handle in its place. In this scenario, when the first service is
polled for readiness it will see that it has been canceled and go to
clean itself up, but when it asserts that it doesn't have a cancel
handle it will see the cancel handle of the second connect event, which
uses the same key as the first connect, and fail its debug assertion.

This change removes that debug assert on the assumption that it is okay
for a peer to connect multiple times consecutively, and that the correct
behavior in that case is to just cancel the first connection and
continue as normal.
2020-06-16 13:42:31 -07:00
Jane Lusby 431f194c0f
propagate errors out of zebra_network::init (#435)
Prior to this change, the service returned by `zebra_network::init` would spawn background tasks that could silently fail, causing unexpected errors in the zebra_network service.

This change modifies the `PeerSet` that backs `zebra_network::init` to store all of the `JoinHandle`s for each background task it depends on. The `PeerSet` then checks this set of futures to see if any of them have exited with an error or a panic, and if they have it returns the error as part of `poll_ready`.
2020-06-09 12:24:28 -07:00
Jane Lusby 8c178c3ee4
fix panic in seed subcommand (#401)
Co-authored-by: Jane Lusby <jane@zfnd.org>

Prior to this change, the seed subcommand would consistently encounter a panic in one of the background tasks, but would continue running after the panic. This is indicative of two bugs. 

First, zebrad was not configured to treat panics as non recoverable and instead defaulted to the tokio defaults, which are to catch panics in tasks and return them via the join handle if available, or to print them if the join handle has been discarded. This is likely a poor fit for zebrad as an application, we do not need to maximize uptime or minimize the extent of an outage should one of our tasks / services start encountering panics. Ignoring a panic increases our risk of observing invalid state, causing all sorts of wild and bad bugs. To deal with this we've switched the default panic behavior from `unwind` to `abort`. This makes panics fail immediately and take down the entire application, regardless of where they occur, which is consistent with our treatment of misbehaving connections.

The second bug is the panic itself. This was triggered by a duplicate entry in the initial_peers set. To fix this we've switched the storage for the peers from a `Vec` to a `HashSet`, which has similar properties but guarantees uniqueness of its keys.
2020-05-27 17:40:12 -07:00
Henry de Valence 3ed75cb626 Tweak peer set metrics.
- Add a total peers metric to prevent races between measurements of
  ready/unready peers (which can cause the sum to be wrong).
- Add an outbound request counter.
2020-02-21 06:48:25 -05:00
Henry de Valence 75d3d44fb3 Metrics MVP: add two metrics and export them to Prometheus.
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2020-02-14 20:14:05 -05:00
Henry de Valence 8d58dd804f Note that tracing causes clippy false positives
Thanks @hawkw for pointing this out.
2020-02-05 12:42:32 -08:00
Henry de Valence f04f4f0b98 Apply clippy fixes 2020-02-05 12:42:32 -08:00
Henry de Valence 2965187b91 Upgrade tokio, futures, hyper to released versions. 2019-12-13 17:42:15 -05:00
Henry de Valence c3ec235a5b Suppress unused import warnings. 2019-10-22 19:06:08 -07:00
Henry de Valence ed2ee9d42f Add a PeerConnector wrapper around PeerHandshake 2019-10-22 19:06:08 -07:00
Henry de Valence b1832ce593 Initial work to add a crawl-and-dial task.
This responds to peerset demand by connecting to additional peers.

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
2019-10-22 19:06:08 -07:00
Henry de Valence 5847b490da Move PeerSet setup logic into a peer_set::init() 2019-10-18 16:11:01 -07:00
Henry de Valence ae1a164ff8
Beginning of peerset implementation. (#62)
* Don't expose submodules of zebra_network::peer.

* PeerSet, PeerDiscover stubs.

Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>

* Initial work on PeerSet.

This is adapted from the MIT-licensed tower-balance implementation.

* Use PeerSet in the connect stub.
2019-10-10 18:15:24 -07:00