This makes all tower subcrates have the following lints as warn (rather
than allow): `missing_docs`, `rust_2018_idioms`, `unreachable_pub`, and
`missing_debug_implementations`. In addition, it consistently applies
`deny(warning)` *only* under CI so that deprecations and macro changes in minor
version bumps in dependencies will never cause `tower` crates to stop
compiling, and so that tests can be run even if not all warnings have been
dealt with. See also https://github.com/rust-unofficial/patterns/blob/master/anti_patterns/deny-warnings.md
Note that `tower-reconnect` has the `missing_docs` lint disabled for now
since it contained _no_ documentation previously. Also note that this
patch does not add documentation to the various `new` methods, as they
are considered self-explanatory. They are instead marked as
`#[allow(missing_docs)]`.
As described in #286, `Balance` had a few problems:
- it is responsible for driving all inner services to readiness, making
its `poll_ready` O(n) and not O(1);
- the `choose` abstraction was a hinderance. If a round-robin balancer
is needed it can be implemented separately without much duplicate
code; and
- endpoint errors were considered fatal to the balancer.
This changes replaces `Balance` with `p2c::Balance` and removes the
`choose` module.
Endpoint service failures now cause the service to be removed from the
balancer gracefully.
Endpoint selection is now effectively constant time, though it biases
for availability in the case when random selection does not yield an
available endpoint.
`tower-test` had to be updated so that a mocked service could fail after
advertising readiness.
The tower-balance crate includes the `Load` and `Instrument` traits,
which are likely useful outside of balancers; and certainly have no
tight coupling with any specific balancer implementation. This change
extracts these protocol-agnostic traits into a dedicated crate.
The `Load` trait includes a latency-aware _PeakEWMA_ load strategy as
well as a simple _PendingRequests_ strategy for latency-agnostic
applications.
The `Instrument` trait is used by both of these strategies to track
in-flight requests without knowing protocol details. It is expected that
protocol-specific crates will provide, for instance, HTTP
time-to-first-byte latency strategies.
A default `NoInstrument` implementation tracks the a request until its
response future is satisfied.
This crate should only be published once tower-balance is published.
Part of https://github.com/tower-rs/tower/issues/286
In order to implement red-line testing, blue-green deployments, and
other operational use cases, many service discovery and routing schemes
support endpoint weighting.
In this iteration, we provide a decorator type, `WeightedLoad`, that may
be used to wrap load-bearing services to alter their load according to a
weight. The `WithWeighted` type may also be used to wrap `Discover`
implementations, in which case it will wrap all new services with
`WeightedLoad`.
This patch adds a new type, `Pool`, which wraps a
`tower_balance::Balance` and a `tower_service::NewService` together so
that new `Service` instances are added when load is high, and removed
again if load is low.
The pool uses an exponentially weighted moving average of successful
calls to `poll_ready` on the underlying `Balance` to estimate whether
there are enough services available. If `poll_ready` frequently returns
`NotReady`, then a new service is produced, whereas if `poll_ready`
pretty much never returns `NotReady`, the most recently added service is
removed from the pool (down to a minimum of 1).
This updates the `Service` contract requiring `poll_ready` to be called
before `call`. This allows `Service::call` to panic in the event the
user of the service omits `poll_ready` or does not wait until `Ready` is
observed.
This patch adds the `DirectService` trait, and related implementations
over it in `tower_balance` and `tower_buffer`. `DirectService` is
similar to a `Service`, but must be "driven" through calls to
`poll_service` for the futures returned by `call` to make progress.
The motivation behind adding this trait is that many current `Service`
implementations spawn long-running futures when the service is created,
which then drive the work necessary to turn requests into responses. A
simple example of this is a service that writes requests over a
`TcpStream` and reads responses over that same `TcpStream`. The
underlying stream must be read from to discover new responses, but there
is no single entity to drive that task. The returned futures would share
access to the stream (and worse yet, may get responses out of order),
and then service itself is not guaranteed to see any more calls to it as
the client is waiting for its requests to finish.
`DirectService` solves this by introducing a new method, `poll_service`,
which must be called to make progress on in-progress futures.
Furthermore, like `Future::poll`, `poll_service` must be called whenever
the associated task is notified so that the service can also respect
time-based operations like heartbeats.
The PR includes changes to both `tower_balance::Balance` and
`tower_buffer::Buffer` to add support for wrapping `DirectService`s. For
`Balance` this is straightforward: if the inner service is a `Service`,
the `Balance` also implements `Service`; if the inner service is a
`DirectService`, the `Balance` is itself also a `DirectService`. For
`Buffer`, this is more involved, as a `Buffer` turns any `DirectService`
*into* a `Service`. The `Buffer`'s `Worker` is spawned, and will
therefore drive the wrapped `DirectService`.
One complication arises in that `Buffer<T>` requires that `T: Service`,
but you can safely construct a `Buffer` over a `DirectService` per the
above. `Buffer` works around this by exposing
```rust
impl Service for HandleTo<S> where S: DirectService {}
```
And giving out `Buffer<HandleTo<S>>` when the `new_directed(s: S)`
constructor is invoked. Since `Buffer` never calls any methods on the
service it wraps, `HandleTo`'s implementation just consists of calls to
`unreachable!()`.
Note that `tower_buffer` now also includes a `DirectedService` type,
which is a wrapper around a `Service` that implements `DirectService`.
In theory, we could do away with this by adding a blanket impl:
```rust
impl<T> DirectedService for T where T: Service {}
```
but until we have specialization, this would prevent downstream users
from implementing `DirectService` themselves.
Finally, this also makes `Buffer` use a bounded mpsc channel, which
introduces a new capacity argument to `Buffer::new`.
Fixes#110.
This changes the Service request type to a generic instead of an associated
type. This is more appropriate as requests are inputs to the service.
This change enables a single implementation of `Service` to accept many
kinds of request types. This also enables requests to be references.
Fixes#99
The balancer provides an implementation of two load balancing strategies: RoundRobin and
P2C+LeastLoaded. The round-robin strategy is extremely simplistic and not sufficient for
most production systems. P2C+LL is a substantial improvement, but relies exclusively on
instantaneous information.
This change introduces P2C+PeakEWMA strategy. P2C+PE improves over P2C+LL by maintaining
an exponentially-weighted moving average of response latencies for each endpoint so that
the recent history directly factors into load balancing decisions. This technique was
pioneered by Finagle for use at Twitter. [Finagle's P2C+PE implementation][finagle] was
referenced heavily while developing this.
The provided demo can be used to illustrate the differences between load balacing
strategies. For example:
```
REQUESTS=50000
CONCURRENCY=50
ENDPOINT_CAPACITY=50
MAX_ENDPOINT_LATENCIES=[1ms, 10ms, 10ms, 10ms, 10ms, 100ms, 100ms, 100ms, 100ms, 1000ms, ]
P2C+PeakEWMA
wall 15s
p50 5ms
p90 56ms
p95 78ms
p99 96ms
p999 105ms
P2C+LeastLoaded
wall 18s
p50 5ms
p90 57ms
p95 80ms
p99 98ms
p999 857ms
RoundRobin
wall 72s
p50 9ms
p90 98ms
p95 496ms
p99 906ms
p999 988ms
````
[numbers]: https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
[finagle]: 9cc08d1521/finagle-core/src/main/scala/com/twitter/finagle/loadbalancer/PeakEwma.scala
Previously, `power_of_two_choices` and `round_robin` constructors were
exposed from the crate scope.
These have been replaced by `Balance::p2c`, `Balance::p2c_from_rng`, and
`Balance::round_robin`.
When debugging load balancer behavior, it's convenient to observe the
individual node selection decisions. To that end, this change requires
that `Load::Metric` implement `fmt::Debug` when used by
`PowerOfTwoChoices`.
In preparation for additional load balancing strategies, the demo is
being updated to allow for richer testing in several important ways:
- Adopt the new `tokio` multithreaded runtime.
- Use `tower-buffer` to drive each simulated endpoint on an independent
task. This fixes a bug where requests appeared active longer than
intended (while waiting for the SendRequests task process responses).
- A top-level concurrency has been added (by wrapping the balancer in
`tower-in-flight-limit`) so that `REQUESTS` futures were not created
immediately. This also caused incorrect load measurements.
- Endpoints are also constrained with `tower-in-flight-limit`. By
default, the limit is that of the load balancer (so endpoints are
effectively unlimited).
- The `demo.rs` script has been reorganized to account for the new
runtime, such that all examples are one task chain.
- New output format:
```
REQUESTS=50000
CONCURRENCY=50
ENDPOINT_CAPACITY=50
MAX_ENDPOINT_LATENCIES=[1ms, 10ms, 10ms, 10ms, 10ms, 100ms, 100ms, 100ms, 100ms, 1000ms, ]
P2C+LeastLoaded
wall 18s
p50 5ms
p90 56ms
p95 80ms
p99 98ms
p999 900ms
RoundRobin
wall 72s
p50 9ms
p90 98ms
p95 488ms
p99 898ms
p999 989ms
```
`PowerOfTwoChoices` requires a Random Number Generator. In order for
this randomization source to be configurable (i.e. for tests),
`PowerOfTwoChoices` is generic over its implementation of `rand::Rng`;
however, this leads to needless boilerplate when building P2C balancers.
Because load balancers do not need a cryptographically strong RNG, we
can use `rand::SmallRng` (which is `Send + Sync`). `PowerOfTwoChoices`
exposes constructors that take a `SmallRng`.
In order to do this, the `tower-balance` crate now requires `rand = "0.5"`.
When an endpoint's state changes in some way, it may need to be rebound to a
new service, and reinserted into the load balancer. This PR changes
`tower-balance` so that, rather than ignoring duplicate `Insert`s, the new
endpoint replaces the old endpoint. The new endpoint is always placed on the
not-ready list; if the replaced endpoint was on the ready list, it is removed
prior to inserting the new endpoint into the not-ready list.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
I've implemented `std::error::Error` for the error types in the `tower-balance`, `tower-buffer`, `tower-in-flight-limit`, and `tower-reconnect` middleware crates.
This is required upstream for runconduit/conduit#442, and also just generally seems like the right thing to do as a library.
Test `Balance::poll_ready` by creating a random number of services, each
of which must be polled a random number of times before becoming ready.
As the balancer is polled, the test ensures that it does not become
ready prematurely and that services are promoted from not_ready to
ready.
`Balance::num_ready()` and `Balance::is_ready()` have been added to
expose the number of ready services, as well as `Balance::num_not_ready()`
to expose the number of pending services.
The new _demo_ example sends a million simulated requests through each
load balancer configuration and records the observed latency
distributions.
Furthermore, this fixes a critical bug in `Balancer`, where we did not
properly iterate through not-ready nodes.
* Use (0..n-1).rev() to iterate from right-to-left
Previously, tower-balance used a fixed round-robin strategy for load
distribution.
This change makes `Balance` generic over its load metric and selection
strategy. The following new traits have been introduced to satisfy this:
- `tower_balance::Load` provides a concrete load metric (i.e. for a service);
- `tower_balance::Choose` provides a strategy for selecting a node;
There are two load balancing configurations supported out-of-the-box:
- `tower_balance::round_robin` provides a load-ignorant round-robin balancer.
- `tower_balance::power_of_two_choices` uses the Power of Two Choices to
distribute requests to the least-loaded node. This should be used in conjunction
with `tower_balance::load::WithPendingRequests` to decorate a `Discover` instance
so that all services it produces implement `Load`.