2017-10-05 13:41:44 -07:00
|
|
|
[package]
|
|
|
|
name = "tower-balance"
|
|
|
|
version = "0.1.0"
|
|
|
|
authors = ["Carl Lerche <me@carllerche.com>"]
|
2019-04-09 10:59:30 -07:00
|
|
|
license = "MIT"
|
2017-11-16 09:40:32 -08:00
|
|
|
publish = false
|
2019-04-08 20:11:09 -07:00
|
|
|
edition = "2018"
|
2017-10-05 13:41:44 -07:00
|
|
|
|
|
|
|
[dependencies]
|
|
|
|
futures = "0.1"
|
2018-02-23 20:24:22 -08:00
|
|
|
log = "0.4.1"
|
2019-01-11 10:12:24 -08:00
|
|
|
rand = "0.6"
|
balance: Implement a Peak-EWMA load metric (#76)
The balancer provides an implementation of two load balancing strategies: RoundRobin and
P2C+LeastLoaded. The round-robin strategy is extremely simplistic and not sufficient for
most production systems. P2C+LL is a substantial improvement, but relies exclusively on
instantaneous information.
This change introduces P2C+PeakEWMA strategy. P2C+PE improves over P2C+LL by maintaining
an exponentially-weighted moving average of response latencies for each endpoint so that
the recent history directly factors into load balancing decisions. This technique was
pioneered by Finagle for use at Twitter. [Finagle's P2C+PE implementation][finagle] was
referenced heavily while developing this.
The provided demo can be used to illustrate the differences between load balacing
strategies. For example:
```
REQUESTS=50000
CONCURRENCY=50
ENDPOINT_CAPACITY=50
MAX_ENDPOINT_LATENCIES=[1ms, 10ms, 10ms, 10ms, 10ms, 100ms, 100ms, 100ms, 100ms, 1000ms, ]
P2C+PeakEWMA
wall 15s
p50 5ms
p90 56ms
p95 78ms
p99 96ms
p999 105ms
P2C+LeastLoaded
wall 18s
p50 5ms
p90 57ms
p95 80ms
p99 98ms
p999 857ms
RoundRobin
wall 72s
p50 9ms
p90 98ms
p95 496ms
p99 906ms
p999 988ms
````
[numbers]: https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
[finagle]: https://github.com/twitter/finagle/blob/9cc08d15216497bb03a1cafda96b7266cfbbcff1/finagle-core/src/main/scala/com/twitter/finagle/loadbalancer/PeakEwma.scala
2018-06-06 23:16:49 -07:00
|
|
|
tokio-timer = "0.2.4"
|
2019-03-06 13:38:58 -08:00
|
|
|
tower-service = "0.2.0"
|
2017-10-05 13:41:44 -07:00
|
|
|
tower-discover = { version = "0.1", path = "../tower-discover" }
|
2019-03-27 16:34:56 -07:00
|
|
|
tower-util = { version = "0.1", path = "../tower-util" }
|
2018-03-15 21:33:20 -07:00
|
|
|
indexmap = "1"
|
2018-01-24 12:18:12 -08:00
|
|
|
|
|
|
|
[dev-dependencies]
|
2018-02-23 20:24:22 -08:00
|
|
|
log = "0.4.1"
|
|
|
|
env_logger = { version = "0.5.3", default-features = false }
|
2018-01-25 12:51:33 -08:00
|
|
|
hdrsample = "6.0"
|
2018-02-23 20:24:22 -08:00
|
|
|
quickcheck = { version = "0.6", default-features = false }
|
balance: Implement a Peak-EWMA load metric (#76)
The balancer provides an implementation of two load balancing strategies: RoundRobin and
P2C+LeastLoaded. The round-robin strategy is extremely simplistic and not sufficient for
most production systems. P2C+LL is a substantial improvement, but relies exclusively on
instantaneous information.
This change introduces P2C+PeakEWMA strategy. P2C+PE improves over P2C+LL by maintaining
an exponentially-weighted moving average of response latencies for each endpoint so that
the recent history directly factors into load balancing decisions. This technique was
pioneered by Finagle for use at Twitter. [Finagle's P2C+PE implementation][finagle] was
referenced heavily while developing this.
The provided demo can be used to illustrate the differences between load balacing
strategies. For example:
```
REQUESTS=50000
CONCURRENCY=50
ENDPOINT_CAPACITY=50
MAX_ENDPOINT_LATENCIES=[1ms, 10ms, 10ms, 10ms, 10ms, 100ms, 100ms, 100ms, 100ms, 1000ms, ]
P2C+PeakEWMA
wall 15s
p50 5ms
p90 56ms
p95 78ms
p99 96ms
p999 105ms
P2C+LeastLoaded
wall 18s
p50 5ms
p90 57ms
p95 80ms
p99 98ms
p999 857ms
RoundRobin
wall 72s
p50 9ms
p90 98ms
p95 496ms
p99 906ms
p999 988ms
````
[numbers]: https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
[finagle]: https://github.com/twitter/finagle/blob/9cc08d15216497bb03a1cafda96b7266cfbbcff1/finagle-core/src/main/scala/com/twitter/finagle/loadbalancer/PeakEwma.scala
2018-06-06 23:16:49 -07:00
|
|
|
tokio = "0.1.7"
|
|
|
|
tokio-executor = "0.1.2"
|
2019-03-29 14:24:43 -07:00
|
|
|
tower = { version = "0.1", path = "../tower" }
|
balance: Update demo (#79)
In preparation for additional load balancing strategies, the demo is
being updated to allow for richer testing in several important ways:
- Adopt the new `tokio` multithreaded runtime.
- Use `tower-buffer` to drive each simulated endpoint on an independent
task. This fixes a bug where requests appeared active longer than
intended (while waiting for the SendRequests task process responses).
- A top-level concurrency has been added (by wrapping the balancer in
`tower-in-flight-limit`) so that `REQUESTS` futures were not created
immediately. This also caused incorrect load measurements.
- Endpoints are also constrained with `tower-in-flight-limit`. By
default, the limit is that of the load balancer (so endpoints are
effectively unlimited).
- The `demo.rs` script has been reorganized to account for the new
runtime, such that all examples are one task chain.
- New output format:
```
REQUESTS=50000
CONCURRENCY=50
ENDPOINT_CAPACITY=50
MAX_ENDPOINT_LATENCIES=[1ms, 10ms, 10ms, 10ms, 10ms, 100ms, 100ms, 100ms, 100ms, 1000ms, ]
P2C+LeastLoaded
wall 18s
p50 5ms
p90 56ms
p95 80ms
p99 98ms
p999 900ms
RoundRobin
wall 72s
p50 9ms
p90 98ms
p95 488ms
p99 898ms
p999 989ms
```
2018-06-04 17:54:07 -07:00
|
|
|
tower-buffer = { version = "0.1", path = "../tower-buffer" }
|
2019-04-05 20:08:43 -07:00
|
|
|
tower-limit = { version = "0.1", path = "../tower-limit" }
|