Problem: JP's report is unaddressed
Solution: formalize our current thoughts and actions
This commit is contained in:
commit
82d9476283
|
@ -0,0 +1,3 @@
|
|||
[submodule "fastcmp"]
|
||||
path = fastcmp
|
||||
url = https://github.com/saschagrunert/fastcmp
|
|
@ -0,0 +1,137 @@
|
|||
# Aura Consensus Protocol Audit Findings Commentary
|
||||
|
||||
This document is prepared in response to the draft audit report provided
|
||||
by Jean-Philippe Aumasson (08/12/17), SHA256 hash of [aura-audit-report.md](aura-audit-report.md)
|
||||
is 8040fe95d91b2d77f1f0df2af5ba23c85ee487b18dde650dd46cfc82686d0b3d
|
||||
|
||||
## Protocol review
|
||||
|
||||
### Risks of synchronized time
|
||||
|
||||
The report indicates certain risks of [un]synchronized time and related attacks.
|
||||
We believe these concerns are valid and are mostly the consequences of the
|
||||
basics of the protocol (it is based on physical time)
|
||||
|
||||
### Resilience to malicious nodes
|
||||
|
||||
No commentary.
|
||||
|
||||
### Denial-of-service attacks
|
||||
|
||||
No commentary.
|
||||
|
||||
### Finality conditions
|
||||
|
||||
We've identified the same case to cause confusion as well (specification
|
||||
stating `>= 1/2` of the validators, while code requires `> 1/2`).
|
||||
|
||||
With regards to `would_be_finalized` calculation, we currently believe
|
||||
that this code is correct and is expressing the same `> 1/2` requirement,
|
||||
but is "forecasting" finalization for the next signer.
|
||||
|
||||
Recommended course of action:
|
||||
|
||||
- [ ] **P1** Consult with Parity maintainers with regards to the original intent
|
||||
and actual implementation. Which constraint ("greater" or "stricly greater")
|
||||
should in fact be specified and implemented
|
||||
- [ ] **P2** Further analyze the correctness of `would_be_finalized` definition
|
||||
- [ ] **P2.1** Provide meaningful in-code commentary on how `build_ancestry_subchain`
|
||||
operates
|
||||
|
||||
### Finality delay
|
||||
|
||||
No commentary at this moment, but we are further researching this concern.
|
||||
|
||||
## Code review
|
||||
|
||||
### Unsafe code
|
||||
|
||||
The report mentions use of unsafe code in `bigint::hash` where it is used
|
||||
to efficiently compare two byte arrays. As mentioned, this doesn't seem
|
||||
to be a security risk. However, it is still an instance of unsafe code
|
||||
which would always raise at least some concern.
|
||||
|
||||
At the core of this issue there's a question of potential performance gains.
|
||||
In order to address it to best of our ability, we've conducted a few comparative
|
||||
benchmarks. Firstly, we ran simple benchmarks (slice equality vs memcp) both on
|
||||
rust stable (rustc 1.22.1 (05e2e1c41 2017-11-22)) and rust nighty (rustc 1.22.1 (05e2e1c41 2017-11-22))
|
||||
and did not see any gains (the results were very similar in both cases):
|
||||
|
||||
```
|
||||
test cmp ... bench: 4 ns/iter (+/- 1)
|
||||
test cmp_loop ... bench: 4,251 ns/iter (+/- 215)
|
||||
test eq ... bench: 4 ns/iter (+/- 1)
|
||||
test eq_loop ... bench: 4,269 ns/iter (+/- 622)
|
||||
```
|
||||
|
||||
The source code for this benchmark is available at [byte-cmp](byte-cmp)
|
||||
|
||||
To continue this effort, we further sampled some libraries available
|
||||
on [crates.io](https://crates.io).
|
||||
|
||||
[fastcmp](https://github.com/saschagrunert/fastcmp) was seemingly able
|
||||
to show different results:
|
||||
|
||||
```
|
||||
test fast_compare_equal ... bench: 10 ns/iter (+/- 0) = 25600 MB/s
|
||||
test fast_compare_unequal ... bench: 10 ns/iter (+/- 0) = 25600 MB/s
|
||||
test slice_compare_equal ... bench: 21 ns/iter (+/- 1) = 12190 MB/s
|
||||
test slice_compare_unequal ... bench: 21 ns/iter (+/- 0) = 12190 MB/s
|
||||
```
|
||||
|
||||
At the core of its implementation method also lies `memcmp`. The reason for the
|
||||
significant discrepancy in results is being further investigated. At this moment,
|
||||
we can only state that the optimization method used in parity *may* produce
|
||||
much more efficient code so its worth keeping around. However, we'd recommend
|
||||
the following course of action:
|
||||
|
||||
- [ ] **C1** Prepare a patch for bigint::hash that explains the necessity of using unsafe
|
||||
code (and memcmp) in prose to make this code easier to understand for others.
|
||||
|
||||
### Step number cast from 64- to 32-bit
|
||||
|
||||
The report indicates a cast of a 64-bit number to 32-bit one in AuRa code. The concern
|
||||
is valid, however, this behaviour will only be triggered 100-500 years later. That said,
|
||||
the following course of action was taken:
|
||||
|
||||
- [x] **C2** Avoid casting down to 32-bit by dropping the need to use `Duration` type
|
||||
for representing step duration (which is typically single or double digits seconds)
|
||||
https://github.com/oraclesorg/parity/commit/fa2ddce949d8a227b7135b300003f0a5bceddc0f
|
||||
- [ ] **C2.1** Prepare a PR and get it merged into the main repository (or get resolved in
|
||||
any other way)
|
||||
|
||||
### Potential integer overflow
|
||||
|
||||
The report highlights that step increment function might overflow the counter, especially
|
||||
when parity is compiled for a 32-bit system. This event is also quite far in the future,
|
||||
however, the following course of action was taken:
|
||||
|
||||
- [x] **C3** Shutdown parity when such event occurs as there's not much that can be done
|
||||
at that point https://github.com/oraclesorg/parity/commit/fa2ddce949d8a227b7135b300003f0a5bceddc0f
|
||||
- [ ] **C3.1** Prepare a PR and get it merged into the main repository (or get resolved in
|
||||
any other way)
|
||||
|
||||
|
||||
### Potential division by zero
|
||||
|
||||
The report shows that Step calibration function might panic on division by zero. This is a
|
||||
valid concern. If somebody will configure Parity with a step duration of 0 seconds, it'll
|
||||
quickly panic, however, the error won't necessarily be very informative. This case highlights
|
||||
the lack of refinement types in Rust. Following course of action was taken:
|
||||
|
||||
- [x] **C4** Guard step duration parameter to be positive and crash Parity otherwise
|
||||
https://github.com/oraclesorg/parity/commit/fa2ddce949d8a227b7135b300003f0a5bceddc0f
|
||||
- [ ] **C4.1** Prepare a PR and get it merged into the main repository (or get resolved in
|
||||
any other way)
|
||||
|
||||
### Other possible improvements
|
||||
|
||||
The report suggests few other improvements. First one is to improve the measurement of timing.
|
||||
There is no indication at this point that this is a concern. Course of action:
|
||||
|
||||
- [ ] **C5** Try out https://github.com/jedisct1/rust-coarsetime to see how much more
|
||||
performant it is
|
||||
- [ ] **C6** Research the importance of measurement timing
|
||||
|
||||
The other suggested improvement is erroneous as it suggests that the code doesn't check
|
||||
of RNG creation failure, while in fact it does.
|
|
@ -0,0 +1,172 @@
|
|||
% Audit of the Aura consensus protocol
|
||||
% Jean-Philippe Aumasson
|
||||
% 08/12/17
|
||||
|
||||
Authority Round (a.k.a. Aura) is a proof-of-authority consensus engine in the [Parity](https://github.com/paritytech/parity/) Ethereum client, implemented in [/ethcore/src/engines/authority_round](https://github.com/paritytech/parity/tree/master/ethcore/src/engines/authority_round) in files [mod.rs](https://github.com/paritytech/parity/blob/master/ethcore/src/engines/authority_round/mod.rs) and [finality.rs](https://github.com/paritytech/parity/blob/master/ethcore/src/engines/authority_round/finality.rs) (about 850 lines or Rust, including approximately 250 lines of tests).
|
||||
|
||||
The goal of this audit is to assess the security strength of the Aura and its implementation, and to find any security shortcoming, in particular related to finality conditions.
|
||||
The version of the source code reviewed is that at the commit `b1517654e1212588238c989d00dd92128ea040fe`.
|
||||
|
||||
The first part *Protocol review* is about the protocol logic, as understood from the documentation and as implemented.
|
||||
The second part *Code review* is about logic and software bugs in the implementation that are not directly related to the protocol's logic.
|
||||
|
||||
|
||||
# Protocol review
|
||||
|
||||
## Risks of synchronized time
|
||||
|
||||
Aura critically relies on time synchronization of the validators for successful execution of the protocol.
|
||||
This increases two risks:
|
||||
|
||||
1. **Inconsistency** of the blockchain's state, if clocks of validators are accidentally desynchronized, for example due to clock drift. In this case, different validators would compute different a different step index, and therefore a different lead validator.
|
||||
|
||||
2. **Denial-of-service** and possiblity other attacks: Many machines would get time from an NTP server through ntpd, adjusting their local clock to compensate any skew. However,
|
||||
- The NTP server(s) have then to be trusted---which also reduces the decentralization of the protocol
|
||||
- The NTP traffic is unauthenticated by default, thus a network attacker could transmit invalid time to the validators.
|
||||
|
||||
The risk of inconsistency can be minimized by choosing a long enough duration, and by synchronizing with a trusted time server from a local network (for example getting its time from GPS).
|
||||
|
||||
The risk of denial-of-service can be minimized by authenticating any NTP requests, using the pre-shared key authentication mode or the protocol proposed in a [recent Internet Draft](https://tools.ietf.org/html/draft-ietf-ntp-using-nts-for-ntp-10).
|
||||
|
||||
Even when NTP is not used, an attacker may have other means of influencing the local clock skew, for example through [CPU heating](http://sec.cs.ucl.ac.uk/users/smurdoch/papers/ccs06hotornot.pdf).
|
||||
|
||||
|
||||
## Resilience to malicious nodes
|
||||
|
||||
Aura claims to tolerate up to 50% malicious nodes, as per its documentation.
|
||||
The goals of a collusion of malicious nodes would be to seal blocks that should not be sealed, or merely to influence the selection of a validator.
|
||||
|
||||
Based on my understanding of the protocol, I believe these claims are correct. The bound 50% is tight, and any collusion of more than 50% of the validators could disrupt the sound execution of the protocol.
|
||||
|
||||
## Denial-of-service attacks
|
||||
|
||||
Even if connections between validators are encrypted and authenticated, an attacker that blocks from/to one or more validators can prevent the protocol from validating incoming blocks.
|
||||
|
||||
Solutions to this problem are non-trivial, and seem to require a trade-off between resiliency to network disruptions and security against collusions of malicious nodes.
|
||||
|
||||
## Finality conditions
|
||||
|
||||
At the time of writing the [documentation](https://github.com/paritytech/parity/wiki/Aura) defines finality condition as `|SIG_SET(C[K..])| >= n/2`, where `n` is the number of validators, which suggests finalization as soon as 50% of validators have signed a given step.
|
||||
However a strict majority seems to be the desired condition, so this condition should instead be `|SIG_SET(C[K..])| > n/2`.
|
||||
|
||||
The condition implemented in finality.rs seems even stricter in `build_ancestry_subchain()`:
|
||||
|
||||
```
|
||||
let would_be_finalized = (current_signed + 1) * 2 > self.signers.len();
|
||||
```
|
||||
|
||||
However this condition seems inconsistent with that in `push_hash()`, which simply requires a strict majority:
|
||||
|
||||
```
|
||||
while self.sign_count.len() * 2 > self.signers.len() {
|
||||
```
|
||||
|
||||
If this understanding is correct, the finality condition should be consistently implemented and specified.
|
||||
|
||||
|
||||
## Finality delay
|
||||
|
||||
A minimum of `n_v/2 + 1` validations being required, with `n_v` the number of validators.
|
||||
At least `2(n_v/2 + 1) = n_v + 2` message round trips are therefore necessary before a block is finalized by all validators.
|
||||
In the worst case, after exactly `n_v` validations, the delay will instead be of `2n_v + 2`.
|
||||
If multiple blocks are proposed for validation, the voting will add at least `2n_v` round trips.
|
||||
|
||||
The average delay may be estimated empirically, based on the number of validations of a block or series thereof.
|
||||
|
||||
|
||||
# Code review
|
||||
|
||||
This section reports on potential security risks in the implementation of Aura.
|
||||
We mainly reviewed the mod.rs and finality.rs files, and did not comprehensively review their dependencies.
|
||||
|
||||
## Unsafe code
|
||||
|
||||
The files implementing Aura, mod.rs and finality.rs, do not unsafe code blocks such as raw pointer dereferences, nor recursions that would create a risk of stack overflow.
|
||||
|
||||
However, there is an unsafe blocks in the `bigint::hash` module that is executed in the Aura code, namely the implementation of the PartialEq trait:
|
||||
|
||||
```
|
||||
impl PartialEq for $from {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
unsafe { memcmp(self.0.as_ptr() as *const c_void, other.0.as_ptr() as *const c_void, $size) == 0 }
|
||||
}
|
||||
}
|
||||
```
|
||||
This is for example executed in Aura's `is_epoch_end()` function in the line
|
||||
|
||||
```
|
||||
.map(|h| if h == chain_head.hash() {
|
||||
```
|
||||
This does not create a security, however, since the buffer size (32 bytes) is hardcoded and can't be controlled by an attacher.
|
||||
|
||||
|
||||
## Step number cast from 64- to 32-bit
|
||||
|
||||
In `duration_remaining()` a Step's `inner` attribute obtained from `self.load()` is cast from `usize` to `u32`, thus truncating any 64-bit value when running on a 64-bit system (where `usize` is 64-bit). This number is then multiplied to a `Duration` object to estimate the remaining time for this step:
|
||||
|
||||
```
|
||||
fn load(&self) -> usize { self.inner.load(AtomicOrdering::SeqCst) }
|
||||
fn duration_remaining(&self) -> Duration {
|
||||
let now = unix_now();
|
||||
let step_end = self.duration * (self.load() as u32 + 1);
|
||||
```
|
||||
|
||||
Rust does not allow a `Duration` to be multiplied by an `u64`, however. The safest fix seems therefore to check that `inner` is smaller than `u32::MAX`.
|
||||
|
||||
## Potential integer overflow
|
||||
|
||||
The `AtomicUsize::fetch_add()` function will wrap around on overflow, which may occur on 32-bit systems (where `usize` is 32-bit) and lead to unsafe behavior of the protocol if not detected: c
|
||||
```
|
||||
fn increment(&self) {
|
||||
self.inner.fetch_add(1, AtomicOrdering::SeqCst);
|
||||
}
|
||||
```
|
||||
|
||||
A fix is to check that `inner` is smaller than `u32::MAX`.
|
||||
|
||||
|
||||
## Potential division by zero
|
||||
|
||||
If `self.duration` is less than a second, then a division by zero will happen in:
|
||||
|
||||
```
|
||||
if self.calibrate {
|
||||
let new_step = unix_now().as_secs() / self.duration.as_secs();
|
||||
self.inner.store(new_step as usize, AtomicOrdering::SeqCst);
|
||||
}
|
||||
```
|
||||
|
||||
A fix is to ensure that `self.duration` is greater than one second.
|
||||
|
||||
## Other possible improvements
|
||||
|
||||
### Faster timings
|
||||
|
||||
If speed of timing measurements is critical, Aura may consider using [rust-coarsetime](https://github.com/jedisct1/rust-coarsetime), "a partial replacement for the Time and Duration structures from the standard library".
|
||||
The timings obtained are slightly less accurate, though.
|
||||
|
||||
### Safer PRNG
|
||||
|
||||
Aura doesn't rely on a pseudorandom generator (`random()` is only used for testing), yet we observed that in hash.rs the `randomize()` function will not check that an instance of OsRng was successfully created:
|
||||
|
||||
```
|
||||
pub fn randomize(&mut self) {
|
||||
let mut rng = OsRng::new().unwrap();
|
||||
*self= $from::rand(&mut rng);
|
||||
}
|
||||
```
|
||||
|
||||
It would be safer to properly check the OsRng instantiation, by doing the following (as [copied from SO](https://stackoverflow.com/questions/27362523/generating-secure-random-numbers-in-rust))
|
||||
|
||||
```
|
||||
let mut rng = match OsRng::new() {
|
||||
Ok(g) => g,
|
||||
Err(e) => panic!("Failed to obtain OS RNG: {}", e)
|
||||
};
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,4 @@
|
|||
/target/
|
||||
**/*.rs.bk
|
||||
Cargo.lock
|
||||
target
|
|
@ -0,0 +1,14 @@
|
|||
[package]
|
||||
name = "byte-cmp"
|
||||
version = "0.1.0"
|
||||
authors = ["Yurii Rashkovskii <yrashk@gmail.com>"]
|
||||
|
||||
[dependencies]
|
||||
rand = "*"
|
||||
bencher ="*"
|
||||
libc = "*"
|
||||
|
||||
[[bench]]
|
||||
name = "cmp"
|
||||
harness = false
|
||||
|
|
@ -0,0 +1,69 @@
|
|||
extern crate rand;
|
||||
#[macro_use]
|
||||
extern crate bencher;
|
||||
extern crate libc;
|
||||
|
||||
use libc::{c_void, memcmp};
|
||||
|
||||
use bencher::Bencher;
|
||||
use rand::Rng;
|
||||
|
||||
type T = [u8; 32];
|
||||
|
||||
fn eq_loop(b: &mut Bencher) {
|
||||
let mut rng = rand::thread_rng();
|
||||
let mut t1: T = Default::default();
|
||||
let mut t2: T = Default::default();
|
||||
rng.fill_bytes(&mut t1);
|
||||
rng.fill_bytes(&mut t2);
|
||||
b.iter(||
|
||||
for _ in 0..1_000 {
|
||||
assert!(!(t1 == t2))
|
||||
}
|
||||
);
|
||||
}
|
||||
|
||||
fn cmp_loop(b: &mut Bencher) {
|
||||
let mut rng = rand::thread_rng();
|
||||
let mut t1: T = Default::default();
|
||||
let mut t2: T = Default::default();
|
||||
rng.fill_bytes(&mut t1);
|
||||
rng.fill_bytes(&mut t2);
|
||||
let t1ptr = t1.as_ptr() as *const c_void;
|
||||
let t2ptr = t2.as_ptr() as *const c_void;
|
||||
b.iter(||
|
||||
for _ in 0..1_000 {
|
||||
assert!(!(unsafe { memcmp(t1ptr, t2ptr, 32) == 0 }));
|
||||
}
|
||||
);
|
||||
}
|
||||
|
||||
fn eq(b: &mut Bencher) {
|
||||
let mut rng = rand::thread_rng();
|
||||
let mut t1: T = Default::default();
|
||||
let mut t2: T = Default::default();
|
||||
rng.fill_bytes(&mut t1);
|
||||
rng.fill_bytes(&mut t2);
|
||||
b.iter(||
|
||||
assert!(!(t1 == t2))
|
||||
);
|
||||
}
|
||||
|
||||
fn cmp(b: &mut Bencher) {
|
||||
let mut rng = rand::thread_rng();
|
||||
let mut t1: T = Default::default();
|
||||
let mut t2: T = Default::default();
|
||||
rng.fill_bytes(&mut t1);
|
||||
rng.fill_bytes(&mut t2);
|
||||
let t1ptr = t1.as_ptr() as *const c_void;
|
||||
let t2ptr = t2.as_ptr() as *const c_void;
|
||||
b.iter(||
|
||||
assert!(!(unsafe { memcmp(t1ptr, t2ptr, 32) == 0 }))
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
benchmark_group!(loops, eq_loop, cmp_loop);
|
||||
benchmark_group!(comparisons, eq, cmp);
|
||||
|
||||
benchmark_main!(comparisons, loops);
|
|
@ -0,0 +1 @@
|
|||
Subproject commit 82d9d5c960b9d57065ad3cc73ded4105909a45f9
|
Loading…
Reference in New Issue