Problem: JP's report is unaddressed

Solution: formalize our current thoughts and actions
This commit is contained in:
Yurii Rashkovskii 2017-12-12 18:30:47 +07:00
commit 82d9476283
8 changed files with 400 additions and 0 deletions

3
.gitmodules vendored Normal file
View File

@ -0,0 +1,3 @@
[submodule "fastcmp"]
path = fastcmp
url = https://github.com/saschagrunert/fastcmp

137
README.md Normal file
View File

@ -0,0 +1,137 @@
# Aura Consensus Protocol Audit Findings Commentary
This document is prepared in response to the draft audit report provided
by Jean-Philippe Aumasson (08/12/17), SHA256 hash of [aura-audit-report.md](aura-audit-report.md)
is 8040fe95d91b2d77f1f0df2af5ba23c85ee487b18dde650dd46cfc82686d0b3d
## Protocol review
### Risks of synchronized time
The report indicates certain risks of [un]synchronized time and related attacks.
We believe these concerns are valid and are mostly the consequences of the
basics of the protocol (it is based on physical time)
### Resilience to malicious nodes
No commentary.
### Denial-of-service attacks
No commentary.
### Finality conditions
We've identified the same case to cause confusion as well (specification
stating `>= 1/2` of the validators, while code requires `> 1/2`).
With regards to `would_be_finalized` calculation, we currently believe
that this code is correct and is expressing the same `> 1/2` requirement,
but is "forecasting" finalization for the next signer.
Recommended course of action:
- [ ] **P1** Consult with Parity maintainers with regards to the original intent
and actual implementation. Which constraint ("greater" or "stricly greater")
should in fact be specified and implemented
- [ ] **P2** Further analyze the correctness of `would_be_finalized` definition
- [ ] **P2.1** Provide meaningful in-code commentary on how `build_ancestry_subchain`
operates
### Finality delay
No commentary at this moment, but we are further researching this concern.
## Code review
### Unsafe code
The report mentions use of unsafe code in `bigint::hash` where it is used
to efficiently compare two byte arrays. As mentioned, this doesn't seem
to be a security risk. However, it is still an instance of unsafe code
which would always raise at least some concern.
At the core of this issue there's a question of potential performance gains.
In order to address it to best of our ability, we've conducted a few comparative
benchmarks. Firstly, we ran simple benchmarks (slice equality vs memcp) both on
rust stable (rustc 1.22.1 (05e2e1c41 2017-11-22)) and rust nighty (rustc 1.22.1 (05e2e1c41 2017-11-22))
and did not see any gains (the results were very similar in both cases):
```
test cmp ... bench: 4 ns/iter (+/- 1)
test cmp_loop ... bench: 4,251 ns/iter (+/- 215)
test eq ... bench: 4 ns/iter (+/- 1)
test eq_loop ... bench: 4,269 ns/iter (+/- 622)
```
The source code for this benchmark is available at [byte-cmp](byte-cmp)
To continue this effort, we further sampled some libraries available
on [crates.io](https://crates.io).
[fastcmp](https://github.com/saschagrunert/fastcmp) was seemingly able
to show different results:
```
test fast_compare_equal ... bench: 10 ns/iter (+/- 0) = 25600 MB/s
test fast_compare_unequal ... bench: 10 ns/iter (+/- 0) = 25600 MB/s
test slice_compare_equal ... bench: 21 ns/iter (+/- 1) = 12190 MB/s
test slice_compare_unequal ... bench: 21 ns/iter (+/- 0) = 12190 MB/s
```
At the core of its implementation method also lies `memcmp`. The reason for the
significant discrepancy in results is being further investigated. At this moment,
we can only state that the optimization method used in parity *may* produce
much more efficient code so its worth keeping around. However, we'd recommend
the following course of action:
- [ ] **C1** Prepare a patch for bigint::hash that explains the necessity of using unsafe
code (and memcmp) in prose to make this code easier to understand for others.
### Step number cast from 64- to 32-bit
The report indicates a cast of a 64-bit number to 32-bit one in AuRa code. The concern
is valid, however, this behaviour will only be triggered 100-500 years later. That said,
the following course of action was taken:
- [x] **C2** Avoid casting down to 32-bit by dropping the need to use `Duration` type
for representing step duration (which is typically single or double digits seconds)
https://github.com/oraclesorg/parity/commit/fa2ddce949d8a227b7135b300003f0a5bceddc0f
- [ ] **C2.1** Prepare a PR and get it merged into the main repository (or get resolved in
any other way)
### Potential integer overflow
The report highlights that step increment function might overflow the counter, especially
when parity is compiled for a 32-bit system. This event is also quite far in the future,
however, the following course of action was taken:
- [x] **C3** Shutdown parity when such event occurs as there's not much that can be done
at that point https://github.com/oraclesorg/parity/commit/fa2ddce949d8a227b7135b300003f0a5bceddc0f
- [ ] **C3.1** Prepare a PR and get it merged into the main repository (or get resolved in
any other way)
### Potential division by zero
The report shows that Step calibration function might panic on division by zero. This is a
valid concern. If somebody will configure Parity with a step duration of 0 seconds, it'll
quickly panic, however, the error won't necessarily be very informative. This case highlights
the lack of refinement types in Rust. Following course of action was taken:
- [x] **C4** Guard step duration parameter to be positive and crash Parity otherwise
https://github.com/oraclesorg/parity/commit/fa2ddce949d8a227b7135b300003f0a5bceddc0f
- [ ] **C4.1** Prepare a PR and get it merged into the main repository (or get resolved in
any other way)
### Other possible improvements
The report suggests few other improvements. First one is to improve the measurement of timing.
There is no indication at this point that this is a concern. Course of action:
- [ ] **C5** Try out https://github.com/jedisct1/rust-coarsetime to see how much more
performant it is
- [ ] **C6** Research the importance of measurement timing
The other suggested improvement is erroneous as it suggests that the code doesn't check
of RNG creation failure, while in fact it does.

172
aura-audit-report.md Normal file
View File

@ -0,0 +1,172 @@
% Audit of the Aura consensus protocol
% Jean-Philippe Aumasson
% 08/12/17
Authority Round (a.k.a. Aura) is a proof-of-authority consensus engine in the [Parity](https://github.com/paritytech/parity/) Ethereum client, implemented in [/ethcore/src/engines/authority_round](https://github.com/paritytech/parity/tree/master/ethcore/src/engines/authority_round) in files [mod.rs](https://github.com/paritytech/parity/blob/master/ethcore/src/engines/authority_round/mod.rs) and [finality.rs](https://github.com/paritytech/parity/blob/master/ethcore/src/engines/authority_round/finality.rs) (about 850 lines or Rust, including approximately 250 lines of tests).
The goal of this audit is to assess the security strength of the Aura and its implementation, and to find any security shortcoming, in particular related to finality conditions.
The version of the source code reviewed is that at the commit `b1517654e1212588238c989d00dd92128ea040fe`.
The first part *Protocol review* is about the protocol logic, as understood from the documentation and as implemented.
The second part *Code review* is about logic and software bugs in the implementation that are not directly related to the protocol's logic.
# Protocol review
## Risks of synchronized time
Aura critically relies on time synchronization of the validators for successful execution of the protocol.
This increases two risks:
1. **Inconsistency** of the blockchain's state, if clocks of validators are accidentally desynchronized, for example due to clock drift. In this case, different validators would compute different a different step index, and therefore a different lead validator.
2. **Denial-of-service** and possiblity other attacks: Many machines would get time from an NTP server through ntpd, adjusting their local clock to compensate any skew. However,
- The NTP server(s) have then to be trusted---which also reduces the decentralization of the protocol
- The NTP traffic is unauthenticated by default, thus a network attacker could transmit invalid time to the validators.
The risk of inconsistency can be minimized by choosing a long enough duration, and by synchronizing with a trusted time server from a local network (for example getting its time from GPS).
The risk of denial-of-service can be minimized by authenticating any NTP requests, using the pre-shared key authentication mode or the protocol proposed in a [recent Internet Draft](https://tools.ietf.org/html/draft-ietf-ntp-using-nts-for-ntp-10).
Even when NTP is not used, an attacker may have other means of influencing the local clock skew, for example through [CPU heating](http://sec.cs.ucl.ac.uk/users/smurdoch/papers/ccs06hotornot.pdf).
## Resilience to malicious nodes
Aura claims to tolerate up to 50% malicious nodes, as per its documentation.
The goals of a collusion of malicious nodes would be to seal blocks that should not be sealed, or merely to influence the selection of a validator.
Based on my understanding of the protocol, I believe these claims are correct. The bound 50% is tight, and any collusion of more than 50% of the validators could disrupt the sound execution of the protocol.
## Denial-of-service attacks
Even if connections between validators are encrypted and authenticated, an attacker that blocks from/to one or more validators can prevent the protocol from validating incoming blocks.
Solutions to this problem are non-trivial, and seem to require a trade-off between resiliency to network disruptions and security against collusions of malicious nodes.
## Finality conditions
At the time of writing the [documentation](https://github.com/paritytech/parity/wiki/Aura) defines finality condition as `|SIG_SET(C[K..])| >= n/2`, where `n` is the number of validators, which suggests finalization as soon as 50% of validators have signed a given step.
However a strict majority seems to be the desired condition, so this condition should instead be `|SIG_SET(C[K..])| > n/2`.
The condition implemented in finality.rs seems even stricter in `build_ancestry_subchain()`:
```
let would_be_finalized = (current_signed + 1) * 2 > self.signers.len();
```
However this condition seems inconsistent with that in `push_hash()`, which simply requires a strict majority:
```
while self.sign_count.len() * 2 > self.signers.len() {
```
If this understanding is correct, the finality condition should be consistently implemented and specified.
## Finality delay
A minimum of `n_v/2 + 1` validations being required, with `n_v` the number of validators.
At least `2(n_v/2 + 1) = n_v + 2` message round trips are therefore necessary before a block is finalized by all validators.
In the worst case, after exactly `n_v` validations, the delay will instead be of `2n_v + 2`.
If multiple blocks are proposed for validation, the voting will add at least `2n_v` round trips.
The average delay may be estimated empirically, based on the number of validations of a block or series thereof.
# Code review
This section reports on potential security risks in the implementation of Aura.
We mainly reviewed the mod.rs and finality.rs files, and did not comprehensively review their dependencies.
## Unsafe code
The files implementing Aura, mod.rs and finality.rs, do not unsafe code blocks such as raw pointer dereferences, nor recursions that would create a risk of stack overflow.
However, there is an unsafe blocks in the `bigint::hash` module that is executed in the Aura code, namely the implementation of the PartialEq trait:
```
impl PartialEq for $from {
fn eq(&self, other: &Self) -> bool {
unsafe { memcmp(self.0.as_ptr() as *const c_void, other.0.as_ptr() as *const c_void, $size) == 0 }
}
}
```
This is for example executed in Aura's `is_epoch_end()` function in the line
```
.map(|h| if h == chain_head.hash() {
```
This does not create a security, however, since the buffer size (32 bytes) is hardcoded and can't be controlled by an attacher.
## Step number cast from 64- to 32-bit
In `duration_remaining()` a Step's `inner` attribute obtained from `self.load()` is cast from `usize` to `u32`, thus truncating any 64-bit value when running on a 64-bit system (where `usize` is 64-bit). This number is then multiplied to a `Duration` object to estimate the remaining time for this step:
```
fn load(&self) -> usize { self.inner.load(AtomicOrdering::SeqCst) }
fn duration_remaining(&self) -> Duration {
let now = unix_now();
let step_end = self.duration * (self.load() as u32 + 1);
```
Rust does not allow a `Duration` to be multiplied by an `u64`, however. The safest fix seems therefore to check that `inner` is smaller than `u32::MAX`.
## Potential integer overflow
The `AtomicUsize::fetch_add()` function will wrap around on overflow, which may occur on 32-bit systems (where `usize` is 32-bit) and lead to unsafe behavior of the protocol if not detected: c
```
fn increment(&self) {
self.inner.fetch_add(1, AtomicOrdering::SeqCst);
}
```
A fix is to check that `inner` is smaller than `u32::MAX`.
## Potential division by zero
If `self.duration` is less than a second, then a division by zero will happen in:
```
if self.calibrate {
let new_step = unix_now().as_secs() / self.duration.as_secs();
self.inner.store(new_step as usize, AtomicOrdering::SeqCst);
}
```
A fix is to ensure that `self.duration` is greater than one second.
## Other possible improvements
### Faster timings
If speed of timing measurements is critical, Aura may consider using [rust-coarsetime](https://github.com/jedisct1/rust-coarsetime), "a partial replacement for the Time and Duration structures from the standard library".
The timings obtained are slightly less accurate, though.
### Safer PRNG
Aura doesn't rely on a pseudorandom generator (`random()` is only used for testing), yet we observed that in hash.rs the `randomize()` function will not check that an instance of OsRng was successfully created:
```
pub fn randomize(&mut self) {
let mut rng = OsRng::new().unwrap();
*self= $from::rand(&mut rng);
}
```
It would be safer to properly check the OsRng instantiation, by doing the following (as [copied from SO](https://stackoverflow.com/questions/27362523/generating-secure-random-numbers-in-rust))
```
let mut rng = match OsRng::new() {
Ok(g) => g,
Err(e) => panic!("Failed to obtain OS RNG: {}", e)
};
```

4
byte-cmp/.gitignore vendored Normal file
View File

@ -0,0 +1,4 @@
/target/
**/*.rs.bk
Cargo.lock
target

14
byte-cmp/Cargo.toml Normal file
View File

@ -0,0 +1,14 @@
[package]
name = "byte-cmp"
version = "0.1.0"
authors = ["Yurii Rashkovskii <yrashk@gmail.com>"]
[dependencies]
rand = "*"
bencher ="*"
libc = "*"
[[bench]]
name = "cmp"
harness = false

69
byte-cmp/benches/cmp.rs Normal file
View File

@ -0,0 +1,69 @@
extern crate rand;
#[macro_use]
extern crate bencher;
extern crate libc;
use libc::{c_void, memcmp};
use bencher::Bencher;
use rand::Rng;
type T = [u8; 32];
fn eq_loop(b: &mut Bencher) {
let mut rng = rand::thread_rng();
let mut t1: T = Default::default();
let mut t2: T = Default::default();
rng.fill_bytes(&mut t1);
rng.fill_bytes(&mut t2);
b.iter(||
for _ in 0..1_000 {
assert!(!(t1 == t2))
}
);
}
fn cmp_loop(b: &mut Bencher) {
let mut rng = rand::thread_rng();
let mut t1: T = Default::default();
let mut t2: T = Default::default();
rng.fill_bytes(&mut t1);
rng.fill_bytes(&mut t2);
let t1ptr = t1.as_ptr() as *const c_void;
let t2ptr = t2.as_ptr() as *const c_void;
b.iter(||
for _ in 0..1_000 {
assert!(!(unsafe { memcmp(t1ptr, t2ptr, 32) == 0 }));
}
);
}
fn eq(b: &mut Bencher) {
let mut rng = rand::thread_rng();
let mut t1: T = Default::default();
let mut t2: T = Default::default();
rng.fill_bytes(&mut t1);
rng.fill_bytes(&mut t2);
b.iter(||
assert!(!(t1 == t2))
);
}
fn cmp(b: &mut Bencher) {
let mut rng = rand::thread_rng();
let mut t1: T = Default::default();
let mut t2: T = Default::default();
rng.fill_bytes(&mut t1);
rng.fill_bytes(&mut t2);
let t1ptr = t1.as_ptr() as *const c_void;
let t2ptr = t2.as_ptr() as *const c_void;
b.iter(||
assert!(!(unsafe { memcmp(t1ptr, t2ptr, 32) == 0 }))
);
}
benchmark_group!(loops, eq_loop, cmp_loop);
benchmark_group!(comparisons, eq, cmp);
benchmark_main!(comparisons, loops);

0
byte-cmp/src/lib.rs Normal file
View File

1
fastcmp Submodule

@ -0,0 +1 @@
Subproject commit 82d9d5c960b9d57065ad3cc73ded4105909a45f9