Rewrite synchronization chapter (#2156)

* Rewrite synchronization chapter
* Add synchronization terminology
This commit is contained in:
Greg Fitzgerald 2018-12-14 11:06:53 -07:00 committed by GitHub
parent f6e3464ab9
commit 483f6702a6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 95 additions and 85 deletions

View File

@ -1,84 +1,56 @@
# Synchronization
It's possible for a centralized database to process 710,000 transactions per
second on a standard gigabit network if the transactions are, on average, no
more than 176 bytes. A centralized database can also replicate itself and
maintain high availability without significantly compromising that transaction
rate using the distributed system technique known as Optimistic Concurrency
Control [\[H.T.Kung, J.T.Robinson
(1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At
Solana, we're demonstrating that these same theoretical limits apply just as
well to blockchain on an adversarial network. The key ingredient? Finding a way
to share time when nodes can't trust one-another. Once nodes can trust time,
suddenly ~40 years of distributed systems research becomes applicable to
blockchain!
Fast, reliable synchronization is the biggest reason Solana is able to achieve
such high throughput. Traditional blockchains synchronize on large chunks of
transactions called blocks. By synchronizing on blocks, a transaction cannot be
processed until a duration called "block time" has passed. In Proof of Work
consensus, these block times need to be very large (~10 minutes) to minimize
the odds of multiple fullnodes producing a new valid block at the same time.
There's no such constraint in Proof of Stake consensus, but without reliable
timestamps, a fullnode cannot determine the order of incoming blocks. The
popular workaround is to tag each block with a [wallclock
timestamp](https://en.bitcoin.it/wiki/Block_timestamp). Because of clock drift
and variance in network latencies, the timestamp is only accurate within an
hour or two. To workaround the workaround, these systems lengthen block times
to provide reasonable certainty that the median timestamp on each block is
always increasing.
> Perhaps the most striking difference between algorithms obtained by our
> method and ones based upon timeout is that using timeout produces a
> traditional distributed algorithm in which the processes operate
> asynchronously, while our method produces a globally synchronous one in which
> every process does the same thing at (approximately) the same time. Our
> method seems to contradict the whole purpose of distributed processing, which
> is to permit different processes to operate independently and perform
> different functions. However, if a distributed system is really a single
> system, then the processes must be synchronized in some way. Conceptually,
> the easiest way to synchronize processes is to get them all to do the same
> thing at the same time. Therefore, our method is used to implement a kernel
> that performs the necessary synchronization--for example, making sure that
> two different processes do not try to modify a file at the same time.
> Processes might spend only a small fraction of their time executing the
> synchronizing kernel; the rest of the time, they can operate
> independently--e.g., accessing different files. This is an approach we have
> advocated even when fault-tolerance is not required. The method's basic
> simplicity makes it easier to understand the precise properties of a system,
> which is crucial if one is to know just how fault-tolerant the system is.
> [\[L.Lamport
> (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
Solana takes a very different approach, which it calls *Proof of History* or
*PoH*. Leader nodes "timestamp" blocks with cryptographic proofs that some
duration of time has passed since the last proof. All data hashed into the
proof most certainly have occurred before the proof was generated. The node
then shares the new block with validator nodes, which are able to verify those
proofs. The blocks can arrive at validators in any order or even could be
replayed years later. With such reliable synchronization guarantees, Solana is
able to break blocks into smaller batches of transactions called *entries*.
Entries are streamed to validators in realtime, before any notion of block
consensus.
## Verifiable Delay Functions
Solana technically never sends a *block*, but uses the term to describe the
sequence of entries that fullnodes vote on to achieve *confirmation*. In that
way, Solana's confirmation times can be compared apples to apples to
block-based systems. The current implementation sets block time to 800ms.
A Verifiable Delay Function is conceptually a water clock where its water marks
can be recorded and later verified that the water most certainly passed
through. Anatoly describes the water clock analogy in detail here:
[water clock
analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
The same technique has been used in Bitcoin since day one. The Bitcoin feature
is called nLocktime and it can be used to postdate transactions using block
height instead of a timestamp. As a Bitcoin client, you'd use block height
instead of a timestamp if you don't trust the network. Block height turns out
to be an instance of what's being called a Verifiable Delay Function in
cryptography circles. It's a cryptographically secure way to say time has
passed. In Solana, we use a far more granular verifiable delay function, a SHA
256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we
implement Optimistic Concurrency Control and are now well en route towards that
theoretical limit of 710,000 transactions per second.
## Proof of History
[Proof of History
overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)
### Relationship to Consensus Mechanisms
Most confusingly, a Proof of History (PoH) is more similar to a Verifiable
Delay Function (VDF) than a Proof of Work or Proof of Stake consensus
mechanism. The name unfortunately requires some historical context to
understand. Proof of History was developed by Anatoly Yakovenko in November of
2017, roughly 6 months before we saw a [paper using the term
VDF](https://eprint.iacr.org/2018/601.pdf). At that time, it was commonplace to
publish new proofs of some desirable property used to build most any blockchain
component. Some time shortly after, the crypto community began charting out all
the different consensus mechanisms and because most of them started with "Proof
of", the prefix became synonymous with a "consensus" suffix. Proof of History
is not a consensus mechanism, but it is used to improve the performance of
Solana's Proof of Stake consensus. It is also used to improve the performance
of the replication and storage protocols. To minimize confusion, Solana may
rebrand PoH to some flavor of the term VDF.
What's happening under the hood is that entries are streamed to validators as
quickly as a leader node can batch a set of valid transactions into an entry.
Validators process those entries long before it is time to vote on their
validity. By processing the transactions optimistically, there is effectively
no delay between the time the last entry is received and the time when the node
can vote. In the event consensus is **not** achieved, a node simply rolls back
its state. This optimisic processing technique was introduced in 1981 and
called [Optimistic Concurrency
Control](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). It
can be applied to blockchain architecture where a cluster votes on a hash that
represents the full ledger up to some *block height*. In Solana, it is
implemented trivially using the last entry's PoH hash.
### Relationship to VDFs
The Proof of History technique was first described for use in blockchain by
Solana in November of 2017. In June of the following year, a similar technique
was described at Stanford and called a [verifiable delay
function](https://eprint.iacr.org/2018/601.pdf) or *VDF*.
A desirable property of a VDF is that verification time is very fast. Solana's
approach to verifying its delay function is proportional to the time it took to
create it. Split over a 4000 core GPU, it is sufficiently fast for Solana's
@ -90,13 +62,26 @@ just the subset with certain performance characteristics. Until that's
resolved, Solana will likely continue using the term PoH for its
application-specific VDF.
Another difference between PoH and VDFs used only for tracking duration, is
that PoH's hash chain includes hashes of any data the application observed.
That data is a double-edged sword. On one side, the data "proves history" -
that the data most certainly existed before hashes after it. On the side, it
means the application can manipulate the hash chain by changing *when* the data
is hashed. The PoH chain therefore does not serve as a good source of
randomness whereas a VDF without that data could. Solana's [leader rotation
algorithm](#leader-rotation), for example, is derived only from the VDF
*height* and not its hash at that height.
Another difference between PoH and VDFs is that a VDF is used only for tracking
duration. PoH's hash chain, on the other hand, includes hashes of any data the
application observed. That data is a double-edged sword. On one side, the data
"proves history" - that the data most certainly existed before hashes after it.
On the side, it means the application can manipulate the hash chain by changing
*when* the data is hashed. The PoH chain therefore does not serve as a good
source of randomness whereas a VDF without that data could. Solana's [leader
rotation algorithm](#leader-rotation), for example, is derived only from the
VDF *height* and not its hash at that height.
### Relationship to Consensus Mechanisms
Proof of History is not a consensus mechanism, but it is used to improve the
performance of Solana's Proof of Stake consensus. It is also used to improve
the performance of the data plane and replication protocols.
### More on Proof of History
* [water clock
analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
* [Proof of History
overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)

View File

@ -18,9 +18,14 @@ A fraction of a [block](#block); the smallest unit sent between
#### block
A contiguous set of [entries](#entry) on the ledger covered by a [vote](#ledger-vote).
The duration of a block is some number of [ticks](#tick), configured via the
[control plane](#control-plane). Also called [voting period](#voting-period).
A contiguous set of [entries](#entry) on the ledger covered by a
[vote](#ledger-vote). The duration of a block is some cluster-configured
number of [ticks](#tick). Also called [voting period](#voting-period).
#### block height
The number of [blocks](#block) beneath the current block plus one. The [genesis
block](#genesis-block), for example, has block height 1.
#### bootstrap leader
@ -153,6 +158,10 @@ A computer particpating in a [cluster](#cluster).
The number of [fullnodes](#fullnode) participating in a [cluster](#cluster).
#### PoH
See [Proof of History](#proof-of-history).
#### program
The code that interprets [instructions](#instruction).
@ -161,6 +170,13 @@ The code that interprets [instructions](#instruction).
The public key of the [account](#account) containing a [program](#program).
#### Proof of History
A stack of proofs, each which proves that some data existed before the proof
was created and that a precise duration of time passed before the previous
proof. Like a [VDF](#verifiable-delay-function), a Proof of History can be
verified in less time than it took to produce.
#### public key
The public key of a [keypair](#keypair).
@ -224,6 +240,15 @@ A set of [transactions](#transaction) that may be executed in parallel.
The role of a [fullnode](#fullnode) when it is validating the
[leader's](#leader) latest [entries](#entry).
#### VDF
See [verifiable delay function](#verifiable-delay-function).
#### verifiable delay function
A function that takes a fixed amount of time to execute that produces a proof
that it ran, which can then be verified in less time than it took to produce.
#### vote
See [ledger vote](#ledger-vote).