Rewrite synchronization chapter (#2156)
* Rewrite synchronization chapter * Add synchronization terminology
This commit is contained in:
parent
f6e3464ab9
commit
483f6702a6
|
@ -1,84 +1,56 @@
|
|||
# Synchronization
|
||||
|
||||
It's possible for a centralized database to process 710,000 transactions per
|
||||
second on a standard gigabit network if the transactions are, on average, no
|
||||
more than 176 bytes. A centralized database can also replicate itself and
|
||||
maintain high availability without significantly compromising that transaction
|
||||
rate using the distributed system technique known as Optimistic Concurrency
|
||||
Control [\[H.T.Kung, J.T.Robinson
|
||||
(1981)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). At
|
||||
Solana, we're demonstrating that these same theoretical limits apply just as
|
||||
well to blockchain on an adversarial network. The key ingredient? Finding a way
|
||||
to share time when nodes can't trust one-another. Once nodes can trust time,
|
||||
suddenly ~40 years of distributed systems research becomes applicable to
|
||||
blockchain!
|
||||
Fast, reliable synchronization is the biggest reason Solana is able to achieve
|
||||
such high throughput. Traditional blockchains synchronize on large chunks of
|
||||
transactions called blocks. By synchronizing on blocks, a transaction cannot be
|
||||
processed until a duration called "block time" has passed. In Proof of Work
|
||||
consensus, these block times need to be very large (~10 minutes) to minimize
|
||||
the odds of multiple fullnodes producing a new valid block at the same time.
|
||||
There's no such constraint in Proof of Stake consensus, but without reliable
|
||||
timestamps, a fullnode cannot determine the order of incoming blocks. The
|
||||
popular workaround is to tag each block with a [wallclock
|
||||
timestamp](https://en.bitcoin.it/wiki/Block_timestamp). Because of clock drift
|
||||
and variance in network latencies, the timestamp is only accurate within an
|
||||
hour or two. To workaround the workaround, these systems lengthen block times
|
||||
to provide reasonable certainty that the median timestamp on each block is
|
||||
always increasing.
|
||||
|
||||
> Perhaps the most striking difference between algorithms obtained by our
|
||||
> method and ones based upon timeout is that using timeout produces a
|
||||
> traditional distributed algorithm in which the processes operate
|
||||
> asynchronously, while our method produces a globally synchronous one in which
|
||||
> every process does the same thing at (approximately) the same time. Our
|
||||
> method seems to contradict the whole purpose of distributed processing, which
|
||||
> is to permit different processes to operate independently and perform
|
||||
> different functions. However, if a distributed system is really a single
|
||||
> system, then the processes must be synchronized in some way. Conceptually,
|
||||
> the easiest way to synchronize processes is to get them all to do the same
|
||||
> thing at the same time. Therefore, our method is used to implement a kernel
|
||||
> that performs the necessary synchronization--for example, making sure that
|
||||
> two different processes do not try to modify a file at the same time.
|
||||
> Processes might spend only a small fraction of their time executing the
|
||||
> synchronizing kernel; the rest of the time, they can operate
|
||||
> independently--e.g., accessing different files. This is an approach we have
|
||||
> advocated even when fault-tolerance is not required. The method's basic
|
||||
> simplicity makes it easier to understand the precise properties of a system,
|
||||
> which is crucial if one is to know just how fault-tolerant the system is.
|
||||
> [\[L.Lamport
|
||||
> (1984)\]](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.1078)
|
||||
Solana takes a very different approach, which it calls *Proof of History* or
|
||||
*PoH*. Leader nodes "timestamp" blocks with cryptographic proofs that some
|
||||
duration of time has passed since the last proof. All data hashed into the
|
||||
proof most certainly have occurred before the proof was generated. The node
|
||||
then shares the new block with validator nodes, which are able to verify those
|
||||
proofs. The blocks can arrive at validators in any order or even could be
|
||||
replayed years later. With such reliable synchronization guarantees, Solana is
|
||||
able to break blocks into smaller batches of transactions called *entries*.
|
||||
Entries are streamed to validators in realtime, before any notion of block
|
||||
consensus.
|
||||
|
||||
## Verifiable Delay Functions
|
||||
Solana technically never sends a *block*, but uses the term to describe the
|
||||
sequence of entries that fullnodes vote on to achieve *confirmation*. In that
|
||||
way, Solana's confirmation times can be compared apples to apples to
|
||||
block-based systems. The current implementation sets block time to 800ms.
|
||||
|
||||
A Verifiable Delay Function is conceptually a water clock where its water marks
|
||||
can be recorded and later verified that the water most certainly passed
|
||||
through. Anatoly describes the water clock analogy in detail here:
|
||||
|
||||
[water clock
|
||||
analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
|
||||
|
||||
The same technique has been used in Bitcoin since day one. The Bitcoin feature
|
||||
is called nLocktime and it can be used to postdate transactions using block
|
||||
height instead of a timestamp. As a Bitcoin client, you'd use block height
|
||||
instead of a timestamp if you don't trust the network. Block height turns out
|
||||
to be an instance of what's being called a Verifiable Delay Function in
|
||||
cryptography circles. It's a cryptographically secure way to say time has
|
||||
passed. In Solana, we use a far more granular verifiable delay function, a SHA
|
||||
256 hash chain, to checkpoint the ledger and coordinate consensus. With it, we
|
||||
implement Optimistic Concurrency Control and are now well en route towards that
|
||||
theoretical limit of 710,000 transactions per second.
|
||||
|
||||
## Proof of History
|
||||
|
||||
[Proof of History
|
||||
overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)
|
||||
|
||||
### Relationship to Consensus Mechanisms
|
||||
|
||||
Most confusingly, a Proof of History (PoH) is more similar to a Verifiable
|
||||
Delay Function (VDF) than a Proof of Work or Proof of Stake consensus
|
||||
mechanism. The name unfortunately requires some historical context to
|
||||
understand. Proof of History was developed by Anatoly Yakovenko in November of
|
||||
2017, roughly 6 months before we saw a [paper using the term
|
||||
VDF](https://eprint.iacr.org/2018/601.pdf). At that time, it was commonplace to
|
||||
publish new proofs of some desirable property used to build most any blockchain
|
||||
component. Some time shortly after, the crypto community began charting out all
|
||||
the different consensus mechanisms and because most of them started with "Proof
|
||||
of", the prefix became synonymous with a "consensus" suffix. Proof of History
|
||||
is not a consensus mechanism, but it is used to improve the performance of
|
||||
Solana's Proof of Stake consensus. It is also used to improve the performance
|
||||
of the replication and storage protocols. To minimize confusion, Solana may
|
||||
rebrand PoH to some flavor of the term VDF.
|
||||
What's happening under the hood is that entries are streamed to validators as
|
||||
quickly as a leader node can batch a set of valid transactions into an entry.
|
||||
Validators process those entries long before it is time to vote on their
|
||||
validity. By processing the transactions optimistically, there is effectively
|
||||
no delay between the time the last entry is received and the time when the node
|
||||
can vote. In the event consensus is **not** achieved, a node simply rolls back
|
||||
its state. This optimisic processing technique was introduced in 1981 and
|
||||
called [Optimistic Concurrency
|
||||
Control](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.65.4735). It
|
||||
can be applied to blockchain architecture where a cluster votes on a hash that
|
||||
represents the full ledger up to some *block height*. In Solana, it is
|
||||
implemented trivially using the last entry's PoH hash.
|
||||
|
||||
### Relationship to VDFs
|
||||
|
||||
The Proof of History technique was first described for use in blockchain by
|
||||
Solana in November of 2017. In June of the following year, a similar technique
|
||||
was described at Stanford and called a [verifiable delay
|
||||
function](https://eprint.iacr.org/2018/601.pdf) or *VDF*.
|
||||
|
||||
A desirable property of a VDF is that verification time is very fast. Solana's
|
||||
approach to verifying its delay function is proportional to the time it took to
|
||||
create it. Split over a 4000 core GPU, it is sufficiently fast for Solana's
|
||||
|
@ -90,13 +62,26 @@ just the subset with certain performance characteristics. Until that's
|
|||
resolved, Solana will likely continue using the term PoH for its
|
||||
application-specific VDF.
|
||||
|
||||
Another difference between PoH and VDFs used only for tracking duration, is
|
||||
that PoH's hash chain includes hashes of any data the application observed.
|
||||
That data is a double-edged sword. On one side, the data "proves history" -
|
||||
that the data most certainly existed before hashes after it. On the side, it
|
||||
means the application can manipulate the hash chain by changing *when* the data
|
||||
is hashed. The PoH chain therefore does not serve as a good source of
|
||||
randomness whereas a VDF without that data could. Solana's [leader rotation
|
||||
algorithm](#leader-rotation), for example, is derived only from the VDF
|
||||
*height* and not its hash at that height.
|
||||
Another difference between PoH and VDFs is that a VDF is used only for tracking
|
||||
duration. PoH's hash chain, on the other hand, includes hashes of any data the
|
||||
application observed. That data is a double-edged sword. On one side, the data
|
||||
"proves history" - that the data most certainly existed before hashes after it.
|
||||
On the side, it means the application can manipulate the hash chain by changing
|
||||
*when* the data is hashed. The PoH chain therefore does not serve as a good
|
||||
source of randomness whereas a VDF without that data could. Solana's [leader
|
||||
rotation algorithm](#leader-rotation), for example, is derived only from the
|
||||
VDF *height* and not its hash at that height.
|
||||
|
||||
### Relationship to Consensus Mechanisms
|
||||
|
||||
Proof of History is not a consensus mechanism, but it is used to improve the
|
||||
performance of Solana's Proof of Stake consensus. It is also used to improve
|
||||
the performance of the data plane and replication protocols.
|
||||
|
||||
### More on Proof of History
|
||||
|
||||
* [water clock
|
||||
analogy](https://medium.com/solana-labs/proof-of-history-explained-by-a-water-clock-e682183417b8)
|
||||
|
||||
* [Proof of History
|
||||
overview](https://medium.com/solana-labs/proof-of-history-a-clock-for-blockchain-cf47a61a9274)
|
||||
|
|
|
@ -18,9 +18,14 @@ A fraction of a [block](#block); the smallest unit sent between
|
|||
|
||||
#### block
|
||||
|
||||
A contiguous set of [entries](#entry) on the ledger covered by a [vote](#ledger-vote).
|
||||
The duration of a block is some number of [ticks](#tick), configured via the
|
||||
[control plane](#control-plane). Also called [voting period](#voting-period).
|
||||
A contiguous set of [entries](#entry) on the ledger covered by a
|
||||
[vote](#ledger-vote). The duration of a block is some cluster-configured
|
||||
number of [ticks](#tick). Also called [voting period](#voting-period).
|
||||
|
||||
#### block height
|
||||
|
||||
The number of [blocks](#block) beneath the current block plus one. The [genesis
|
||||
block](#genesis-block), for example, has block height 1.
|
||||
|
||||
#### bootstrap leader
|
||||
|
||||
|
@ -153,6 +158,10 @@ A computer particpating in a [cluster](#cluster).
|
|||
|
||||
The number of [fullnodes](#fullnode) participating in a [cluster](#cluster).
|
||||
|
||||
#### PoH
|
||||
|
||||
See [Proof of History](#proof-of-history).
|
||||
|
||||
#### program
|
||||
|
||||
The code that interprets [instructions](#instruction).
|
||||
|
@ -161,6 +170,13 @@ The code that interprets [instructions](#instruction).
|
|||
|
||||
The public key of the [account](#account) containing a [program](#program).
|
||||
|
||||
#### Proof of History
|
||||
|
||||
A stack of proofs, each which proves that some data existed before the proof
|
||||
was created and that a precise duration of time passed before the previous
|
||||
proof. Like a [VDF](#verifiable-delay-function), a Proof of History can be
|
||||
verified in less time than it took to produce.
|
||||
|
||||
#### public key
|
||||
|
||||
The public key of a [keypair](#keypair).
|
||||
|
@ -224,6 +240,15 @@ A set of [transactions](#transaction) that may be executed in parallel.
|
|||
The role of a [fullnode](#fullnode) when it is validating the
|
||||
[leader's](#leader) latest [entries](#entry).
|
||||
|
||||
#### VDF
|
||||
|
||||
See [verifiable delay function](#verifiable-delay-function).
|
||||
|
||||
#### verifiable delay function
|
||||
|
||||
A function that takes a fixed amount of time to execute that produces a proof
|
||||
that it ran, which can then be verified in less time than it took to produce.
|
||||
|
||||
#### vote
|
||||
|
||||
See [ledger vote](#ledger-vote).
|
||||
|
|
Loading…
Reference in New Issue