From b5a80d3d49aea48c9455f3f88b50cff9b7ebd7cd Mon Sep 17 00:00:00 2001 From: Greg Fitzgerald Date: Fri, 7 Dec 2018 16:52:36 -0700 Subject: [PATCH] Update ledger replication chapter (#2029) * ledger block -> ledger segment The book already defines a *block* to be a slight variation of how block-based changes define it. It's the thing the cluster confirms should be the next set of transactions on the ledger. * Boot storage description from the book --- book/src/fullnode.md | 2 +- book/src/storage.md | 112 ---------------------------------------- book/src/terminology.md | 58 +++++++++++++++++++-- rfcs/0003-storage.md | 72 ++++++++++++-------------- 4 files changed, 88 insertions(+), 156 deletions(-) diff --git a/book/src/fullnode.md b/book/src/fullnode.md index 066fdc95f..d35e0bc3f 100644 --- a/book/src/fullnode.md +++ b/book/src/fullnode.md @@ -1,4 +1,4 @@ -# Fullnode +# Anatomy of a Fullnode Fullnode block diagrams diff --git a/book/src/storage.md b/book/src/storage.md index 395d93f29..f45d1bded 100644 --- a/book/src/storage.md +++ b/book/src/storage.md @@ -1,114 +1,2 @@ # Ledger Replication -## Background - -At full capacity on a 1gbps network Solana would generate 4 petabytes of data -per year. If each fullnode was required to store the full ledger, the cost of -storage would discourage fullnode participation, thus centralizing the network -around those that could afford it. Solana aims to keep the cost of a fullnode -below $5,000 USD to maximize participation. To achieve that, the network needs -to minimize redundant storage while at the same time ensuring the validity and -availability of each copy. - -To trust storage of ledger segments, Solana has *replicators* periodically -submit proofs to the network that the data was replicated. Each proof is called -a Proof of Replication. The basic idea of it is to encrypt a dataset with a -public symmetric key and then hash the encrypted dataset. Solana uses [CBC -encryption](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Cipher_Block_Chaining_(CBC)). -To prevent a malicious replicator from deleting the data as soon as it's -hashed, a replicator is required hash random segments of the dataset. -Alternatively, Solana could require hashing the reverse of the encrypted data, -but random sampling is sufficient and much faster. Either solution ensures -that all the data is present during the generation of the proof and also -requires the validator to have the entirety of the encrypted data present for -verification of every proof of every identity. The space required to validate -is: - -``` number_of_proofs * data_size ``` - -## Optimization with PoH - -Solana is not the only distribute systems project using Proof of Replication, -but it might be the most efficient implementation because of its ability to -synchronize nodes with its Proof of History. With PoH, Solana is able to record -a hash of the PoRep samples in the ledger. Thus the blocks stay in the exact -same order for every PoRep and verification can stream the data and verify all -the proofs in a single batch. This way Solana can verify multiple proofs -concurrently, each one on its own GPU core. With the current generation of -graphics cards our network can support up to 14,000 replication identities or -symmetric keys. The total space required for verification is: - -``` 2 CBC_blocks * number_of_identities ``` - -with core count of equal to (Number of Identities). A CBC block is expected to -be 1MB in size. - -## Network - -Validators for PoRep are the same validators that are verifying transactions. -They have some stake that they have put up as collateral that ensures that -their work is honest. If you can prove that a validator verified a fake PoRep, -then the validator's stake is slashed. - -Replicators are specialized light clients. They download a part of the ledger -and store it and provide proofs of storing the ledger. For each verified proof, -replicators are rewarded tokens from the mining pool. - -## Constraints - -Solana's PoRep protocol instroduces the following constraints: - -* At most 14,000 replication identities can be used, because that is how many GPU - cores are currently available to a computer costing under $5,000 USD. -* Verification requires generating the CBC blocks. That requires space of 2 - blocks per identity, and 1 GPU core per identity for the same dataset. As -many identities at once are batched with as many proofs for those identities -verified concurrently for the same dataset. - -## Validation and Replication Protocol - -1. The network sets a replication target number, let's say 1k. 1k PoRep - identities are created from signatures of a PoH hash. They are tied to a -specific PoH hash. It doesn't matter who creates them, or it could simply be -the last 1k validation signatures we saw for the ledger at that count. This is -maybe just the initial batch of identities, because we want to stagger identity -rotation. -2. Any client can use any of these identities to create PoRep proofs. - Replicator identities are the CBC encryption keys. -3. Periodically at a specific PoH count, a replicator that wants to create - PoRep proofs signs the PoH hash at that count. That signature is the seed -used to pick the block and identity to replicate. A block is 1TB of ledger. -4. Periodically at a specific PoH count, a replicator submits PoRep proofs for - their selected block. A signature of the PoH hash at that count is the seed -used to sample the 1TB encrypted block, and hash it. This is done faster than -it takes to encrypt the 1TB block with the original identity. -5. Replicators must submit some number of fake proofs, which they can prove to - be fake by providing the seed for the hash result. -6. Periodically at a specific PoH count, validators sign the hash and use the - signature to select the 1TB block that they need to validate. They batch all -the identities and proofs and submit approval for all the verified ones. -7. After #6, replicator client submit the proofs of fake proofs. - -For any random seed, Solana requires everyone to use a signature that is -derived from a PoH hash. Every node uses the same count so that the same PoH -hash is signed by every participant. The signatures are then each -cryptographically tied to the keypair, which prevents a leader from grinding on -the resulting value for more than 1 identity. - -Key rotation is *staggered*. Once going, the next identity is generated by -hashing itself with a PoH hash. - -Since there are many more client identities then encryption identities, the -reward is split amont multiple clients to prevent Sybil attacks from generating -many clients to acquire the same block of data. To remain BFT, the network -needs to avoid a single human entity from storing all the replications of a -single chunk of the ledger. - -Solana's solution to this is to require clients to continue using the same -identity. If the first round is used to acquire the same block for many client -identities, the second round for the same client identities will require a -redistribution of the signatures, and therefore PoRep identities and blocks. -Thus to get a reward for storage, clients are not rewarded for storage of the -first block. The network rewards long-lived client identities more than new -ones. - diff --git a/book/src/terminology.md b/book/src/terminology.md index 20bc1b09f..cf4bcfd1a 100644 --- a/book/src/terminology.md +++ b/book/src/terminology.md @@ -145,8 +145,9 @@ The public key of a [keypair](#keypair). #### replicator -A type of [client](#client) that stores copies of segments of the -[ledger](#ledger). +A type of [client](#client) that stores [ledger](#ledger) segments and +periodically submits storage proofs to the cluster; not a +[fullnode](#fullnode). #### secret key @@ -154,8 +155,8 @@ The private key of a [keypair](#keypair). #### slot -The time (i.e. number of [blocks](#block)) for which a [leader](#leader) ingests -transactions and produces [entries](#entry). +The time (i.e. number of [blocks](#block)) for which a [leader](#leader) +ingests transactions and produces [entries](#entry). #### sol @@ -215,13 +216,29 @@ for potential future use. A fraction of a [block](#block); the smallest unit sent between [fullnodes](#fullnode). +#### CBC block + +Smallest encrypted chunk of ledger, an encrypted ledger segment would be made of +many CBC blocks; `ledger_segment_size / cbc_block_size` to be exact. + #### curio A scarce, non-fungible member of a set of curios. #### epoch -The time, i.e. number of [slots](#slot), for which a [leader schedule](#leader-schedule) is valid. +The time, i.e. number of [slots](#slot), for which a [leader +schedule](#leader-schedule) is valid. + +#### fake storage proof + +A proof which has the same format as a storage proof, but the sha state is +actually from hashing a known ledger value which the storage client can reveal +and is also easily verifiable by the network on-chain. + +#### ledger segment + +A sequence of [blocks](#block). #### light client @@ -237,6 +254,37 @@ Millions of [instructions](#instruction) per second. The component of a [fullnode](#fullnode) responsible for [program](#program) execution. +#### storage proof + +A set of SHA hash states which is constructed by sampling the encrypted version +of the stored [ledger segment](#ledger-segment) at certain offsets. + +#### storage proof challenge + +A [transaction](#transaction) from a [replicator](#replicator) that verifiably +proves that a [validator](#validator) [confirmed](#storage-proof-confirmation) +a [fake proof](#fake-storage-proof). + +#### storage proof claim + +A [transaction](#transaction) from a [validator](#validator) which is after the +timeout period given from the [storage proof +confirmation](#storage-proof-confirmation) and which no successful +[challenges](#storage-proof-challenge) have been observed which rewards the +parties of the [storage proofs](#storage-proof) and confirmations. + +#### storage proof confirmation + +A [transaction](#transaction) from a [validator](#validator) which indicates +the set of [real](#storage-proof) and [fake proofs](#fake-storage-proof) +submitted by a [replicator](#replicator). The transaction would contain a list +of proof hash values and a bit which says if this hash is valid or fake. + +#### storage validation capacity + +The number of keys and samples that a [validator](#validator) can verify each +storage epoch. + #### thin client A type of [client](#client) that trusts it is communicating with a valid diff --git a/rfcs/0003-storage.md b/rfcs/0003-storage.md index b0d602c4f..87f561672 100644 --- a/rfcs/0003-storage.md +++ b/rfcs/0003-storage.md @@ -1,11 +1,19 @@ -# Storage +# Ledger Replication -The goal of this RFC is to define a protocol for storing a very large ledger -over a p2p network that is verified by solana validators. At full capacity on -a 1gbps network solana will generate 4 petabytes of data per year. To prevent -the network from centralizing around full nodes that have to store the full -data set this protocol proposes a way for mining nodes to provide storage -capacity for pieces of the network. +At full capacity on a 1gbps network solana will generate 4 petabytes of data +per year. To prevent the network from centralizing around full nodes that have +to store the full data set this protocol proposes a way for mining nodes to +provide storage capacity for pieces of the network. + +The basic idea to Proof of Replication is encrypting a dataset with a public +symmetric key using CBC encryption, then hash the encrypted dataset. The main +problem with the naive approach is that a dishonest storage node can stream the +encryption and delete the data as its hashed. The simple solution is to force +the hash to be done on the reverse of the encryption, or perhaps with a random +order. This ensures that all the data is present during the generation of the +proof and it also requires the validator to have the entirety of the encrypted +data present for verification of every proof of every identity. So the space +required to validate is `number_of_proofs * data_size` ## Definitions @@ -14,20 +22,20 @@ capacity for pieces of the network. Storage mining client, stores some part of the ledger enumerated in blocks and submits storage proofs to the chain. Not a full-node. -#### ledger block +#### ledger segment Portion of the ledger which is downloaded by the replicator where storage proof data is derived. #### CBC block -Smallest encrypted chunk of ledger, an encrypted ledger block would be made of -many CBC blocks. `(size of ledger block) / (size of cbc block)` to be exact. +Smallest encrypted chunk of ledger, an encrypted ledger segment would be made of +many CBC blocks. `ledger_segment_size / cbc_block_size` to be exact. #### storage proof A set of sha hash state which is constructed by sampling the encrypted version -of the stored ledger block at certain offsets. +of the stored ledger segment at certain offsets. #### fake storage proof @@ -56,28 +64,16 @@ observed which rewards the parties of the storage proofs and confirmations. The number of keys and samples that a validator can verify each storage epoch. -## Background - -The basic idea to Proof of Replication is encrypting a dataset with a public -symmetric key using CBC encryption, then hash the encrypted dataset. The main -problem with the naive approach is that a dishonest storage node can stream the -encryption and delete the data as its hashed. The simple solution is to force -the hash to be done on the reverse of the encryption, or perhaps with a random -order. This ensures that all the data is present during the generation of the -proof and it also requires the validator to have the entirety of the encrypted -data present for verification of every proof of every identity. So the space -required to validate is `(Number of Proofs)*(data size)` - ## Optimization with PoH -Our improvement on this approach is to randomly sample the encrypted blocks +Our improvement on this approach is to randomly sample the encrypted segments faster than it takes to encrypt, and record the hash of those samples into the -PoH ledger. Thus the blocks stay in the exact same order for every PoRep and +PoH ledger. Thus the segments stay in the exact same order for every PoRep and verification can stream the data and verify all the proofs in a single batch. This way we can verify multiple proofs concurrently, each one on its own CUDA -core. The total space required for verification is `(1 ledger block) + (2 CBC -blocks) * (Number of Identities)`, with core count of equal to (Number of -Identities). We use a 64-byte chacha CBC block size. +core. The total space required for verification is `1_ledger_segment + +2_cbc_blocks * number_of_identities` with core count of equal to +`number_of_identities`. We use a 64-byte chacha CBC block size. ## Network @@ -106,8 +102,8 @@ changes to determine what rate it can validate storage proofs. ### Constants -1. NUM\_STORAGE\_ENTRIES: Number of entries in a block of ledger data. The unit -of storage for a replicator. +1. NUM\_STORAGE\_ENTRIES: Number of entries in a segment of ledger data. The +unit of storage for a replicator. 2. NUM\_KEY\_ROTATION\_TICKS: Number of ticks to save a PoH value and cause a key generation for the section of ledger just generated and the rotation of another key in the set. @@ -167,19 +163,19 @@ is: 2. A replicator obtains the PoH hash corresponding to the last key rotation along with its entry\_height. 3. The replicator signs the PoH hash with its keypair. That signature is the -seed used to pick the block to replicate and also the encryption key. The -replicator mods the signature with the entry\_height to get which block to +seed used to pick the segment to replicate and also the encryption key. The +replicator mods the signature with the entry\_height to get which segment to replicate. 4. The replicator retrives the ledger by asking peer validators and replicators. See 6.5. -5. The replicator then encrypts that block with the key with chacha algorithm +5. The replicator then encrypts that segment with the key with chacha algorithm in CBC mode with NUM\_CHACHA\_ROUNDS of encryption. 6. The replicator initializes a chacha rng with the signature from step 2 as the seed. 7. The replicator generates NUM\_STORAGE\_SAMPLES samples in the range of the -entry size and samples the encrypted block with sha256 for 32-bytes at each +entry size and samples the encrypted segment with sha256 for 32-bytes at each offset value. Sampling the state should be faster than generating the encrypted -block. +segment. 8. The replicator sends a PoRep proof transaction which contains its sha state at the end of the sampling operation, its seed and the samples it used to the current leader and it is put onto the ledger. @@ -198,9 +194,9 @@ frozen. ### Finding who has a given block of ledger 1. Validators monitor the transaction stream for storage mining proofs, and -keep a mapping of ledger blocks by entry\_height to public keys. When it sees a -storage mining proof it updates this mapping and provides an RPC interface -which takes an entry\_height and hands back a list of public keys. The client +keep a mapping of ledger segments by entry\_height to public keys. When it sees +a storage mining proof it updates this mapping and provides an RPC interface +which takes an entry\_height and hands back a list of public keys. The client then looks up in their cluster\_info table to see which network address that corresponds to and sends a repair request to retrieve the necessary blocks of ledger.