book: Start adding Pollard design notes

Some of this content may move into the concepts section, or possibly into a dedicated specification area, but for now the design section includes our choices alongside the reasoning.
2020-12-15 20:52:31 +00:00 · 2020-12-15 20:52:31 +00:00 · 1baedabd86
parent d51299999c
commit 1baedabd86
8 changed files with 262 additions and 0 deletions
--- a/book/src/SUMMARY.md
+++ b/book/src/SUMMARY.md
@ -9,3 +9,9 @@
  - [Spending notes](user/spending-notes.md)
  - [Integration into an existing chain](user/integration.md)
 - [Design](design.md)
+  - [Actions](design/actions.md)
+  - [Commitments](design/commitments.md)
+  - [Commitment tree](design/commitment-tree.md)
+  - [Nullifiers](design/nullifiers.md)
+  - [Signatures](design/signatures.md)
+  - [Circuit](design/circuit.md)
--- a/book/src/design.md
+++ b/book/src/design.md
@ -1 +1,60 @@
 # Design
+
+## General design notes
+
+- Write up design review of Pollard
+  - Requirements, rationale
+  - Open design issues
+  - Future work / non-requirements
+
+- TODO: Enumerate the features that Pollard has.
+
+### Requirements
+
+- Keep the design close to Sapling, while eliminating aspects we don't like.
+
+### Non-requirements
+
+- Delegated proving with privacy from the prover.
+  - We know how to do this, but it would require a discrete log equality proof, and the most efficient way to do this would be to do RedDSA and this at the same time, which means more work for e.g. hardware wallets.
+
+### Open issues
+
+- Should we have one memo per output, or one memo per transaction, or 0..n memos?
+  - Variable, or (1 or n), is a potential privacy leak.
+  - Need to consider the privacy issue related to light clients requesting individual memos vs being able to fetch all memos.
+
+### Key structure
+
+Provisional proposal: exactly the same key structure as Sapling.
+
+Group hashing uses the isogeny.
+
+- nsk goes away; `nk` is now a field element
+- TODO: ak / nk split enables splitting the security argument, but could consider merging.
+  Merging would help with ivk derivation perf (though as a commitment now it's pretty cheap)
+- TODO: nullifier computation
+
+ZIP 32 integration
+- Use same Sapling design?
+- Simpler "hardened-only" derivation structure?
+- Improve diversifier integration / documentation
+
+### Note structure
+
+- TODO: UDAs: arbitrary vs whitelisted
+
+### Typed variables vs byte encodings
+
+For Sapling, we have encountered multiple places where the specification uses typed
+variables to define the consensus rules, but the C++ implementation in zcashd relied on
+byte encodings to implement them. This resulted in subtly-different consensus rules being
+deployed than were intended, for example where a particular type was not round-trip
+encodable.
+
+In Pollard, we avoid this by defining the consensus rules in terms of the byte encodings
+of all variables, and being explicit about any types that are not round-trip encodable.
+This makes consensus compatibility between strongly-typed implementations (such as this
+crate) and byte-oriented implementations easier to achieve.
+
+- TODO: Do we want to take this approach, or try again for "everything non-canonical"?
--- a/book/src/design/actions.md
+++ b/book/src/design/actions.md
@ -0,0 +1,21 @@
+# Actions
+
+In Sprout, we had a single proof that represented two spent notes and two new notes.
+In Sapling, we had one proof per spent note and one (much simpler) proof per output note.
+
+For Pollard, we take a middle approach:
+
+Each action is both a spend and an output
+
+
+
+
+Each action
+
+- Each action is either a spend or an output
+
+- Each action is both a spend and an output
+
+## Memo fields
+
+TODO: One memo per tx vs one memo per output
--- a/book/src/design/circuit.md
+++ b/book/src/design/circuit.md
@ -0,0 +1,3 @@
+# Circuit
+
+
--- a/book/src/design/commitment-tree.md
+++ b/book/src/design/commitment-tree.md
@ -0,0 +1,15 @@
+# Commitment tree
+
+One of the things we learned from Sapling is that having a single global commitment tree makes life hard for light client wallets. When a new note is received, the wallet derives its incremental witness from the state of the global tree at the point when the note's commitment is appended; this incremental state then needs to be updated with every subsequent commitment in the block in-order. It isn't efficient for a server to pre-compute and send over the necessary incremental updates for every new note in a block, and if a wallet requested a specific update from the server it would leak the specific note that was received.
+
+Pollard addresses this by splitting the commitment tree into several sub-trees:
+
+- Bundle tree, that accumulates the commitments within a single bundle (and thus a single transaction).
+- Block tree, that accumulates the bundle tree roots within a single block.
+- Global tree, that accumulates the block tree roots.
+
+Each of these trees has a fixed depth (necessary for being able to create proofs).
+
+Chains that integrate Pollard can decouple the limits on commitments-per-subtree from higher-layer constraints like block size, by enabling their blocks and transactions to be structured internally as a series of Pollard blocks or txs (e.g. a Zcash block would contain a `Vec<BlockTreeRoot>`, that each get appended in-order).
+
+Zcash level: we also bind these roots into the FlyClient history leaves, so that light clients can assert they are valid independently of the full block.
--- a/book/src/design/commitments.md
+++ b/book/src/design/commitments.md
@ -0,0 +1,28 @@
+# Commitments
+
+As in Sapling, we require two kinds of commitment schemes in Pollard:
+- $HomomorphicCommit$ is a linearly homomorphic commitment scheme with perfect hiding, and
+  strong binding reducible to DL.
+- $Commit$ and $ShortCommit$ are commitment schemes with perfect hiding, and strong
+  binding reducible to DL.
+
+By "strong binding" we mean that the scheme is collision resistant on the input and
+randomness.
+
+We instantiate $HomomorphicCommit$ with a Pedersen commitment, and use it for value
+commitments:
+
+$$\mathsf{cv} = HomomorphicCommit^{\mathsf{cv}}_{\mathsf{rcv}}(v)$$
+
+We instantiate $Commit$ and $ShortCommit$ with Sinsemilla, and use them for all other
+commitments:
+
+$$\mathsf{ivk} = ShortCommit^{\mathsf{ivk}}_{\mathsf{rivk}}(\mathsf{nk}, \mathsf{ak})$$
+$$\mathsf{cm} = Commit^{\mathsf{cm}}_{\mathsf{rcm}}(\text{rest of note})$$
+
+This is the same split (and rationale) as in Sapling, but using the more PLONK-efficient
+Sinsemilla instead of Bowe-Hopwood Pedersen hashes.
+
+Note that we also deviate from Sapling by using $ShortCommit$ to deriving $\mathsf{ivk}$
+instead of a full PRF. This removes an unnecessary (large) PRF primitive from the circuit,
+at the cost of requiring $\mathsf{rivk}$ to be part of the full viewing key.
--- a/book/src/design/nullifiers.md
+++ b/book/src/design/nullifiers.md
@ -0,0 +1,125 @@
+# Nullifiers
+
+The nullifier design we use for Pollard is
+
+$$\mathsf{nf} = [Hash_{\mathsf{nk}}(\rho) + \psi \pmod{p}] \mathcal{G} + \mathsf{cm},$$
+
+where:
+- $Hash$ is a keyed circuit-efficient hash (such as Rescue).
+- $\rho$ is unique to this output. As with $\mathsf{h_{Sig}}$ in Sprout, $\rho$ includes
+  the nullifiers of any Pollard notes being spent.
+  - If spends and outputs are merged / combined, then we always have a nullifier
+    (internally derived from a real or dummy note), and can rely on the nullifier
+    derivation process to prevent an adversary from choosing dummy nullifiers arbitrarily.
+  - If spends and outputs are *not* merged, then $\rho$ should probably also include
+    unique information from other parts of the transaction as well.
+  - TODO: Decide which of the above two cases will be used, and update this.
+- $\psi$ is sender-controlled randomness. It is not required to be unique, and in practice
+  is derived from a sender-selected random value $\mathsf{rseed}$.
+- $\mathcal{G}$ is an fixed independent base.
+
+This gives a note structure of
+
+$$(addr, v, \rho, \psi, \mathsf{rcm}).$$
+
+The nullifier commits to the note value via $\mathsf{cm}$ in order to domain-separate
+nullifiers for zero-valued notes from other notes.
+
+## Security properties
+
+We care about several security properties for our nullifiers:
+
+- **Balance:** can I forge money?
+
+- **Note Privacy:** can I gain information about notes only from the public block chain?
+  - This describes notes sent in-band.
+
+- **Note Privacy (OOB):** can I gain information about notes sent out-of-band, only from
+  the public block chain?
+  - In this case, we assume privacy of the channel over which the note is sent, and that
+    the adversary does not have access to any notes sent to the same address which are
+    then spent (so that the nullifier is on the block chain somewhere).
+
+- **Spend Unlinkability:** given the incoming viewing key for an address, and not the full
+  viewing key, can I (possibly the sender) detect spends of any notes sent to that address?
+  - We're giving $ivk$ to the attacker and allowing it to be the sender in order to make
+    this property as strong as possible: they will have *all* the notes sent to that
+    address.
+
+- **Faerie Resistance:** can I perform a Faerie Gold attack (i.e. cause notes to be
+  accepted that are unspendable)?
+
+We assume (and instantiate elsewhere) the following primitives:
+
+- $GH$ is a cryptographic hash into the group (such as BLAKE2s with simplified SWU), used
+  to derive all fixed independent bases.
+- $E$ is an elliptic curve (such as Pallas).
+- $KDF$ is the note encryption key derivation function.
+
+For our chosen design, our desired security properties rely on the following assumptions:
+
+$$
+\begin{array}{|l|l|}
+\text{Balance} & DL_E \\
+\text{Note Privacy} & HashDH^{KDF}_E \\
+\text{Note Privacy (OOB)} & \text{Near perfect} \ddagger \\
+\text{Spend Unlinkability} & DDH_E^\dagger \vee PRF_{Hash} \\
+\text{Faerie Resistance} & DL_E \\
+\end{array}
+$$
+
+$HashDH^{F}_E$ is computational Diffie-Hellman using $F$ for the key derivation, with
+one-time ephemeral keys. This assumption is heuristically weaker than $DDH_E$ but stronger
+than $DL_E$.
+
+We omit $RO_{GH}$ as a security assumption because we only rely on the random oracle
+applied to fixed inputs defined by the protocol, i.e. to generate the fixed base
+$\mathcal{G}$, not to attacker-specified inputs.
+
+> $\dagger$ We additionally assume that for any input $x$, $\{Hash_{\mathsf{nk}}(x) :
+> \mathsf{nk} \in E\}$ gives a scalar in an adequate range for $DDH_E$. (Otherwise, $Hash$
+> could be trivial, e.g. independent of $\mathsf{nk}$.)
+>
+> $\ddagger$ Statistical distance $< 2^{-167.8}$ from perfect.
+
+## Considered alternatives
+
+$\color{red}{\textsf{⚠ Caution}}$: be skeptical of the claims in this table about what
+problem(s) each security property depends on. They may not be accurate and are definitely
+not fully rigorous.
+
+$$
+\begin{array}{|c|l|c|c|c|c|}
+\hline
+\mathsf{nf} & Note & \text{Balance} & \text{Note Privacy} & \text{Note Privacy (OOB)} & \text{Spend Unlinkability} & \text{Faerie Resistance} \\\hline
+[\mathsf{nk}] [\theta] H & (addr, v, H, \theta, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E & RO_{GH} \wedge DL_E \\\hline
+[\mathsf{nk}] H + [\mathsf{rnf}] \mathcal{I} & (addr, v, H, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E & RO_{GH} \wedge DL_E \\\hline
+Hash([\mathsf{nk}] [\theta] H) & (addr, v, H, \theta, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E \vee Pre_{Hash} & Coll_{Hash} \wedge RO_{GH} \wedge DL_E \\\hline
+Hash([\mathsf{nk}] H + [\mathsf{rnf}] \mathcal{I}) & (addr, v, H, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E \vee Pre_{Hash} & Coll_{Hash} \wedge RO_{GH} \wedge DL_E \\\hline
+[Hash_{\mathsf{nk}}(\psi)] [\theta] H & (addr, v, H, \theta, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & RO_{GH} \wedge DL_E \\\hline
+[Hash_{\mathsf{nk}}(\psi)] H + [\mathsf{rnf}] \mathcal{I} & (addr, v, H, \mathsf{rnf}, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & RO_{GH} \wedge DL_E \\\hline
+[Hash_{\mathsf{nk}}(\psi)] \mathcal{G} + [\theta] H & (addr, v, H, \theta, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & RO_{GH} \wedge DL_E \\\hline
+[Hash_{\mathsf{nk}}(\psi)] H + \mathsf{cm} & (addr, v, H, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & PRF_{Hash} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+[Hash_{\mathsf{nk}}(\rho, \psi)] \mathcal{G} + \mathsf{cm} & (addr, v, \rho, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & PRF_{Hash} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+[Hash_{\mathsf{nk}}(\rho)] \mathcal{G} + \mathsf{cm} & (addr, v, \rho, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & PRF_{Hash} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+[Hash_{\mathsf{nk}}(\rho, \psi)] \mathcal{G} + Commit^{\mathsf{nf}}_{\mathsf{rnf}}(v, \rho) & (addr, v, \rho, \mathsf{rnf}, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+[Hash_{\mathsf{nk}}(\rho)] \mathcal{G} + Commit^{\mathsf{nf}}_{\mathsf{rnf}}(v, \rho) & (addr, v, \rho, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+[Hash_{\mathsf{nk}}(\rho, \psi)] \mathcal{G} + [\mathsf{rnf}] \mathcal{I} + \mathsf{cm} & (addr, v, \rho, \mathsf{rnf}, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+[Hash_{\mathsf{nk}}(\rho)] \mathcal{G} + [\mathsf{rnf}] \mathcal{I} + \mathsf{cm} & (addr, v, \rho, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline
+\end{array}
+$$
+
+In the above alternatives:
+- $H$ is calculated by the sender as $H = GH(\rho)$, and would be provided in the action.
+- $\mathcal{I}$ is an fixed independent base, independent of $\mathcal{G}$ and any others
+  returned by $GH$.
+
+For the options that use $H$, when spending a note,
+- if it's a real note, then $H$ is as computed for that note, so it is a unique RO output;
+- if it's a dummy note, we enforce that it is some fixed base independent of other bases.
+
+The $Commit^{\mathsf{nf}}$ variants enabled nullifier domain separation based on note
+value, without directly depending on $\mathsf{cm}$ (which in its native type is a base
+field element, not a group element). We decided instead to follow Sapling by defining an
+intermediate representation of $\mathsf{cm}$ as a group element, that is only used in
+nullifier computation.
--- a/book/src/design/signatures.md
+++ b/book/src/design/signatures.md
@ -0,0 +1,5 @@
+# Signatures
+
+TODO:
+- Should it be possible to sign partial transactions?
+  - If we're going to merge down all the signatures into a single one, and also want this, we need to ensure there's a feasible MPC.