diff --git a/.github/workflows/book.yml b/.github/workflows/book.yml index 69c7b07c..5f20c520 100644 --- a/.github/workflows/book.yml +++ b/.github/workflows/book.yml @@ -16,6 +16,12 @@ jobs: with: mdbook-version: '0.4.4' + - name: Install mdbook-katex + uses: actions-rs/cargo@v1 + with: + command: install + args: mdbook-katex + - name: Build Orchard book run: mdbook build book/ diff --git a/book/book.toml b/book/book.toml index 9c221ca0..e86492cb 100644 --- a/book/book.toml +++ b/book/book.toml @@ -4,3 +4,5 @@ language = "en" multilingual = false src = "src" title = "The Orchard Book" + +[preprocessor.katex] diff --git a/book/src/SUMMARY.md b/book/src/SUMMARY.md index a9d7bc4c..485b3119 100644 --- a/book/src/SUMMARY.md +++ b/book/src/SUMMARY.md @@ -9,3 +9,9 @@ - [Spending notes](user/spending-notes.md) - [Integration into an existing chain](user/integration.md) - [Design](design.md) + - [Actions](design/actions.md) + - [Commitments](design/commitments.md) + - [Commitment tree](design/commitment-tree.md) + - [Nullifiers](design/nullifiers.md) + - [Signatures](design/signatures.md) + - [Circuit](design/circuit.md) diff --git a/book/src/design.md b/book/src/design.md index 3d14cb7c..1c927c26 100644 --- a/book/src/design.md +++ b/book/src/design.md @@ -1 +1,54 @@ # Design + +## General design notes + +### Requirements + +- Keep the design close to Sapling, while eliminating aspects we don't like. + +### Non-requirements + +- Delegated proving with privacy from the prover. + - We know how to do this, but it would require a discrete log equality proof, and the + most efficient way to do this would be to do RedDSA and this at the same time, which + means more work for e.g. hardware wallets. + +### Open issues + +- Should we have one memo per output, or one memo per transaction, or 0..n memos? + - Variable, or (1 or n), is a potential privacy leak. + - Need to consider the privacy issue related to light clients requesting individual + memos vs being able to fetch all memos. + +### Key structure + +Provisional proposal: exactly the same key structure as Sapling. + +Group hashing uses the isogeny. + +- nsk goes away; `nk` is now a field element +- TODO: ak / nk split enables splitting the security argument, but could consider merging. + Merging would help with ivk derivation perf (though as a commitment now it's pretty cheap) +- TODO: nullifier computation + +ZIP 32 integration +- Use same Sapling design? +- Simpler "hardened-only" derivation structure? +- Improve diversifier integration / documentation + +### Note structure + +- TODO: UDAs: arbitrary vs whitelisted + +### Typed variables vs byte encodings + +For Sapling, we have encountered multiple places where the specification uses typed +variables to define the consensus rules, but the C++ implementation in zcashd relied on +byte encodings to implement them. This resulted in subtly-different consensus rules being +deployed than were intended, for example where a particular type was not round-trip +encodable. + +In Orchard, we avoid this by defining the consensus rules in terms of the byte encodings +of all variables, and being explicit about any types that are not round-trip encodable. +This makes consensus compatibility between strongly-typed implementations (such as this +crate) and byte-oriented implementations easier to achieve. diff --git a/book/src/design/actions.md b/book/src/design/actions.md new file mode 100644 index 00000000..645aa88d --- /dev/null +++ b/book/src/design/actions.md @@ -0,0 +1,27 @@ +# Actions + +In Sprout, we had a single proof that represented two spent notes and two new notes. This +was necessary in order to faciliate spending multiple notes in a single transaction (to +balance value, an output of one JoinSplit could be spent in the next one), but also +provided a minimal level of arity-hiding: single-JoinSplit transactions all looked like +2-in 2-out transactions, and in multi-JoinSplit transactions each JoinSplit looked like a +1-in 1-out. + +In Sapling, we switched to using value commitments to balance the transaction, removing +the min-2 arity requirement. We opted for one proof per spent note and one (much simpler) +proof per output note, which greatly improved the performance of generating outputs, but +removed any arity-hiding from the proofs (instead having the transaction builder pad +transactions to 1-in, 2-out). + +For Orchard, we take a combined approach: we define an Orchard transaction as containing a +bundle of actions, where each action is both a spend and an output. This provides the same +inherent arity-hiding as multi-JoinSplit Sprout, but using Sapling value commitments to +balance the transaction without doubling its size. + +TODO: Depending on the circuit cost, we _may_ switch to having an action internally +represent either a spend or an output. Externally spends and outputs would still be +indistinguishable, but the transaction would be larger. + +## Memo fields + +TODO: One memo per tx vs one memo per output diff --git a/book/src/design/circuit.md b/book/src/design/circuit.md new file mode 100644 index 00000000..f66800a3 --- /dev/null +++ b/book/src/design/circuit.md @@ -0,0 +1,3 @@ +# Circuit + + diff --git a/book/src/design/commitment-tree.md b/book/src/design/commitment-tree.md new file mode 100644 index 00000000..82f66ea1 --- /dev/null +++ b/book/src/design/commitment-tree.md @@ -0,0 +1,31 @@ +# Commitment tree + +One of the things we learned from Sapling is that having a single global commitment tree +makes life hard for light client wallets. When a new note is received, the wallet derives +its incremental witness from the state of the global tree at the point when the note's +commitment is appended; this incremental state then needs to be updated with every +subsequent commitment in the block in-order. It isn't efficient for a server to +pre-compute and send over the necessary incremental updates for every new note in a block, +and if a wallet requested a specific update from the server it would leak the specific +note that was received. + +Orchard addresses this by splitting the commitment tree into several sub-trees: + +- Bundle tree, that accumulates the commitments within a single bundle (and thus a single + transaction). +- Block tree, that accumulates the bundle tree roots within a single block. +- Global tree, that accumulates the block tree roots. + +Each of these trees has a fixed depth (necessary for being able to create proofs). + +Chains that integrate Orchard can decouple the limits on commitments-per-subtree from +higher-layer constraints like block size, by enabling their blocks and transactions to be +structured internally as a series of Orchard blocks or txs (e.g. a Zcash block would +contain a `Vec`, that each get appended in-order). + +Zcash level: we also bind these roots into the FlyClient history leaves, so that light +clients can assert they are valid independently of the full block. + +TODO: Sean is pretty sure we can just improve the Incremental Merkle Tree implementation +to work around this, without domain-separating the tree. If we can do that instead, it may +be simpler. diff --git a/book/src/design/commitments.md b/book/src/design/commitments.md new file mode 100644 index 00000000..cc838123 --- /dev/null +++ b/book/src/design/commitments.md @@ -0,0 +1,28 @@ +# Commitments + +As in Sapling, we require two kinds of commitment schemes in Pollard: +- $HomomorphicCommit$ is a linearly homomorphic commitment scheme with perfect hiding, and + strong binding reducible to DL. +- $Commit$ and $ShortCommit$ are commitment schemes with perfect hiding, and strong + binding reducible to DL. + +By "strong binding" we mean that the scheme is collision resistant on the input and +randomness. + +We instantiate $HomomorphicCommit$ with a Pedersen commitment, and use it for value +commitments: + +$$\mathsf{cv} = HomomorphicCommit^{\mathsf{cv}}_{\mathsf{rcv}}(v)$$ + +We instantiate $Commit$ and $ShortCommit$ with Sinsemilla, and use them for all other +commitments: + +$$\mathsf{ivk} = ShortCommit^{\mathsf{ivk}}_{\mathsf{rivk}}(\mathsf{nk}, \mathsf{ak})$$ +$$\mathsf{cm} = Commit^{\mathsf{cm}}_{\mathsf{rcm}}(\text{rest of note})$$ + +This is the same split (and rationale) as in Sapling, but using the more PLONK-efficient +Sinsemilla instead of Bowe-Hopwood Pedersen hashes. + +Note that we also deviate from Sapling by using $ShortCommit$ to deriving $\mathsf{ivk}$ +instead of a full PRF. This removes an unnecessary (large) PRF primitive from the circuit, +at the cost of requiring $\mathsf{rivk}$ to be part of the full viewing key. diff --git a/book/src/design/nullifiers.md b/book/src/design/nullifiers.md new file mode 100644 index 00000000..6fdf51a8 --- /dev/null +++ b/book/src/design/nullifiers.md @@ -0,0 +1,125 @@ +# Nullifiers + +The nullifier design we use for Orchard is + +$$\mathsf{nf} = [Hash_{\mathsf{nk}}(\rho) + \psi \pmod{p}] \mathcal{G} + \mathsf{cm},$$ + +where: +- $Hash$ is a keyed circuit-efficient hash (such as Rescue). +- $\rho$ is unique to this output. As with $\mathsf{h_{Sig}}$ in Sprout, $\rho$ includes + the nullifiers of any Orchard notes being spent. + - If spends and outputs are merged / combined, then we always have a nullifier + (internally derived from a real or dummy note), and can rely on the nullifier + derivation process to prevent an adversary from choosing dummy nullifiers arbitrarily. + - If spends and outputs are *not* merged, then $\rho$ should probably also include + unique information from other parts of the transaction as well. + - TODO: Decide which of the above two cases will be used, and update this. +- $\psi$ is sender-controlled randomness. It is not required to be unique, and in practice + is derived from a sender-selected random value $\mathsf{rseed}$. +- $\mathcal{G}$ is an fixed independent base. + +This gives a note structure of + +$$(addr, v, \rho, \psi, \mathsf{rcm}).$$ + +The nullifier commits to the note value via $\mathsf{cm}$ in order to domain-separate +nullifiers for zero-valued notes from other notes. + +## Security properties + +We care about several security properties for our nullifiers: + +- **Balance:** can I forge money? + +- **Note Privacy:** can I gain information about notes only from the public block chain? + - This describes notes sent in-band. + +- **Note Privacy (OOB):** can I gain information about notes sent out-of-band, only from + the public block chain? + - In this case, we assume privacy of the channel over which the note is sent, and that + the adversary does not have access to any notes sent to the same address which are + then spent (so that the nullifier is on the block chain somewhere). + +- **Spend Unlinkability:** given the incoming viewing key for an address, and not the full + viewing key, can I (possibly the sender) detect spends of any notes sent to that address? + - We're giving $ivk$ to the attacker and allowing it to be the sender in order to make + this property as strong as possible: they will have *all* the notes sent to that + address. + +- **Faerie Resistance:** can I perform a Faerie Gold attack (i.e. cause notes to be + accepted that are unspendable)? + +We assume (and instantiate elsewhere) the following primitives: + +- $GH$ is a cryptographic hash into the group (such as BLAKE2s with simplified SWU), used + to derive all fixed independent bases. +- $E$ is an elliptic curve (such as Pallas). +- $KDF$ is the note encryption key derivation function. + +For our chosen design, our desired security properties rely on the following assumptions: + +$$ +\begin{array}{|l|l|} +\text{Balance} & DL_E \\ +\text{Note Privacy} & HashDH^{KDF}_E \\ +\text{Note Privacy (OOB)} & \text{Near perfect} \ddagger \\ +\text{Spend Unlinkability} & DDH_E^\dagger \vee PRF_{Hash} \\ +\text{Faerie Resistance} & DL_E \\ +\end{array} +$$ + +$HashDH^{F}_E$ is computational Diffie-Hellman using $F$ for the key derivation, with +one-time ephemeral keys. This assumption is heuristically weaker than $DDH_E$ but stronger +than $DL_E$. + +We omit $RO_{GH}$ as a security assumption because we only rely on the random oracle +applied to fixed inputs defined by the protocol, i.e. to generate the fixed base +$\mathcal{G}$, not to attacker-specified inputs. + +> $\dagger$ We additionally assume that for any input $x$, $\{Hash_{\mathsf{nk}}(x) : +> \mathsf{nk} \in E\}$ gives a scalar in an adequate range for $DDH_E$. (Otherwise, $Hash$ +> could be trivial, e.g. independent of $\mathsf{nk}$.) +> +> $\ddagger$ Statistical distance $< 2^{-167.8}$ from perfect. + +## Considered alternatives + +$\color{red}{\textsf{⚠ Caution}}$: be skeptical of the claims in this table about what +problem(s) each security property depends on. They may not be accurate and are definitely +not fully rigorous. + +$$ +\begin{array}{|c|l|c|c|c|c|} +\hline +\mathsf{nf} & Note & \text{Balance} & \text{Note Privacy} & \text{Note Privacy (OOB)} & \text{Spend Unlinkability} & \text{Faerie Resistance} \\\hline +[\mathsf{nk}] [\theta] H & (addr, v, H, \theta, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E & RO_{GH} \wedge DL_E \\\hline +[\mathsf{nk}] H + [\mathsf{rnf}] \mathcal{I} & (addr, v, H, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E & RO_{GH} \wedge DL_E \\\hline +Hash([\mathsf{nk}] [\theta] H) & (addr, v, H, \theta, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E \vee Pre_{Hash} & Coll_{Hash} \wedge RO_{GH} \wedge DL_E \\\hline +Hash([\mathsf{nk}] H + [\mathsf{rnf}] \mathcal{I}) & (addr, v, H, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E \vee Pre_{Hash} & Coll_{Hash} \wedge RO_{GH} \wedge DL_E \\\hline +[Hash_{\mathsf{nk}}(\psi)] [\theta] H & (addr, v, H, \theta, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & RO_{GH} \wedge DL_E \\\hline +[Hash_{\mathsf{nk}}(\psi)] H + [\mathsf{rnf}] \mathcal{I} & (addr, v, H, \mathsf{rnf}, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & RO_{GH} \wedge DL_E \\\hline +[Hash_{\mathsf{nk}}(\psi)] \mathcal{G} + [\theta] H & (addr, v, H, \theta, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & RO_{GH} \wedge DL_E \\\hline +[Hash_{\mathsf{nk}}(\psi)] H + \mathsf{cm} & (addr, v, H, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & PRF_{Hash} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +[Hash_{\mathsf{nk}}(\rho, \psi)] \mathcal{G} + \mathsf{cm} & (addr, v, \rho, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & PRF_{Hash} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +[Hash_{\mathsf{nk}}(\rho)] \mathcal{G} + \mathsf{cm} & (addr, v, \rho, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & PRF_{Hash} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +[Hash_{\mathsf{nk}}(\rho, \psi)] \mathcal{G} + Commit^{\mathsf{nf}}_{\mathsf{rnf}}(v, \rho) & (addr, v, \rho, \mathsf{rnf}, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +[Hash_{\mathsf{nk}}(\rho)] \mathcal{G} + Commit^{\mathsf{nf}}_{\mathsf{rnf}}(v, \rho) & (addr, v, \rho, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +[Hash_{\mathsf{nk}}(\rho, \psi)] \mathcal{G} + [\mathsf{rnf}] \mathcal{I} + \mathsf{cm} & (addr, v, \rho, \mathsf{rnf}, \psi, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +[Hash_{\mathsf{nk}}(\rho)] \mathcal{G} + [\mathsf{rnf}] \mathcal{I} + \mathsf{cm} & (addr, v, \rho, \mathsf{rnf}, \mathsf{rcm}) & DL_E & HashDH^{KDF}_E & \text{Perfect} & DDH_E^\dagger \vee PRF_{Hash} & DL_E \\\hline +\end{array} +$$ + +In the above alternatives: +- $H$ is calculated by the sender as $H = GH(\rho)$, and would be provided in the action. +- $\mathcal{I}$ is an fixed independent base, independent of $\mathcal{G}$ and any others + returned by $GH$. + +For the options that use $H$, when spending a note, +- if it's a real note, then $H$ is as computed for that note, so it is a unique RO output; +- if it's a dummy note, we enforce that it is some fixed base independent of other bases. + +The $Commit^{\mathsf{nf}}$ variants enabled nullifier domain separation based on note +value, without directly depending on $\mathsf{cm}$ (which in its native type is a base +field element, not a group element). We decided instead to follow Sapling by defining an +intermediate representation of $\mathsf{cm}$ as a group element, that is only used in +nullifier computation. diff --git a/book/src/design/signatures.md b/book/src/design/signatures.md new file mode 100644 index 00000000..4f0922fc --- /dev/null +++ b/book/src/design/signatures.md @@ -0,0 +1,7 @@ +# Signatures + +Orchard signatures are an instantiation of RedDSA with a cofactor of 1. + +TODO: +- Should it be possible to sign partial transactions? + - If we're going to merge down all the signatures into a single one, and also want this, we need to ensure there's a feasible MPC.