orchard Crates.io

IMPORTANT: This library is being actively developed and should not be used in production software.

Requires Rust 1.56.1+.

Documentation

License

Copyright 2020-2021 The Electric Coin Company.

You may use this package under the Bootstrap Open Source Licence, version 1.0, or at your option, any later version. See the file COPYING for more details, and LICENSE-BOSL for the terms of the Bootstrap Open Source Licence, version 1.0.

The purpose of the BOSL is to allow commercial improvements to the package while ensuring that all improvements are open source. See here for why the BOSL exists.

Concepts

Preliminaries

User Documentation

Creating keys and addresses

Creating notes

Spending notes

Integration into an existing chain

Design

General design notes

Requirements

  • Keep the design close to Sapling, while eliminating aspects we don't like.

Non-requirements

  • Delegated proving with privacy from the prover.
    • We know how to do this, but it would require a discrete log equality proof, and the most efficient way to do this would be to do RedDSA and this at the same time, which means more work for e.g. hardware wallets.

Open issues

  • Should we have one memo per output, or one memo per transaction, or 0..n memos?
    • Variable, or (1 or n), is a potential privacy leak.
    • Need to consider the privacy issue related to light clients requesting individual memos vs being able to fetch all memos.

Note structure

  • TODO: UDAs: arbitrary vs whitelisted

Typed variables vs byte encodings

For Sapling, we have encountered multiple places where the specification uses typed variables to define the consensus rules, but the C++ implementation in zcashd relied on byte encodings to implement them. This resulted in subtly-different consensus rules being deployed than were intended, for example where a particular type was not round-trip encodable.

In Orchard, we avoid this by defining the consensus rules in terms of the byte encodings of all variables, and being explicit about any types that are not round-trip encodable. This makes consensus compatibility between strongly-typed implementations (such as this crate) and byte-oriented implementations easier to achieve.

Keys and addresses

Orchard keys and payment addresses are structurally similar to Sapling. The main change is that Orchard keys use the Pallas curve instead of Jubjub, in order to enable the future use of the Pallas-Vesta curve cycle in the Orchard protocol. (We already use Vesta as the curve on which Halo 2 proofs are computed, but this doesn't yet require a cycle.)

Using the Pallas curve and making the most efficient use of the Halo 2 proof system involves corresponding changes to the key derivation process, such as using Sinsemilla for Pallas-efficient commitments. We also take the opportunity to remove all uses of expensive general-purpose hashes (such as BLAKE2s) from the circuit.

We make several structural changes, building on the lessons learned from Sapling:

  • The nullifier private key is removed. Its purpose in Sapling was as defense-in-depth, in case RedDSA was found to have weaknesses; an adversary who could recover would not be able to spend funds. In practice it has not been feasible to manage much more securely than a full viewing key, as the computational power required to generate Sapling proofs has made it necessary to perform this step on the same device that is creating the overall transaction (rather than on a more constrained device like a hardware wallet). We are also more confident in RedDSA now.

  • is now a field element instead of a curve point, making it more efficient to generate nullifiers.

  • is now derived from , instead of being derived in parallel. This places it in a similar position within the key structure to , and also removes an issue where two full viewing keys could be constructed that have the same but different s. Users still have control over whether is used when constructing a transaction.

  • All diversifiers now result in valid payment addresses, due to group hashing into Pallas being specified to be infallible. This removes significant complexity from the use cases for diversified addresses.

  • The fact that Pallas is a prime-order curve simplifies the protocol and removes the need for cofactor multiplication in key agreement. Unlike Sapling, we define public (including ephemeral) and private keys used for note encryption to exclude the zero point and the zero scalar. Without this change, the implementation of the Orchard Action circuit would need special cases for the zero point, since Pallas is a short Weierstrass rather than an Edwards curve. This also has the advantage of ensuring that the key agreement has "contributory behaviour" — that is, if either party contributes a random scalar, then the shared secret will be random to an observer who does not know that scalar and cannot break Diffie–Hellman.

Other than the above, Orchard retains the same design rationale for its keys and addresses as Sapling. For example, diversifiers remain at 11 bytes, so that a raw Orchard address is the same length as a raw Sapling address.

Orchard payment addresses do not have a stand-alone string encoding. Instead, we define "unified addresses" that can bundle together addresses of different types, including Orchard. Unified addresses have a Human-Readable Part of "u" on Mainnet, i.e. they will have the prefix "u1". For specifications of this and other formats (e.g. for Orchard viewing and spending keys), see section 5.6.4 of the NU5 protocol specification [#NU5-orchardencodings].

Hierarchical deterministic wallets

When designing Sapling, we defined a BIP 32-like mechanism for generating hierarchical deterministic wallets in ZIP 32. We decided at the time to stick closely to the design of BIP 32, on the assumption that there were Bitcoin use cases that used both hardened and non-hardened derivation that we might not be aware of. This decision created significant complexity for Sapling: we needed to handle derivation separately for each component of the expanded spending key and full viewing key (whereas for transparent addresses there is only a single component in the spending key).

Non-hardened derivation enables creating a multi-level path of child addresses below some parent address, without involving the parent spending key. The primary use case for this is HD wallets for transparent addresses, which use the following structure defined in BIP 44:

  • (H) BIP 44
    • (H) Coin type: Zcash
      • (H) Account 0
        • (N) Normal addresses
          • (N) Address 0
          • (N) Address 1...
        • (N) Change addresses
          • (N) Change address 0
          • (N) Change address 1...
      • (H) Account 1...

Shielded accounts do not require separating change addresses from normal addresses, because addresses are not revealed in transactions. Similarly, there is also no need to generate a fresh spending key for every transaction, and in fact this would cause a linear slow-down in wallet scanning. But for users who do want to generate multiple addresses per account, they can generate the following structure, which does not use non-hardened derivation:

  • (H) ZIP 32
    • (H) Coin type: Zcash
      • (H) Account 0
        • Diversified address 0
        • Diversified address 1...
      • (H) Account 1...

Non-hardened derivation is therefore only required for use-cases that require the ability to derive more than one child layer of addresses. However, in the years since Sapling was deployed, we have not seen any such use cases appear.

Therefore, for Orchard we only define hardened derivation, and do so with a much simpler design than ZIP 32. All derivations produce an opaque binary spending key, from which the keys and addresses are then derived. As a side benefit, this makes key formats shorter. (The formats that will actually be used in practice for Orchard will correspond to the simpler Sapling formats in the protocol specification, rather than the longer and more complicated "extended" ones defined by ZIP 32.)

Actions

In Sprout, we had a single proof that represented two spent notes and two new notes. This was necessary in order to faciliate spending multiple notes in a single transaction (to balance value, an output of one JoinSplit could be spent in the next one), but also provided a minimal level of arity-hiding: single-JoinSplit transactions all looked like 2-in 2-out transactions, and in multi-JoinSplit transactions each JoinSplit looked like a 1-in 1-out.

In Sapling, we switched to using value commitments to balance the transaction, removing the min-2 arity requirement. We opted for one proof per spent note and one (much simpler) proof per output note, which greatly improved the performance of generating outputs, but removed any arity-hiding from the proofs (instead having the transaction builder pad transactions to 1-in, 2-out).

For Orchard, we take a combined approach: we define an Orchard transaction as containing a bundle of actions, where each action is both a spend and an output. This provides the same inherent arity-hiding as multi-JoinSplit Sprout, but using Sapling value commitments to balance the transaction without doubling its size.

TODO: Depending on the circuit cost, we may switch to having an action internally represent either a spend or an output. Externally spends and outputs would still be indistinguishable, but the transaction would be larger.

Memo fields

TODO: One memo per tx vs one memo per output

Commitments

As in Sapling, we require two kinds of commitment schemes in Orchard:

  • is a linearly homomorphic commitment scheme with perfect hiding, and strong binding reducible to DL.
  • and are commitment schemes with perfect hiding, and strong binding reducible to DL.

By "strong binding" we mean that the scheme is collision resistant on the input and randomness.

We instantiate with a Pedersen commitment, and use it for value commitments:

We instantiate and with Sinsemilla, and use them for all other commitments:

This is the same split (and rationale) as in Sapling, but using the more PLONK-efficient Sinsemilla instead of Bowe--Hopwood Pedersen hashes.

Note that for , we also deviate from Sapling in two ways:

  • We use to derive instead of a full PRF. This removes an unnecessary (large) PRF primitive from the circuit, at the cost of requiring to be part of the full viewing key.
  • We define as an integer in ; that is, we exclude . For Sapling, we relied on BLAKE2s to make infeasible to produce, but it was still technically possible. For Orchard, we get this by construction:
    • is not a valid x-coordinate for any Pallas point.
    • internally maps points to field elements by replacing the identity (which has no affine coordinates) with . But is defined using incomplete addition, and thus will never produce the identity.

Commitment tree

The commitment tree structure for Orchard is identical to Sapling:

  • A single global commitment tree of fixed depth 32.
  • Note commitments are appended to the tree in-order from the block.
  • Valid Orchard anchors correspond to the global tree state at block boundaries (after all commitments from a block have been appended, and before any commitments from the next block have been appended).

The only difference is that we instantiate with Sinsemilla (whereas used a Bowe--Hopwood Pedersen hash).

Uncommitted leaves

The fixed-depth incremental Merkle trees that we use (in Sprout and Sapling, and again in Orchard) require specifying an "empty" or "uncommitted" leaf - a value that will never be appended to the tree as a regular leaf.

  • For Sprout (and trees composed of the outputs of bit-twiddling hash functions), we use the all-zeroes array; the probability of a real note having a colliding note commitment is cryptographically negligible.
  • For Sapling, where leaves are -coordinates of Jubjub points, we use the value which is not the -coordinate of any Jubjub point.

Orchard note commitments are the -coordinates of Pallas points; thus we take the same approach as Sapling, using a value that is not the -coordinate of any Pallas point as the uncommitted leaf value. We use the value for both Pallas and Vesta, because is not a square in either or :

sage: p = 0x40000000000000000000000000000000224698fc094cf91b992d30ed00000001
sage: q = 0x40000000000000000000000000000000224698fc0994a8dd8c46eb2100000001
sage: EllipticCurve(GF(p), [0, 5]).count_points() == q
True
sage: EllipticCurve(GF(q), [0, 5]).count_points() == p
True
sage: Mod(13, p).is_square()
False
sage: Mod(13, q).is_square()
False

Note: There are also no Pallas points with -coordinate , but we map the identity to within the circuit. Although cannot return the identity (the incomplete addition would return instead), it would arguably be confusing to rely on that.

Considered alternatives

We considered splitting the commitment tree into several sub-trees:

  • Bundle tree, that accumulates the commitments within a single bundle (and thus a single transaction).
  • Block tree, that accumulates the bundle tree roots within a single block.
  • Global tree, that accumulates the block tree roots.

Each of these trees would have had a fixed depth (necessary for being able to create proofs). Chains that integrated Orchard could have decoupled the limits on commitments-per-subtree from higher-layer constraints like block size, by enabling their blocks and transactions to be structured internally as a series of Orchard blocks or txs (e.g. a Zcash block would have contained a Vec<BlockTreeRoot>, that each were appended in-order).

The motivation for considering this change was to improve the lives of light client wallet developers. When a new note is received, the wallet derives its incremental witness from the state of the global tree at the point when the note's commitment is appended; this incremental state then needs to be updated with every subsequent commitment in the block in-order. Wallets can't get help from the server to create these for new notes without leaking the specific note that was received.

We decided that this was too large a change from Sapling, and that it should be possible to improve the Incremental Merkle Tree implementation to work around the efficiency issues without domain-separating the tree.

Nullifiers

The nullifier design we use for Orchard is

where:

  • is a keyed circuit-efficient PRF (such as Rescue or Poseidon).
  • is unique to this output. As with in Sprout, includes the nullifiers of any Orchard notes being spent in the same action. Given that an action consists of a single spend and a single output, we set to be the nullifier of the spent note.
  • is sender-controlled randomness. It is not required to be unique, and in practice is derived from both and a sender-selected random value :
  • is a fixed independent base.
  • extracts the -coordinate of a Pallas curve point.

This gives a note structure of

The note plaintext includes in place of and , and omits (which is a public part of the action).

Security properties

We care about several security properties for our nullifiers:

  • Balance: can I forge money?

  • Note Privacy: can I gain information about notes only from the public block chain?

    • This describes notes sent in-band.
  • Note Privacy (OOB): can I gain information about notes sent out-of-band, only from the public block chain?

    • In this case, we assume privacy of the channel over which the note is sent, and that the adversary does not have access to any notes sent to the same address which are then spent (so that the nullifier is on the block chain somewhere).
  • Spend Unlinkability: given the incoming viewing key for an address, and not the full viewing key, can I (possibly the sender) detect spends of any notes sent to that address?

    • We're giving to the attacker and allowing it to be the sender in order to make this property as strong as possible: they will have all the notes sent to that address.
  • Faerie Resistance: can I perform a Faerie Gold attack (i.e. cause notes to be accepted that are unspendable)?

    • We're giving the full viewing key to the attacker and allowing it to be the sender in order to make this property as strong as possible: they will have all the notes sent to that address, and be able to derive every nullifier.

We assume (and instantiate elsewhere) the following primitives:

  • is a cryptographic hash into the group (such as BLAKE2s with simplified SWU), used to derive all fixed independent bases.
  • is an elliptic curve (such as Pallas).
  • is the note encryption key derivation function.

For our chosen design, our desired security properties rely on the following assumptions:

is computational Diffie-Hellman using for the key derivation, with one-time ephemeral keys. This assumption is heuristically weaker than but stronger than .

We omit as a security assumption because we only rely on the random oracle applied to fixed inputs defined by the protocol, i.e. to generate the fixed base , not to attacker-specified inputs.

We additionally assume that for any input , gives a scalar in an adequate range for . (Otherwise, could be trivial, e.g. independent of .)

Statistical distance from perfect.

Considered alternatives

: be skeptical of the claims in this table about what problem(s) each security property depends on. They may not be accurate and are definitely not fully rigorous.

The entries in this table omit the application of , which is an optimization to halve the nullifier length. That optimization requires its own security analysis, but because it is a deterministic mapping, only Faerie Resistance could be affected by it.

In the above alternatives:

  • is a keyed circuit-efficient hash (such as Rescue).

  • is an fixed independent base, independent of and any others returned by .

  • is a pair of fixed independent bases (independent of all others), where the specific choice of base depends on whether the note has zero value.

  • is a base unique to this output.

    • For non-zero-valued notes, . As with in Sprout, includes the nullifiers of any Orchard notes being spent in the same action.
    • For zero-valued notes, is constrained by the circuit to a fixed base independent of and any others returned by .

Rationale

In order to satisfy the Balance security property, we require that the circuit must be able to enforce that only one nullifier is accepted for a given note. As in Sprout and Sapling, we achieve this by ensuring that the nullifier deterministically depends only on values committed to (directly or indirectly) by the note commitment. As in Sapling, this involves arguing that:

  • There can be only one for a given . This is true because the circuit checks that , and the mapping is an injection for any . ( is in the base field of , which must be smaller than its scalar field, as is the case for Pallas.)
  • There can be only one for a given . This is true because the circuit checks that where is binding (see Commitments).

Use of

Faerie Resistance requires that nullifiers be unique. This is primarily achieved by taking a unique value (checked for uniqueness by the public consensus rules) as an input to the nullifier. However, it is also necessary to ensure that the transformations applied to this value preserve its uniqueness. Meanwhile, to achieve Spend Unlinkability, we require that the nullifier does not reveal any information about the unique value it is derived from.

The design alternatives fall into two categories in terms of how they balance these requirements:

  • Publish a unique value at note creation time, and blind that value within the nullifier computation.

    • This is similar to the approach taken in Sprout and Sapling, which both implemented nullifiers as PRF outputs; Sprout uses the compression function from SHA-256, while Sapling uses BLAKE2s.
  • Derive a unique base from some unique value, publish that unique base at note creation time, and then blind the base (either additively or multiplicatively) during nullifier computation.

For Spend Unlinkability, the only value unknown to the adversary is , and the cryptographic assumptions only involve the first term (other terms like or cannot be extracted directly from the observed nullifiers, but can be subtracted from them). We therefore ensure that the first term does not commit directly to the note (to avoid a DL-breaking adversary from immediately breaking SU).

We were considering using a design involving with the goal of eliminating all usages of a PRF inside the circuit, for two reasons:

  • Instantiating with a traditional hash function is expensive in the circuit.
  • We didn't want to solely rely on an algebraic hash function satisfying to achieve Spend Unlinkability.

However, those designs rely on both and for Faerie Resistance, while still requiring for Spend Unlinkability. (There are two designs for which this is not the case, but they rely on for Note Privacy (OOB) which was not acceptable).

By contrast, several designs involving (including the chosen design) have weaker assumptions for Faerie Resistance (only relying on ), and Spend Unlinkability does not require to hold: they can fall back on the same assumption as the designs (along with an additional assumption about the output of which is easily satisfied).

Use of

Most of the designs include either a multiplicative blinding term , or an additive blinding term , in order to achieve perfect Note Privacy (OOB) (to an adversary who does not know the note). The chosen design is effectively using for this purpose; a DL-breaking adversary only learns . This reduces Note Privacy (OOB) from perfect to statistical, but given that is from a distribution statistically close to uniform on , this is statistically close to better than . The benefit is that it does not require an additional scalar multiplication, making it more efficient inside the circuit.

's derivation has two motivations:

  • Deriving from a random value enables multiple derived values to be conveyed to the recipient within an action (such as the ephemeral secret , per ZIP 212), while keeping the note plaintext short.
  • Mixing into the derivation ensures that the sender can't repeat across two notes, which could have enabled spend linkability attacks in some designs.

The note that is committed to, and which the circuit takes as input, only includes (i.e. the circuit does not check the derivation from ). However, an adversarial sender is still constrained by this derivation, because the recipient recomputes during note decryption. If an action were created using an arbitrary (for which the adversary did not have a corresponding ), the recipient would derive a note commitment that did not match the action's commitment field, and reject it (as in Sapling).

Use of

The nullifier commits to the note value via for two reasons:

  • It domain-separates nullifiers for zero-valued notes from other notes. This is necessary because we do not require zero-valued notes to exist in the commitment tree.
  • Designs that bind the nullifier to require to achieve Faerie Resistance (and similarly where is applied to a value derived from ). Adding to the nullifier avoids this assumption: all of the bases used to derive are fixed and independent of , and so the nullifier can be viewed as a Pedersen hash where the input includes directly.

The variants were considered to avoid directly depending on (which in its native type is a base field element, not a group element). We decided instead to follow Sapling by defining an intermediate representation of as a group element, that is only used in nullifier computation. The circuit already needs to compute , so this improves performance by removing

We also considered variants that used a choice of fixed bases to provide domain separation for zero-valued notes. The most performant design (similar to the chosen design) does not achieve Faerie Resistance for an adversary that knows the recipient's full viewing key ( could be brute-forced to cancel out , causing a collision), and the other variants require assuming as mentioned above.

Signatures

Orchard signatures are an instantiation of RedDSA with a cofactor of 1.

TODO:

  • Should it be possible to sign partial transactions?
    • If we're going to merge down all the signatures into a single one, and also want this, we need to ensure there's a feasible MPC.

Circuit

Gadgets

We will use formulae for curve arithmetic using affine coordinates on short Weierstrass curves, derived from section 4.1 of Hüseyin Hışıl's thesis.

Incomplete addition

  • Inputs:
  • Output:

The formulae from Hışıl's thesis are:

Rename:

  • to
  • to
  • to .

Let , which we implement as

Also,

which is equivalent to

Assuming ,

Substituting for , we get the constraints:

    • Note that this constraint is unsatisfiable for (when ), and so cannot be used with arbitrary inputs.

Complete addition

Suppose that we represent as . ( is not an -coordinate of a valid point because we would need , and is not square in . Also is not a -coordinate of a valid point because is not a cube in .)

For the doubling case, Hışıl's thesis tells us that has to instead be computed as .

Define

Witness where:

Constraints

Max degree: 6

Analysis of constraints

Propositions:

Cases:

Note that we rely on the fact that is not a valid -coordinate or -coordinate of a point on the Pallas curve other than .

    • Completeness:

    • Soundness: is the only solution to

  • for

    • Completeness:

    • Soundness: is the only solution to

  • for

    • Completeness:

    • Soundness: is the only solution to

  • for

    • Completeness:

    • Soundness: is computed correctly, and is the only solution.

  • for

    • Completeness:

    • Soundness: is the only solution to

  • for and and

    • Completeness:

    • Soundness: is computed correctly, and is the only solution.

Fixed-base scalar multiplication

There are fixed bases in the Orchard protocol:

  • , used in deriving the nullifier;
  • , used in spend authorization;
  • base for ;
  • and bases for ; and
  • base for .

Decompose scalar

We support fixed-base scalar multiplication with three types of scalars:

Full-width scalar

A -bit scalar from . We decompose a full-width scalar into -bit windows:

The scalar multiplication will be computed correctly for representing any integer in the range .

We range-constrain each -bit word of the scalar decomposition using a polynomial range-check constraint: where

Base field element

We support using a base field element as the scalar in fixed-base multiplication. This occurs, for example, in the scalar multiplication for the nullifier computation of the Action circuit : here, the scalar is the result of a base field addition.

Decompose the base field element into three-bit windows, and range-constrain each window, using the short range decomposition gadget in strict mode, with

If is witnessed directly then no issue of canonicity arises. However, because the scalar is given as a base field element here, care must be taken to ensure a canonical representation, since . That is, we must check that where the is Pallas base field modulus Note that

To do this, we decompose into three pieces:

We check the correctness of this decomposition by: If the MSB is not set, then However, in the case where , we must check:

  • :
    • ,

To check that we make use of the three-bit running sum decomposition:

  • Firstly, we constrain to be a -bit value by enforcing its high bits to be all-zero. We can get from the decomposition:
  • Then, we constrain bits of to be zeroes; in other words, we constrain the three-bit word We make use of the running sum decomposition to obtain

Define . To check that we use 13 ten-bit lookups, where we constrain the running sum output of the lookup to be if

Short signed scalar

A short signed scalar is witnessed as a magnitude and sign such that

This is used for . We want to compute , where

and are each already constrained to bits (by their use as inputs to ).

Decompose the magnitude into three-bit windows, and range-constrain each window, using the short range decomposition gadget in strict mode, with

We have two additional constraints: where .

Load fixed base

Then, we precompute multiples of the fixed base for each window. This takes the form of a window table: such that:

  • for the first (W-1) rows :
  • in the last row :

The additional term lets us avoid adding the point at infinity in the case . We offset these accumulated terms by subtracting them in the final window, i.e. we subtract .

Note: Although an offset of would naively suffice, it introduces an edge case when . In this case, the window table entries evaluate to the same point:

In fixed-base scalar multiplication, we sum the multiples of at each window (except the last) using incomplete addition. Since the point doubling case is not handled by incomplete addition, we avoid it by using an offset of

For each window of fixed-base multiples :

  • Define a Lagrange interpolation polynomial that maps to the -coordinate of the multiple , i.e.
  • Find a value such that is a square in the field, but the wrong-sign -coordinate does not produce a square.

Repeating this for all windows, we end up with:

  • an table storing coefficients interpolating the coordinate for each window. Each -coordinate interpolation polynomial will be of the form where and 's are the coefficients for each power of ; and
  • a length- array of 's.

We load these precomputed values into fixed columns whenever we do fixed-base scalar multiplication in the circuit.

Fixed-base scalar multiplication

Given a decomposed scalar and a fixed base , we compute as follows:

  1. For each in the scalar decomposition, witness the - and -coordinates
  2. Check that is on the curve: .
  3. Witness such that .
  4. For all windows but the last, use incomplete addition to sum the 's, resulting in .
  5. For the last window, use complete addition and return the final result.

Note: complete addition is required in the final step to correctly map to a representation of the point at infinity, ; and also to handle a corner case for which the last step is a doubling.

Constraints:

where (from the Pallas curve equation).

Signed short exponent

Recall that the signed short exponent is witnessed as a bit magnitude , and a sign Using the above algorithm, we compute . Then, to get the final result we conditionally negate using .

Constraints:

Layout

Note: this doesn't include the last row that uses complete addition. In the implementation this is allocated in a different region.

Variable-base scalar multiplication

In the Orchard circuit we need to check where and the scalar field is with .

We have and , for .

Witness scalar

We're trying to compute for . Set and . Then we can compute

provided that , i.e. which covers the whole range we need because in fact .

Thus, given a scalar , we witness the boolean decomposition of (We use big-endian bit order for convenient input into the variable-base scalar multiplication algorithm.)

Variable-base scalar multiplication

We use an optimized double-and-add algorithm, copied from "Faster variable-base scalar multiplication in zk-SNARK circuits" with some variable name changes:

Acc := [2] T
for i from n-1 down to 0 {
    P := k_{i+1} ? T : −T
    Acc := (Acc + P) + Acc
}
return (k_0 = 0) ? (Acc - T) : Acc

It remains to check that the x-coordinates of each pair of points to be added are distinct.

When adding points in a prime-order group, we can rely on Theorem 3 from Appendix C of the Halo paper, which says that if we have two such points with nonzero indices wrt a given odd-prime order base, where the indices taken in the range are distinct disregarding sign, then they have different x-coordinates. This is helpful, because it is easier to reason about the indices of points occurring in the scalar multiplication algorithm than it is to reason about their x-coordinates directly.

So, the required check is equivalent to saying that the following "indexed version" of the above algorithm never asserts:

acc := 2
for i from n-1 down to 0 {
    p = k_{i+1} ? 1 : −1
    assert acc ≠ ± p
    assert (acc + p) ≠ acc    // X
    acc := (acc + p) + acc
    assert 0 < acc ≤ (q-1)/2
}
if k_0 = 0 {
    assert acc ≠ 1
    acc := acc - 1
}

The maximum value of acc is:

    <--- n 1s --->
  1011111...111111
= 1100000...000000 - 1

=

The assertion labelled X obviously cannot fail because . It is possible to see that acc is monotonically increasing except in the last conditional. It reaches its largest value when is maximal, i.e. .

So to entirely avoid exceptional cases, we would need . But we can use larger by if the last iterations use complete addition.

The first for which the algorithm using only incomplete addition fails is going to be , since . We need to make the wraparound technique above work.

sage: q = 0x40000000000000000000000000000000224698fc0994a8dd8c46eb2100000001
sage: 2^253 + 2^252 - 1 < (q-1)//2
False
sage: 2^252 + 2^251 - 1 < (q-1)//2
True

So the last three iterations of the loop () need to use complete addition, as does the conditional subtraction at the end. Writing this out using ⸭ for incomplete addition (as we do in the spec), we have:

Acc := [2] T
for i from 253 down to 3 {
    P := k_{i+1} ? T : −T
    Acc := (Acc ⸭ P) ⸭ Acc
}
for i from 2 down to 0 {
    P := k_{i+1} ? T : −T
    Acc := (Acc + P) + Acc  // complete addition
}
return (k_0 = 0) ? (Acc + (-T)) : Acc  // complete addition

Constraint program for optimized double-and-add (incomplete addition)

Define a running sum , where and:

where The helper . After substitution of , and , this becomes:

Here, is assigned to a cell. This is unlike previous 's, which were implicitly derived from , but never actually assigned.

The bits are used in three further steps, using complete addition:

If the least significant bit we set otherwise we set . Then we return using complete addition.

Let

Output .

(Note that represents .)

Circuit design

We need six advice columns to witness . However, since are the same, we can perform two incomplete additions in a single row, reusing the same . We split the scalar bits used in incomplete addition into and halves and process them in parallel. This means that we effectively have two for loops:

  • the first, covering the half for from down to , with a special case at ; and
  • the second, covering the half for the remaining from down to , with a special case at .

For each and half, we have three sets of gates. Note that is going from ; is NOT indexing the rows.

This gate is only used on the first row (before the for loop). We check that are initialized to values consistent with the initial where

This gate is used on all rows corresponding to the for loop except the last.

where

This gate is used on the final iteration of the for loop, handling the special case where we check that the output has been witnessed correctly. where

Overflow check

cannot overflow for any , because it is a weighted sum of bits only up to , which is smaller than (and also ).

However, can overflow .

Since overflow can only occur in the final step that constrains , we have . It is then sufficient to also check that (so that ) and that . These conditions together imply that as an integer, and so as required.

Note: the bits do not represent a value reduced modulo , but rather a representation of the unreduced .

Optimized check for

Since , we have

We may assume that .

Therefore,

Given , we prove equivalence of and as follows:

  • shift the range by to give ;
  • observe that is guaranteed to be in and therefore cannot overflow or underflow modulo ;
  • using the fact that , observe that .

(We can see in a different way that this is correct by observing that it checks whether , so the upper bound is aligned as we would expect.)

Now, we can continue optimizing from :

Constraining to be all- or not-all- can be implemented almost "for free", as follows.

Recall that , so we have:

So are all exactly when .

Finally, we can merge the -bit decompositions for the and cases by checking that .

Overflow check constraints

Let . The constraints for the overflow check are:

Define

Witness , and decompose as .

Then the needed gates are:

where can be computed by another running sum. Note that the factor of has no effect on the constraint, since the RHS is zero.

Running sum range check

We make use of a -bit lookup range check in the circuit to subtract the low bits of . The range check subtracts the first bits of and right-shifts the result to give

Sinsemilla

Overview

Sinsemilla is a collision-resistant hash function and commitment scheme designed to be efficient in algebraic circuit models that support lookups, such as PLONK or Halo 2.

The security properties of Sinsemilla are similar to Pedersen hashes; it is not designed to be used where a random oracle, PRF, or preimage-resistant hash is required. The only claimed security property of the hash function is collision-resistance for fixed-length inputs.

Sinsemilla is roughly 4 times less efficient than the algebraic hashes Rescue and Poseidon inside a circuit, but around 19 times more efficient than Rescue outside a circuit. Unlike either of these hashes, the collision resistance property of Sinsemilla can be proven based on cryptographic assumptions that have been well-established for at least 20 years. Sinsemilla can also be used as a computationally binding and perfectly hiding commitment scheme.

The general approach is to split the message into -bit pieces, and for each piece, select from a table of bases in our cryptographic group. We combine the selected bases using a double-and-add algorithm. This ends up being provably as secure as a vector Pedersen hash, and makes advantageous use of the lookup facility supported by Halo 2.

Description

This section is an outline of how Sinsemilla works: for the normative specification, refer to §5.4.1.9 Sinsemilla Hash Function in the protocol spec. The incomplete point addition operator, ⸭, that we use below is also defined there.

Let be a cryptographic group of prime order . We write additively, with identity , and using for scalar multiplication of by .

Let be an integer chosen based on efficiency considerations (the table size will be ). Let be an integer, fixed for each instantiation, such that messages are bits, where . We use zero-padding to the next multiple of bits if necessary.

: Choose and as independent, verifiably random generators of , using a suitable hash into , such that none of or are .

In Orchard, we define to be dependent on a domain separator . The protocol specification uses in place of and in place of .

:

  • Split into groups of bits. Interpret each group as a -bit little-endian integer .
  • let
  • for from up to :
    • let
  • return

Let be the -coordinate of . (This assumes that is a prime-order elliptic curve in short Weierstrass form, as is the case for Pallas and Vesta.)

It is slightly more efficient to express a double-and-add as . We also use incomplete additions: it is shown in the Sinsemilla security argument that in the case where is a prime-order short Weierstrass elliptic curve, an exceptional case for addition would lead to finding a discrete logarithm, which can be assumed to occur with negligible probability even for adversarial input.

Use as a commitment scheme

Choose another generator independently of and .

The randomness for a commitment is chosen uniformly on .

Let .

Let be the of . (This again assumes that is a prime-order elliptic curve in short Weierstrass form.)

Note that unlike a simple Pedersen commitment, this commitment scheme ( or ) is not additively homomorphic.

Efficient implementation

The aim of the design is to optimize the number of bits that can be processed for each step of the algorithm (which requires a doubling and addition in ) for a given table size. Using a single table of size group elements, we can process bits at a time.

Constraint program

Let .

Input: . (The message words are 1-indexed here, as in the protocol spec, but we start the loop from so that corresponds to in the protocol spec.)

Output: .

  • for from up to :

PLONK / Halo 2 constraints

Message decomposition

We have an -bit message . (Note that the message words are 1-indexed as in the protocol spec.)

Initialise the running sum and define . We will end up with

Rearranging gives us an expression for each word of the original message , which we can look up in the table.

In other words, .

For a little-endian decomposition as used here, the running sum is initialized to the scalar and ends at 0. For a big-endian decomposition as used in variable-base scalar multiplication, the running sum would start at 0 and end with recovering the original scalar.

The running sum only applies to message words within a single field element, i.e. if then we will have several disjoint running sums. A longer message can be constructed by splitting the message words across several field elements, and then running several instances of the constraints below. An additional selector is set to for the last step of each element, except for the last element where it is set to .

In order to support chaining multiple field elements without a gap, we will use a slightly more complicated expression for that effectively forces to zero for the last step of each element, as indicated by . This allows the cell that would have been to be used to reinitialize the running sum for the next element.

Generator lookup table

The Sinsemilla circuit makes use of pre-computed random generators. These are loaded into a lookup table:

Layout

Note: is synthesized from and ; it is shown here only for clarity.

, , , etc. would be copied in using equality constraints.

Optimized Sinsemilla gate

The Halo 2 circuit API can automatically substitute , , , and , so we don't need to do that manually.

  • for from up to :

Note that each term of the last constraint is multiplied by relative to the constraint program given earlier. This is a small optimization that avoids divisions by .

By gating the lookup expression on , we avoid the need to fill in unused cells with dummy values to pass the lookup argument. The optimized lookup value (using a default index of ) is:

This increases the degree of the lookup argument to .

MerkleCRH

Message decomposition

is used in the hash function. The input to is:

where:

  • ,
  • ,
  • ,

with and are allowed to be non-canonical -bit encodings.

We break these inputs into the following MessagePieces:

are constrained by the to be bits, bits, and bits respectively.

In a custom gate, we check this message decomposition by enforcing the following constraints:


  1. , the index-1 running sum output of , is copied into the gate. has been constrained by the to be bits. We recover the subpieces using : is loaded into a fixed column at each layer of the hash. It is used both as a gate selector, and to fix the value of . We check that

Note: The reason for using instead of is that when (hashing two leaves). We cannot have a zero-valued selector, since a constraint gated by a zero-valued selector is never checked.


  1. , the index-1 running sum output of , is copied into the gate. has been constrained by the to be bits. We witness the subpieces outside this gate, and constrain them each to be bits. Inside the gate, we check that We also recover the subpiece using :

We have now derived or witnessed every subpiece, and range-constrained every subpiece:

  • ( bits), derived as ;
  • ( bits), equal to ;
  • ( bits), derived as ;
  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and constrained outside the gate;
  • is constrained to equal , and we use them to reconstruct the original field element inputs:

Circuit components

The Orchard circuit spans advice columns while the chip only uses advice columns. We distribute the path hashing evenly across two chips to make better use of the available circuit area. Since the output from the previous layer hash is copied into the next layer hash, we maintain continuity even when moving from one chip to the other.

Message decomposition

is used in the function. The input to is:

where , are Pallas base field elements, and

Sinsemilla operates on multiples of 10 bits, so we start by decomposing the message into chunks:

Then we recompose the chunks into message pieces:

Each message piece is constrained by to its stated length. Additionally, and are witnessed as field elements, so we know they are canonical. However, we need additional constraints to enforce that:

  • The chunks are the correct bit lengths (or else they could overlap in the decompositions and allow the prover to witness an arbitrary message).
  • The chunks contain the canonical decompositions of and (or else the prover could witness an input to that is equivalent to and but not identical).

Some of these constraints can be implemented with reusable circuit gadgets. We define a custom gate controlled by the selector to hold the remaining constraints.

Bit length constraints

Chunks and are directly constrained by Sinsemilla. For the remaining chunks, we use the following constraints:

where and is a short lookup range check.

Decomposition constraints

We have now derived or witnessed every subpiece, and range-constrained every subpiece:

  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and boolean-constrained in the gate;
  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and boolean-constrained in the gate.

We can now use them to reconstruct both the (chunked) message pieces, and the original field element inputs:

Canonicity checks

At this point, we have constrained and to be 255-bit values, with top bits and respectively. We have also constrained:

where is the Pallas base field modulus. The remaining constraints will enforce that these are indeed canonically-encoded field elements, i.e.

The Pallas base field modulus has the form , where is 126 bits. We therefore know that if the top bit is not set, then the remaining bits will always comprise a canonical encoding of a field element. Thus the canonicity checks below are enforced if and only if (for ) or (for ).

In the constraints below we use a base- variant of the method used in libsnark (originally from [SVPBABW2012, Appendix C.1]) for range constraints :

  • Let be the smallest power of greater than .
  • Enforce .
  • Let .
  • Enforce .

with

In these cases, we check that :

  1. Since we know that and in particular

  2. To check that , we use two constraints:

    a) . This is expressed in the custom gate as where is the index-13 running sum output by

    b) . To check this, we decompose into thirteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

with

In these cases, we check that :

  1. Since we know that and in particular

  2. To check that , we use two constraints:

    a) . is already constrained individually to be a -bit value. is the index-13 running sum output by By constraining we constrain

    b) . To check this, we decompose into fourteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

Region layout

The constraints controlled by the selector are arranged across 9 advice columns, requiring two rows.

NoteCommit

Message decomposition

is used in the function. The input to is:

where:

  • are representations of Pallas curve points, with bits used for the -coordinate and bit used for the -coordinate.
  • are Pallas base field elements.
  • is a -bit value.

Sinsemilla operates on multiples of 10 bits, so we start by decomposing the message into chunks:

Then we recompose the chunks into message pieces:

where is 4 zero bits (corresponding to the padding applied by the Sinsemilla function).

Each message piece is constrained by to its stated length. Additionally:

  • and are witnessed and checked to be valid elliptic curve points.
  • is witnessed as a field element, but its decomposition is sufficient to constrain it to be a 64-bit value.
  • and are witnessed as field elements, so we know they are canonical.

However, we need additional constraints to enforce that:

  • The chunks are the correct bit lengths (or else they could overlap in the decompositions and allow the prover to witness an arbitrary message).
  • The chunks contain the canonical decompositions of , , , and (or else the prover could witness multiple equivalent inputs to ).

Some of these constraints are implemented with a reusable circuit gadget, . We define custom gates for the remainder. Since these gates use simple boolean selectors activated on different rows, their selectors are eligible for combining, reducing the overall proof size.

Message piece decomposition

We check the decomposition of each message piece in its own region. There is no need to check the whole pieces:

  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and constrained outside the gate;
  • ( bits) is witnessed and constrained outside the gate;

The following helper gates are defined:

has been constrained to be bits by the Sinsemilla hash.

Region layout

Constraints

Outside this gate, we have constrained:

has been constrained to be bits by the .

Region layout

Constraints

Outside this gate, we have constrained:

  • is equality-constrained to , where the latter is the index-1 running sum output of constrained by the hash to be bits.

has been constrained to be bits by the .

Region layout

Constraints

Outside this gate, we have constrained:

has been constrained to be bits by the .

Region layout

Constraints

Outside this gate, we have constrained:

  • is equality-constrained to , where the latter is the index-1 running sum output of constrained by the hash to be 240 bits.

has been constrained to be bits by the .

Region layout

Constraints

Outside this gate, we have constrained:

Field element checks

All message pieces and subpieces have been range-constrained by the earlier decomposition gates. They are now used to:

  • constrain each field element , , , and to be 255-bit values, with top bits , , , and respectively.
  • constrain where is the Pallas base field modulus.
  • check that these are indeed canonically-encoded field elements, i.e.

The Pallas base field modulus has the form , where is 126 bits. We therefore know that if the top bit is not set, then the remaining bits will always comprise a canonical encoding of a field element. Thus the canonicity checks below are enforced if and only if the corresponding top bit is set to 1.

In the constraints below we use a base- variant of the method used in libsnark (originally from [SVPBABW2012, Appendix C.1]) for range constraints :

  • Let be the smallest power of greater than .
  • Enforce .
  • Let .
  • Enforce .

with

Recall that . When the top bit is set, we check that :

  1. Since we know that and in particular

  2. To check that , we use two constraints:

    a) . This is expressed in the custom gate as where is the index-13 running sum output by

    b) . To check this, we decompose into thirteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

Region layout

Constraints

with

Recall that . When the top bit is set, we check that :

  1. To check that we use two constraints:

    a) is already constrained individually to be a -bit value. is the index-13 running sum output by By constraining we constrain

    b) . To check this, we decompose into fourteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

Region layout

Constraints

Region layout

Constraints

with

Recall that . When the top bit is set, we check that :

  1. To check that we use two constraints:

    a) is already constrained individually to be a -bit value. is the index-13 running sum output by By constraining we constrain

    b) . To check this, we decompose into fourteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

Region layout

Constraints

with

Recall that . When the top bit is set, we check that :

  1. Since we know that and in particular

  2. To check that we use two constraints:

    a) is already constrained individually to be a -bit value. is the index-13 running sum output by By constraining we constrain

    b) . To check this, we decompose into thirteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

Region layout

Constraints

-coordinate checks

Note that only the LSB of the -coordinates was input to the hash, while the other bits of the -coordinate were unused. However, we must still check that the witnessed bit matches the original point's -coordinate. The checks for will follow the same format. For each -coordinate, we witness:

where is for , and for . Let We decompose to be bits using a strict word ten-bit lookup. The running sum outputs allow us to susbstitute

Recall that and were pieces input to the Sinsemilla hash and have already been boolean-constrained. and are constrained outside this gate to and bits respectively. To constrain the remaining chunks, we use the following constraints:

Then, to check that the decomposition was correct:

with

In these cases, we check that :

  1. Since we know that and in particular

  2. To check that , we use two constraints:

    a) . This is expressed in the custom gate as where is the index-13 running sum output by the -bit lookup decomposition of .

    b) . To check this, we decompose into thirteen 10-bit words (little-endian) using a running sum , looking up each word in a -bit lookup table. We then enforce in the custom gate that

Region layout

Constraints

Outside this gate, we have constrained:

This can be checked in exactly the same way as , with replaced by .

Decomposition

Given a field element , these gadgets decompose it into -bit windows where each a -bit value.

This is done using a running sum We initialize the running sum and compute subsequent terms This gives us:

Strict mode

Strict mode constrains the running sum output to be zero, thus range-constraining the field element to be within bits.

In strict mode, we are also assured that gives us the last window in the decomposition.

Lookup decomposition

This gadget makes use of a -bit lookup table to decompose a field element into -bit words. Each -bit word is range-constrained by a lookup in the -bit table.

The region layout for the lookup decomposition uses a single advice column , and two selectors and

Short range check

Using two -bit lookups, we can range-constrain a field element to be bits, where To do this:

  1. Constrain to be within bits using a -bit lookup.
  2. Constrain to be within bits using a -bit lookup.

The short variant of the lookup decomposition introduces a selector. The same advice column has here been renamed to for clarity:

where Note that is assigned to a fixed column at keygen, and copied in at proving time. This is used in the gate enabled by the selector to check that was shifted correctly:

Combined lookup expression

Since the lookup decomposition and its short variant both make use of the same lookup table, we combine their lookup input expressions into a single one:

where and are the same cell (but distinguished here for clarity of usage).

Short range decomposition

For a short range (for instance, where ), we can range-constrain each word using a degree- polynomial constraint instead of a lookup: