Update book to remove Halo 2 content

This commit is contained in:
Jack Grigg 2021-03-03 22:44:30 +00:00
parent d40ed36d50
commit c713e804fa
42 changed files with 9 additions and 3292 deletions

View File

@ -1,4 +1,4 @@
name: halo2 book
name: Pasta book
on:
push:
@ -22,7 +22,7 @@ jobs:
command: install
args: mdbook-katex
- name: Build halo2 book
- name: Build Pasta book
run: mdbook build book/
- name: Deploy to GitHub Pages

View File

@ -81,7 +81,7 @@ jobs:
uses: peaceiris/actions-mdbook@v1
with:
mdbook-version: '0.4.5'
- name: Test halo2 book
- name: Test Pasta book
run: mdbook test -L target/debug/deps book/
clippy:

View File

@ -8,6 +8,6 @@ authors = [
language = "en"
multilingual = false
src = "src"
title = "The halo2 Book"
title = "The Pasta Book"
[preprocessor.katex]

View File

@ -1,36 +1,6 @@
# The halo2 Book
# The Pasta Book
[halo2](README.md)
- [Concepts](concepts.md)
- [Proof systems](concepts/proofs.md)
- [UltraPLONK Arithmetization](concepts/arithmetization.md)
- [Cores](concepts/cores.md)
- [Chips](concepts/chips.md)
- [Gadgets](concepts/gadgets.md)
- [User Documentation](user.md)
- [A simple example](user/simple-example.md)
- [Lookup tables](user/lookup-tables.md)
- [Gadgets](user/gadgets.md)
- [Tips and tricks](user/tips-and-tricks.md)
[Pasta](README.md)
- [Design](design.md)
- [Proving system](design/proving-system.md)
- [Lookup argument](design/proving-system/lookup.md)
- [Permutation argument](design/proving-system/permutation.md)
- [Circuit commitments](design/proving-system/circuit-commitments.md)
- [Vanishing argument](design/proving-system/vanishing.md)
- [Multipoint opening argument](design/proving-system/multipoint-opening.md)
- [Inner product argument](design/proving-system/inner-product.md)
- [Comparison to other work](design/proving-system/comparison.md)
- [Implementation](design/implementation.md)
- [Fields](design/implementation/fields.md)
- [Gadgets](design/gadgets.md)
- [SHA-256](design/gadgets/sha256.md)
- [16-bit table chip](design/gadgets/sha256/table16.md)
- [Background Material](background.md)
- [Fields](background/fields.md)
- [Polynomials](background/polynomials.md)
- [Cryptographic groups](background/groups.md)
- [Elliptic curves](background/curves.md)
- [UltraPLONK arithmetisation](background/upa.md)
- [Polynomial commitment using inner product argument](background/pc-ipa.md)
- [Recursion](background/recursion.md)

View File

@ -1,7 +0,0 @@
# Background Material
This section covers the background material required to understand the Halo 2 proving
system. It is targeted at an ELI15 (Explain It Like I'm 15) level; if you think anything
could do with additional explanation, [let us know]!
[let us know]: https://github.com/zcash/halo2/issues/new/choose

View File

@ -1,252 +0,0 @@
# Elliptic curves
Elliptic curves constructed over finite fields are another important cryptographic tool.
We use elliptic curves because they provide a cryptographic [group](fields.md#Groups),
i.e. a group in which the discrete logarithm problem (discussed below) is hard.
There are several ways to define the curve equation, but for our purposes, let
$\mathbb{F}_p$ be a large (255-bit) field, and then let the set of solutions $(x, y)$ to
$y^2 = x^3 + b$ for some constant $b$ define the $\mathbb{F}_p$-rational points on an
elliptic curve $E(\mathbb{F}_p)$. These $(x, y)$ coordinates are called "affine
coordinates". Each of the $\mathbb{F}_p$-rational points, together with a "point at
infinity" $\mathcal{O}$ that serves as the group identity, can be interpreted as an
element of a group. By convention, elliptic curve groups are written additively.
![](https://i.imgur.com/JvLS6yE.png)
*"Three points on a line sum to zero, which is the point at infinity."*
The group addition law is simple: to add two points together, find the line that
intersects both points and obtain the third point, and then negate its $y$-coordinate. The
case that a point is being added to itself, called point doubling, requires special
handling: we find the line tangent to the point, and then find the single other point that
intersects this line and then negate. Otherwise, in the event that a point is being
"added" to its negation, the result is the point at infinity.
The ability to add and double points naturally gives us a way to scale them by integers,
called _scalars_. The number of points on the curve is the group order. If this number
is a prime $q$, then the scalars can be considered as elements of a _scalar field_,
$\mathbb{F}_q$.
Elliptic curves, when properly designed, have an important security property. Given two
random elements $G, H \in E(\mathbb{F}_p)$ finding $a$ such that $[a] G = H$, otherwise
known as the discrete log of $H$ with respect to $G$, is considered computationally
infeasible with classical computers. This is called the elliptic curve discrete log
assumption.
If an elliptic curve group $\mathbb{G}$ has prime order $q$ (like the ones used in Halo 2),
then it is a finite cyclic group. Recall from the section on [groups](fields.md#Groups)
that this implies it is isomorphic to $\mathbb{Z}/q\mathbb{Z}$, or equivalently, to the
scalar field $\mathbb{F}_q$. Each possible generator $G$ fixes the isomorphism; then
an element on the scalar side is precisely the discrete log of the corresponding group
element with respect to $G$. In the case of a cryptographically secure elliptic curve,
the isomorphism is hard to compute in the $\mathbb{G} \rightarrow \mathbb{F}_q$ direction
because the elliptic curve discrete log problem is hard.
> It is sometimes helpful to make use of this isomorphism by thinking of group-based
> cryptographic protocols and algorithms in terms of the scalars instead of in terms of
> the group elements. This can make proofs and notation simpler.
>
> For instance, it has become common in papers on proof systems to use the notation $[x]$
> to denote a group element with discrete log $x$, where the generator is implicit.
>
> We also used this idea in the
> "[distinct-x theorem](https://zips.z.cash/protocol/protocol.pdf#thmdistinctx)",
> in order to prove correctness of optimizations
> [for elliptic curve scalar multiplication](https://github.com/zcash/zcash/issues/3924)
> in Sapling, and an endomorphism-based optimization in Appendix C of the original
> [Halo paper](https://eprint.iacr.org/2019/1021.pdf).
## Curve arithmetic
### Point doubling
The simplest situation is doubling a point $(x_0, y_0)$. Continuing with our example
$y^2 = x^3 + b$, this is done first by computing the derivative
$$
\lambda = \frac{\mathrm{d}y}{\mathrm{d}x} = \frac{3x^2}{2y}.
$$
To obtain expressions for $(x_1, y_1) = (x_0, y_0) + (x_0, y_0),$ we consider
$$
\begin{aligned}
\frac{-y_1 - y_0}{x_1 - x_0} = \lambda &\implies -y_1 = \lambda(x_1 - x_0) + y_0 \\
&\implies \boxed{y_1 = \lambda(x_0 - x_1) - y_0}.
\end{aligned}
$$
To get the expression for $x_1,$ we substitute $y = \lambda(x_0 - x) - y_0$ into the
elliptic curve equation:
$$
\begin{aligned}
y^2 = x^3 + b &\implies (\lambda(x_0 - x) - y_0)^2 = x^3 + b \\
&\implies x^3 - \lambda^2 x^2 + \cdots = 0 \leftarrow\text{(rearranging terms)} \\
&= (x - x_0)(x - x_0)(x - x_1) \leftarrow\text{(known roots $x_0, x_0, x_1$)} \\
&= x^3 - (x_0 + x_0 + x_1)x^2 + \cdots.
\end{aligned}
$$
Comparing coefficients for the $x^2$ term gives us
$\lambda^2 = x_0 + x_0 + x_1 \implies \boxed{x_1 = \lambda^2 - 2x_0}.$
### Projective coordinates
This unfortunately requires an expensive inversion of $2y$. We can avoid this by arranging
our equations to "defer" the computation of the inverse, since we often do not need the
actual affine $(x', y')$ coordinate of the resulting point immediately after an individual
curve operation. Let's introduce a third coordinate $Z$ and scale our curve equation by
$Z^3$ like so:
$$
Z^3 y^2 = Z^3 x^3 + Z^3 b
$$
Our original curve is just this curve at the restriction $Z = 1$. If we allow the affine
point $(x, y)$ to be represented by $X = xZ$, $Y = yZ$ and $Z \neq 0$ then we have the
[homogenous projective curve](https://en.wikipedia.org/wiki/Homogeneous_coordinates)
$$
Y^2 Z = X^3 + Z^3 b.
$$
Obtaining $(x, y)$ from $(X, Y, Z)$ is as simple as computing $(X/Z, Y/Z)$ when
$Z \neq 0$. (When $Z = 0,$ we are dealing with the point at infinity $O := (0:1:0)$.) In
this form, we now have a convenient way to defer the inversion required by doubling a
point. The general strategy is to express $x', y'$ as rational functions using $x = X/Z$
and $y = Y/Z$, rearrange to make their denominators the same, and then take the resulting
point $(X, Y, Z)$ to have $Z$ be the shared denominator and $X = x'Z, Y = y'Z$.
> Projective coordinates are often, but not always, more efficient than affine
> coordinates. There may be exceptions to this when either we have a different way to
> apply Montgomery's trick, or when we're in the circuit setting where multiplications and
> inversions are about equally as expensive (at least in terms of circuit size).
The following shows an example of doubling a point $(X, Y, Z) = (xZ, yZ, Z)$ without an
inversion. Substituting with $X, Y, Z$ gives us
$$
\lambda = \frac{3x^2}{2y} = \frac{3(X/Z)^2}{2(Y/Z)} = \frac{3 X^2}{2YZ}
$$
and gives us
$$
\begin{aligned}
x' &= \lambda^2 - 2x \\
&= \lambda^2 - \frac{2X}{Z} \\
&= \frac{9 X^4}{4Y^2Z^2} - \frac{2X}{Z} \\
&= \frac{9 X^4 - 8XY^2Z}{4Y^2Z^2} \\
&= \frac{18 X^4 Y Z - 16XY^3Z^2}{8Y^3Z^3} \\
\\
y' &= \lambda (x - x') - y \\
&= \lambda (\frac{X}{Z} - \frac{9 X^4 - 8XY^2Z}{4Y^2Z^2}) - \frac{Y}{Z} \\
&= \frac{3 X^2}{2YZ} (\frac{X}{Z} - \frac{9 X^4 - 8XY^2Z}{4Y^2Z^2}) - \frac{Y}{Z} \\
&= \frac{3 X^3}{2YZ^2} - \frac{27 X^6 - 24X^3Y^2Z}{8Y^3Z^3} - \frac{Y}{Z} \\
&= \frac{12 X^3Y^2Z - 8Y^4Z^2 - 27 X^6 + 24X^3Y^2Z}{8Y^3Z^3}
\end{aligned}
$$
Notice how the denominators of $x'$ and $y'$ are the same. Thus, instead of computing
$(x', y')$ we can compute $(X, Y, Z)$ with $Z = 8Y^3Z^3$ and $X, Y$ set to the
corresponding numerators such that $X/Z = x'$ and $Y/Z = y'$. This completely avoids the
need to perform an inversion when doubling, and something analogous to this can be done
when adding two distinct points.
### TODO: Point addition
$$
\begin{aligned}
P + Q &= R\\
(x_p, y_p) + (x_q, y_q) &= (x_r, y_r) \\
\lambda &= \frac{y_q - y_p}{x_q - x_p} \\
x_r &= \lambda^2 - x_p - x_q \\
y_r &= \lambda(x_p - x_r) - y_p
\end{aligned}
$$
----------
Important notes:
* There exist efficient formulae[^complete-formulae] for point addition that do not have
edge cases (so-called "complete" formulae) and that unify the addition and doubling
cases together. The result of adding a point to its negation using those formulae
produces $Z = 0$, which represents the point at infinity.
* In addition, there are other models like the Jacobian representation where
$(x, y) = (xZ^2, yZ^3, Z)$ where the curve is rescaled by $Z^6$ instead of $Z^3$, and
this representation has even more efficient arithmetic but no unified/complete formulae.
* We can easily compare two curve points $(X_1, Y_1, Z_1)$ and $(X_2, Y_2, Z_2)$ for
equality in the homogenous projective coordinate space by "homogenizing" their
Z-coordinates; the checks become $X_1 Z_2 = X_2 Z_1$ and $Y_1 Z_2 = Y_2 Z_1$.
## Curve endomorphisms
Imagine that $\mathbb{F}_p$ has a primitive cube root of unity, or in other words that
$3 | p - 1$ and so an element $\zeta_p$ generates a $3$-order multiplicative subgroup.
Notice that a point $(x, y)$ on our example elliptic curve $y^2 = x^3 + b$ has two cousin
points: $(\zeta_p x, \zeta_p^2 x)$, because the computation $x^3$ effectively kills the
$\zeta$ component of the $x$-coordinate. Applying the map $(x, y) \mapsto (\zeta_p x, y)$
is an application of an endomorphism over the curve. The exact mechanics involved are
complicated, but when the curve has a prime $q$ number of points (and thus a prime
"order") the effect of the endomorphism is to multiply the point by a scalar in
$\mathbb{F}_q$ which is also a primitive cube root $\zeta_q$ in the scalar field.
## Curve point compression
TODO
## Cycles of curves
Let $E_p$ be an elliptic curve over a finite field $\mathbb{F}_p,$ where $p$ is a prime.
We denote this by $E_p/\mathbb{F}_p.$ and we denote the group of points of $E_p$ over
$\mathbb{F}_p,$ with order $q = \#E(\mathbb{F}_p).$ For this curve, we call $\mathbb{F}_p$
the "base field" and $\mathbb{F}_q$ the "scalar field".
We instantiate our proof system over the elliptic curve $E_p/\mathbb{F}_p$. This allows us
to prove statements about $\mathbb{F}_q$-arithmetic circuit satisfiability.
> **(aside) If our curve $E_p$ is over $\mathbb{F}_p,$ why is the arithmetic circuit instead in $\mathbb{F}_q$?**
> The proof system is basically working on encodings of the scalars in the circuit (or
> more precisely, commitments to polynomials whose coefficients are scalars). The scalars
> are in $\mathbb{F}_q$ when their encodings/commitments are elliptic curve points in
> $E_p/\mathbb{F}_p$.
However, most of the verifier's arithmetic computations are over the base field
$\mathbb{F}_p,$ and are thus efficiently expressed as an $\mathbb{F}_p$-arithmetic
circuit.
> **(aside) Why are the verifier's computations (mainly) over $\mathbb{F}_p$?**
> The Halo 2 verifier actually has to perform group operations using information output by
> the circuit. Group operations like point doubling and addition use arithmetic in
> $\mathbb{F}_p$, because the coordinates of points are in $\mathbb{F}_p.$
This motivates us to construct another curve with scalar field $\mathbb{F}_p$, which has
an $\mathbb{F}_p$-arithmetic circuit that can efficiently verify proofs from the first
curve. As a bonus, if this second curve had base field $E_q/\mathbb{F}_q,$ it would
generate proofs that could be efficiently verified in the first curve's
$\mathbb{F}_q$-arithmetic circuit. In other words, we instantiate a second proof system
over $E_q/\mathbb{F}_q,$ forming a 2-cycle with the first:
![](https://i.imgur.com/bNMyMRu.png)
### TODO: Pallas-Vesta curves
Reference: https://github.com/zcash/pasta
## Hashing to curves
Sometimes it is useful to be able to produce a random point on an elliptic curve
$E_p/\mathbb{F}_p$ corresponding to some input, in such a way that no-one will know its
discrete logarithm (to any other base).
This is described in detail in the [Internet draft on Hashing to Elliptic Curves][cfrg-hash-to-curve].
Several algorithms can be used depending on efficiency and security requirements. The
framework used in the Internet Draft makes use of several functions:
* ``hash_to_field``: takes a byte sequence input and maps it to a element in the base
field $\mathbb{F}_p$
* ``map_to_curve``: takes an $\mathbb{F}_p$ element and maps it to $E_p$.
[cfrg-hash-to-curve]: https://datatracker.ietf.org/doc/draft-irtf-cfrg-hash-to-curve/?include_text=1
### TODO: Simplified SWU
Reference: https://eprint.iacr.org/2019/403.pdf
## References
[^complete-formulae]: [Renes, J., Costello, C., & Batina, L. (2016, May). "Complete addition formulas for prime order elliptic curves." In Annual International Conference on the Theory and Applications of Cryptographic Techniques (pp. 403-428). Springer, Berlin, Heidelberg.](https://eprint.iacr.org/2015/1060)

View File

@ -1,294 +0,0 @@
# Fields
A fundamental component of many cryptographic protocols is the algebraic structure known
as a [field]. Fields are sets of objects (usually numbers) with two associated binary
operators $+$ and $\times$ such that various [field axioms][field-axioms] hold. The real
numbers $\mathbb{R}$ are an example of a field with an uncountably infinite number of
elements.
[field]: https://en.wikipedia.org/wiki/Field_(mathematics)
[field-axioms]: https://en.wikipedia.org/wiki/Field_(mathematics)#Classic_definition
Halo makes use of _finite fields_ which have a finite number of elements. Finite fields
are fully classified as follows:
- if $\mathbb{F}$ is a finite field, it contains $|\mathbb{F}| = p^k$ elements for some
integer $k \geq 1$ and some prime $p$;
- any two finite fields with the same number of elements are isomorphic. In particular,
all of the arithmetic in a prime field $\mathbb{F}_p$ is isomorphic to addition and
multiplication of integers modulo $p$, i.e. in $\mathbb{Z}_p$. This is why we often
refer to $p$ as the _modulus_.
We'll write a field as $\mathbb{F}_q$ where $q = p^k$. The prime $p$ is called its
_characteristic_. In the cases where $k \gt 1$ the field $\mathbb{F}_q$ is a $k$-degree
extension of the field $\mathbb{F}_p$. (By analogy, the complex numbers
$\mathbb{C} = \mathbb{R}(i)$ are an extension of the real numbers.) However, in Halo we do
not use extension fields. Whenever we write $\mathbb{F}_p$ we are referring to what
we call a _prime field_ which has a prime $p$ number of elements, i.e. $k = 1$.
Important notes:
* There are two special elements in any field: $0$, the additive identity, and
$1$, the multiplicative identity.
* The least significant bit of a field element, when represented as an integer in binary
format, can be interpreted as its "sign" to help distinguish it from its additive
inverse (negation). This is because for some nonzero element $a$ which has a least
significant bit $0$ we have that $-a = p - a$ has a least significant bit $1$, and vice
versa. We could also use whether or not an element is larger than $(p - 1) / 2$ to give
it a "sign."
Finite fields will be useful later for constructing [polynomials](polynomials.md) and
[elliptic curves](curves.md). Elliptic curves are examples of groups, which we discuss
next.
## Groups
Groups are simpler and more limited than fields; they have only one binary operator $\cdot$
and fewer axioms. They also have an identity, which we'll denote as $1$.
[group]: https://en.wikipedia.org/wiki/Group_(mathematics)
[group-axioms]: https://en.wikipedia.org/wiki/Group_(mathematics)#Definition
Any non-zero element $a$ in a group has an _inverse_ $b = a^{-1}$,
which is the _unique_ element $b$ such that $a \cdot b = 1$.
For example, the set of nonzero elements of $\mathbb{F}_p$ forms a group, where the
group operation is given by multiplication on the field.
[group]: https://en.wikipedia.org/wiki/Group_(mathematics)
> #### (aside) Additive vs multiplicative notation
> If $\cdot$ is written as $\times$ or omitted (i.e. $a \cdot b$ written as $ab$), the
> identity as $1$, and inversion as $a^{-1}$, as we did above, then we say that the group
> is "written multiplicatively". If $\cdot$ is written as $+$, the identity as $0$ or
> $\mathcal{O}$, and inversion as $-a$, then we say it is "written additively".
>
> It's conventional to use additive notation for elliptic curve groups, and multiplicative
> notation when the elements come from a finite field.
>
> When additive notation is used, we also write
>
> $$[k] A = \underbrace{A + A + \cdots + A}_{k \text{ times}}$$
>
> for nonnegative $k$ and call this "scalar multiplication"; we also often use uppercase
> letters for variables denoting group elements. When multiplicative notation is used, we
> also write
>
> $$a^k = \underbrace{a \times a \times \cdots \times a}_{k \text{ times}}$$
>
> and call this "exponentiation". In either case we call the scalar $k$ such that
> $[k] g = a$ or $g^k = a$ the "discrete logarithm" of $a$ to base $g$. We can extend
> scalars to negative integers by inversion, i.e. $[-k] A + [k] A = \mathcal{O}$ or
> $a^{-k} \times a^k = 1$.
The _order_ of an element $a$ of a finite group is defined as the smallest positive integer
$k$ such that $a^k = 1$ (in multiplicative notation) or $[k] a = \mathcal{O}$ (in additive
notation). The order _of the group_ is the number of elements.
Groups always have a [generating set], which is a set of elements such that we can produce
any element of the group as (in multiplicative terminology) a product of powers of those
elements. So if the generating set is $g_{1..k}$, we can produce any element of the group
as $\prod\limits_{i=1}^{k} g_i^{a_i}$. There can be many different generating sets for a
given group.
[generating set]: https://en.wikipedia.org/wiki/Generating_set_of_a_group
A group is called [cyclic] if it has a (not necessarily unique) generating set with only
a single element — call it $g$. In that case we can say that $g$ generates the group, and
that the order of $g$ is the order of the group.
Any finite cyclic group $\mathbb{G}$ of order $n$ is [isomorphic] to the integers
modulo $n$ (denoted $\mathbb{Z}/n\mathbb{Z}$), such that:
- the operation $\cdot$ in $\mathbb{G}$ corresponds to addition modulo $n$;
- the identity in $\mathbb{G}$ corresponds to $0$;
- some generator $g \in \mathbb{G}$ corresponds to $1$.
Given a generator $g$, the isomorphism is always easy to compute in the
$\mathbb{Z}/n\mathbb{Z} \rightarrow \mathbb{G}$ direction; it is just $a \mapsto g^a$
(or in additive notation, $a \mapsto [a] g$).
It may be difficult in general to compute in the $\mathbb{G} \rightarrow \mathbb{Z}/n\mathbb{Z}$
direction; we'll discuss this further when we come to [elliptic curves](curves.md).
If the order $n$ of a finite group is prime, then the group is cyclic, and every
non-identity element is a generator.
[isomorphic]: https://en.wikipedia.org/wiki/Isomorphism
[cyclic]: https://en.wikipedia.org/wiki/Cyclic_group
### The multiplicative group of a finite field
We use the notation $\mathbb{F}_p^\times$ for the multiplicative group (i.e. the group
operation is multiplication in $\mathbb{F}_p$) over the set $\mathbb{F}_p - \{0\}$.
A quick way of obtaining the inverse in $\mathbb{F}_p^\times$ is $a^{-1} = a^{p - 2}$.
The reason for this stems from [Fermat's little theorem][fermat-little], which states
that $a^p = a \pmod p$ for any integer $a$. If $a$ is nonzero, we can divide by $a$ twice
to get $a^{p-2} = a^{-1}.$
[fermat-little]: https://en.wikipedia.org/wiki/Fermat%27s_little_theorem
Let's assume that $\alpha$ is a generator of $\mathbb{F}_p^\times$, so it has order $p-1$
(equal to the number of elements in $\mathbb{F}_p^\times$). Therefore, for any element in
$a \in \mathbb{F}_p^\times$ there is a unique integer $i \in \{0..p-2\}$ such that $a = \alpha^i$.
Notice that $a \times b$ where $a, b \in \mathbb{F}_p^\times$ can really be interpreted as
$\alpha^i \times \alpha^j$ where $a = \alpha^i$ and $b = \alpha^j$. Indeed, it holds that
$\alpha^i \times \alpha^j = \alpha^{i + j}$ for all $0 \leq i, j \lt p - 1$. As a result
the multiplication of nonzero field elements can be interpreted as addition modulo $p - 1$
with respect to some fixed generator $\alpha$. The addition just happens "in the exponent."
This is another way to look at where $a^{p - 2}$ comes from for computing inverses in the
field:
$$p - 2 \equiv -1 \pmod{p - 1},$$
so $a^{p - 2} = a^{-1}$.
### Montgomery's Trick
Montgomery's trick, named after Peter Montgomery (RIP) is a way to compute many group
inversions at the same time. It is commonly used to compute inversions in
$\mathbb{F}_p^\times$, which are quite computationally expensive compared to multiplication.
Imagine we need to compute the inverses of three nonzero elements $a, b, c \in \mathbb{F}_p^\times$.
Instead, we'll compute the products $x = ab$ and $y = xc = abc$, and compute the inversion
$$z = y^{p - 2} = \frac{1}{abc}.$$
We can now multiply $z$ by $x$ to obtain $\frac{1}{c}$ and multiply $z$ by $c$ to obtain
$\frac{1}{ab}$, which we can then multiply by $a, b$ to obtain their respective inverses.
This technique generalizes to arbitrary numbers of group elements with just a single
inversion necessary.
## Multiplicative subgroups
A _subgroup_ of a group $G$ with operation $\cdot$, is a subset of elements of $G$ that
also form a group under $\cdot$.
In the previous section we said that $\alpha$ is a generator of the $(p - 1)$-order
multiplicative group $\mathbb{F}_p^\times$. This group has _composite_ order, and so by
the Chinese remainder theorem[^chinese-remainder] it has strict subgroups. As an example
let's imagine that $p = 11$, and so $p - 1$ factors into $5 \cdot 2$. Thus, there is a
generator $\beta$ of the $5$-order subgroup and a generator $\gamma$ of the $2$-order
subgroup. All elements in $\mathbb{F}_p^\times$, therefore, can be written uniquely as
$\beta^i \cdot \gamma^j$ for some $i$ (modulo $5$) and some $j$ (modulo $2$).
If we have $a = \beta^i \cdot \gamma^j$ notice what happens when we compute
$$
a^5 = (\beta^i \cdot \gamma^j)^5
= \beta^{i \cdot 5} \cdot \gamma^{j \cdot 5}
= \beta^0 \cdot \gamma^{j \cdot 5}
= \gamma^{j \cdot 5};
$$
we have effectively "killed" the $5$-order subgroup component, producing a value in the
$2$-order subgroup.
[Lagrange's theorem (group theory)][lagrange-group] states that the order of any subgroup
$H$ of a finite group $G$ divides the order of $G$. Therefore, the order of any subgroup
of $\mathbb{F}_p^\times$ must divide $p-1.$
[lagrange-group]: https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)
## Square roots
In a field $\mathbb{F}_p$ exactly half of all nonzero elements are squares; the remainder
are non-squares or "quadratic non-residues". In order to see why, consider an $\alpha$
that generates the $2$-order multiplicative subgroup of $\mathbb{F}_p^\times$ (this exists
because $p - 1$ is divisible by $2$ since $p$ is a prime greater than $2$) and $\beta$ that
generates the $t$-order multiplicative subgroup of $\mathbb{F}_p^\times$ where $p - 1 = 2t$.
Then every element $a \in \mathbb{F}_p^\times$ can be written uniquely as
$\alpha^i \cdot \beta^j$ with $i \in \mathbb{Z}_2$ and $j \in \mathbb{Z}_t$. Half of all
elements will have $i = 0$ and the other half will have $i = 1$.
Let's consider the simple case where $p \equiv 3 \pmod{4}$ and so $t$ is odd (if $t$ is
even, then $p - 1$ would be divisible by $4$, which contradicts $p$ being $3 \pmod{4}$).
If $a \in \mathbb{F}_p^\times$ is a square, then there must exist
$b = \alpha^i \cdot \beta^j$ such that $b^2 = a$. But this means that
$$a = (\alpha^i \cdot \beta^j)^2 = \alpha^{2i} \cdot \beta^{2j} = \beta^{2j}.$$
In other words, all squares in this particular field do not generate the $2$-order
multiplicative subgroup, and so since half of the elements generate the $2$-order subgroup
then at most half of the elements are square. In fact exactly half of the elements are
square (since squaring each nonsquare element gives a unique square). This means we can
assume all squares can be written as $\beta^m$ for some $m$, and therefore finding the
square root is a matter of exponentiating by $2^{-1} \pmod{t}$.
In the event that $p \equiv 1 \pmod{4}$ then things get more complicated because
$2^{-1} \pmod{t}$ does not exist. Let's write $p - 1$ as $2^k \cdot t$ with $t$ odd. The
case $k = 0$ is impossible, and the case $k = 1$ is what we already described, so consider
$k \geq 2$. $\alpha$ generates a $2^k$-order multiplicative subgroup and $\beta$ generates
the odd $t$-order multiplicative subgroup. Then every element $a \in \mathbb{F}_p^\times$
can be written as $\alpha^i \cdot \beta^j$ for $i \in \mathbb{Z}_{2^k}$ and
$j \in \mathbb{Z}_t$. If the element is a square, then there exists some $b = \sqrt{a}$
which can be written $b = \alpha^{i'} \cdot \beta^{j'}$ for $i' \in \mathbb{Z}_{2^k}$ and
$j' \in \mathbb{Z}_t$. This means that $a = b^2 = \alpha^{2i'} \cdot \beta^{2j'}$,
therefore we have $i \equiv 2i' \pmod{2^k}$, and $j \equiv 2j' \pmod{t}$. $i$ would have
to be even in this case because otherwise it would be impossible to have
$i \equiv 2i' \pmod{2^k}$ for any $i'$. In the case that $a$ is not a square, then $i$ is
odd, and so half of all elements are squares.
In order to compute the square root, we can first raise the element
$a = \alpha^i \cdot \beta^j$ to the power $t$ to "kill" the $t$-order component, giving
$$a^t = \alpha^{it \pmod 2^k} \cdot \beta^{jt \pmod t} = \alpha^{it \pmod 2^k}$$
and then raise this result to the power $t^{-1} \pmod{2^k}$ to undo the effect of the
original exponentiation on the $2^k$-order component:
$$(\alpha^{it \bmod 2^k})^{t^{-1} \pmod{2^k}} = \alpha^i$$
(since $t$ is relatively prime to $2^k$). This leaves bare the $\alpha^i$ value which we
can trivially handle. We can similarly kill the $2^k$-order component to obtain
$\beta^{j \cdot 2^{-1} \pmod{t}}$, and put the values together to obtain the square root.
It turns out that in the cases $k = 2, 3$ there are simpler algorithms that merge several
of these exponentiations together for efficiency. For other values of $k$, the only known
way is to manually extract $i$ by squaring until you obtain the identity for every single
bit of $i$. This is the essence of the [Tonelli-Shanks square root algorithm][ts-sqrt] and
describes the general strategy. (There is another square root algorithm that uses
quadratic extension fields, but it doesn't pay off in efficiency until the prime becomes
quite large.)
[ts-sqrt]: https://en.wikipedia.org/wiki/Tonelli%E2%80%93Shanks_algorithm
## Roots of unity
In the previous sections we wrote $p - 1 = 2^k \cdot t$ with $t$ odd, and stated that an
element $\alpha \in \mathbb{F}_p^\times$ generated the $2^k$-order subgroup. For
convenience, let's denote $n := 2^k.$ The elements $\{1, \alpha, \alpha^2, \alpha^{n-1}\}$
are known as the $n$th [roots of unity](https://en.wikipedia.org/wiki/Root_of_unity).
The **primitive root of unity**, $\omega,$ is an $n$th root of unity such that
$\omega^i \neq 1$ except when $i \equiv 0 \pmod{n}$.
Important notes:
- If $\alpha$ is an $n$th root of unity, $\alpha$ satisfies $\alpha^n - 1 = 0.$ If
$\alpha \neq 1,$ then
$$1 + \alpha + \alpha^2 + \cdots + \alpha^{n-1} = 0.$$
- Equivalently, the roots of unity are solutions to the equation
$$X^n - 1 = (X - 1)(X - \alpha)(X - \alpha^2) \cdots (X - \alpha^{n-1}).$$
- **$\boxed{\omega^{\frac{n}{2}+i} = -\omega^i}$ ("Negation lemma")**. Proof:
$$
\begin{aligned}
\omega^n = 1 &\implies \omega^n - 1 = 0 \\
&\implies (\omega^{n/2} + 1)(\omega^{n/2} - 1) = 0.
\end{aligned}
$$
Since the order of $\omega$ is $n$, $\omega^{n/2} \neq 1.$ Therefore, $\omega^{n/2} = -1.$
- **$\boxed{(\omega^{\frac{n}{2}+i})^2 = (\omega^i)^2}$ ("Halving lemma")**. Proof:
$$
(\omega^{\frac{n}{2}+i})^2 = \omega^{n + 2i} = \omega^{n} \cdot \omega^{2i} = \omega^{2i} = (\omega^i)^2.
$$
In other words, if we square each element in the $n$th roots of unity, we would get back
only half the elements, $\{(\omega_n^i)^2\} = \{\omega_{n/2}\}$ (i.e. the $\frac{n}{2}$th roots
of unity). There is a two-to-one mapping between the elements and their squares.
## References
[^chinese-remainder]: [Friedman, R. (n.d.) "Cyclic Groups and Elementary Number Theory II" (p. 5).](http://www.math.columbia.edu/~rf/numbertheory2.pdf)

View File

@ -1,94 +0,0 @@
# Cryptographic groups
In the section [Inverses and groups](fields.md#inverses-and-groups) we introduced the
concept of *groups*. A group has an identity and a group operation. In this section we
will write groups additively, i.e. the identity is $\mathcal{O}$ and the group operation
is $+$.
Some groups can be used as *cryptographic groups*. At the risk of oversimplifying, this
means that the problem of finding a discrete logarithm of a group element $P$ to a given
base $G$, i.e. finding $x$ such that $P = [x] G$, is hard in general.
## Pedersen commitment
The Pedersen commitment [[P99]] is a way to commit to a secret message in a verifiable
way. It uses two random public generators $G, H \in \mathbb{G},$ where $\mathbb{G}$ is a
cryptographic group of order $p$. A random secret $r$ is chosen in $\mathbb{Z}_q$, and the
message to commit to $m$ is from any subset of $\mathbb{Z}_q$. The commitment is
$$c = \text{Commit}(m,r)=[m]G + [r]H.$$
To open the commitment, the committer reveals $m$ and $r,$ thus allowing anyone to verify
that $c$ is indeed a commitment to $m.$
[P99]: https://link.springer.com/content/pdf/10.1007%2F3-540-46766-1_9.pdf#page=3
Notice that the Pedersen commitment scheme is homomorphic:
$$
\begin{aligned}
\text{Commit}(m,r) + \text{Commit}(m',r') &= [m]G + [r]H + [m']G + [r']H \\
&= [m + m']G + [r + r']H \\
&= \text{Commit}(m + m',r + r').
\end{aligned}
$$
Assuming the discrete log assumption holds, Pedersen commitments are also perfectly hiding
and computationally binding:
* **hiding**: the adversary chooses messages $m_0, m_1.$ The committer commits to one of
these messages $c = \text{Commit}(m_b;r), b \in \{0,1\}.$ Given $c,$ the probability of
the adversary guessing the correct $b$ is no more than $\frac{1}{2}$.
* **binding**: the adversary cannot pick two different messages $m_0 \neq m_1,$ and
randomness $r_0, r_1,$ such that $\text{Commit}(m_0,r_0) = \text{Commit}(m_1,r_1).$
### Vector Pedersen commitment
We can use a variant of the Pedersen commitment scheme to commit to multiple messages at
once, $\mathbf{m} = (m_1, \cdots, m_n)$. This time, we'll have to sample a corresponding
number of random public generators $\mathbf{G} = (G_0, \cdots, G_{n-1}),$ along with a
single random generator $H$ as before (for use in hiding). Then, our commitment scheme is:
$$
\begin{aligned}
\text{Commit}(\mathbf{m}; r) &= \text{Commit}((m_0, \cdots, m_{n-1}); r) \\
&= [r]H + [m_0]G_0 + \cdots + [m_{n-1}]G_{n-1} \\
&= [r]H + \sum_{i= 0}^{n-1} [m_i]G_i.
\end{aligned}
$$
> TODO: is this positionally binding?
## Diffie--Hellman
An example of a protocol that uses cryptographic groups is Diffie--Hellman key agreement
[[DH1976]]. The Diffie--Hellman protocol is a method for two users, Alice and Bob, to
generate a shared private key. It proceeds as follows:
1. Alice and Bob publicly agree on two prime numbers, $p$ and $G,$ where $p$ is large and
$G$ is a primitive root $\pmod p.$ (Note that $g$ is a generator of the group
$\mathbb{F}_p^\times.$)
2. Alice chooses a large random number $a$ as her private key. She computes her public key
$A = [a]G \pmod p,$ and sends $A$ to Bob.
3. Similarly, Bob chooses a large random number $b$ as his private key. He computes his
public key $B = [b]G \pmod p,$ and sends $B$ to Alice.
4. Now both Alice and Bob compute their shared key $K = [ab]G \pmod p,$ which Alice
computes as
$$K = [a]B \pmod p = [a]([b]G) \pmod p,$$
and Bob computes as
$$K = [b]A \pmod p = [b]([a]G) \pmod p.$$
[DH1976]: https://ee.stanford.edu/~hellman/publications/24.pdf
A potential eavesdropper would need to derive $K = [ab]g \pmod p$ knowing only
$g, p, A = [a]G,$ and $B = [b]G$: in other words, they would need to either get the
discrete logarithm $a$ from $A = [a]G$ or $b$ from $B = [b]G,$ which we assume to be
computationally infeasible in $\mathbb{F}_p^\times.$
More generally, protocols that use similar ideas to Diffie--Hellman are used throughout
cryptography. One way of instantiating a cryptographic group is as an
[elliptic curve](curves.md). Before we go into detail on elliptic curves, we'll describe
some algorithms that can be used for any group.
## Multiscalar multiplication
### TODO: Pippenger's algorithm
Reference: https://jbootle.github.io/Misc/pippenger.pdf

View File

@ -1,80 +0,0 @@
# Polynomial commitment using inner product argument
We want to commit to some polynomial $p(X) \in \mathbb{F}_p[X]$, and be able to provably
evaluate the committed polynomial at arbitrary points. The naive solution would be for the
prover to simply send the polynomial's coefficients to the verifier: however, this
requires $O(n)$ communication. Our polynomial commitment scheme gets the job done using
$O(\log n)$ communication.
### `Setup`
Given a parameter $d = 2^k,$ we generate the common reference string
$\sigma = (\mathbb{G}, \mathbf{G}, H, \mathbb{F}_p)$ defining certain constants for this
scheme:
* $\mathbb{G}$ is a group of prime order $p;$
* $\mathbf{G} \in \mathbb{G}^d$ is a vector of $d$ random group elements;
* $H \in \mathbb{G}$ is a random group element; and
* $\mathbb{F}_p$ is the finite field of order $p.$
### `Commit`
The Pedersen vector commitment $\text{Commit}$ is defined as
$$\text{Commit}(\sigma, p(X); r) = \langle\mathbf{a}, \mathbf{G}\rangle + [r]H,$$
for some polynomial $p(X) \in \mathbb{F}_p[X]$ and some blinding factor
$r \in \mathbb{F}_p.$ Here, each element of the vector $\mathbf{a}_i \in \mathbb{F}_p$ is
the coefficient for the $i$th degree term of $p(X),$ and $p(X)$ is of maximal degree
$d - 1.$
### `Open` (prover) and `OpenVerify` (verifier)
The modified inner product argument is an argument of knowledge for the relation
$$\boxed{\{((P, x, v); (\mathbf{a}, r)): P = \langle\mathbf{a}, \mathbf{G}\rangle + [r]H, v = \langle\mathbf{a}, \mathbf{b}\rangle\}},$$
where $\mathbf{b} = (1, x, x^2, \cdots, x^{d-1})$ is composed of increasing powers of the
evaluation point $x.$ This allows a prover to demonstrate to a verifier that the
polynomial contained “inside” the commitment $P$ evaluates to $v$ at $x,$ and moreover,
that the committed polynomial has maximum degree $d 1.$
The inner product argument proceeds in $k = \log_2 d$ rounds. For our purposes, it is
sufficient to know about its final outputs, while merely providing intuition about the
intermediate rounds. (Refer to Section 3 in the [Halo] paper for a full explanation.)
[Halo]: https://eprint.iacr.org/2019/1021.pdf
Before beginning the argment, the verifier selects a random group element $U$ and sends it
to the prover. We initialise the argument at round $k,$ with the vectors
$\mathbf{a}^{(k)} := \mathbf{a},$ $\mathbf{G}^{(k)} := \mathbf{G}$ and
$\mathbf{b}^{(k)} := \mathbf{b}.$ In each round $j = k, k-1, \cdots, 1$:
* the prover computes two values $L_j$ and $R_j$ by taking some inner product of
$\mathbf{a}^{(j)}$ with $\mathbf{G}^{(j)}$ and $\mathbf{b}^{(j)}$. Note that are in some
sense "cross-terms": the lower half of $\mathbf{a}$ is used with the higher half of
$\mathbf{G}$ and $\mathbf{b}$, and vice versa:
$$
\begin{aligned}
L_j &= \langle\mathbf{a_{lo}^{(j)}}, \mathbf{G_{hi}^{(j)}}\rangle + [l_j]H + [\langle\mathbf{a_{lo}^{(j)}}, \mathbf{b_{hi}^{(j)}}\rangle] U\\
R_j &= \langle\mathbf{a_{hi}^{(j)}}, \mathbf{G_{lo}^{(j)}}\rangle + [l_j]H + [\langle\mathbf{a_{hi}^{(j)}}, \mathbf{b_{lo}^{(j)}}\rangle] U\\
\end{aligned}
$$
* the verifier issues a random challenge $u_j$;
* the prover uses $u_j$ to compress the lower and higher halves of $\mathbf{a}^{(j)}$,
thus producing a new vector of half the original length
$$\mathbf{a}^{(j-1)} = \mathbf{a_{hi}^{(j)}}\cdot u_j^{-1} + \mathbf{a_{lo}^{(j)}}\cdot u_j.$$
The vectors $\mathbf{G}^{(j)}$ and $\mathbf{b}^{(j)}$ are similarly compressed to give
$\mathbf{G}^{(j-1)}$ and $\mathbf{b}^{(j-1)}$.
* $\mathbf{a}^{(j-1)}$, $\mathbf{G}^{(j-1)}$ and $\mathbf{b}^{(j-1)}$ are input to the
next round $j - 1.$
Note that at the end of the last round $j = 1,$ we are left with $a := \mathbf{a}^{(0)}$,
$G := \mathbf{G}^{(0)}$, $b := \mathbf{b}^{(0)},$ each of length 1. The intuition is that
these final scalars, together with the challenges $\{u_j\}$ and "cross-terms"
$\{L_j, R_j\}$ from each round, encode the compression in each round. Since the prover did
not know the challenges $U, \{u_j\}$ in advance, they would have been unable to manipulate
the round compressions. Thus, checking a constraint on these final terms should enforce
that the compression had been performed correctly, and that the original $\mathbf{a}$
satisfied the relation before undergoing compression.
Note that $G, b$ are simply rearrangements of the publicly known $\mathbf{G}, \mathbf{b},$
with the round challenges $\{u_j\}$ mixed in: this means the verifier can compute $G, b$
independently and verify that the prover had provided those same values.

View File

@ -1,289 +0,0 @@
# Polynomials
Let $A(X)$ be a polynomial over $\mathbb{F}_p$ with formal indeterminate $X$. As an example,
$$
A(X) = a_0 + a_1 X + a_2 X^2 + a_3 X^3
$$
defines a degree-$3$ polynomial. $a_0$ is referred to as the constant term. Polynomials of
degree $n-1$ have $n$ coefficients. We will often want to compute the result of replacing
the formal indeterminate $X$ with some concrete value $x$, which we denote by $A(x)$.
> In mathematics this is commonly referred to as "evaluating $A(X)$ at a point $x$".
> The word "point" here stems from the geometrical usage of polynomials in the form
> $y = A(x)$, where $(x, y)$ is the coordinate of a point in two-dimensional space.
> However, the polynomials we deal with are almost always constrained to equal zero, and
> $x$ will be an [element of some field](fields.md). This should not be confused
> with points on an [elliptic curve](curves.md), which we also make use of, but never in
> the context of polynomial evaluation.
Important notes:
* Multiplication of polynomials produces a product polynomial that is the sum of the
degrees of its factors. Polynomial division subtracts from the degree.
$$\deg(A(X)B(X)) = \deg(A(X)) + \deg(B(X)),$$
$$\deg(A(X)/B(X)) = \deg(A(X)) -\deg(B(X)).$$
* Given a polynomial $A(X)$ of degree $n-1$, if we obtain $n$ evaluations of the
polynomial at distinct points then these evaluations perfectly define the polynomial. In
other words, given these evaluations we can obtain a unique polynomial $A(X)$ of degree
$n-1$ via polynomial interpolation.
* $[a_0, a_1, \cdots, a_{n-1}]$ is the **coefficient representation** of the polynomial
$A(X)$. Equivalently, we could use its **evaluation representation**
$$[(x_0, A(x_0)), (x_1, A(x_1)), \cdots, (x_{n-1}, A(x_{n-1}))]$$
at $n$ distinct points. Either representation uniquely specifies the same polynomial.
> #### (aside) Horner's rule
> Horner's rule allows for efficient evaluation of a polynomial of degree $n-1$, using
> only $n-1$ multiplications and $n-1$ additions. It is the following identity:
> $$\begin{aligned}a_0 &+ a_1X + a_2X^2 + \cdots + a_{n-1}X^{n-1} \\ &= a_0 + X\bigg( a_1 + X \Big( a_2 + \cdots + X(a_{n-2} + X a_{n-1}) \Big)\!\bigg),\end{aligned}$$
## Fast Fourier Transform (FFT)
The FFT is an efficient way of converting between the coefficient and evaluation
representations of a polynomial. It evaluates the polynomial at the $n$th roots of unity
$\{\omega^0, \omega^1, \cdots, \omega^{n-1}\},$ where $\omega$ is a primitive $n$th root
of unity. By exploiting symmetries in the roots of unity, each round of the FFT reduces
the evaluation into a problem only half the size. Most commonly we use polynomials of
length some power of two, $n = 2^k$, and apply the halving reduction recursively.
### Motivation: Fast polynomial multiplication
In the coefficient representation, it takes $O(n^2)$ operations to multiply two
polynomials $A(X)\cdot B(X) = C(X)$:
$$
\begin{aligned}
A(X) &= a_0 + a_1X + a_2X^2 + \cdots + a_{n-1}X^{n-1}, \\
B(X) &= b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1}, \\
C(X) &= a_0\cdot (b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1}) \\
&+ a_1X\cdot (b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1})\\
&+ \cdots \\
&+ a_{n-1}X^{n-1} \cdot (b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1}),
\end{aligned}
$$
where each of the $n$ terms in the first polynomial has to be multiplied by the $n$ terms
of the second polynomial.
In the evaluation representation, however, polynomial multiplication only requires $O(n)$
operations:
$$
\begin{aligned}
A&: \{(x_0, A(x_0)), (x_1, A(x_1)), \cdots, (x_{n-1}, A(x_{n-1}))\}, \\
B&: \{(x_0, B(x_0)), (x_1, B(x_1)), \cdots, (x_{n-1}, B(x_{n-1}))\}, \\
C&: \{(x_0, A(x_0)B(x_0)), (x_1, A(x_1)B(x_1)), \cdots, (x_{n-1}, A(x_{n-1})B(x_{n-1}))\},
\end{aligned}
$$
where each evaluation is multiplied pointwise.
This suggests the following strategy for fast polynomial multiplication:
1. Evaluate polynomials at all $n$ points;
2. Perform fast pointwise multiplication in the evaluation representation ($O(n)$);
3. Convert back to the coefficient representation.
The challenge now is how to **evaluate** and **interpolate** the polynomials efficiently.
Naively, evaluating a polynomial at $n$ points would require $O(n^2)$ operations (we use
the $O(n)$ Horner's rule at each point):
$$
\begin{bmatrix}
A(1) \\
A(\omega) \\
A(\omega^2) \\
\vdots \\
A(\omega^{n-1})
\end{bmatrix} =
\begin{bmatrix}
1&1&1&\dots&1 \\
1&\omega&\omega^2&\dots&\omega^{n-1} \\
1&\omega^2&\omega^{2\cdot2}&\dots&\omega^{2\cdot(n-1)} \\
\vdots&\vdots&\vdots& &\vdots \\
1&\omega^{n-1}&\omega^{2(n-1)}&\cdots&\omega^{(n-1)^2}\\
\end{bmatrix} \cdot
\begin{bmatrix}
a_0 \\
a_1 \\
a_2 \\
\vdots \\
a_{n-1}
\end{bmatrix}.
$$
For convenience, we will denote the matrices above as:
$$\hat{\mathbf{A}} = \mathbf{V}_\omega \cdot \mathbf{A}. $$
($\hat{\mathbf{A}}$ is known as the *Discrete Fourier Transform* of $\mathbf{A}$;
$\mathbf{V}_\omega$ is also called the *Vandermonde matrix*.)
### The (radix-2) Cooley-Tukey algorithm
Our strategy is to divide a DFT of size $n$ into two interleaved DFTs of size $n/2$. Given
the polynomial $A(X) = a_0 + a_1X + a_2X^2 + \cdots + a_{n-1}X^{n-1},$ we split it up into
even and odd terms:
$$
\begin{aligned}
A_{\text{even}} &= a_0 + a_2X + \cdots + a_{n-2}X^{\frac{n}{2} - 1}, \\
A_{\text{odd}} &= a_1 + a_3X + \cdots + a_{n-1}X^{\frac{n}{2} - 1}. \\
\end{aligned}
$$
To recover the original polynomial, we do
$A(X) = A_{\text{even}} (X^2) + X A_{\text{odd}}(X^2).$
Trying this out on points $\omega_n^i$ and $\omega_n^{\frac{n}{2} + i}$,
$i \in [0..\frac{n}{2}-1],$ we start to notice some symmetries:
$$
\begin{aligned}
A(\omega_n^i) &= A_{\text{even}} ((\omega_n^i)^2) + \omega_n^i A_{\text{odd}}((\omega_n^i)^2), \\
A(\omega_n^{\frac{n}{2} + i}) &= A_{\text{even}} ((\omega_n^{\frac{n}{2} + i})^2) + \omega_n^{\frac{n}{2} + i} A_{\text{odd}}((\omega_n^{\frac{n}{2} + i})^2) \\
&= A_{\text{even}} ((-\omega_n^i)^2) - \omega_n^i A_{\text{odd}}((-\omega_n^i)^2) \leftarrow\text{(negation lemma)} \\
&= A_{\text{even}} ((\omega_n^i)^2) - \omega_n^i A_{\text{odd}}((\omega_n^i)^2).
\end{aligned}
$$
Notice that we are only evaluating $A_{\text{even}}(X)$ and $A_{\text{odd}}(X)$ over half
the domain $\{(\omega_n^0)^2, (\omega_n)^2, \cdots, (\omega_n^{\frac{n}{2} -1})^2\} = \{\omega_{n/2}^i\}, i = [0..\frac{n}{2}-1]$ (halving lemma).
This gives us all the terms we need to reconstruct $A(X)$ over the full domain
$\{\omega^0, \omega, \cdots, \omega^{n -1}\}$: which means we have transformed a
length-$n$ DFT into two length-$\frac{n}{2}$ DFTs.
We choose $n = 2^k$ to be a power of two (by zero-padding if needed), and apply this
divide-and-conquer strategy recursively. By the Master Theorem[^master-thm], this gives us
an evaluation algorithm with $O(n\log_2n)$ operations, also known as the Fast Fourier
Transform (FFT).
### Inverse FFT
So we've evaluated our polynomials and multiplied them pointwise. What remains is to
convert the product from the evaluation representation back to coefficient representation.
To do this, we simply call the FFT on the evaluation representation. However, this time we
also:
- replace $\omega^i$ by $\omega^{-i}$ in the Vandermonde matrix, and
- multiply our final result by a factor of $1/n$.
In other words:
$$\mathbf{A} = \frac{1}{n} \mathbf{V}_{\omega^{-1}} \cdot \hat{\mathbf{A}}. $$
(To understand why the inverse FFT has a similar form to the FFT, refer to Slide 13-1 of
[^ifft]. The below image was also taken from [^ifft].)
![](https://i.imgur.com/lSw30zo.png)
## The Schwartz-Zippel lemma
The Schwartz-Zippel lemma informally states that "different polynomials are different at
most points." Formally, it can be written as follows:
> Let $p(x_1, x_2, \cdots, x_n)$ be a nonzero polynomial of $n$ variables with degree $d$.
> Let $S$ be a finite set of numbers with at least $d$ elements in it. If we choose random
> $\alpha_1, \alpha_1, \cdots, \alpha_n$ from $S$,
> $$\text{Pr}[p(\alpha_1, \alpha_2, \cdots, \alpha_n) = 0] \leq \frac{d}{|S|}.$$
In the familiar univariate case $p(X)$, this reduces to saying that a nonzero polynomial
of degree $d$ has at most $d$ roots.
The Schwartz-Zippel lemma is used in polynomial equality testing. Given two multi-variate
polynomials $p_1(x_1,\cdots,x_n)$ and $p_2(x_1,\cdots,x_n)$ of degrees $d_1, d_2$
respectively, we can test if
$p_1(\alpha_1, \cdots, \alpha_n) - p_2(\alpha_1, \cdots, \alpha_n) = 0$ for random
$\alpha_1, \cdots, \alpha_n \leftarrow S,$ where the size of $S$ is at least
$|S| \geq (d_1 + d_2).$ If the two polynomials are identical, this will always be true,
whereas if the two polynomials are different then the equality holds with probability at
most $\frac{\max(d_1,d_2)}{|S|}$.
## Vanishing polynomial
Consider the order-$n$ multiplicative subgroup $\mathcal{H}$ with primitive root of unity
$\omega$. For all $\omega^i \in \mathcal{H}, i \in [n-1],$ we have
$(\omega^i)^n = (\omega^n)^i = (\omega^0)^i = 1.$ In other words, every element of
$\mathcal{H}$ fulfils the equation
$$
\begin{aligned}
Z_H(X) &= X^n - 1 \\
&= (X-\omega^0)(X-\omega^1)(X-\omega^2)\cdots(X-\omega^{n-1}),
\end{aligned}
$$
meaning every element is a root of $Z_H(X).$ We call $Z_H(X)$ the **vanishing polynomial**
over $\mathcal{H}$ because it evaluates to zero on all elements of $\mathcal{H}.$
This comes in particularly handy when checking polynomial constraints. For instance, to
check that $A(X) + B(X) = C(X)$ over $\mathcal{H},$ we simply have to check that
$A(X) + B(X) - C(X)$ is some multiple of $Z_H(X)$. In other words, if dividing our
constraint by the vanishing polynomial still yields some polynomial
$\frac{A(X) + B(X) - C(X)}{Z_H(X)} = H(X),$ we are satisfied that $A(X) + B(X) - C(X) = 0$
over $\mathcal{H}.$
## Lagrange basis functions
> TODO: explain what a basis is in general (briefly).
Polynomials are commonly written in the monomial basis (e.g. $X, X^2, ... X^n$). However,
when working over a multiplicative subgroup of order $n$, we find a more natural expression
in the Lagrange basis.
Consider the order-$n$ multiplicative subgroup $\mathcal{H}$ with primitive root of unity
$\omega$. The Lagrange basis corresponding to this subgroup is a set of functions
$\{\mathcal{L}_i\}_{i = 0}^{n-1}$, where
$$
\mathcal{L_i}(\omega^j) = \begin{cases}
1 & \text{if } i = j, \\
0 & \text{otherwise.}
\end{cases}
$$
We can write this more compactly as $\mathcal{L_i}(\omega^j) = \delta_{ij},$ where
$\delta$ is the Kronecker delta function.
Now, we can write our polynomial as a linear combination of Lagrange basis functions,
$$A(X) = \sum_{i = 0}^{n-1} a_i\mathcal{L_i}(X), X \in \mathcal{H},$$
which is equivalent to saying that $p(X)$ evaluates to $a_0$ at $\omega^0$,
to $a_1$ at $\omega^1$, to $a_2$ at $\omega^2, \cdots,$ and so on.
When working over a multiplicative subgroup, the Lagrange basis function has a convenient
sparse representation of the form
$$
\mathcal{L}_i(X) = \frac{c_i\cdot(X^{n} - 1)}{X - \omega^i},
$$
where $c_i$ is the barycentric weight. (To understand how this form was derived, refer to
[^barycentric].) For $i = 0,$ we have
$c = 1/n \implies \mathcal{L}_0(X) = \frac{1}{n} \frac{(X^{n} - 1)}{X - 1}$.
Suppose we are given a set of evaluation points $\{x_0, x_1, \cdots, x_{n-1}\}$.
Since we cannot assume that the $x_i$'s form a multiplicative subgroup, we consider also
the Lagrange polynomials $\mathcal{L}_i$'s in the general case. Then we can construct:
$$
\mathcal{L}_i(X) = \prod_{j\neq i}\frac{X - x_j}{x_i - x_j}, i \in [0..n-1].
$$
Here, every $X = x_j \neq x_i$ will produce a zero numerator term $(x_j - x_j),$ causing
the whole product to evaluate to zero. On the other hand, $X= x_i$ will evaluate to
$\frac{x_i - x_j}{x_i - x_j}$ at every term, resulting in an overall product of one. This
gives the desired Kronecker delta behaviour $\mathcal{L_i}(x_j) = \delta_{ij}$ on the
set $\{x_0, x_1, \cdots, x_{n-1}\}$.
### Lagrange interpolation
Given a polynomial in its evaluation representation
$$A: \{(x_0, A(x_0)), (x_1, A(x_1)), \cdots, (x_{n-1}, A(x_{n-1}))\},$$
we can reconstruct its coefficient form in the Lagrange basis:
$$A(X) = \sum_{i = 0}^{n-1} A(x_i)\mathcal{L_i}(X), $$
where $X \in \{x_0, x_1,\cdots, x_{1-n}\}.$
## References
[^master-thm]: [Dasgupta, S., Papadimitriou, C. H., & Vazirani, U. V. (2008). "Algorithms" (ch. 2). New York: McGraw-Hill Higher Education.](https://people.eecs.berkeley.edu/~vazirani/algorithms/chap2.pdf)
[^ifft]: [Golin, M. (2016). "The Fast Fourier Transform and Polynomial Multiplication" [lecture notes], COMP 3711H Design and Analysis of Algorithms, Hong Kong University of Science and Technology.](http://www.cs.ust.hk/mjg_lib/Classes/COMP3711H_Fall16/lectures/FFT_Slides.pdf)
[^barycentric]: [Berrut, J. and Trefethen, L. (2004). "Barycentric Lagrange Interpolation."](https://people.maths.ox.ac.uk/trefethen/barycentric.pdf)

View File

@ -1,26 +0,0 @@
## Recursion
> Alternative terms: Induction; Accumulation scheme; Proof-carrying data
However, the computation of $G$ requires a length-$2^k$ multiexponentiation
$\langle \mathbf{G}, \mathbf{s}\rangle,$ where $\mathbf{s}$ is composed of the round
challenges $u_1, \cdots, u_k$ arranged in a binary counting structure. This is the
linear-time computation that we want to amortise across a batch of proof instances.
Instead of computing $G,$ notice that we can express $G$ as a commitment to a polynomial
$$G = \text{Commit}(\sigma, g(X, u_1, \cdots, u_k)),$$
where $g(X, u_1, \cdots, u_k) := \prod_{i=1}^k (u_i + u_i^{-1}X^{2^{i-1}})$ is a
polynomial with degree $2^k - 1.$
| | |
| -------- | -------- |
| <img src="https://i.imgur.com/vMXKFDV.png" width=1900> | Since $G$ is a commitment, it can be checked in an inner product argument. The verifier circuit witnesses $G$ and brings $G, u_1, \cdots, u_k$ out as public inputs to the proof $\pi.$ The next verifier instance checks $\pi$ using the inner product argument; this includes checking that $G = \text{Commit}(g(X, u_1, \cdots, u_k))$ evaluates at some random point to the expected value for the given challenges $u_1, \cdots, u_k.$ Recall from the [previous section](#Polynomial-commitment-using-inner-product-argument) that this check only requires $\log d$ work. <br><br> At the end of checking $\pi$ and $G,$ the circuit is left with a new $G',$ along with the $u_1', \cdots, u_k'$ challenges sampled for the check. To fully accept $\pi$ as valid, we should perform a linear-time computation of $G' = \langle\mathbf{G}, \mathbf{s}'\rangle$. Once again, we delay this computation by witnessing $G'$ and bringing $G, u_1, \cdots, u_k$ out as public inputs to the proof $\pi.$ <br><br> This goes on from one proof instance to the next, until we are satisfied with the size of our batch of proofs. We finally perform a single linear-time computation, thus deciding the validity of the whole batch. |
We recall from the section [Cycles of curves](curves.md#cycles-of-curves) that we can
instantiate this protocol over a two-cycle, where a proof produced by one curve is
efficiently verified in the circuit of the other curve. However, some of these verifier
checks can actually be efficiently performed in the native circuit; these are "deferred"
to the next native circuit (see diagram below) instead of being immediately passed over to
the other curve.
![](https://i.imgur.com/l4HrYgE.png)

View File

@ -1,81 +0,0 @@
# [WIP] UltraPLONK arithmetisation
We call the field over which the circuit is defined $\mathbb{F} = \mathbb{F}_p$.
Let $n = 2^k$, and assume that $\omega$ is a primitive root of unity of order $n$ in
$\mathbb{F}^\times$, so that $\mathbb{F}^\times$ has a multiplicative subgroup
$\mathcal{H} = \{1, \omega, \omega^2, \cdots, \omega^{n-1}\}$. This forms a Lagrange
basis corresponding to the elements in the subgroup.
## Polynomial rules
A polynomial rule defines a constraint that must hold between its specified columns at
every row (i.e. at every element in the multiplicative subgroup).
e.g.
```text
a * sa + b * sb + a * b * sm + c * sc + PI = 0
```
## Columns
- **fixed (i.e. "selector") columns**: fixed for all instances of a particular circuit.
These columns toggle parts of a polynomial rule "on" or "off" to form a "custom gate".
- **advice columns**: variable values assigned in each instance of the circuit.
Corresponds to the prover's secret witness.
- **public input**: like advice columns, but publicly known values.
Each column is a vector of $n$ values, e.g. $\mathbf{a} = [a_0, a_1, \cdots, a_{n-1}]$. We
can think of the vector as the evaluation form of the column polynomial
$a(X), X \in \mathcal{H}.$ To recover the coefficient form, we can use
[Lagrange interpolation](polynomials.md#lagrange-interpolation), such that
$a(\omega^i) = a_i.$
## Copy constraints
- Define permutation between a set of columns, e.g. $\sigma(a, b, c)$
- Copy specific cells between these columns, e.g. $b_1 = c_0$
- Construct permuted columns which should evaluate to same value as original columns
## Permutation grand product
$$Z(\omega^i) := \prod_{0 \leq j \leq i} \frac{C_k(\omega^j) + \beta\delta^k \omega^j + \gamma}{C_k(\omega^j) + \beta S_k(\omega^j) + \gamma},$$
where $i = 0, \cdots, n-1$ indexes over the size of the multiplicative subgroup, and
$k = 0, \cdots, m-1$ indexes over the advice columns involved in the permutation. This is
a running product, where each term includes the cumulative product of the terms before it.
> TODO: what is $\delta$? keep columns linearly independent
Check the constraints:
1. First term is equal to one
$$\mathcal{L}_0(X) \cdot (1 - Z(X)) = 0$$
2. Running product is well-constructed. For each row, we check that this holds:
$$Z(\omega^i) \cdot{(C(\omega^i) + \beta S_k(\omega^i) + \gamma)} - Z(\omega^{i-1}) \cdot{(C(\omega^i) + \delta^k \beta \omega^i + \gamma)} = 0$$
Rearranging gives
$$Z(\omega^i) = Z(\omega^{i-1}) \frac{C(\omega^i) + \beta\delta^k \omega^i + \gamma}{C(\omega^i) + \beta S_k(\omega^i) + \gamma},$$
which is how we defined the grand product polynomial in the first place.
### Lookup
Reference: [Generic Lookups with PLONK (DRAFT)](/LTPc5f-3S0qNF6MtwD-Tdg?view)
### Vanishing argument
We want to check that the expressions defined by the gate constraints, permutation
constraints and loookup constraints evaluate to zero at all elements in the multiplicative
subgroup. To do this, the prover collapses all the expressions into one polynomial
$$H(X) = \sum_{i=0}^e y^i E_i(X),$$
where $e$ is the number of expressions and $y$ is a random challenge used to keep the
constraints linearly independent. The prover then divides this by the vanishing polynomial
(see section: [Vanishing polynomial](polynomials.md#vanishing-polynomial)) and commits to
the resulting quotient
$$\text{Commit}(Q(X)), \text{where } Q(X) = \frac{H(X)}{Z_H(X)}.$$
The verifier responds with a random evaluation point $x,$ to which the prover replies with
the claimed evaluations $q = Q(x), \{e_i\}_{i=0}^e = \{E_i(x)\}_{i=0}^e.$ Now, all that
remains for the verifier to check is that the evaluations satisfy
$$q \stackrel{?}{=} \frac{\sum_{i=0}^e y^i e_i}{Z_H(x)}.$$
Notice that we have yet to check that the committed polynomials indeed evaluate to the
claimed values at
$x, q \stackrel{?}{=} Q(x), \{e_i\}_{i=0}^e \stackrel{?}{=} \{E_i(x)\}_{i=0}^e.$
This check is handled by the polynomial commitment scheme (described in the next section).

View File

@ -1,5 +0,0 @@
# Concepts
First we'll describe the concepts behind zero-knowledge proof systems; the
*arithmetization* (kind of circuit description) used by Halo 2; and the
abstractions we use to build circuit implementations.

View File

@ -1,60 +0,0 @@
# UltraPLONK Arithmetization
The arithmetization used by Halo 2 comes from [PLONK](https://eprint.iacr.org/2019/953), or
more precisely its extension UltraPLONK that supports custom gates and lookup arguments. We'll
call it ***UPA*** (***UltraPLONK arithmetization***).
> The term UPA and some of the other terms we use to describe it are not used in the PLONK
> paper.
***UPA circuits*** are defined in terms of a rectangular matrix of values. We refer to
***rows***, ***columns***, and ***cells*** of this matrix with the conventional meanings.
A UPA circuit depends on a ***configuration***:
* A finite field $\mathbb{F}$, where cell values (for a given statement and witness) will be
elements of $\mathbb{F}$.
* The number of columns in the matrix, and a specification of each column as being
***fixed***, ***advice***, or ***instance***. Fixed columns are fixed by the circuit;
advice columns correspond to witness values; and instance columns are normally used for
public inputs (technically, they can be used for any elements shared between the prover
and verifier).
* A subset of the columns that can participate in equality constraints.
* A ***polynomial degree bound***.
* A sequence of ***polynomial constraints***. These are multivariate polynomials over
$\mathbb{F}$ that must evaluate to zero *for each row*. The variables in a polynomial
constraint may refer to a cell in a given column of the current row, or a given column of
another row relative to this one (with wrap-around, i.e. taken modulo $n$). The maximum
degree of each polynomial is given by the polynomial degree bound.
* A sequence of ***lookup arguments*** defined over tuples of ***input expressions***
(which are multivariate polynomials as above) and ***table columns***.
A UPA circuit also defines:
* The number of rows $n$ in the matrix. $n$ must correspond to the size of a multiplicative
subgroup of $\mathbb{F}^\times$; typically a power of two.
* A sequence of ***equality constraints***, which specify that two given cells must have equal
values.
* The values of the fixed columns at each row.
From a circuit description we can generate a ***proving key*** and a ***verification key***,
which are needed for the operations of proving and verification for that circuit.
> Note that we specify the ordering of columns, polynomial constraints, lookup arguments, and
> equality constraints, even though these do not affect the meaning of the circuit. This makes
> it easier to define the generation of proving and verification keys as a deterministic
> process.
Typically, a configuration will define polynomial constraints that are switched off and on by
***selectors*** defined in fixed columns. For example, a constraint $q_i \cdot p(...) = 0$ can
be switched off for a particular row $i$ by setting $q_i = 0$. In this case we sometimes refer
to a set of constraints controlled by a set of selector columns that are designed to be used
together, as a ***gate***. Typically there will be a ***standard gate*** that supports generic
operations like field multiplication and division, and possibly also ***custom gates*** that
support more specialized operations.

View File

@ -1,18 +0,0 @@
# Chips
In order to combine functionality from several cores, we use a ***chip***. To implement a
chip, we define a set of fixed, advice, and instance columns, and then specify how they
should be distributed between cores.
In the simplest case, each core will use columns disjoint from the other cores. However, it
is allowed to share a column between cores. It is important to optimize the number of advice
columns in particular, because that affects proof size.
The result (possibly after optimization) is a UPA configuration. Our circuit implementation
will be parameterized on a chip, and can use any features of the supported cores via the chip.
Our hope is that less expert users will normally be able to find an existing chip that
supports the operations they need, or only have to make minor modifications to an existing
chip. Expert users will have full control to do the kind of
[circuit optimizations](https://zips.z.cash/protocol/canopy.pdf#circuitdesign)
[that ECC is famous for](https://electriccoin.co/blog/cultivating-sapling-faster-zksnarks/) 🙂.

View File

@ -1,68 +0,0 @@
# Cores
The previous section gives a fairly low-level description of a circuit. When implementing circuits we will
typically use a higher-level API which aims for the desirable characteristics of auditability,
efficiency, modularity, and expressiveness.
Some of the terminology and concepts used in this API are taken from an analogy with
integrated circuit design and layout. [As for integrated circuits](https://opencores.org/),
the above desirable characteristics are easier to obtain by composing ***cores*** that provide
efficient pre-built implementations of particular functionality.
For example, we might have cores that implement particular cryptographic primitives such as a
hash function or cipher, or algorithms like scalar multiplication or pairings.
In UPA, it is possible to build up arbitrary logic just from standard gates that do field
multiplication and addition. However, very significant efficiency gains can be obtained by
using custom gates.
Using our API, we define cores that "know" how to use particular sets of custom gates. This
creates an abstraction layer that isolates the implementation of a high-level circuit from the
complexity of using custom gates directly.
> Even if we sometimes need to "wear two hats", by implementing both a high-level circuit and
> the cores that it uses, the intention is that this separation will result in code that is
> easier to understand, audit, and maintain/reuse. This is partly because some potential
> implementation errors are ruled out by construction.
Gates in UPA refer to cells by ***relative references***, i.e. to the cell in a given column,
and the row at a given offset relative to the one in which the gate's selector is set. We call
this an ***offset reference*** when the offset is nonzero (i.e. offset references are a subset
of relative references).
Relative references contrast with ***absolute references*** used in equality constraints,
which can point to any cell.
The motivation for offset references is to reduce the number of columns needed in the
configuration, which reduces proof size. If we did not have offset references then we would
need a column to hold each value referred to by a custom gate, and we would need to use
equality constraints to copy values from other cells of the circuit into that column. With
offset references, we not only need fewer columns; we also do not need equality constraints to
be supported for all of those columns, which improves efficiency.
In R1CS (another arithmetization which may be more familiar to some readers, but don't worry
if it isn't), a circuit consists of a "sea of gates" with no semantically significant ordering.
Because of offset references, the order of rows in a UPA circuit, on the other hand, *is*
significant. We're going to make some simplifying assumptions and define some abstractions to
tame the resulting complexity: the aim will be that, [at the gadget level](gadgets.md) where
we do most of our circuit construction, we will not have to deal with relative references or
with gate layout explicitly.
We will partition a circuit into ***regions***, where each region contains a disjoint subset
of cells, and relative references only ever point *within* a region. Part of the responsibility
of a core implementation is to ensure that gates that make offset references are laid out in
the correct positions in a region.
Given the set of regions and their ***shapes***, we will use a separate ***floor planner***
to decide where (i.e. at what starting row) each region is placed. There is a default floor
planner that implements a very general algorithm, but you can write your own floor planner if
you need to.
Floor planning will in general leave gaps in the matrix, because the gates in a given row did
not use all available columns. These are filled in —as far as possible— by gates that do
not require offset references, which allows them to be placed on any row.
Cores can also define lookup tables. If more than one table is defined for the same lookup
argument, we can use a ***tag column*** to specify which table is used on each row. It is also
possible to perform a lookup in the union of several tables (limited by the polynomial degree
bound).

View File

@ -1,25 +0,0 @@
# Gadgets
When implementing a circuit, we could use the features of the cores we've selected directly
via the chip. Typically, though, we will use them via ***gadgets***. This indirection is
useful because, for reasons of efficiency and limitations imposed by UPA, the core interfaces
will often be dependent on low-level implementation details. The gadget interface can provide
a more convenient and stable API that abstracts away from extraneous detail.
For example, consider a hash function such as SHA-256. The interface of a core supporting
SHA-256 might be dependent on internals of the hash function design such as the separation
between message schedule and compression function. The corresponding gadget interface can
provide a more convenient and familiar `update`/`finalize` API, and can also handle parts
of the hash function that do not need core support, such as padding. This is similar to how
[accelerated](https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html)
[instructions](https://developer.arm.com/documentation/ddi0514/g/introduction/about-the-cortex-a57-processor-cryptography-engine)
for cryptographic primitives on CPUs are typically accessed via software libraries, rather
than directly.
Gadgets can also provide modular and reusable abstractions for circuit programming
at a higher level, similar to their use in libraries such as
[libsnark](https://github.com/christianlundkvist/libsnark-tutorial) and
[bellman](https://electriccoin.co/blog/bellman-zksnarks-in-rust/). As well as abstracting
*functions*, they can also abstract *types*, such as elliptic curve points or integers of
specific sizes.

View File

@ -1,91 +0,0 @@
# Proof systems
The aim of any ***proof system*** is to be able to prove interesting mathematical or
cryptographic ***statements***.
Typically, in a given protocol we will want to prove families of statements that differ
in their ***public inputs***. The prover will also need to show that they know some
***private inputs*** that make the statement hold.
To do this we write down a ***relation***, $\mathcal{R}$, that specifies which
combinations of public and private inputs are valid.
> The terminology above is intended to be aligned with the
> [ZKProof Community Reference](https://docs.zkproof.org/reference#latest-version).
To be precise, we should distinguish between the relation $\mathcal{R}$, and its
implementation to be used in a proof system. We call the latter a ***circuit***.
The language that we use to express circuits for a particular proof system is called an
***arithmetization***. Usually, an arithmetization will define circuits in terms of
polynomial constraints on variables over a field.
> The _process_ of expressing a particular relation as a circuit is also sometimes called
> "arithmetization", but we'll avoid that usage.
To create a proof of a statement, the prover will need to know the private inputs,
and also intermediate values, called ***advice*** values, that are used by the circuit.
We assume that we can compute advice values efficiently from the private and public inputs.
The particular advice values will depend on how we write the circuit, not only on the
high-level statement.
The private inputs and advice values are collectively called a ***witness***.
> Some authors use "witness" as just a synonym for private inputs. But in our usage,
> a witness includes advice, i.e. it includes all values that the prover supplies to
> the circuit.
For example, suppose that we want to prove knowledge of a preimage $x$ of a
hash function $H$ for a digest $y$:
* The private input would be the preimage $x$.
* The public input would be the digest $y$.
* The relation would be $\{(x, y) : H(x) = y\}$.
* For a particular public input $Y$, the statement would be: $\{(x) : H(x) = Y\}$.
* The advice would be all of the intermediate values in the circuit implementing the
hash function. The witness would be $x$ and the advice.
A ***Non-interactive Argument*** allows a ***prover*** to create a ***proof*** for a
given statement and witness. The proof is data that can be used to convince a ***verifier***
that _there exists_ a witness for which the statement holds. The security property that
such proofs cannot falsely convince a verifier is called ***soundness***.
A ***Non-interactive Argument of Knowledge*** (***NARK***) further convinces the verifier
that the prover _knew_ a witness for which the statement holds. This security property is
called ***knowledge soundness***, and it implies soundness.
In practice knowledge soundness is more useful for cryptographic protocols than soundness:
if we are interested in whether Alice holds a secret key in some protocol, say, we need
Alice to prove that _she knows_ the key, not just that it exists.
Knowledge soundness is formalized by saying that an ***extractor***, which can observe
precisely how the proof is generated, must be able to compute the witness.
> This property is subtle given that proofs can be ***malleable***. That is, depending on the
> proof system it may be possible to take an existing proof (or set of proofs) and, without
> knowing the witness(es), modify it/them to produce a distinct proof of the same or a related
> statement. Higher-level protocols that use malleable proof systems need to take this into
> account.
>
> Even without malleability, proofs can also potentially be ***replayed***. For instance,
> we would not want Alice in our example to be able to present a proof generated by someone
> else, and have that be taken as a demonstration that she knew the key.
If a proof yields no information about the witness (other than that a witness exists and was
known to the prover), then we say that the proof system is ***zero knowledge***.
If a proof system produces short proofs —i.e. of length polylogarithmic in the circuit
size— then we say that it is ***succinct***. A succinct NARK is called a ***SNARK***
(***Succinct Non-Interactive Argument of Knowledge***).
> By this definition, a SNARK need not have verification time polylogarithmic in the circuit
> size. Some papers use the term ***efficient*** to describe a SNARK with that property, but
> we'll avoid that term since it's ambiguous for SNARKs that support amortized or recursive
> verification, which we'll get to later.
A ***zk-SNARK*** is a zero-knowledge SNARK.

View File

@ -1,7 +0,0 @@
# Gadgets
In this section we document some example gadgets and chip designs that are suitable for
Halo 2.
> Neither these gadgets, nor their implementations, have been reviewed, and they should
> not be used in production.

View File

@ -1,65 +0,0 @@
# SHA-256
## Specification
SHA-256 is specified in [NIST FIPS PUB 180-4](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf).
Unlike the specification, we use $\boxplus$ for addition modulo $2^{32}$, and $+$ for
field addition. $\oplus$ is used for XOR.
## Gadget interface
SHA-256 maintains state in eight 32-bit variables. It processes input as 512-bit blocks,
but internally splits these blocks into 32-bit chunks. We therefore designed the SHA-256
gadget to consume input in 32-bit chunks.
## Chip instructions
The SHA-256 gadget requires a chip with the following instructions:
```rust
# extern crate halo2;
# use halo2::plonk::Error;
# use std::fmt;
#
# trait Chip: Sized {}
# trait Layouter<C: Chip> {}
const BLOCK_SIZE: usize = 16;
const DIGEST_SIZE: usize = 8;
pub trait Sha256Instructions: Chip {
/// Variable representing the SHA-256 internal state.
type State: Clone + fmt::Debug;
/// Variable representing a 32-bit word of the input block to the SHA-256 compression
/// function.
type BlockWord: Copy + fmt::Debug;
/// Places the SHA-256 IV in the circuit, returning the initial state variable.
fn initialization_vector(layouter: &mut impl Layouter<Self>) -> Result<Self::State, Error>;
/// Starting from the given initial state, processes a block of input and returns the
/// final state.
fn compress(
layouter: &mut impl Layouter<Self>,
initial_state: &Self::State,
input: [Self::BlockWord; BLOCK_SIZE],
) -> Result<Self::State, Error>;
/// Converts the given state into a message digest.
fn digest(
layouter: &mut impl Layouter<Self>,
state: &Self::State,
) -> Result<[Self::BlockWord; DIGEST_SIZE], Error>;
}
```
TODO: Add instruction for computing padding.
This set of instructions was chosen to strike a balance between the reusability of the
instructions, and the scope for chips to internally optimise them. In particular, we
considered splitting the compression function into its constituent parts (Ch, Maj etc),
and providing a compression function gadget that implemented the round logic. However,
this would prevent chips from using relative references between the various parts of a
compression round. Having an instruction that implements all compression rounds is also
similar to the Intel SHA extensions, which provide an instruction that performs multiple
compression rounds.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

View File

@ -1,899 +0,0 @@
# 16-bit table chip for SHA-256
This chip implementation is based around a single 16-bit lookup table. It requires a
minimum of $2^{16}$ circuit rows, and is therefore suitable for use in larger circuits.
We target a maximum constraint degree of $9$. That will allow us to handle constraining
carries and "small pieces" to a range of up to $\{0..7\}$ in one row.
## Compression round
There are $64$ compression rounds. Each round takes 32-bit values $A, B, C, D, E, F, G, H$
as input, and performs the following operations:
$$
\begin{array}{rcl}
Ch(E, F, G) &=& (E \wedge F) \oplus (¬E \wedge G) \\
Maj(A, B, C) &=& (A \wedge B) \oplus (A \wedge C) \oplus (B \wedge C) \\
&=& count(A, B, C) \geq 2 \\
\Sigma_0(A) &=& (A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋙ 22) \\
\Sigma_1(E) &=& (E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋙ 25) \\
H' &=& H + Ch(E, F, G) + \Sigma_1(E) + K_t + W_t \\
E_{new} &=& reduce_6(H' + D) \\
A_{new} &=& reduce_7(H' + Maj(A, B, C) + \Sigma_0(A))
\end{array}
$$
where $reduce_i$ must handle a carry $0 \leq \mathit{carry} < i$.
![The SHA-256 compression function](./compression.png)
Define $\mathtt{spread}$ as a table mapping a $16$-bit input to an output interleaved with
zero bits. We do not require a separate table for range checks because $\mathtt{spread}$
can be used.
### Modular addition
To implement addition modulo $2^{32}$, we note that this is equivalent to adding the
operands using field addition, and then masking away all but the lowest 32 bits of the
result. For example, if we have two operands $a$ and $b$:
$$a \boxplus b = c,$$
we decompose each operand (along with the result) into 16-bit chunks:
$$(a_L : \mathbb{Z}_{2^{16}}, a_H : \mathbb{Z}_{2^{16}}) \boxplus (b_L : \mathbb{Z}_{2^{16}}, b_H : \mathbb{Z}_{2^{16}}) = (c_L : \mathbb{Z}_{2^{16}}, c_H : \mathbb{Z}_{2^{16}}),$$
and then reformulate the constraint using field addition:
$$\mathsf{carry} \cdot 2^{32} + c_H \cdot 2^{16} + c_L = (a_H + b_H) \cdot 2^{16} + a_L + b_L.$$
More generally, any bit-decomposition of the output can be used, not just a decomposition
into 16-bit chunks. Note that this correctly handles the carry from $a_L + b_L$.
This constraint requires that each chunk is correctly range-checked (or else an assignment
could overflow the field).
- The operand and result chunks can be constrained using $\mathtt{spread}$, by looking up
each chunk in the "dense" column within a subset of the table. This way we additionally
get the "spread" form of the output for free; in particular this is true for the output
of the bottom-right $\boxplus$ which becomes $A_{new}$, and the output of the leftmost
$\boxplus$ which becomes $E_{new}$. We will use this below to optimize $Maj$ and $Ch$.
- $\mathsf{carry}$ must be constrained to the precise range of allowed carry values for
the number of operands. We do this with a
[small range constraint](../../../user/tips-and-tricks.md#small-range-constraints).
### Maj function
$Maj$ can be done in $4$ lookups: $2\; \mathtt{spread} * 2$ chunks
- As mentioned above, after the first round we already have $A$ in spread form $A'$.
Similarly, $B$ and $C$ are equal to the $A$ and $B$ respectively of the previous round,
and therefore in the steady state we already have them in spread form $B'$ and $C'$. In
fact we can also assume we have them in spread form in the first round, either from the
fixed IV or from the use of $\mathtt{spread}$ to reduce the output of the feedforward in
the previous block.
- Add the spread forms in the field: $M' = A' + B' + C'$;
- We can add them as $32$-bit words or in pieces; it's equivalent
- Witness the compressed even bits $M^{even}_i$ and the compressed odd bits $M^{odd}_i$ for $i = \{0..1\}$;
- Constrain $M' = \mathtt{spread}(M^{even}_0) + 2 \cdot \mathtt{spread}(M^{odd}_0) + 2^{32} \cdot \mathtt{spread}(M^{even}_1) + 2^{33} \cdot \mathtt{spread}(M^{odd}_1)$, where $M^{odd}_i$ is the $Maj$ function output.
> Note: by "even" bits we mean the bits of weight an even-power of $2$, i.e. of weight
> $2^0, 2^2, \ldots$. Similarly by "odd" bits we mean the bits of weight an odd-power of
> $2$.
### Ch function
> TODO: can probably be optimised to $4$ or $5$ lookups using an additional table.
>
$Ch$ can be done in $8$ lookups: $4\; \mathtt{spread} * 2$ chunks
- As mentioned above, after the first round we already have $E$ in spread form $E'$.
Similarly, $F$ and $G$ are equal to the $E$ and $F$ respectively of the previous round,
and therefore in the steady state we already have them in spread form $F'$ and $G'$. In
fact we can also assume we have them in spread form in the first round, either from the
fixed IV or from the use of $\mathtt{spread}$ to reduce the output of the feedforward in
the previous block.
- Calculate $P' = E' + F'$ and $Q' = (evens - E') + G'$, where $evens = \mathtt{spread}(2^{32} - 1)$.
- We can add them as $32$-bit words or in pieces; it's equivalent.
- $evens - E'$ works to compute the spread of $¬E$ even though negation and
$\mathtt{spread}$ do not commute in general. It works because each spread bit in $E'$
is subtracted from $1$, so there are no borrows.
- Witness $P^{even}_i, P^{odd}_i, Q^{even}_i, Q^{odd}_i$ such that
$P' = \mathtt{spread}(P^{even}_0) + 2 \cdot \mathtt{spread}(P^{odd}_0) + 2^{32} \cdot \mathtt{spread}(P^{even}_1) + 2^{33} \cdot \mathtt{spread}(P^{odd}_1)$, and similarly for $Q'$.
- $\{P^{odd}_i + Q^{odd}_i\}_{i=0..1}$ is the $Ch$ function output.
### Σ_0 function
$\Sigma_0(A)$ can be done in $6$ lookups.
To achieve this we first split $A$ into pieces $(a, b, c, d)$, of lengths $(2, 11, 9, 10)$
bits respectively counting from the little end. At the same time we obtain the spread
forms of these pieces. This can all be done in two PLONK rows, because the $10$ and
$11$-bit pieces can be handled using $\mathtt{spread}$ lookups, and the $9$-bit piece can
be split into $3 * 3$-bit subpieces. The latter and the remaining $2$-bit piece can be
range-checked by polynomial constraints in parallel with the two lookups, two small pieces
in each row. The spread forms of these small pieces are found by interpolation.
Note that the splitting into pieces can be combined with the reduction of $A_{new}$, i.e.
no extra lookups are needed for the latter. In the last round we reduce $A_{new}$ after
adding the feedforward (requiring a carry of up to $7$ which is fine).
$(A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋙ 22)$ is equivalent to
$(A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋘ 10)$:
![](./upp_sigma_0.png)
Then, using $4$ more $\mathtt{spread}$ lookups we obtain the result as the even bits of a
linear combination of the pieces:
$$
\begin{array}{rcccccccl}
& (a &||& d &||& c &||& b) & \oplus \\
& (b &||& a &||& d &||& c) & \oplus \\
& (c &||& b &||& a &||& d) & \\
&&&&\Downarrow \\
R' = & 4^{30} a &+& 4^{20} d &+& 4^{11} c &+& b\;&+ \\
& 4^{21} b &+& 4^{19} a &+& 4^{ 9} d &+& c\;&+ \\
& 4^{23} c &+& 4^{12} b &+& 4^{10} a &+& d\;&
\end{array}
$$
That is, we witness the compressed even bits $R^{even}_i$ and the compressed odd bits
$R^{odd}_i$, and constrain
$$R' = \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)$$
where $\{R^{even}_i\}_{i=0..1}$ is the $\Sigma_0$ function output.
### Σ_1 function
$\Sigma_1(E)$ can be done in $6$ lookups.
To achieve this we first split $E$ into pieces $(a, b, c, d)$, of lengths $(6, 5, 14, 7)$
bits respectively counting from the little end. At the same time we obtain the spread
forms of these pieces. This can all be done in two PLONK rows, because the $7$ and
$14$-bit pieces can be handled using $\mathtt{spread}$ lookups, the $5$-bit piece can be
split into $3$ and $2$-bit subpieces, and the $6$-bit piece can be split into $2 * 3$-bit
subpieces. The four small pieces can be range-checked by polynomial constraints in
parallel with the two lookups, two small pieces in each row. The spread forms of these
small pieces are found by interpolation.
Note that the splitting into pieces can be combined with the reduction of $E_{new}$, i.e.
no extra lookups are needed for the latter. In the last round we reduce $E_{new}$ after
adding the feedforward (requiring a carry of up to $6$ which is fine).
$(E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋙ 25)$ is equivalent to
$(E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋘ 7)$.
![](./upp_sigma_1.png)
Then, using $4$ more $\mathtt{spread}$ lookups we obtain the result as the even bits of a
linear combination of the pieces, in the same way we did for $\Sigma_0$:
$$
\begin{array}{rcccccccl}
& (a &||& d &||& c &||& b) & \oplus \\
& (b &||& a &||& d &||& c) & \oplus \\
& (c &||& b &||& a &||& d) & \\
&&&&\Downarrow \\
R' = & 4^{26} a &+& 4^{19} d &+& 4^{ 5} c &+& b\;&+ \\
& 4^{27} b &+& 4^{21} a &+& 4^{14} d &+& c\;&+ \\
& 4^{18} c &+& 4^{13} b &+& 4^{ 7} a &+& d\;&
\end{array}
$$
That is, we witness the compressed even bits $R^{even}_i$ and the compressed odd bits
$R^{odd}_i$, and constrain
$$R' = \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)$$
where $\{R^{even}_i\}_{i=0..1}$ is the $\Sigma_1$ function output.
## Block decomposition
For each block $M \in \{0,1\}^{512}$ of the padded message, $64$ words of $32$ bits each
are constructed as follows:
- The first $16$ are obtained by splitting $M$ into $32$-bit blocks $$M = W_0 || W_1 || \cdots || W_{14} || W_{15};$$
- The remaining $48$ words are constructed using the formula:
$$W_i = \sigma_1(W_{i-2}) \boxplus W_{i-7} \boxplus \sigma_0(W_{i-15}) \boxplus W_{i-16},$$ for $16 \leq i < 64$.
> Note: $0$-based numbering is used for the $W$ word indices.
$$
\begin{array}{ccc}
\sigma_0(X) &=& (X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3) \\
\sigma_1(X) &=& (X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10) \\
\end{array}
$$
> Note: $≫$ is a right-**shift**, not a rotation.
### σ_0 function
$(X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3)$ is equivalent to
$(X ⋙ 7) \oplus (X ⋘ 14) \oplus (X ≫ 3)$.
![](./low_sigma_0.png)
As above but with pieces $(a, b, c, d)$ of lengths $(3, 4, 11, 14)$ counting from the
little end. Split $b$ into two $2$-bit subpieces.
$$
\begin{array}{rcccccccl}
& (0^{[3]} &||& d &||& c &||& b) & \oplus \\
& (\;\;\;b &||& a &||& d &||& c) & \oplus \\
& (\;\;\;c &||& b &||& a &||& d) & \\
&&&&\Downarrow \\
R' = & & & 4^{15} d &+& 4^{ 4} c &+& b\;&+ \\
& 4^{28} b &+& 4^{25} a &+& 4^{11} d &+& c\;&+ \\
& 4^{21} c &+& 4^{17} b &+& 4^{14} a &+& d\;&
\end{array}
$$
### σ_1 function
$(X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10)$ is equivalent to
$(X ⋘ 15) \oplus (X ⋘ 13) \oplus (X ≫ 10)$.
![](./low_sigma_1.png)
TODO: this diagram doesn't match the expression on the right. This is just for consistency
with the other diagrams.
As above but with pieces $(a, b, c, d)$ of lengths $(10, 7, 2, 13)$ counting from the
little end. Split $b$ into $(3, 2, 2)$-bit subpieces.
$$
\begin{array}{rcccccccl}
& (0^{[10]}&||& d &||& c &||& b) & \oplus \\
& (\;\;\;b &||& a &||& d &||& c) & \oplus \\
& (\;\;\;c &||& b &||& a &||& d) & \\
&&&&\Downarrow \\
R' = & & & 4^{ 9} d &+& 4^{ 7} c &+& b\;&+ \\
& 4^{25} b &+& 4^{15} a &+& 4^{ 2} d &+& c\;&+ \\
& 4^{30} c &+& 4^{23} b &+& 4^{13} a &+& d\;&
\end{array}
$$
### Message scheduling
We apply $\sigma_0$ to $W_{1..48}$, and $\sigma_1$ to $W_{14..61}$. In order to avoid
redundant applications of $\mathtt{spread}$, we can merge the splitting into pieces for
$\sigma_0$ and $\sigma_1$ in the case of $W_{14..48}$. Merging the piece lengths
$(3, 4, 11, 14)$ and $(10, 7, 2, 13)$ gives pieces of lengths $(3, 4, 3, 7, 1, 1, 13)$.
![](./bit_reassignment.png)
If we can do the merged split in $3$ rows (as opposed to a total of $4$ rows when
splitting for $\sigma_0$ and $\sigma_1$ separately), we save $35$ rows.
> These might even be doable in $2$ rows; not sure.
> —Daira
We can merge the reduction mod $2^{32}$ of $W_{16..61}$ into their splitting when they are
used to compute subsequent words, similarly to what we did for $A$ and $E$ in the round
function.
We will still need to reduce $W_{62..63}$ since they are not split. (Technically we could
leave them unreduced since they will be reduced later when they are used to compute
$A_{new}$ and $E_{new}$ -- but that would require handling a carry of up to $10$ rather
than $6$, so it's not worth the complexity.)
The resulting message schedule cost is:
- $2$ rows to constrain $W_0$ to $32$ bits
- This is technically optional, but let's do it for robustness, since the rest of the
input is constrained for free.
- $13*2$ rows to split $W_{1..13}$ into $(3, 4, 11, 14)$-bit pieces
- $35*3$ rows to split $W_{14..48}$ into $(3, 4, 3, 7, 1, 1, 13)$-bit pieces (merged with
a reduction for $W_{16..48}$)
- $13*2$ rows to split $W_{49..61}$ into $(10, 7, 2, 13)$-bit pieces (merged with a
reduction)
- $4*48$ rows to extract the results of $\sigma_0$ for $W_{1..48}$
- $4*48$ rows to extract the results of $\sigma_1$ for $W_{14..61}$
- $2*2$ rows to reduce $W_{62..63}$
- $= 547$ rows.
## Overall cost
For each round:
- $8$ rows for $Ch$
- $4$ rows for $Maj$
- $6$ rows for $\Sigma_0$
- $6$ rows for $\Sigma_1$
- $reduce_6$ and $reduce_7$ are always free
- $= 24$ per round
This gives $24*64 = 1792$ rows for all of "step 3", to which we need to add:
- $547$ rows for message scheduling
- $2*8$ rows for $8$ reductions mod $2^{32}$ in "step 4"
giving a total of $2099$ rows.
## Tables
We only require one table $\mathtt{spread}$, with $2^{16}$ rows and $3$ columns. We need a
tag column to allow selecting $(7, 10, 11, 13, 14)$-bit subsets of the table for
$\Sigma_{0..1}$ and $\sigma_{0..1}$.
### `spread` table
| row | tag | table (16b) | spread (32b) |
|--------------|-----|------------------|----------------------------------|
| $0$ | 0 | 0000000000000000 | 00000000000000000000000000000000 |
| $1$ | 0 | 0000000000000001 | 00000000000000000000000000000001 |
| $2$ | 0 | 0000000000000010 | 00000000000000000000000000000100 |
| $3$ | 0 | 0000000000000011 | 00000000000000000000000000000101 |
| ... | 0 | ... | ... |
| $2^{7} - 1$ | 0 | 0000000001111111 | 00000000000000000001010101010101 |
| $2^{7}$ | 1 | 0000000010000000 | 00000000000000000100000000000000 |
| ... | 1 | ... | ... |
| $2^{10} - 1$ | 1 | 0000001111111111 | 00000000000001010101010101010101 |
| ... | 2 | ... | ... |
| $2^{11} - 1$ | 2 | 0000011111111111 | 00000000010101010101010101010101 |
| ... | 3 | ... | ... |
| $2^{13} - 1$ | 3 | 0001111111111111 | 00000001010101010101010101010101 |
| ... | 4 | ... | ... |
| $2^{14} - 1$ | 4 | 0011111111111111 | 00000101010101010101010101010101 |
| ... | 5 | ... | ... |
| $2^{16} - 1$ | 5 | 1111111111111111 | 01010101010101010101010101010101 |
For example, to do an $11$-bit $\mathtt{spread}$ lookup, we polynomial-constrain the tag
to be in $\{0, 1, 2\}$. For the most common case of a $16$-bit lookup, we don't need to
constrain the tag. Note that we can fill any unused rows beyond $2^{16}$ with a duplicate
entry, e.g. all-zeroes.
## Gates
### Choice gate
Input from previous operations:
- $E', F', G',$ 64-bit spread forms of 32-bit words $E, F, G$, assumed to be constrained by previous operations
- in practice, we'll have the spread forms of $E', F', G'$ after they've been decomposed into 16-bit subpieces
- $evens$ is defined as $\mathtt{spread}(2^{32} - 1)$
- $evens_0 = evens_1 = \mathtt{spread}(2^{16} - 1)$
#### E ∧ F
|s_ch| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ |
|----|-------------|-------------|-----------------------------|------------------------------------|------------------------------------|
|0 |{0,1,2,3,4,5}|$P_0^{even}$ |$\texttt{spread}(P_0^{even})$| $\mathtt{spread}(E^{lo})$ | $\mathtt{spread}(E^{hi})$ |
|1 |{0,1,2,3,4,5}|$P_0^{odd}$ |$\texttt{spread}(P_0^{odd})$ |$\texttt{spread}(P_1^{odd})$ | |
|0 |{0,1,2,3,4,5}|$P_1^{even}$ |$\texttt{spread}(P_1^{even})$| $\mathtt{spread}(F^{lo})$ | $\mathtt{spread}(F^{hi})$ |
|0 |{0,1,2,3,4,5}|$P_1^{odd}$ |$\texttt{spread}(P_1^{odd})$ | | |
#### ¬E ∧ G
s_ch_neg| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ |
--------|-------------|-------------|-----------------------------|------------------------------------|------------------------------------|------------------------------------|
0 |{0,1,2,3,4,5}|$Q_0^{even}$ |$\texttt{spread}(Q_0^{even})$|$\mathtt{spread}(E_{neg}^{lo})$ | $\mathtt{spread}(E_{neg}^{hi})$ | $\mathtt{spread}(E^{lo})$ |
1 |{0,1,2,3,4,5}|$Q_0^{odd}$ |$\texttt{spread}(Q_0^{odd})$ |$\texttt{spread}(Q_1^{odd})$ | | $\mathtt{spread}(E^{hi})$ |
0 |{0,1,2,3,4,5}|$Q_1^{even}$ |$\texttt{spread}(Q_1^{even})$|$\mathtt{spread}(G^{lo})$ | $\mathtt{spread}(G^{hi})$ | |
0 |{0,1,2,3,4,5}|$Q_1^{odd}$ |$\texttt{spread}(Q_1^{odd})$ | | | |
Constraints:
- `s_ch` (choice): $LHS - RHS = 0$
- $LHS = a_3 \omega^{-1} + a_3 \omega + 2^{32}(a_4 \omega^{-1} + a_4 \omega)$
- $RHS = a_2 \omega^{-1} + 2* a_2 + 2^{32}(a_2 \omega + 2* a_3)$
- `s_ch_neg` (negation): `s_ch` with an extra negation check
- $\mathtt{spread}$ lookup on $(a_0, a_1, a_2)$
- permutation between $(a_2, a_3)$
Output: $Ch(E, F, G) = P^{odd} + Q^{odd} = (P_0^{odd} + Q_0^{odd}) + 2^{16} (P_1^{odd} + Q_1^{odd})$
### Majority gate
Input from previous operations:
- $A', B', C',$ 64-bit spread forms of 32-bit words $A, B, C$, assumed to be constrained by previous operations
- in practice, we'll have the spread forms of $A', B', C'$ after they've been decomposed into $16$-bit subpieces
s_maj| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ |
-----|-------------|------------|-----------------------------|----------------------------|--------------------------|--------------------------|
0 |{0,1,2,3,4,5}|$M_0^{even}$|$\texttt{spread}(M_0^{even})$| |$\mathtt{spread}(A^{lo})$ |$\mathtt{spread}(A^{hi})$ |
1 |{0,1,2,3,4,5}|$M_0^{odd}$ |$\texttt{spread}(M_0^{odd})$ |$\texttt{spread}(M_1^{odd})$|$\mathtt{spread}(B^{lo})$ |$\mathtt{spread}(B^{hi})$ |
0 |{0,1,2,3,4,5}|$M_1^{even}$|$\texttt{spread}(M_1^{even})$| |$\mathtt{spread}(C^{lo})$ |$\mathtt{spread}(C^{hi})$ |
0 |{0,1,2,3,4,5}|$M_1^{odd}$ |$\texttt{spread}(M_1^{odd})$ | | | |
Constraints:
- `s_maj` (majority): $LHS - RHS = 0$
- $LHS = \mathtt{spread}(M^{even}_0) + 2 \cdot \mathtt{spread}(M^{odd}_0) + 2^{32} \cdot \mathtt{spread}(M^{even}_1) + 2^{33} \cdot \mathtt{spread}(M^{odd}_1)$
- $RHS = A' + B' + C'$
- $\mathtt{spread}$ lookup on $(a_0, a_1, a_2)$
- permutation between $(a_2, a_3)$
Output: $Maj(A,B,C) = M^{odd} = M_0^{odd} + 2^{16} M_1^{odd}$
### Σ_0 gate
$A$ is a 32-bit word split into $(2,11,9,10)$-bit chunks, starting from the little end. We refer to these chunks as $(a(2), b(11), c(9), d(10))$ respectively, and further split $c(9)$ into three 3-bit chunks $c(9)^{lo}, c(9)^{mid}, c(9)^{hi}$. We witness the spread versions of the small chunks.
$$
\begin{array}{ccc}
\Sigma_0(A) &=& (A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋙ 22) \\
&=& (A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋘ 10)
\end{array}
$$
s_upp_sigma_0| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ |
-------------|-------------|------------|-----------------------------|------------------------------|----------------------------|------------------------|-----------------------------|
0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $c(9)^{lo}$ |$\texttt{spread}(c(9)^{lo})$| $c(9)^{mid}$ |$\texttt{spread}(c(9)^{mid})$|
1 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d(10))$ |$\texttt{spread}(b(11))$| $c(9)$ |
0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $a(2)$ |$\texttt{spread}(a(2))$ | $c(9)^{hi}$ |$\texttt{spread}(c(9)^{hi})$ |
0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | |
Constraints:
- `s_upp_sigma_0` ($\Sigma_0$ constraint): $LHS - RHS + tag + decompose = 0$
$$
\begin{array}{ccc}
tag &=& constrain_1(a_0\omega^{-1}) + constrain_2(a_0\omega) \\
decompose &=& a(2) + 2^2 b(11) + 2^{13} c(9)^{lo} + 2^{16} c(9)^{mid} + 2^{19} c(9)^{hi} + 2^{22} d(10) - A\\
LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
\end{array}
$$
$$
\begin{array}{rcccccccccl}
RHS = & 4^{30} \texttt{spread}(a(2)) &+& 4^{20} \texttt{spread}(d(10)) &+& 4^{17} \texttt{spread}(c(9)^{hi}) &+& 4^{14} \texttt{spread}(c(9)^{mid}) &+& 4^{11} \texttt{spread}(c(9)^{lo}) &+& \texttt{spread}(b(11))\;&+ \\
& 4^{21} \texttt{spread}(b(11)) &+& 4^{19} \texttt{spread}(a(2)) &+& 4^{9} \texttt{spread}(d(10)) &+& 4^{6} \texttt{spread}(c(9)^{hi}) &+& 4^{3} \texttt{spread}(c(9)^{mid}) &+& \texttt{spread}(c(9)^{lo}) \;&+ \\
& 4^{29} \texttt{spread}(c(9)^{hi}) &+& 4^{26} \texttt{spread}(c(9)^{mid}) &+& 4^{23} \texttt{spread}(c(9)^{lo}) &+& 4^{12} \texttt{spread}(b(11)) &+& 4^{10} \texttt{spread}(a(2)) &+& \texttt{spread}(d(10))\;&
\end{array}
$$
- $\mathtt{spread}$ lookup on $a_0, a_1, a_2$
- 2-bit range check and 2-bit spread check on $a(2)$
- 3-bit range check and 3-bit spread check on $c(9)^{lo}, c(9)^{mid}, c(9)^{hi}$
(see section [Helper gates](#helper-gates))
Output: $\Sigma_0(A) = R^{even} = R_0^{even} + 2^{16} R_1^{even}$
### Σ_1 gate
$E$ is a 32-bit word split into $(6,5,14,7)$-bit chunks, starting from the little end. We refer to these chunks as $(a(6), b(5), c(14), d(7))$ respectively, and further split $a(6)$ into two 3-bit chunks $a(6)^{lo}, a(6)^{hi}$ and $b$ into (2,3)-bit chunks $b(5)^{lo}, b(5)^{hi}$. We witness the spread versions of the small chunks.
$$
\begin{array}{ccc}
\Sigma_1(E) &=& (E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋙ 25) \\
&=& (E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋘ 7)
\end{array}
$$
s_upp_sigma_1| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ |
-------------|-------------|------------|-----------------------------|------------------------------|----------------------------|------------------------|-----------------------------|------------|
0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $b(5)^{lo}$ |$\texttt{spread}(b(5)^{lo})$| $b(5)^{hi}$ |$\texttt{spread}(b(5)^{hi})$ | $b(5)$ |
1 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d(7))$ |$\texttt{spread}(c(14))$| | |
0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $a(6)^{lo}$ |$\texttt{spread}(a(6)^{lo})$| $a(6)^{hi}$ |$\texttt{spread}(a(6)^{hi})$ | $a(6)$ |
0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | |
Constraints:
- `s_upp_sigma_1` ($\Sigma_1$ constraint): $LHS - RHS + tag + decompose = 0$
$$
\begin{array}{ccc}
tag &=& a_0\omega^{-1} + constrain_4(a_0\omega) \\
decompose &=& a(6)^{lo} + 2^3 a(6)^{hi} + 2^6 b(5)^{lo} + 2^8 b(5)^{hi} + 2^{11} c(14) + 2^{25} d(7) - E \\
LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
\end{array}
$$
$$
\begin{array}{rcccccccccl}
RHS = & 4^{29} \texttt{spread}(a(6)^{hi}) &+& 4^{26} \texttt{spread}(a(6)^{lo}) &+& 4^{19} \texttt{spread}(d(7)) &+& 4^{ 5} \texttt{spread}(c(14)) &+& 4^{2} \texttt{spread}(b(5)^{hi}) &+& \texttt{spread}(b(5)^{lo})\;&+ \\
& 4^{29} \texttt{spread}(b(5)^{hi}) &+& 4^{27} \texttt{spread}(b(5)^{lo}) &+& 4^{24} \texttt{spread}(a(6)^{hi}) &+& 4^{21} \texttt{spread}(a(6)^{lo}) &+& 4^{14} \texttt{spread}(d(7)) &+& \texttt{spread}(c(14))\;&+ \\
& 4^{18} \texttt{spread}(c(14)) &+& 4^{15} \texttt{spread}(b(5)^{hi}) &+& 4^{13} \texttt{spread}(b(5)^{lo}) &+& 4^{10} \texttt{spread}(a(6)^{hi}) &+& 4^{7} \texttt{spread}(a(6)^{lo}) &+& \texttt{spread}(d(7))\;&
\end{array}
$$
- $\mathtt{spread}$ lookup on $a_0, a_1, a_2$
- 2-bit range check and 2-bit spread check on $b(5)^{lo}$
- 3-bit range check and 3-bit spread check on $a(6)^{lo}, a(6)^{hi}, b(4)^{hi}$
(see section [Helper gates](#helper-gates))
Output: $\Sigma_1(E) = R^{even} = R_0^{even} + 2^{16} R_1^{even}$
### σ_0 gate
#### v1
v1 of the $\sigma_0$ gate takes in a word that's split into $(3, 4, 11, 14)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(3), b(4), c(11), d(14)).$ $b(4$ is further split into two 2-bit chunks $b(4)^{lo},b(4)^{hi}.$ We witness the spread versions of the small chunks. We already have $\texttt{spread}(c(11))$ and $\texttt{spread}(d(14))$ from the message scheduling.
$(X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3)$ is equivalent to
$(X ⋙ 7) \oplus (X ⋘ 14) \oplus (X ≫ 3)$.
s_low_sigma_0| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ |
-------------|-------------|------------|-----------------------------|-----------------------------|----------------------------|--------------------|----------------------------|
0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $b(4)^{lo}$ |$\texttt{spread}(b(4)^{lo})$| $b(4)^{hi}$ |$\texttt{spread}(b(4)^{hi})$|
1 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ |$\texttt{spread}(R_1^{odd})$ |$\texttt{spread}(c)$ |$\texttt{spread}(d)$| $b(4)$ |
0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $0$ | $0$ | $a$ | $\texttt{spread}(a)$ |
0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | |
Constraints:
- `s_low_sigma_0` ($\sigma_0$ v1 constraint): $LHS - RHS = 0$
$$
\begin{array}{ccc}
LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
\end{array}
$$
$$
\begin{array}{rccccccccl}
RHS = & & & 4^{15} d(14) &+& 4^{ 4} c(11) &+& 4^2 b(4)^{hi} &+& b(4)^{lo}\;&+ \\
& 4^{30} b(4)^{hi} &+& 4^{28} b(4)^{lo} &+& 4^{25} a(3) &+& 4^{11} d(14) &+& c(11)\;&+ \\
& 4^{21} c(11) &+& 4^{19} b(4)^{hi} &+& 4^{17} b(4)^{lo} &+& 4^{14} a(3) &+& d(14)\;&
\end{array}
$$
- check that `b` was properly split into subsections for 4-bit pieces.
- $W^{b(4)lo} + 2^2 W^{b(4)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(4)^{lo}, b(4)^{hi}$
- 3-bit range check and 3-bit spread check on $a(3)$
#### v2
v2 of the $\sigma_0$ gate takes in a word that's split into $(3, 4, 3, 7, 1, 1, 13)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(3), b(4), c(3), d(7), e(1), f(1), g(13)).$ We already have $\mathtt{spread}(d(7)), \mathtt{spread}(g(13))$ from the message scheduling. The 1-bit $e(1), f(1)$ remain unchanged by the spread operation and can be used directly. We further split $b(4)$ into two 2-bit chunks $b(4)^{lo}, b(4)^{hi}.$ We witness the spread versions of the small chunks.
$(X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3)$ is equivalent to
$(X ⋙ 7) \oplus (X ⋘ 14) \oplus (X ≫ 3)$.
s_low_sigma_0_v2| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ |
----------------|-------------|------------|-----------------------------|-----------------------------|----------------------------|------------------------|----------------------------|------------|
0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $b(4)^{lo}$ |$\texttt{spread}(b(4)^{lo})$| $b(4)^{hi}$ |$\texttt{spread}(b(4)^{hi})$| |
1 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$| $\texttt{spread}(d(7))$ |$\texttt{spread}(g(13))$| $b(4)$ | $e(1)$ |
0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $a(3)$ |$\texttt{spread}(a(3))$ | $c(3)$ |$\texttt{spread}(c(3))$ | $f(1)$ |
0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | |
Constraints:
- `s_low_sigma_0_v2` ($\sigma_0$ v2 constraint): $LHS - RHS = 0$
$$
\begin{array}{ccc}
LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
\end{array}
$$
$$
\begin{array}{rcccccccccccl}
RHS = & & & 4^{16} g(13) &+& 4^{15} f(1) &+& 4^{ 14} e(1) &+& 4^{ 7} d(7) &+& 4^{ 4} c(3) &+& 4^2 b(4)^{hi} &+& b(4)^{lo}\;&+ \\
& 4^{30} b(4)^{hi} &+& 4^{28} b(4)^{lo} &+& 4^{25} a(3) &+& 4^{12} g(13) &+& 4^{11} f(1) &+& 4^{10} e(1) &+& 4^{3} d(7) &+& c(3)\;&+ \\
& 4^{31} e(1) &+& 4^{24} d(7) &+& 4^{21} c(3) &+& 4^{19} b(4)^{hi} &+& 4^{17} b(4)^{lo} &+& 4^{14} a(3) &+& 4^{1} g(13) &+& f(1)\;&
\end{array}
$$
- check that `b` was properly split into subsections for 4-bit pieces.
- $W^{b(4)lo} + 2^2 W^{b(4)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(4)^{lo}, b(4)^{hi}$
- 3-bit range check and 3-bit spread check on $a(3), c(3)$
### σ_1 gate
#### v1
v1 of the $\sigma_1$ gate takes in a word that's split into $(10, 7, 2, 13)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(10), b(7), c(2), d(13)).$ $b(7)$ is further split into $(2, 2, 3)$-bit chunks $b(7)^{lo}, b(7)^{mid}, b(7)^{hi}.$ We witness the spread versions of the small chunks. We already have $\texttt{spread}(a(10))$ and $\texttt{spread}(d(13))$ from the message scheduling.
$(X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10)$ is equivalent to
$(X ⋘ 15) \oplus (X ⋘ 13) \oplus (X ≫ 10)$.
s_low_sigma_1| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ |
-------------|-------------|------------|-----------------------------|------------------------------|----------------------------|------------------------|-----------------------------|
0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $b(7)^{lo}$ |$\texttt{spread}(b(7)^{lo})$| $b(7)^{mid}$ |$\texttt{spread}(b(7)^{mid})$|
1 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(a(10))$ |$\texttt{spread}(d(13))$| $b(7)$ |
0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $c(2)$ |$\texttt{spread}(c(2))$ | $b(7)^{hi}$ |$\texttt{spread}(b(7)^{hi})$ |
0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | |
Constraints:
- `s_low_sigma_1` ($\sigma_1$ v1 constraint): $LHS - RHS = 0$
$$
\begin{array}{ccc}
LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
\end{array}
$$
$$
\begin{array}{rcccccccccl}
RHS = & & & 4^{ 9} d(13) &+& 4^{ 7} c(2) &+& 4^{4} b(7)^{hi} &+& 4^{2} b(7)^{mid} &+& b(7)^{lo}\;&+ \\
& 4^{29} b(7)^{hi} &+& 4^{27} b(7)^{mid} &+& 4^{25} b(7)^{lo} &+& 4^{15} a(10) &+& 4^{ 2} d(13) &+& c(2)\;&+ \\
& 4^{30} c(2) &+& 4^{27} b(7)^{hi} &+& 4^{25} b(7)^{mid} &+& 4^{23} b(7)^{lo} &+& 4^{13} a(10) &+& d(13)\;&
\end{array}
$$
- check that `b` was properly split into subsections for 7-bit pieces.
- $W^{b(7)lo} + 2^2 W^{b(7)mid} + 2^4 W^{b(7)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(7)^{lo}, b(7)^{mid}, c(2)$
- 3-bit range check and 3-bit spread check on $b(7)^{hi}$
#### v2
v2 of the $\sigma_1$ gate takes in a word that's split into $(3, 4, 3, 7, 1, 1, 13)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(3), b(4), c(3), d(7), e(1), f(1), g(13)).$ We already have $\mathtt{spread}(d(7)), \mathtt{spread}(g(13))$ from the message scheduling. The 1-bit $e(1), f(1)$ remain unchanged by the spread operation and can be used directly. We further split $b(4)$ into two 2-bit chunks $b(4)^{lo}, b(4)^{hi}.$ We witness the spread versions of the small chunks.
$(X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10)$ is equivalent to
$(X ⋘ 15) \oplus (X ⋘ 13) \oplus (X ≫ 10)$.
s_low_sigma_1_v2| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ |
----------------|-------------|------------|-----------------------------|-----------------------------|----------------------------|-------------------------|----------------------------|------------|
0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $b(4)^{lo}$ |$\texttt{spread}(b(4)^{lo})$| $b(4)^{hi}$ |$\texttt{spread}(b(4)^{hi})$| |
1 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$| $\texttt{spread}(d(7))$ | $\texttt{spread}(g(13))$| $b(4)$ | $e(1)$ |
0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $a(3)$ |$\texttt{spread}(a(3))$ | $c(3)$ |$\texttt{spread}(c(3))$ | $f(1)$ |
0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | |
Constraints:
- `s_low_sigma_1_v2` ($\sigma_1$ v2 constraint): $LHS - RHS = 0$
$$
\begin{array}{ccc}
LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
\end{array}
$$
$$
\begin{array}{rccccccccccccl}
RHS = & &&&& & & 4^{ 9} g(13) &+& 4^{ 8} f(1) &+& 4^{ 7} e(1) &+& d(7)\;&+ \\
& 4^{25} d(7) &+& 4^{22} c(3) &+& 4^{20} b(4)^{hi} &+& 4^{18} b(4)^{lo} &+& 4^{15} a &+& 4^{ 2} g(13) &+& 4^{1}f(1) &+& e(1)\;&+ \\
& 4^{31} f(1) &+& 4^{30} e(1) &+& 4^{23} d(7) &+& 4^{20} c(3) &+& 4^{18} b(4)^{hi} &+& 4^{16} b(4)^{lo} &+& 4^{13} a &+& g(13)\;&
\end{array}
$$
- check that `b` was properly split into subsections for 4-bit pieces.
- $W^{b(4)lo} + 2^2 W^{b(4)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(4)^{lo}, b(4)^{hi}$
- 3-bit range check and 3-bit spread check on $a(3), c(3)$
### Helper gates
#### Small range constraints
Let $constrain_n(x) = \prod_{i=0}^n (x-i)$. Constraining this expression to equal zero enforces that $x$ is in $[0..n].$
#### 2-bit range check
$(a - 3)(a - 2)(a - 1)(a) = 0$
sr2| $a_0$ |
---|-------|
1 | a |
#### 2-bit spread
$l_1(a) + 4*l_2(a) + 5*l_3(a) - a' = 0$
ss2| $a_0$ | $a_1$
---|-------|------
1 | a | a'
with interpolation polynomials:
- $l_0(a) = \frac{(a - 3)(a - 2)(a - 1)}{(-3)(-2)(-1)}$ ($\mathtt{spread}(00) = 0000$)
- $l_1(a) = \frac{(a - 3)(a - 2)(a)}{(-2)(-1)(1)}$ ($\mathtt{spread}(01) = 0001$)
- $l_2(a) = \frac{(a - 3)(a - 1)(a)}{(-1)(1)(2)}$ ($\mathtt{spread}(10) = 0100$)
- $l_3(a) = \frac{(a - 2)(a - 1)(a)}{(1)(2)(3)}$ ($\mathtt{spread}(11) = 0101$)
#### 3-bit range check
$(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)(a) = 0$
sr3| $a_0$ |
---|-------|
1 | a |
#### 3-bit spread
$l_1(a) + 4*l_2(a) + 5*l_3(a) + 16*l_4(a) + 17*l_5(a) + 20*l_6(a) + 21*l_7(a) - a' = 0$
ss3| $a_0$ | $a_1$
---|-------|------
1 | a | a'
with interpolation polynomials:
- $l_0(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)}{(-7)(-6)(-5)(-4)(-3)(-2)(-1)}$ ($\mathtt{spread}(000) = 000000$)
- $l_1(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a)}{(-6)(-5)(-4)(-3)(-2)(-1)(1)}$ ($\mathtt{spread}(001) = 000001$)
- $l_2(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 1)(a)}{(-5)(-4)(-3)(-2)(-1)(1)(2)}$ ($\mathtt{spread}(010) = 000100$)
- $l_3(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 3)(a - 2)(a - 1)(a)}{(-4)(-3)(-2)(-1)(1)(2)(3)}$ ($\mathtt{spread}(011) = 000101$)
- $l_4(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 3)(a - 2)(a - 1)(a)}{(-3)(-2)(-1)(1)(2)(3)(4)}$ ($\mathtt{spread}(100) = 010000$)
- $l_5(a) = \frac{(a - 7)(a - 6)(a - 4)(a - 3)(a - 2)(a - 1)(a)}{(-2)(-1)(1)(2)(3)(4)(5)}$ ($\mathtt{spread}(101) = 010001$)
- $l_6(a) = \frac{(a - 7)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)(a)}{(-1)(1)(2)(3)(4)(5)(6)}$ ($\mathtt{spread}(110) = 010100$)
- $l_7(a) = \frac{(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)(a)}{(1)(2)(3)(4)(5)(6)(7)}$ ($\mathtt{spread}(111) = 010101$)
#### reduce_6 gate
Addition $\pmod{2^{32}}$ of 6 elements
Input:
- $E$
- $\{e_i^{lo}, e_i^{hi}\}_{i=0}^5$
- $carry$
Check: $E = e_0 + e_1 + e_2 + e_3 + e_4 + e_5 \pmod{32}$
Assume inputs are constrained to 16 bits.
- Addition gate (sa):
- $a_0 + a_1 + a_2 + a_3 + a_4 + a_5 + a_6 - a_7 = 0$
- Carry gate (sc):
- $2^{16} a_6 \omega^{-1} + a_6 + [(a_6 - 5)(a_6 - 4)(a_6 -3)(a_6 - 2)(a_6 - 1)(a_6)] = 0$
sa|sc| $a_0$ | $a_1$ |$a_2$ |$a_3$ |$a_4$ |$a_5$ |$a_6$ |$a_7$ |
--|--|----------|----------|----------|----------|----------|----------|---------------|--------|
1 |0 |$e_0^{lo}$|$e_1^{lo}$|$e_2^{lo}$|$e_3^{lo}$|$e_4^{lo}$|$e_5^{lo}$|$-carry*2^{16}$|$E^{lo}$|
1 |1 |$e_0^{hi}$|$e_1^{hi}$|$e_2^{hi}$|$e_3^{hi}$|$e_4^{hi}$|$e_5^{hi}$|$carry$ |$E^{hi}$|
Assume inputs are constrained to 16 bits.
- Addition gate (sa):
- $a_0 \omega^{-1} + a_1 \omega^{-1} + a_2 \omega^{-1} + a_0 + a_1 + a_2 + a_3 \omega^{-1} - a_3 = 0$
- Carry gate (sc):
- $2^{16} a_3 \omega + a_3 \omega^{-1} = 0$
sa|sc| $a_0$ | $a_1$ |$a_2$ |$a_3$ |
--|--|----------|----------|----------|---------------|
0 |0 |$e_0^{lo}$|$e_1^{lo}$|$e_2^{lo}$|$-carry*2^{16}$|
1 |1 |$e_3^{lo}$|$e_4^{lo}$|$e_5^{lo}$|$E^{lo}$ |
0 |0 |$e_0^{hi}$|$e_1^{hi}$|$e_2^{hi}$|$carry$ |
1 |0 |$e_3^{hi}$|$e_4^{hi}$|$e_5^{hi}$|$E^{hi}$ |
#### reduce_7 gate
Addition $\pmod{2^{32}}$ of 7 elements
Input:
- $E$
- $\{e_i^{lo}, e_i^{hi}\}_{i=0}^6$
- $carry$
Check: $E = e_0 + e_1 + e_2 + e_3 + e_4 + e_5 + e_6 \pmod{32}$
Assume inputs are constrained to 16 bits.
- Addition gate (sa):
- $a_0 + a_1 + a_2 + a_3 + a_4 + a_5 + a_6 + a_7 - a_8 = 0$
- Carry gate (sc):
- $2^{16} a_7 \omega^{-1} + a_7 + [(a_7 - 6)(a_7 - 5)(a_7 - 4)(a_7 -3)(a_7 - 2)(a_7 - 1)(a_7)] = 0$
sa|sc| $a_0$ | $a_1$ |$a_2$ |$a_3$ |$a_4$ |$a_5$ |$a_6$ |$a_7$ |$a_8$ |
--|--|----------|----------|----------|----------|----------|----------|----------|---------------|--------|
1 |0 |$e_0^{lo}$|$e_1^{lo}$|$e_2^{lo}$|$e_3^{lo}$|$e_4^{lo}$|$e_5^{lo}$|$e_6^{lo}$|$-carry*2^{16}$|$E^{lo}$|
1 |1 |$e_0^{hi}$|$e_1^{hi}$|$e_2^{hi}$|$e_3^{hi}$|$e_4^{hi}$|$e_5^{hi}$|$e_6^{hi}$|$carry$ |$E^{hi}$|
### Message scheduling region
For each block $M \in \{0,1\}^{512}$ of the padded message, $64$ words of $32$ bits each are constructed as follows:
- the first $16$ are obtained by splitting $M$ into $32$-bit blocks $$M = W_0 || W_1 || \cdots || W_{14} || W_{15};$$
- the remaining $48$ words are constructed using the formula:
$$W_i = \sigma_1(W_{i-2}) \boxplus W_{i-7} \boxplus \sigma_0(W_{i-15}) \boxplus W_{i-16},$$ for $16 \leq i < 64$.
sw|sd0|sd1|sd2|sd3|ss0|ss0_v2|ss1|ss1_v2| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ | $a_8$ | $a_9$ |
--|---|---|---|---|---|------|---|------|---------------|------------------|-----------------------------------|------------------------------|----------------------------------|---------------------------------|--------------------------------- |------------------------|----------------|--------------|
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $W_{0}^{lo}$ | $\texttt{spread}(W_{0}^{lo})$ | $W_{0}^{lo}$ | $W_{0}^{hi}$ | $W_{0}$ |$\sigma_0(W_1)^{lo}$ |$\sigma_1(W_{14})^{lo}$ | $W_{9}^{lo}$ | |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $W_{0}^{hi}$ | $\texttt{spread}(W_{0}^{hi})$ | | | $W_{16}$ |$\sigma_0(W_1)^{hi}$ |$\sigma_1(W_{14})^{hi}$ | $W_{9}^{hi}$ | $carry_{16}$ |
0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4} | $W_{1}^{d(14)}$ | $\texttt{spread}(W_{1}^{d(14)})$ | $W_{1}^{lo}$ | $W_{1}^{hi}$ | $W_{1}$ |$\sigma_0(W_2)^{lo}$ |$\sigma_1(W_{15})^{lo}$ | $W_{10}^{lo}$ | |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} | $W_{1}^{c(11)}$ | $\texttt{spread}(W_{1}^{c(11)})$ | $W_{1}^{a(3)}$ | $W_{1}^{b(4)}$ | $W_{17}$ |$\sigma_0(W_2)^{hi}$ |$\sigma_1(W_{15})^{hi}$ | $W_{10}^{hi}$ | $carry_{17}$ |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_0^{even}$ | $\texttt{spread}(R_0^{even})$ | $W_{1}^{b(4)lo}$ |$\texttt{spread}(W_{1}^{b(4)lo})$ | $W_{1}^{b(4)hi}$ |$\texttt{spread}(W_{1}^{b(4)hi})$ | | | |
0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_1^{odd}$ | $\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ |$\texttt{spread}(W_{1}^{c(11)})$ |$\texttt{spread}(W_{1}^{d(14)})$ | $W_{1}^{b(4)}$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_0^{odd}$ | $\texttt{spread}(R_1^{even})$ | $0$ | $0$ | $W_{1}^{a(3)}$ |$\texttt{spread}(W_{1}^{a(3)})$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_1^{even}$ | $\texttt{spread}(R_1^{odd})$ | $\sigma_0 v1 R_0$ | $\sigma_0 v1 R_1$ | $\sigma_0 v1 R_0^{even}$ | $\sigma_0 v1 R_0^{odd}$ | | | |
..|...|...|...|...|...|... |...|... | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3} | $W_{14}^{g(13)}$ | $\texttt{spread}(W_{14}^{g(13)})$ | $W_{14}^{a(3)}$ | $W_{14}^{c(3)}$ | | | | | |
0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | $W_{14}^{d(7)}$ | $\texttt{spread}(W_{14}^{d(7)})$ | $W_{14}^{lo}$ | $W_{14}^{hi}$ | $W_{14}$ |$\sigma_0(W_{15})^{lo}$ |$\sigma_1(W_{28})^{lo}$ | $W_{23}^{lo}$ | |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $W_{14}^{b(4)}$ | $\texttt{spread}(W_{14}^{b(4)})$ | $W_{14}^{e(1)}$ | $W_{14}^{f(1)}$ | $W_{30}$ |$\sigma_0(W_{15})^{hi}$ |$\sigma_1(W_{28})^{hi}$ | $W_{23}^{hi}$ | $carry_{30}$ |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_0^{even}$ | $\texttt{spread}(R_0^{even})$ | $W_{14}^{b(4)lo}$ |$\texttt{spread}(W_{14}^{b(4)lo})$| $W_{14}^{b(4) hi}$ |$\texttt{spread}(W_{14}^{b(4)hi})$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | {0,1,2,3,4,5} | $R_0^{odd}$ | $\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ |$\texttt{spread}(W_{14}^{d(7)})$ |$\texttt{spread}(W_{14}^{g(13)})$| $W_{1}^{b(14)}$ | $W_{14}^{e(1)}$ | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_1^{even}$ | $\texttt{spread}(R_1^{even})$ | $W_{14}^{a(3)}$ |$\texttt{spread}(W_{14}^{a(3)})$ | $W_{14}^{c(3)}$ |$\texttt{spread}(W_{14}^{c(3)})$ | $W_{14}^{f(1)}$ | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_1^{odd}$ | $\texttt{spread}(R_1^{odd})$ | $\sigma_0 v2 R_0$ | $\sigma_0 v2 R_1$ |$\sigma_0 v2 R_0^{even}$ |$\sigma_0 v2 R_0^{odd}$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_0^{even}$ | $\texttt{spread}(R_0^{even})$ | $W_{14}^{b(4)lo}$ |$\texttt{spread}(W_{14}^{b(4)lo})$| $W_{14}^{b(4) hi}$ |$\texttt{spread}(W_{14}^{b(4)hi})$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | {0,1,2,3,4,5} | $R_0^{odd}$ | $\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d)$ | $\texttt{spread}(g)$ | | $W_{14}^{e(1)}$ | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_1^{even}$ | $\texttt{spread}(R_1^{even})$ | $W_{14}^{a(3)}$ |$\texttt{spread}(W_{14}^{a(3)})$ | $W_{14}^{c(3)}$ |$\texttt{spread}(W_{14}^{c(3)})$ | $W_{14}^{f(1)}$ | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $R_1^{odd}$ | $\texttt{spread}(R_1^{odd})$ | $\sigma_1 v2 R_0$ | $\sigma_1 v2 R_1$ |$\sigma_1 v2 R_0^{even}$ |$\sigma_1 v2 R_0^{odd}$ | | | |
..|...|...|...|...|...|... |...|... | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | {0,1,2,3} | $W_{49}^{d(13)}$ | $\texttt{spread}(W_{49}^{d(13)})$ | $W_{49}^{lo}$ | $W_{49}^{hi}$ | $W_{49}$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} | $W_{49}^{a(10)}$ | $\texttt{spread}(W_{49}^{a(10)})$ | $W_{49}^{c(2)}$ | $W_{49}^{b(7)}$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5} | $R_0^{even}$ | $\texttt{spread}(R_0^{even})$ | $W_{49}^{b(7)lo}$ |$\texttt{spread}(W_{49}^{b(7)lo})$| $W_{49}^{b(7)mid}$ |$\texttt{spread}(W_{49}^{b(7)mid})$| | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |{0,1,2,3,4,5} | $R_0^{odd}$ | $\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(a)$ | $\texttt{spread}(d)$ | $W_{1}^{b(49)}$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5} | $R_1^{even}$ | $\texttt{spread}(R_1^{even})$ | $W_{49}^{c(2)}$ |$\texttt{spread}(W_{49}^{c(2)})$ | $W_{49}^{b(7)hi}$ |$\texttt{spread}(W_{49}^{b(7)hi})$ | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5} | $R_1^{odd}$ | $\texttt{spread}(R_1^{odd})$ | $\sigma_1 v1 R_0$ | $\sigma_1 v1 R_1$ |$\sigma_1 v1 R_0^{even}$ |$\sigma_1 v1 R_0^{odd}$ | | | |
..|...|...|...|...|...|... |...|... | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $W_{62}^{lo}$ | $\texttt{spread}(W_{62}^{lo})$ | $W_{62}^{lo}$ | $W_{62}^{hi}$ | $W_{62}$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $W_{62}^{hi}$ | $\texttt{spread}(W_{62}^{hi})$ | | | | | | | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $W_{63}^{lo}$ | $\texttt{spread}(W_{63}^{lo})$ | $W_{63}^{lo}$ | $W_{63}^{hi}$ | $W_{63}$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2,3,4,5} | $W_{63}^{hi}$ | $\texttt{spread}(W_{63}^{hi})$ | | | | | | | |
Constraints:
- `sw`: construct word using $reduce_4$
- `sd0`: decomposition gate for $W_0, W_{62}, W_{63}$
- $W^{lo} + 2^{16} W^{hi} - W = 0$
- `sd1`: decomposition gate for $W_{1..13}$ (split into $(3,4,11,14)$-bit pieces)
- $W^{a(3)} + 2^3 W^{b(4) lo} + 2^5 W^{b(4) hi} + 2^7 W^{c(11)} + 2^{18} W^{d(14)} - W = 0$
- `sd2`: decomposition gate for $W_{14..48}$ (split into $(3,4,3,7,1,1,13)$-bit pieces)
- $W^{a(3)} + 2^3 W^{b(4) lo} + 2^5 W^{b(4) hi} + 2^7 W^{c(11)} + 2^{10} W^{d(14)} + 2^{17} W^{e(1)} + 2^{18} W^{f(1)} + 2^{19} W^{g(13)} - W = 0$
- `sd3`: decomposition gate for $W_{49..61}$ (split into $(10,7,2,13)$-bit pieces)
- $W^{a(10)} + 2^{10} W^{b(7) lo} + 2^{12} W^{b(7) mid} + 2^{15} W^{b(7) hi} + 2^{17} W^{c(2)} + 2^{19} W^{d(13)} - W = 0$
### Compression region
```plaintext
+----------------------------------------------------------+
| |
| decompose E, |
| Σ_1(E) |
| |
| +---------------------------------------+
| | |
| | reduce_5() to get H' |
| | |
+----------------------------------------------------------+
| decompose F, decompose G |
| |
| Ch(E,F,G) |
| |
+----------------------------------------------------------+
| |
| decompose A, |
| Σ_0(A) |
| |
| |
| +---------------------------------------+
| | |
| | reduce_7() to get A_new, |
| | using H' |
| | |
+------------------+---------------------------------------+
| decompose B, decompose C |
| |
| Maj(A,B,C) |
| |
| +---------------------------------------+
| | reduce_6() to get E_new, |
| | using H' |
+------------------+---------------------------------------+
```
#### Initial round:
s_digest|sd_abcd|sd_efgh|ss0|ss1|s_maj|s_ch_neg|s_ch|s_a_new |s_e_new |s_h_prime| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ | $a_8$ | $a_9$ |
--------|-------|-------|---|---|-----|--------|----|--------|--------|---------|-------------|------------|-----------------------------|-------------------------------------|-------------------------------------|----------------------------------------|------------------------------------|------------------------------------|------------------------------------|------------------------------------|
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$F_0 d(7)$ |$\texttt{spread}(E_0 d(7)) $ | $E_0 b(5)^{lo}$ | $\texttt{spread}(E_0 b(5)^{lo})$ | $E_0 b(5)^{hi}$ | $\texttt{spread}(E_0 b(5)^{hi}) $ | $E_0^{lo}$ | $\mathtt{spread}(E_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$E_0 c(14)$ |$\texttt{spread}(E_0 c(14))$ | $E_0 a(6)^{lo}$ | $\texttt{spread}(E_0 a(6)^{lo})$ | $E_0 a(6)^{hi}$ | $\texttt{spread}(E_0 a(6)^{hi}) $ | $E_0^{hi}$ | $\mathtt{spread}(E_0^{hi})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $\texttt{spread}(E_0 b(5)^{lo})$ | $\texttt{spread}(E_0 b(5)^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(E_0 d(7))$ | $\texttt{spread}(E_0 c(14))$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $\texttt{spread}(E_0 a(6)^{lo})$ | $\texttt{spread}(E_0 a(6)^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | | | |
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$F_0 d(7)$ |$\texttt{spread}(F_0 d(7)) $ | $F_0 b(5)^{lo}$ | $\texttt{spread}(F_0 b(5)^{lo})$ | $F_0 b(5)^{hi}$ | $\texttt{spread}(F_0 b(5)^{hi}) $ | $F_0^{lo}$ | $\mathtt{spread}(F_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$F_0 c(14)$ |$\texttt{spread}(F_0 c(14))$ | $F_0 a(6)^{lo}$ | $\texttt{spread}(F_0 a(6)^{lo})$ | $F_0 a(6)^{hi}$ | $\texttt{spread}(F_0 a(6)^{hi}) $ | $F_0^{hi}$ | $\mathtt{spread}(F_0^{hi})$ | |
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$G_0 d(7)$ |$\texttt{spread}(G_0 d(7)) $ | $G_0 b(5)^{lo}$ | $\texttt{spread}(G_0 b(5)^{lo})$ | $G_0 b(5)^{hi}$ | $\texttt{spread}(G_0 b(5)^{hi}) $ | $G_0^{lo}$ | $\mathtt{spread}(G_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$G_0 c(14)$ |$\texttt{spread}(G_0 c(14))$ | $G_0 a(6)^{lo}$ | $\texttt{spread}(G_0 a(6)^{lo})$ | $G_0 a(6)^{hi}$ | $\texttt{spread}(G_0 a(6)^{hi}) $ | $G_0^{hi}$ | $\mathtt{spread}(G_0^{hi})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$P_0^{even}$|$\texttt{spread}(P_0^{even})$| $\mathtt{spread}(E^{lo})$ | $\mathtt{spread}(E^{hi})$ | $Q_0^{odd}$ | $K_0^{lo}$ | $H_0^{lo}$ | $W_0^{lo}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |{0,1,2,3,4,5}|$P_0^{odd}$ |$\texttt{spread}(P_0^{odd})$ | $\texttt{spread}(P_1^{odd})$ | $\Sigma_1(E_0)^{lo}$ | $\Sigma_1(E_0)^{hi}$ | $K_0^{hi}$ | $H_0^{hi}$ | $W_0^{hi}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$P_1^{even}$|$\texttt{spread}(P_1^{even})$| $\mathtt{spread}(F^{lo})$ | $\mathtt{spread}(F^{hi})$ | $Q_1^{odd}$ | $P_1^{odd}$ | $Hprime_0^{lo}$ | $Hprime_0^{hi}$ | $Hprime_0 carry$ |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$P_1^{odd}$ |$\texttt{spread}(P_1^{odd})$ | | | | | $D_0^{lo}$ | $E_1^{lo}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |{0,1,2,3,4,5}|$Q_0^{even}$|$\texttt{spread}(Q_0^{even})$| $\mathtt{spread}(E_{neg}^{lo})$ | $\mathtt{spread}(E_{neg}^{hi})$ | $\mathtt{spread}(E^{lo})$ | | $D_0^{hi}$ | $E_1^{hi}$ | $E_1 carry$ |
0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$Q_0^{odd}$ |$\texttt{spread}(Q_0^{odd})$ | $\texttt{spread}(Q_1^{odd})$ | | $\mathtt{spread}(E^{hi})$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$Q_1^{even}$|$\texttt{spread}(Q_1^{even})$| $\mathtt{spread}(G^{lo})$ | $\mathtt{spread}(G^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$Q_1^{odd}$ |$\texttt{spread}(Q_1^{odd})$ | | | | | | | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$A_0 b(11)$ |$\texttt{spread}(A_0 b(11))$ | $A_0 c(9)^{lo}$ | $\texttt{spread}(A_0 c(9)^{lo})$ | $A_0 c(9)^{mid}$ | $\texttt{spread}(A_0 c(9)^{mid})$ | $A_0^{lo}$ | $\mathtt{spread}(A_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$A_0 d(10)$ |$\texttt{spread}(A_0 d(10))$ | $A_0 a(2)$ | $\texttt{spread}(A_0 a(2))$ | $A_0 c(9)^{hi}$ | $\texttt{spread}(A_0 c(9)^{hi})$ | $A_0^{hi}$ | $\mathtt{spread}(A_0^{hi})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $\texttt{spread}(c(9)^{lo})$ | $\texttt{spread}(c(9)^{mid})$ | | | | | |
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d(10))$ | $\texttt{spread}(b(11))$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $\texttt{spread}(a(2))$ | $\texttt{spread}(c(9)^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | | | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$B_0 b(11)$ |$\texttt{spread}(B_0 b(11))$ | $B_0 c(9)^{lo}$ | $\texttt{spread}(B_0 c(9)^{lo})$ | $B_0 c(9)^{mid}$ | $\texttt{spread}(B_0 c(9)^{mid})$ | $B_0^{lo}$ | $\mathtt{spread}(B_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$B_0 d(10)$ |$\texttt{spread}(B_0 d(10))$ | $B_0 a(2)$ | $\texttt{spread}(B_0 a(2))$ | $B_0 c(9)^{hi}$ | $\texttt{spread}(B_0 c(9)^{hi})$ | $B_0^{hi}$ | $\mathtt{spread}(B_0^{hi})$ | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$C_0 b(11)$ |$\texttt{spread}(C_0 b(11))$ | $C_0 c(9)^{lo}$ | $\texttt{spread}(C_0 c(9)^{lo})$ | $C_0 c(9)^{mid}$ | $\texttt{spread}(C_0 c(9)^{mid})$ | $C_0^{lo}$ | $\mathtt{spread}(C_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$C_0 d(10)$ |$\texttt{spread}(C_0 d(10))$ | $C_0 a(2)$ | $\texttt{spread}(C_0 a(2))$ | $C_0 c(9)^{hi}$ | $\texttt{spread}(C_0 c(9)^{hi})$ | $C_0^{hi}$ | $\mathtt{spread}(C_0^{hi})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$M_0^{even}$|$\texttt{spread}(M_0^{even})$| $M_1^{odd}$ | $\mathtt{spread}(A_0^{lo})$ | $\mathtt{spread}(A_0^{hi})$ | | $Hprime_0^{lo}$ | $Hprime_0^{hi}$ | |
0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 |{0,1,2,3,4,5}|$M_0^{odd}$ |$\texttt{spread}(M_0^{odd})$ | $\texttt{spread}(M_1^{odd})$ | $\mathtt{spread}(B_0^{lo})$ | $\mathtt{spread}(B_0^{hi})$ | $\Sigma_0(A_0)^{lo}$ | | $A_1^{lo}$ | $A_1 carry$ |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$M_1^{even}$|$\texttt{spread}(M_1^{even})$| | $\mathtt{spread}(C_0^{lo})$ | $\mathtt{spread}(C_0^{hi})$ | $\Sigma_0(A_0)^{hi}$ | | $A_1^{hi}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$M_1^{odd}$ |$\texttt{spread}(M_1^{odd})$ | | | | | | | |
#### Steady-state:
s_digest|sd_abcd|sd_efgh|ss0|ss1|s_maj|s_ch_neg|s_ch|s_a_new |s_e_new |s_h_prime| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ | $a_8$ | $a_9$ |
--------|-------|-------|---|---|-----|--------|----|--------|--------|---------|-------------|------------|-----------------------------|-------------------------------------|-------------------------------------|----------------------------------------|------------------------------------|------------------------------------|------------------------------------|------------------------------------|
0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$F_0 d(7)$ |$\texttt{spread}(E_0 d(7)) $ | $E_0 b(5)^{lo}$ | $\texttt{spread}(E_0 b(5)^{lo})$ | $E_0 b(5)^{hi}$ | $\texttt{spread}(E_0 b(5)^{hi}) $ | $E_0^{lo}$ | $\mathtt{spread}(E_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$E_0 c(14)$ |$\texttt{spread}(E_0 c(14))$ | $E_0 a(6)^{lo}$ | $\texttt{spread}(E_0 a(6)^{lo})$ | $E_0 a(6)^{hi}$ | $\texttt{spread}(E_0 a(6)^{hi}) $ | $E_0^{hi}$ | $\mathtt{spread}(E_0^{hi})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $\texttt{spread}(E_0 b(5)^{lo})$ | $\texttt{spread}(E_0 b(5)^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(E_0 d(7))$ | $\texttt{spread}(E_0 c(14))$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $\texttt{spread}(E_0 a(6)^{lo})$ | $\texttt{spread}(E_0 a(6)^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$P_0^{even}$|$\texttt{spread}(P_0^{even})$| $\mathtt{spread}(E^{lo})$ | $\mathtt{spread}(E^{hi})$ | $Q_0^{odd}$ | $K_0^{lo}$ | $H_0^{lo}$ | $W_0^{lo}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |{0,1,2,3,4,5}|$P_0^{odd}$ |$\texttt{spread}(P_0^{odd})$ | $\texttt{spread}(P_1^{odd})$ | $\Sigma_1(E_0)^{lo}$ | $\Sigma_1(E_0)^{hi}$ | $K_0^{hi}$ | $H_0^{hi}$ | $W_0^{hi}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$P_1^{even}$|$\texttt{spread}(P_1^{even})$| $\mathtt{spread}(F^{lo})$ | $\mathtt{spread}(F^{hi})$ | $Q_1^{odd}$ | $P_1^{odd}$ | $Hprime_0^{lo}$ | $Hprime_0^{hi}$ | $Hprime_0 carry$ |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$P_1^{odd}$ |$\texttt{spread}(P_1^{odd})$ | | | | | $D_0^{lo}$ | $E_1^{lo}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |{0,1,2,3,4,5}|$Q_0^{even}$|$\texttt{spread}(Q_0^{even})$| $\mathtt{spread}(E_{neg}^{lo})$ | $\mathtt{spread}(E_{neg}^{hi})$ | $\mathtt{spread}(E^{lo})$ | | $D_0^{hi}$ | $E_1^{hi}$ | $E_1 carry$ |
0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$Q_0^{odd}$ |$\texttt{spread}(Q_0^{odd})$ | $\texttt{spread}(Q_1^{odd})$ | | $\mathtt{spread}(E^{hi})$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$Q_1^{even}$|$\texttt{spread}(Q_1^{even})$| $\mathtt{spread}(G^{lo})$ | $\mathtt{spread}(G^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$Q_1^{odd}$ |$\texttt{spread}(Q_1^{odd})$ | | | | | | | |
0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1,2} |$A_0 b(11)$ |$\texttt{spread}(A_0 b(11))$ | $A_0 c(9)^{lo}$ | $\texttt{spread}(A_0 c(9)^{lo})$ | $A_0 c(9)^{mid}$ | $\texttt{spread}(A_0 c(9)^{mid})$ | $A_0^{lo}$ | $\mathtt{spread}(A_0^{lo})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | {0,1} |$A_0 d(10)$ |$\texttt{spread}(A_0 d(10))$ | $A_0 a(2)$ | $\texttt{spread}(A_0 a(2))$ | $A_0 c(9)^{hi}$ | $\texttt{spread}(A_0 c(9)^{hi})$ | $A_0^{hi}$ | $\mathtt{spread}(A_0^{hi})$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$| $\texttt{spread}(c(9)^{lo})$ | $\texttt{spread}(c(9)^{mid})$ | | | | | |
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d(10))$ | $\texttt{spread}(b(11))$ | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$| $\texttt{spread}(a(2))$ | $\texttt{spread}(c(9)^{hi})$ | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ | | | | | | | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$M_0^{even}$|$\texttt{spread}(M_0^{even})$| $M_1^{odd}$ | $\mathtt{spread}(A_0^{lo})$ | $\mathtt{spread}(A_0^{hi})$ | | $Hprime_0^{lo}$ | $Hprime_0^{hi}$ | |
0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 |{0,1,2,3,4,5}|$M_0^{odd}$ |$\texttt{spread}(M_0^{odd})$ | $\texttt{spread}(M_1^{odd})$ | $\mathtt{spread}(B_0^{lo})$ | $\mathtt{spread}(B_0^{hi})$ | $\Sigma_0(A_0)^{lo}$ | | $A_1^{lo}$ | $A_1 carry$ |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$M_1^{even}$|$\texttt{spread}(M_1^{even})$| | $\mathtt{spread}(C_0^{lo})$ | $\mathtt{spread}(C_0^{hi})$ | $\Sigma_0(A_0)^{hi}$ | | $A_1^{hi}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |{0,1,2,3,4,5}|$M_1^{odd}$ |$\texttt{spread}(M_1^{odd})$ | | | | | | | |
#### Final digest:
s_digest|sd_abcd|sd_efgh|ss0|ss1|s_maj|s_ch_neg|s_ch|s_a_new |s_e_new |s_h_prime| $a_0$ | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ | $a_7$ | $a_8$ | $a_9$ |
--------|-------|-------|---|---|-----|--------|----|--------|--------|---------|-------------|-------------|------------------------------|-------------------------------------|-------------------------------------|----------------------------------------|------------------------------------|------------------------------------|------------------------------------|------------------------------------|
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $A_{63}^{lo}$ | $A_{63}^{hi}$ | $A_{63}$ | $B_{63}^{lo}$ | $B_{63}^{hi}$ | $B_{63}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $C_{63}^{lo}$ | $C_{63}^{hi}$ | $C_{63}$ | $C_{63}^{lo}$ | $C_{63}^{hi}$ | $C_{63}$ | |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $E_{63}^{lo}$ | $E_{63}^{hi}$ | $E_{63}$ | $G_{63}^{lo}$ | $G_{63}^{hi}$ | $G_{63}$ | |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | $F_{63}^{lo}$ | $F_{63}^{hi}$ | $F_{63}$ | $H_{63}^{lo}$ | $H_{63}^{hi}$ | $H_{63}$ | |

Binary file not shown.

Before

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 42 KiB

View File

@ -1,33 +1 @@
# Implementation
## Proofs as opaque byte streams
In proving system implementations like `bellman`, there is a concrete `Proof` struct that
encapsulates the proof data, is returned by a prover, and can be passed to a verifier.
`halo2` does not contain any proof-like structures, for several reasons:
- The Proof structures would contain vectors of (vectors of) curve points and scalars.
This complicates serialization/deserialization of proofs because the lengths of these
vectors depend on the configuration of the circuit. However, we didn't want to encode
the lengths of vectors inside of proofs, because at runtime the circuit is fixed, and
thus so are the proof sizes.
- It's easy to accidentally put stuff into a Proof structure that isn't also placed in the
transcript, which is a hazard when developing and implementing a proving system.
- We needed to be able to create multiple PLONK proofs at the same time; these proofs
share many different substructures when they are for the same circuit.
Instead, `halo2` treats proof objects as opaque byte streams. Creation and consumption of
these byte streams happens via the transcript:
- The `TranscriptWrite` trait represents something that we can write proof components to
(at proving time).
- The `TranscriptRead` trait represents something that we can read proof components from
(at verifying time).
Crucially, implementations of `TranscriptWrite` are responsible for simultaneously writing
to some `std::io::Write` buffer at the same time that they hash things into the transcript,
and similarly for `TranscriptRead`/`std::io::Read`.
As a bonus, treating proofs as opaque byte streams ensures that verification accounts for
the cost of deserialization, which isn't negligible due to point compression.

View File

@ -1,7 +1,7 @@
# Fields
The [Pasta curves](https://electriccoin.co/blog/the-pasta-curves-for-halo-2-and-beyond/)
that we use in `halo2` are designed to be highly 2-adic, meaning that a large $2^S$
are designed to be highly 2-adic, meaning that a large $2^S$
[multiplicative subgroup](../../background/fields.md#multiplicative-subgroups) exists in
each field. That is, we can write $p - 1 \equiv 2^S \cdot T$ with $T$ odd. For both Pallas
and Vesta, $S = 32$; this helps to simplify the field implementations.
@ -9,8 +9,8 @@ and Vesta, $S = 32$; this helps to simplify the field implementations.
## Sarkar square-root algorithm (table-based variant)
We use a technique from [Sarkar2020](https://eprint.iacr.org/2020/1407.pdf) to compute
[square roots](../../background/fields.md#square-roots) in `halo2`. The intuition behind
the algorithm is that we can split the task into computing square roots in each
[square roots](../../background/fields.md#square-roots) in `pasta_curves`. The intuition
behind the algorithm is that we can split the task into computing square roots in each
multiplicative subgroup.
Suppose we want to find the square root of $u$ modulo one of the Pasta primes $p$, where

View File

@ -1,74 +0,0 @@
# Proving system
The Halo 2 proving system can be broken down into five stages:
1. Commit to polynomials encoding the main components of the circuit:
- Cell assignments.
- Permuted values and products for each lookup argument.
- Equality constraint permutations.
2. Construct the vanishing argument to constrain all circuit relations to zero:
- Standard and custom gates.
- Lookup argument rules.
- Equality constraint permutation rules.
3. Evaluate the above polynomials at all necessary points:
- All relative rotations used by custom gates across all columns.
- Vanishing argument pieces.
4. Construct the multipoint opening argument to check that all evaluations are consistent
with their respective commitments.
5. Run the inner product argument to create a polynomial commitment opening proof for the
multipoint opening argument polynomial.
These stages are presented in turn across this section of the book.
## Example
To aid our explanations, we will at times refer to the following example constraint
system:
- Four advice columns $a, b, c, d$.
- One fixed column $f$.
- Three custom gates:
- $a \cdot b \cdot c_{-1} - d = 0$
- $f_{-1} \cdot c = 0$
- $f \cdot d \cdot a = 0$
## tl;dr
The table below provides a (probably too) succinct description of the Halo 2 protocol.
This description will likely be replaced by the Halo 2 paper and security proof, but for
now serves as a summary of the following sub-sections.
| Prover | | Verifier |
| --------------------------------------------------------------------------- | ------- | ---------------------------------- |
| | $\larr$ | $t(X) = (X^n - 1)$ |
| | $\larr$ | $F = [F_0, F_1, \dots, F_{m - 1}]$ |
| $\mathbf{A} = [A_0, A_1, \dots, A_{m - 1}]$ | $\rarr$ | |
| | $\larr$ | $\theta$ |
| $\mathbf{L} = [(A'_0, S'_0), \dots, (A'_{m - 1}, S'_{m - 1})]$ | $\rarr$ | |
| | $\larr$ | $\beta, \gamma$ |
| $\mathbf{Z_P} = [Z_{P,0}, Z_{P,1}, \ldots]$ | $\rarr$ | |
| $\mathbf{Z_L} = [Z_{L,0}, Z_{L,1}, \ldots]$ | $\rarr$ | |
| | $\larr$ | $y$ |
| $h(X) = \frac{\text{gate}_0(X) + \dots + y^i \cdot \text{gate}_i(X)}{t(X)}$ | | |
| $h(X) = h_0(X) + \dots + X^{n(d-1)} h_{d-1}(X)$ | | |
| $\mathbf{H} = [H_0, H_1, \dots, H_{d-1}]$ | $\rarr$ | |
| | $\larr$ | $x$ |
| $evals = [A_0(x), \dots, H_{d - 1}(x)]$ | $\rarr$ | |
| | | Checks $h(x)$ |
| | $\larr$ | $x_1, x_2$ |
| Constructs $h'(X)$ multipoint opening poly | | |
| $U = \text{Commit}(h'(X))$ | $\rarr$ | |
| | $\larr$ | $x_3$ |
| $\mathbf{q}_\text{evals} = [Q_0(x_3), Q_1(x_3), \dots]$ | $\rarr$ | |
| $u_\text{eval} = U(x_3)$ | $\rarr$ | |
| | $\larr$ | $x_4$ |
Then the prover and verifier:
- Construct $\text{finalPoly}(X)$ as a linear combination of $\mathbf{Q}$ and $U$ using
powers of $x_4$;
- Construct $\text{finalPolyEval}$ as the equivalent linear combination of
$\mathbf{q}_\text{evals}$ and $u_\text{eval}$; and
- Perform $\text{InnerProduct}(\text{finalPoly}(X), x_3, \text{finalPolyEval}).$
> TODO: Write up protocol components that provide zero-knowledge.

View File

@ -1,102 +0,0 @@
# Circuit commitments
## Committing to the circuit assignments
At the start of proof creation, the prover has a table of cell assignments that it claims
satisfy the constraint system. The table has $n = 2^k$ rows, and is broken into advice,
instance, and fixed columns. We define $F_{i,j}$ as the assignment in the $j$th row of
the $i$th fixed column. Without loss of generality, we'll similarly define $A_{i,j}$ to
represent the advice and instance assignments.
> We separate fixed columns here because they are provided by the verifier, whereas the
> advice and instance columns are provided by the prover. In practice, the commitments to
> instance and fixed columns are computed by both the prover and verifier, and only the
> advice commitments are stored in the proof.
To commit to these assignments, we construct Lagrange polynomials of degree $n - 1$ for
each column, over an evaluation domain of size $n$ (where $\omega$ is the $n$th primitive
root of unity):
- $a_i(X)$ interpolates such that $a_i(\omega^j) = A_{i,j}$.
- $f_i(X)$ interpolates such that $f_i(\omega^j) = F_{i,j}$.
We then create a blinding commitment to the polynomial for each column:
$$\mathbf{A} = [\text{Commit}(a_0(X)), \dots, \text{Commit}(a_i(X))]$$
$$\mathbf{F} = [\text{Commit}(f_0(X)), \dots, \text{Commit}(f_i(X))]$$
$\mathbf{F}$ is constructed as part of key generation, using a blinding factor of $1$.
$\mathbf{A}$ is constructed by the prover and sent to the verifier.
## Committing to the lookup permutations
The verifier starts by sampling $\theta$, which is used to keep individual columns within
lookups independent. Then, the prover commits to the permutations for each lookup as
follows:
- Given a lookup with input column polynomials $[A_0(X), \dots, A_{m-1}(X)]$ and table
column polynomials $[S_0(X), \dots, S_{m-1}(X)]$, the prover constructs two compressed
polynomials
$$A_\text{compressed}(X) = \theta^{m-1} A_0(X) + \theta^{m-2} A_1(X) + \dots + \theta A_{m-2}(X) + A_{m-1}(X)$$
$$S_\text{compressed}(X) = \theta^{m-1} S_0(X) + \theta^{m-2} S_1(X) + \dots + \theta S_{m-2}(X) + S_{m-1}(X)$$
- The prover then permutes $A_\text{compressed}(X)$ and $S_\text{compressed}(X)$ according
to the [rules of the lookup argument](lookup.md), obtaining $A'(X)$ and $S'(X)$.
Finally, the prover creates blinding commitments for all of the lookups
$$\mathbf{L} = \left[ (\text{Commit}(A'(X))), \text{Commit}(S'(X))), \dots \right]$$
and sends them to the verifier.
## Committing to the equality constraint permutations
The verifier samples $\beta$ and $\gamma$.
For each equality constraint argument:
- The prover constructs a vector $P$:
$$
P_j = \prod\limits_{i=0}^{m-1} \frac{p_i(\omega^j) + \beta \cdot \delta^i \cdot \omega^j + \gamma}{p_i(\omega^j) + \beta \cdot s_i(\omega^j) + \gamma}
$$
- The prover constructs a polynomial $Z_P$ which has a Lagrange basis representation
corresponding to a running product of $P$, starting at $Z_P(1) = 1$.
See the [Permutation argument](permutation.md#argument-specification) section for more detail.
The prover creates blinding commitments to each $Z_P$ polynomial:
$$\mathbf{Z_P} = \left[\text{Commit}(Z_P(X)), \dots \right]$$
and sends them to the verifier.
## Committing to the lookup permutation product columns
In addition to committing to the individual permuted lookups, for each lookup,
the prover needs to commit to the permutation product column:
- The prover constructs a vector $P$:
$$
P_j = \frac{(A_\text{compressed}(\omega^j) + \beta)(S_\text{compressed}(\omega^j) + \gamma)}{(A'(\omega^j) + \beta)(S'(\omega^j) + \gamma)}
$$
- The prover constructs a polynomial $Z_L$ which has a Lagrange basis representation
corresponding to a running product of $P$, starting at $Z_L(1) = 1$.
$\beta$ and $\gamma$ are used to combine the permutation arguments for $A'(X)$ and $S'(X)$
while keeping them independent. We can reuse $\beta$ and $\gamma$ from the equality
constraint permutation here because they serve the same purpose in both places, and we
aren't trying to combine the lookup and equality constraint permutation arguments. The
important thing here is that the verifier samples $\beta$ and $\gamma$ after the prover
has created $\mathbf{A}$, $\mathbf{F}$, and $\mathbf{L}$ (and thus commited to all the
cell values used in lookup columns, as well as $A'(X)$ and $S'(X)$ for each lookup).
As before, the prover creates blinding commitments to each $Z_L$ polynomial:
$$\mathbf{Z_L} = \left[\text{Commit}(Z_L(X)), \dots \right]$$
and sends them to the verifier.

View File

@ -1,54 +0,0 @@
# Comparison to other work
## BCMS20 Appendix A.2
Appendix A.2 of [BCMS20] describes a polynomial commitment scheme that is similar to the
one described in [BGH19] (BCMS20 being a generalization of the original Halo paper). Halo
2 builds on both of these works, and thus itself uses a polynomial commitment scheme that
is very similar to the one in BCMS20.
[BGH19]: https://eprint.iacr.org/2019/1021
[BCMS20]: https://eprint.iacr.org/2020/499
The following table provides a mapping between the variable names in BCMS20, and the
equivalent objects in Halo 2 (which builds on the nomenclature from the Halo paper):
| BCMS20 | Halo 2 |
| :------------: | :-----------------: |
| $S$ | $H$ |
| $H$ | $U$ |
| $C$ | `msm` or $P$ |
| $\alpha$ | $\iota$ |
| $\xi_0$ | $z$ |
| $\xi_i$ | `challenge_i` |
| $H'$ | $[z] U$ |
| $\bar{p}$ | `s_poly` |
| $\bar{\omega}$ | `s_poly_blind` |
| $\bar{C}$ | `s_poly_commitment` |
| $h(X)$ | $g(X)$ |
| $\omega'$ | `blind` / $\xi$ |
| $\mathbf{c}$ | $\mathbf{a}$ |
| $c$ | $a = \mathbf{a}_0$ |
| $v'$ | $ab$ |
Halo 2's polynomial commitment scheme differs from Appendix A.2 of BCMS20 in two ways:
1. Step 8 of the $\text{Open}$ algorithm computes a "non-hiding" commitment $C'$ prior to
the inner product argument, which opens to the same value as $C$ but is a commitment to
a randomly-drawn polynomial. The remainder of the protocol involves no blinding. By
contrast, in Halo 2 we blind every single commitment that we make (even for instance
and fixed polynomials, though using a blinding factor of 1 for the fixed polynomials);
this makes the protocol simpler to reason about. As a consequence of this, the verifier
needs to handle the cumulative blinding factor at the end of the protocol, and so there
is no need to derive an equivalent to $C'$ at the start of the protocol.
- $C'$ is also an input to the random oracle for $\xi_0$; in Halo 2 we utilize a
transcript that has already committed to the equivalent components of $C'$ prior to
sampling $z$.
2. The $\text{PC}_\text{DL}.\text{SuccinctCheck}$ subroutine (Figure 2 of BCMS20) computes
the initial group element $C_0$ by adding $[v] H' = [v \epsilon] H$, which requires two
scalar multiplications. Instead, we subtract $[v] G_0$ from the original commitment $P$,
so that we're effectively opening the polynomial at the point to the value zero. The
computation $[v] G_0$ is more efficient in the context of recursion because $G_0$ is a
fixed base (so we can use lookup tables).

View File

@ -1,11 +0,0 @@
# Inner product argument
Halo 2 uses a polynomial commitment scheme for which we can create polynomial commitment
opening proofs, based around the Inner Product Argument.
> TODO: Explain Halo 2's variant of the IPA.
>
> It is very similar to $\text{PC}_\text{DL}.\text{Open}$ from Appendix A.2 of [BCMS20].
> See [this comparison](comparison.md#bcms20-appendix-a2) for details.
>
> [BCMS20]: https://eprint.iacr.org/2020/499

View File

@ -1,111 +0,0 @@
# Lookup argument
halo2 uses the following lookup technique, which allows for lookups in arbitrary sets, and
is arguably simpler than Plookup.
## Note on Language
In addition to the [general notes on language](../design.md#note-on-language):
- We call the $Z(X)$ polynomial (the grand product argument polynomial for the permutation
argument) the "permutation product" column.
## Technique Description
We express lookups in terms of a "subset argument" over a table with $2^k$ rows (numbered
from 0), and columns $A$ and $S$.
The goal of the subset argument is to enforce that every cell in $A$ is equal to _some_
cell in $S$. This means that more than one cell in $A$ can be equal to the _same_ cell in
$S$, and some cells in $S$ don't need to be equal to any of the cells in $A$.
- $S$ might be fixed, but it doesn't need to be. That is, we can support looking up values
in either fixed or variable tables (where the latter includes advice columns).
- $A$ and $S$ can contain duplicates. If the sets represented by $A$ and/or $S$ are not
naturally of size $2^k$, we extend $S$ with duplicates and $A$ with dummy values known
to be in $S$.
- Alternatively we could add a "lookup selector" that controls which elements of the $A$
column participate in lookups. This would modify the occurrence of $A(X)$ in the
permutation rule below to replace $A$ with, say, $S_0$ if a lookup is not selected.
Let $\ell_i$ be the Lagrange basis polynomial that evaluates to $1$ at row $i$, and $0$
otherwise.
We start by allowing the prover to supply permutation columns of $A$ and $S$. Let's call
these $A'$ and $S'$, respectively. We can enforce that they are permutations using a
permutation argument with product column $Z$ with the rules:
$$
Z(X) (A(X) + \beta) (S(X) + \gamma) - Z(\omega^{-1} X) (A'(X) + \beta) (S'(X) + \gamma) = 0
$$$$
\ell_0(X) (Z(X) - 1) = 0
$$
This is a version of the permutation argument which allows $A'$ and $S'$ to be
permutations of $A$ and $S$, respectively, but doesn't specify the exact permutations.
$\beta$ and $\gamma$ are separate challenges so that we can combine these two permutation
arguments into one without worrying that they might interfere with each other.
The goal of these permutations is to allow $A'$ and $S'$ to be arranged by the prover in a
particular way:
1. All the cells of column $A'$ are arranged so that like-valued cells are vertically
adjacent to each other. This could be done by some kind of sorting algorithm, but all
that matters is that like-valued cells are on consecutive rows in column $A'$, and that
$A'$ is a permutation of $A$.
2. The first row in a sequence of like values in $A'$ is the row that has the
corresponding value in $S'.$ Apart from this constraint, $S'$ is any arbitrary
permutation of $S$.
Now, we'll enforce that either $A'_i = S'_i$ or that $A'_i = A'_{i-1}$, using the rule
$$
(A'(X) - S'(X)) \cdot (A'(X) - A'(\omega^{-1} X)) = 0
$$
In addition, we enforce $A'_0 = S'_0$ using the rule
$$
\ell_0(X) \cdot (A'(X) - S'(X)) = 0
$$
Together these constraints effectively force every element in $A'$ (and thus $A$) to equal
at least one element in $S'$ (and thus $S$). Proof: by induction on prefixes of the rows.
## Cost
* There is the original column $A$ and the fixed column $S$.
* There is a permutation product column $Z$.
* There are the two permutations $A'$ and $S'$.
* The gates are all of low degree.
## Generalizations
halo2's lookup argument implementation generalizes the above technique in the following
ways:
- $A$ and $S$ can be extended to multiple columns, combined using a random challenge. $A'$
and $S'$ stay as single columns.
- The commitments to the columns of $S$ can be precomputed, then combined cheaply once
the challenge is known by taking advantage of the homomorphic property of Pedersen
commitments.
- The columns of $A$ can be given as arbitrary polynomial expressions using relative
references. These will be substituted into the product column constraint, subject to
the maximum degree bound. This potentially saves one or more advice columns.
- Then, a lookup argument for an arbitrary-width relation can be implemented in terms of a
subset argument, i.e. to constrain $\mathcal{R}(x, y, ...)$ in each row, consider
$\mathcal{R}$ as a set of tuples $S$ (using the method of the previous point), and check
that $(x, y, ...) \in \mathcal{R}$.
- In the case where $\mathcal{R}$ represents a function, this implicitly also checks
that the inputs are in the domain. This is typically what we want, and often saves an
additional range check.
- We can support multiple tables in the same circuit, by combining them into a single
table that includes a tag column to identify the original table.
- The tag column could be merged with the "lookup selector" mentioned earlier, if this
were implemented.
These generalizations are similar to those in sections 4 and 5 of the
[Plookup paper](https://eprint.iacr.org/2020/315.pdf). That is, the differences from
Plookup are in the subset argument. This argument can then be used in all the same ways;
for instance, the optimized range check technique in section 5 of the Plookup paper can
also be used with this subset argument.

View File

@ -1,93 +0,0 @@
# Multipoint opening argument
Consider the commitments $A, B, C, D$ to polynomials $a(X), b(X), c(X), d(X)$.
Let's say that $a$ and $b$ were queried at the point $x$, while $c$ and $d$
were queried at both points $x$ and $\omega x$. (Here, $\omega$ is the primitive
root of unity in the multiplicative subgroup over which we constructed the
polynomials).
To open these commitments, we could create a polynomial $Q$ for each point that we queried
at (corresponding to each relative rotation used in the circuit). But this would not be
efficient in the circuit; for example, $c(X)$ would appear in multiple polynomials.
Instead, we can group the commitments by the sets of points at which they were queried:
$$
\begin{array}{cccc}
&\{x\}& &\{x, \omega x\}& \\
&A& &C& \\
&B& &D&
\end{array}
$$
For each of these groups, we combine them into a polynomial set, and create a single $Q$
for that set, which we open at each rotation.
## Optimisation steps
The multipoint opening optimisation takes as input:
- A random $x$ sampled by the verifier, at which we evaluate $a(X), b(X), c(X), d(X)$.
- Evaluations of each polynomial at each point of interest, provided by the prover:
$a(x), b(x), c(x), d(x), c(\omega x), d(\omega x)$
These are the outputs of the [vanishing argument](vanishing.md#evaluating-the-polynomials).
The multipoint opening optimisation proceeds as such:
1. Sample random $x_1$, to keep $a, b, c, d$ linearly independent.
2. Accumulate polynomials and their corresponding evaluations according
to the point set at which they were queried:
`q_polys`:
$$
\begin{array}{rccl}
q_1(X) &=& a(X) &+& x_1 b(X) \\
q_2(X) &=& c(X) &+& x_1 d(X)
\end{array}
$$
`q_eval_sets`:
```math
[
[a(x) + x_1 b(x)],
[
c(x) + x_1 d(x),
c(\omega x) + x_1 d(\omega x)
]
]
```
NB: `q_eval_sets` is a vector of sets of evaluations, where the outer vector
goes over the point sets, and the inner vector goes over the points in each set.
3. Interpolate each set of values in `q_eval_sets`:
`r_polys`:
$$
\begin{array}{cccc}
r_1(X) s.t.&&& \\
&r_1(x) &=& a(x) + x_1 b(x) \\
r_2(X) s.t.&&& \\
&r_2(x) &=& c(x) + x_1 d(x) \\
&r_2(\omega x) &=& c(\omega x) + x_1 d(\omega x) \\
\end{array}
$$
4. Construct `f_polys` which check the correctness of `q_polys`:
`f_polys`
$$
\begin{array}{rcl}
f_1(X) &=& \frac{ q_1(X) - r_1(X)}{X - x} \\
f_2(X) &=& \frac{ q_2(X) - r_2(X)}{(X - x)(X - \omega x)} \\
\end{array}
$$
If $q_1(x) = r_1(x)$, then $f_1(X)$ should be a polynomial.
If $q_2(x) = r_2(x)$ and $q_2(\omega x) = r_2(\omega x)$
then $f_2(X)$ should be a polynomial.
5. Sample random $x_2$ to keep the `f_polys` linearly independent.
6. Construct $f(X) = f_1(X) + x_2 f_2(X)$.
7. Sample random $x_3$, at which we evaluate $f(X)$:
$$
\begin{array}{rcccl}
f(x_3) &=& f_1(x_3) &+& x_2 f_2(x_3) \\
&=& \frac{q_1(x_3) - r_1(x_3)}{x_3 - x} &+& x_2\frac{q_2(x_3) - r_2(x_3)}{(x_3 - x)(x_3 - \omega x)}
\end{array}
$$
8. Sample random $x_4$ to keep $f(X)$ and `q_polys` linearly independent.
9. Construct `final_poly`, $$final\_poly(X) = f(X) + x_4 q_1(X) + x_4^2 q_2(X),$$
which is the polynomial we commit to in the inner product argument.

View File

@ -1,162 +0,0 @@
# Permutation argument
Given that gates in halo2 circuits operate "locally" (on cells in the current row or
defined relative rows), it is common to need to copy a value from some arbitrary cell into
the current row for use in a gate. This is performed with an equality constraint, which
enforces that the source and destination cells contain the same value.
We implement these equality constraints by constructing a permutation that represents the
constraints, and then using a permutation argument within the proof to enforce them.
## Notation
A permutation is a one-to-one and onto mapping of a set onto itself. A permutation can be
factored uniquely into a composition of cycles (up to ordering of cycles, and rotation of
each cycle).
We sometimes use [cycle notation](https://en.wikipedia.org/wiki/Permutation#Cycle_notation)
to write permutations. Let $(a\ b\ c)$ denote a cycle where $a$ maps to $b$, $b$ maps to
$c$, and $c$ maps to $a$ (with the obvious generalisation to arbitrary-sized cycles).
Writing two or more cycles next to each other denotes a composition of the corresponding
permutations. For example, $(a\ b)\ (c\ d)$ denotes the permutation that maps $a$ to $b$,
$b$ to $a$, $c$ to $d$, and $d$ to $c$.
## Constructing the permutation
### Goal
We want to construct a permutation in which each subset of variables that are in a
equality-constraint set form a cycle. For example, suppose that we have a circuit that
defines the following equality constraints:
- $a \equiv b$
- $a \equiv c$
- $d \equiv e$
From this we have the equality-constraint sets $\{a, b, c\}$ and $\{d, e\}$. We want to
construct the permutation:
$$(a\ b\ c)\ (d\ e)$$
which defines the mapping of $[a, b, c, d, e]$ to $[b, c, a, e, d]$.
### Algorithm
We need to keep track of the set of cycles, which is a
[set of disjoint sets](https://en.wikipedia.org/wiki/Disjoint-set_data_structure).
Efficient data structures for this problem are known; for the sake of simplicity we choose
one that is not asymptotically optimal but is easy to implement.
We represent the current state as:
- an array $\mathsf{mapping}$ for the permutation itself;
- an auxiliary array $\mathsf{aux}$ that keeps track of a distinguished element of each
cycle;
- another array $\mathsf{sizes}$ that keeps track of the size of each cycle.
We have the invariant that for each element $x$ in a given cycle $C$, $\mathsf{aux}(x)$
points to the same element $c \in C$. This allows us to quickly decide whether two given
elements $x$ and $y$ are in the same cycle, by checking whether
$\mathsf{aux}(x) = \mathsf{aux}(y)$. Also, $\mathsf{sizes}(\mathsf{aux}(x))$ gives the
size of the cycle containing $x$. (This is guaranteed only for
$\mathsf{sizes}(\mathsf{aux}(x)))$, not for $\mathsf{sizes}(x)$.)
The algorithm starts with a representation of the identity permutation:
for all $x$, we set $\mathsf{mapping}(x) = x$, $\mathsf{aux}(x) = x$, and
$\mathsf{sizes}(x) = 1$.
To add an equality constraint $\mathit{left} \equiv \mathit{right}$:
1. Check whether $\mathit{left}$ and $\mathit{right}$ are already in the same cycle, i.e.
whether $\mathsf{aux}(\mathit{left}) = \mathsf{aux}(\mathit{right})$. If so, there is
nothing to do.
2. Otherwise, $\mathit{left}$ and $\mathit{right}$ belong to different cycles. Make
$\mathit{left}$ the larger cycle and $\mathit{right}$ the smaller one, by swapping them
iff $\mathsf{sizes}(\mathsf{aux}(\mathit{left})) < \mathsf{sizes}(\mathsf{aux}(\mathit{right}))$.
3. Following the mapping around the right (smaller) cycle, for each element $x$ set
$\mathsf{aux}(x) = \mathsf{aux}(\mathit{left})$.
4. Splice the smaller cycle into the larger one by swapping $\mathsf{mapping}(\mathit{left})$
with $\mathsf{mapping}(\mathit{right})$.
For example, given two disjoint cycles $(A\ B\ C\ D)$ and $(E\ F\ G\ H)$:
```plaintext
A +---> B
^ +
| |
+ v
D <---+ C E +---> F
^ +
| |
+ v
H <---+ G
```
After adding constraint $B \equiv E$ the above algorithm produces the cycle:
```plaintext
A +---> B +-------------+
^ |
| |
+ v
D <---+ C <---+ E F
^ +
| |
+ v
H <---+ G
```
### Broken alternatives
If we did not check whether $\mathit{left}$ and $\mathit{right}$ were already in the same
cycle, then we could end up undoing an equality constraint. For example, if we have the
following constraints:
- $a \equiv b$
- $b \equiv c$
- $c \equiv d$
- $b \equiv d$
and we tried to implement adding an equality constraint just using step 4 of the above
algorithm, then we would end up constructing the cycle $(a\ b)\ (c\ d)$, rather than the
correct $(a\ b\ c\ d)$.
## Argument specification
We need to represent permutations over $m$ columns, represented by polynomials $p_0, \ldots, p_{m-1}$.
We first assign a unique element of $\mathbb{F}^\times$ as an "extended domain" element for each cell
that can participate in the permutation argument.
Let $\omega$ be a $2^k$ root of unity and let $\delta$ be a $T$ root of unity, where
$T \cdot 2^S + 1 = p$ with $T$ odd and $k \leq S$.
We will use $\delta^i \cdot \omega^j \in \mathbb{F}^\times$ as the extended domain element for the
cell in the $j$th row of the $i$th column of the permutation argument.
If we have a permutation $\sigma(\mathsf{column}: i, \mathsf{row}: j) = (\mathsf{column}: i', \mathsf{row}: j')$,
we can represent it as a vector of $m$ polynomials $s_i(X)$ such that $s_i(\omega^j) = \delta^{i'} \cdot \omega^{j'}$.
Notice that the identity permutation can be represented by the vector of $m$ polynomials
$\mathsf{ID}_i(X)$ such that $\mathsf{ID}_i(X) = \delta^i \cdot X$.
Now given our permutation represented by $s_0, \ldots, s_{m-1}$, over advice columns represented by
$p_0, \ldots, p_{m-1}$, we want to ensure that:
$$
\prod\limits_{i=0}^{m-1} \prod\limits_{j=0}^{n-1} \left(\frac{p_i(\omega^j) + \beta \cdot \delta^i \cdot \omega^j + \gamma}{p_i(\omega^j) + \beta \cdot s_i(\omega^j) + \gamma}\right) = 1
$$
Let $Z_P$ be such that $Z_P(\omega^0) = Z_P(\omega^n) = 1$ and for $0 \leq j < n$:
$$\begin{array}{rl}
Z_P(\omega^{j+1}) &= \prod\limits_{h=0}^{j} \prod\limits_{i=0}^{m-1} \frac{p_i(\omega^h) + \beta \cdot \delta^i \cdot \omega^h + \gamma}{p_i(\omega^h) + \beta \cdot s_i(\omega^h) + \gamma} \\
&= Z_P(\omega^j) \prod\limits_{i=0}^{m-1} \frac{p_i(\omega^j) + \beta \cdot \delta^i \cdot \omega^j + \gamma}{p_i(\omega^j) + \beta \cdot s_i(\omega^j) + \gamma}
\end{array}$$
Then it is sufficient to enforce the constraints:
$$
l_0 \cdot (Z_P(X) - 1) = 0 \\
Z_P(\omega X) \cdot \prod\limits_{i=0}^{m-1} \left(p_i(X) + \beta \cdot s_i(X) + \gamma\right) - Z_P(X) \cdot \prod\limits_{i=0}^{m-1} \left(p_i(X) + \beta \cdot \delta^i \cdot X + \gamma\right) = 0
$$
> The optimization used to obtain the simple representation of the identity permutation was suggested
> by Vitalik Buterin for PLONK, and is described at the end of section 8 of the PLONK paper. Note that
> the $\delta^i$ are all distinct quadratic non-residues.

View File

@ -1,79 +0,0 @@
# Vanishing argument
Having committed to the circuit assignments, the prover now needs to demonstrate that the
various circuit relations are satisfied:
- The custom gates, represented by polynomials $\text{gate}_i(X)$.
- The rules of the lookup arguments.
- The rules of the equality constraint permutations.
Each of these relations is represented as a polynomial of degree $d$ (the maximum degree
of any of the relations) with respect to the circuit columns. Given that the degree of the
assignment polynomials for each column is $n - 1$, the relation polynomials have degree
$d(n - 1)$ with respect to $X$.
> In our [example](../proving-system.md#example), these would be the gate polynomials, of
> degree $3n - 3$:
>
> - $\text{gate}_0(X) = a_0(X) \cdot a_1(X) \cdot a_2(X \omega^{-1}) - a_3(X)$
> - $\text{gate}_1(X) = f_0(X \omega^{-1}) \cdot a_2(X)$
> - $\text{gate}_2(X) = f_0(X) \cdot a_3(X) \cdot a_0(X)$
A relation is satisfied if its polynomial is equal to zero. One way to demonstrate this is
to divide each polynomial relation by the vanishing polynomial $t(X) = (X^n - 1)$, which
is the lowest-degree monomial that has roots at every $\omega^i$. If relation's polynomial
is perfectly divisible by $t(X)$, it is equal to zero over the domain (as desired).
This simple construction would require a polynomial commitment per relation. Instead, we
commit to all of the circuit relations simultaneously: the verifier samples $y$, and then
the prover constructs the quotient polynomial
$$h(X) = \frac{\text{gate}_0(X) + y \cdot \text{gate}_1(X) + \dots + y^i \cdot \text{gate}_i(X) + \dots}{t(X)},$$
where the numerator is a random (the prover commits to the cell assignments before the
verifier samples $y$) linear combination of the circuit relations.
- If the numerator polynomial (in formal indeterminate $X$) is perfectly divisible by
$t(X)$, then with high probability all relations are satisfied.
- Conversely, if at least one relation is not satisfied, then with high probability
$h(x) \cdot t(x)$ will not equal the evaluation of the numerator at $x$. In this case,
the numerator polynomial would not be perfectly divisible by $t(X)$.
## Committing to $h(X)$
$h(X)$ has degree $(d - 1)n - d$ (because the divisor $t(X)$ has degree $n$). However, the
polynomial commitment scheme we use for Halo 2 only supports committing to polynomials of
degree $n - 1$ (which is the maximum degree that the rest of the protocol needs to commit
to). Instead of increasing the cost of the polynomial commitment scheme, the prover split
$h(X)$ into pieces of degree $n - 1$
$$h_0(X) + X^n h_1(X) + \dots + X^{n(d-1)} h_{d-1}(X),$$
and produces blinding commitments to each piece
$$\mathbf{H} = [\text{Commit}(h_0(X)), \text{Commit}(h_1(X)), \dots, \text{Commit}(h_{d-1}(X))].$$
## Evaluating the polynomials
At this point, all properties of the circuit have been committed to. The verifier now
wants to see if the prover committed to the correct $h(X)$ polynomial. The verifier
samples $x$, and the prover produces the purported evaluations of the various polynomials
at $x$, for all the relative offsets used in the circuit, as well as $h(X)$.
> In our [example](../proving-system.md#example), this would be:
>
> - $a_0(x)$
> - $a_1(x)$
> - $a_2(x)$, $a_2(x \omega^{-1})$
> - $a_3(x)$
> - $f_0(x)$, $f_0(x \omega^{-1})$
> - $h_0(x)$, ..., $h_{d-1}(x)$
The verifier checks that these evaluations satisfy the form of $h(X)$:
$$\frac{\text{gate}_0(x) + \dots + y^i \cdot \text{gate}_i(x) + \dots}{t(x)} = h_0(x) + \dots + x^{n(d-1)} h_{d-1}(x)$$
Now content that the evaluations collectively satisfy the gate constraints, the verifier
needs to check that the evaluations themselves are consistent with the original
[circuit commitments](circuit-commitments.md), as well as $\mathbf{H}$. To implement this
efficiently, we use a [multipoint opening argument](multipoint-opening.md).

View File

@ -1,5 +0,0 @@
# User Documentation
You're probably here because you want to write circuits? Excellent!
This section will guide you through the process of creating circuits with halo2.

View File

@ -1 +0,0 @@
# Gadgets

View File

@ -1,11 +0,0 @@
# Lookup tables
In normal programs, you can trade memory for CPU to improve performance, by pre-computing
and storing lookup tables for some part of the computation. We can do the same thing in
halo2 circuits!
A lookup table can be thought of as enforcing a *relation* between variables, where the relation is expressed as a table.
Assuming we have only one lookup argument in our constraint system, the total size of tables is constrained by the size of the circuit:
each table entry costs one row, and it also costs one row to do each lookup.
TODO

View File

@ -1,86 +0,0 @@
# A simple example
Let's start with a simple circuit, to introduce you to the common APIs and how they are
used. The circuit will take a public input $c$, and will prove knowledge of two private
inputs $a$ and $b$ such that
$$a^2 \cdot b^2 = c.$$
## Define instructions
Firstly, we need to define the instructions that our circuit will rely on. Instructions
are the boundary between high-level [gadgets](../concepts/gadgets.md) and the low-level
circuit operations. Instructions may be as coarse or as granular as desired, but in
practice you want to strike a balance between an instruction being large enough to
effectively optimize its implementation, and small enough that it is meaningfully
reusable.
For our circuit, we will use three instructions:
- Load a private number into the circuit.
- Multiply two numbers.
- Expose a number as a public input to the circuit.
We also need a type for a variable representing a number. Instruction interfaces provide
associated types for their inputs and outputs, to allow the implementations to represent
these in a way that makes the most sense for their optimization goals.
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:instructions}}
```
## Define a chip implementation
For our circuit, we will build a [chip](../concepts/chips.md) that provides the above
numeric instructions for a finite field.
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:chip}}
```
Every chip needs to implement the `Chip` trait. This defines the properties of the chip
that a `Layouter` may rely on when synthesizing a circuit, as well as enabling any initial
state that the chip requires to be loaded into the circuit.
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:chip-impl}}
```
## Configure the chip
The chip needs to be configured with the columns, permutations, and gates that will be
required to implement all of the desired instructions.
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:chip-config}}
```
## Implement chip traits
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:instructions-impl}}
```
## Build the circuit
Now that we have the instructions we need, and a chip that implements them, we can finally
build our circuit!
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:circuit}}
```
## Testing the circuit
`halo2::dev::MockProver` can be used to test that the circuit is working correctly. The
private and public inputs to the circuit are constructed as we will do to create a proof,
but by passing them to `MockProver::run` we get an object that can test every constraint
in the circuit, and tell us exactly what is failing (if anything).
```rust,ignore,no_run
{{#include ../../../examples/simple-example.rs:test-circuit}}
```
## Full example
You can find the source code for this example
[here](https://github.com/zcash/halo2/tree/main/examples/simple-example.rs).

View File

@ -1,71 +0,0 @@
# Tips and tricks
This section contains various ideas and snippets that you might find useful while writing
halo2 circuits.
## Small range constraints
A common constraint used in R1CS circuits is the boolean constraint: $b * (1 - b) = 0$.
This constraint can only be satisfied by $b = 0$ or $b = 1$.
In halo2 circuits, you can similarly constrain a cell to have one of a small set of
values. For example, to constrain $a$ to the range $[0..5]$, you would create a gate of
the form:
$$a \cdot (1 - a) \cdot (2 - a) \cdot (3 - a) \cdot (4 - a) = 0$$
while to constraint $c$ to be either 7 or 13, you would use:
$$(7 - c) \cdot (13 - c) = 0$$
> The underlying principle here is that we create a polynomial constraint with roots at
> each value in the set of possible values we want to allow. In R1CS circuits, the maximum
> supported polynomial degree is 2 (due to all constraints being of the form $a * b = c$).
> In halo2 circuits, you can use arbitrary-degree polynomials - with the proviso that
> higher-degree constraints are more expensive to use.
Note that the roots don't have to be constants; for example $(a - x) \cdot (a - y) \cdot (a - z) = 0$ will constrain $a$ to be equal to one of $\{ x, y, z \}$ where the latter can be arbitrary polynomials, as long as the whole expression stays within the maximum degree bound.
## Small set interpolation
We can use Lagrange interpolation to create a polynomial constraint that maps
$f(X) = Y$ for small sets of $X \in \{x_i\}, Y \in \{y_i\}$.
For instance, say we want to map a 2-bit value to a "spread" version interleaved
with zeros. We first precompute the evaluations at each point:
$$
\begin{array}{rcl}
00 \rightarrow 0000 &\implies& 0 \rightarrow 0 \\
01 \rightarrow 0001 &\implies& 1 \rightarrow 1 \\
10 \rightarrow 0100 &\implies& 2 \rightarrow 4 \\
11 \rightarrow 0101 &\implies& 3 \rightarrow 5
\end{array}
$$
Then, we construct the Lagrange basis polynomial for each point using the
identity:
$$\mathcal{l}_j(X) = \prod_{0 \leq m < k,\; m \neq j} \frac{x - x_m}{x_j - x_m},$$
where $k$ is the number of data points. ($k = 4$ in our example above.)
Recall that the Lagrange basis polynomial $\mathcal{l}_j(X)$ evaluates to $1$ at
$X = x_j$ and $0$ at all other $x_i, j \neq i.$
Continuing our example, we get four Lagrange basis polynomials:
$$
\begin{array}{ccc}
l_0(X) &=& \frac{(X - 3)(X - 2)(X - 1)}{(-3)(-2)(-1)} \\[1ex]
l_1(X) &=& \frac{(X - 3)(X - 2)(X)}{(-2)(-1)(1)} \\[1ex]
l_2(X) &=& \frac{(X - 3)(X - 1)(X)}{(-1)(1)(2)} \\[1ex]
l_3(X) &=& \frac{(X - 2)(X - 1)(X)}{(1)(2)(3)}
\end{array}
$$
Our polynomial constraint is then
$$
\begin{array}{cccccccccccl}
&f(0) \cdot l_0(X) &+& f(1) \cdot l_1(X) &+& f(2) \cdot l_2(X) &+& f(3) \cdot l_3(X) &-& f(X) &=& 0 \\
\implies& 0 \cdot l_0(X) &+& 1 \cdot l_1(X) &+& 4 \cdot l_2(X) &+& 5 \cdot l_3(X) &-& f(X) &=& 0. \\
\end{array}
$$