Update book to remove Halo 2 content

2021-03-03 22:44:30 +00:00 · 2021-03-03 22:44:30 +00:00 · c713e804fa
parent d40ed36d50
commit c713e804fa
42 changed files with 9 additions and 3292 deletions
--- a/.github/workflows/book.yml
+++ b/.github/workflows/book.yml
@ -1,4 +1,4 @@
-name: halo2 book
+name: Pasta book

 on:
  push:
@ -22,7 +22,7 @@ jobs:
          command: install
          args: mdbook-katex

-      - name: Build halo2 book
+      - name: Build Pasta book
        run: mdbook build book/

      - name: Deploy to GitHub Pages
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -81,7 +81,7 @@ jobs:
        uses: peaceiris/actions-mdbook@v1
        with:
          mdbook-version: '0.4.5'
-      - name: Test halo2 book
+      - name: Test Pasta book
        run: mdbook test -L target/debug/deps book/

  clippy:
--- a/book/book.toml
+++ b/book/book.toml
@ -8,6 +8,6 @@ authors = [
 language = "en"
 multilingual = false
 src = "src"
-title = "The halo2 Book"
+title = "The Pasta Book"

 [preprocessor.katex]
--- a/book/src/SUMMARY.md
+++ b/book/src/SUMMARY.md
@ -1,36 +1,6 @@
-# The halo2 Book
+# The Pasta Book

-[halo2](README.md)
- [Concepts](concepts.md)
-  - [Proof systems](concepts/proofs.md)
-  - [UltraPLONK Arithmetization](concepts/arithmetization.md)
-  - [Cores](concepts/cores.md)
-  - [Chips](concepts/chips.md)
-  - [Gadgets](concepts/gadgets.md)
- [User Documentation](user.md)
-  - [A simple example](user/simple-example.md)
-  - [Lookup tables](user/lookup-tables.md)
-  - [Gadgets](user/gadgets.md)
-  - [Tips and tricks](user/tips-and-tricks.md)
+[Pasta](README.md)
 - [Design](design.md)
-  - [Proving system](design/proving-system.md)
-    - [Lookup argument](design/proving-system/lookup.md)
-    - [Permutation argument](design/proving-system/permutation.md)
-    - [Circuit commitments](design/proving-system/circuit-commitments.md)
-    - [Vanishing argument](design/proving-system/vanishing.md)
-    - [Multipoint opening argument](design/proving-system/multipoint-opening.md)
-    - [Inner product argument](design/proving-system/inner-product.md)
-    - [Comparison to other work](design/proving-system/comparison.md)
  - [Implementation](design/implementation.md)
    - [Fields](design/implementation/fields.md)
-  - [Gadgets](design/gadgets.md)
-    - [SHA-256](design/gadgets/sha256.md)
-      - [16-bit table chip](design/gadgets/sha256/table16.md)
- [Background Material](background.md)
-  - [Fields](background/fields.md)
-  - [Polynomials](background/polynomials.md)
-  - [Cryptographic groups](background/groups.md)
-  - [Elliptic curves](background/curves.md)
-  - [UltraPLONK arithmetisation](background/upa.md)
-  - [Polynomial commitment using inner product argument](background/pc-ipa.md)
-  - [Recursion](background/recursion.md)
--- a/book/src/background.md
+++ b/book/src/background.md
@ -1,7 +0,0 @@
-# Background Material
-
-This section covers the background material required to understand the Halo 2 proving
-system. It is targeted at an ELI15 (Explain It Like I'm 15) level; if you think anything
-could do with additional explanation, [let us know]!
-
-[let us know]: https://github.com/zcash/halo2/issues/new/choose
--- a/book/src/background/curves.md
+++ b/book/src/background/curves.md
@ -1,252 +0,0 @@
-# Elliptic curves
-
-Elliptic curves constructed over finite fields are another important cryptographic tool.
-
-We use elliptic curves because they provide a cryptographic [group](fields.md#Groups),
-i.e. a group in which the discrete logarithm problem (discussed below) is hard.
-
-There are several ways to define the curve equation, but for our purposes, let
-$\mathbb{F}_p$ be a large (255-bit) field, and then let the set of solutions $(x, y)$ to
-$y^2 = x^3 + b$ for some constant $b$ define the $\mathbb{F}_p$-rational points on an
-elliptic curve $E(\mathbb{F}_p)$. These $(x, y)$ coordinates are called "affine
-coordinates". Each of the $\mathbb{F}_p$-rational points, together with a "point at
-infinity" $\mathcal{O}$ that serves as the group identity, can be interpreted as an
-element of a group. By convention, elliptic curve groups are written additively.
-
-![](https://i.imgur.com/JvLS6yE.png)
-*"Three points on a line sum to zero, which is the point at infinity."*
-
-The group addition law is simple: to add two points together, find the line that
-intersects both points and obtain the third point, and then negate its $y$-coordinate. The
-case that a point is being added to itself, called point doubling, requires special
-handling: we find the line tangent to the point, and then find the single other point that
-intersects this line and then negate. Otherwise, in the event that a point is being
-"added" to its negation, the result is the point at infinity.
-
-The ability to add and double points naturally gives us a way to scale them by integers,
-called _scalars_. The number of points on the curve is the group order. If this number
-is a prime $q$, then the scalars can be considered as elements of a _scalar field_,
-$\mathbb{F}_q$.
-
-Elliptic curves, when properly designed, have an important security property. Given two
-random elements $G, H \in E(\mathbb{F}_p)$ finding $a$ such that $[a] G = H$, otherwise
-known as the discrete log of $H$ with respect to $G$, is considered computationally
-infeasible with classical computers. This is called the elliptic curve discrete log
-assumption.
-
-If an elliptic curve group $\mathbb{G}$ has prime order $q$ (like the ones used in Halo 2),
-then it is a finite cyclic group. Recall from the section on [groups](fields.md#Groups)
-that this implies it is isomorphic to $\mathbb{Z}/q\mathbb{Z}$, or equivalently, to the
-scalar field $\mathbb{F}_q$. Each possible generator $G$ fixes the isomorphism; then
-an element on the scalar side is precisely the discrete log of the corresponding group
-element with respect to $G$. In the case of a cryptographically secure elliptic curve,
-the isomorphism is hard to compute in the $\mathbb{G} \rightarrow \mathbb{F}_q$ direction
-because the elliptic curve discrete log problem is hard.
-
-> It is sometimes helpful to make use of this isomorphism by thinking of group-based
-> cryptographic protocols and algorithms in terms of the scalars instead of in terms of
-> the group elements. This can make proofs and notation simpler.
->
-> For instance, it has become common in papers on proof systems to use the notation $[x]$
-> to denote a group element with discrete log $x$, where the generator is implicit.
->
-> We also used this idea in the
-> "[distinct-x theorem](https://zips.z.cash/protocol/protocol.pdf#thmdistinctx)",
-> in order to prove correctness of optimizations
-> [for elliptic curve scalar multiplication](https://github.com/zcash/zcash/issues/3924)
-> in Sapling, and an endomorphism-based optimization in Appendix C of the original
-> [Halo paper](https://eprint.iacr.org/2019/1021.pdf).
-
-## Curve arithmetic
-
-### Point doubling
-
-The simplest situation is doubling a point $(x_0, y_0)$. Continuing with our example
-$y^2 = x^3 + b$, this is done first by computing the derivative
-$$
-\lambda = \frac{\mathrm{d}y}{\mathrm{d}x} = \frac{3x^2}{2y}.
-$$
-
-To obtain expressions for $(x_1, y_1) = (x_0, y_0) + (x_0, y_0),$ we consider 
-
-$$
-\begin{aligned}
-\frac{-y_1 - y_0}{x_1 - x_0} = \lambda &\implies -y_1 = \lambda(x_1 - x_0) + y_0 \\
-&\implies \boxed{y_1 = \lambda(x_0 - x_1) - y_0}.
-\end{aligned}
-$$
-
-To get the expression for $x_1,$ we substitute $y = \lambda(x_0 - x) - y_0$ into the
-elliptic curve equation:
-
-$$
-\begin{aligned}
-y^2 = x^3 + b &\implies (\lambda(x_0 - x) - y_0)^2 = x^3 + b \\
-&\implies x^3 - \lambda^2 x^2 + \cdots = 0 \leftarrow\text{(rearranging terms)} \\
-&= (x - x_0)(x - x_0)(x - x_1) \leftarrow\text{(known roots $x_0, x_0, x_1$)} \\
-&= x^3 - (x_0 + x_0 + x_1)x^2 + \cdots.
-\end{aligned}
-$$
-
-Comparing coefficients for the $x^2$ term gives us
-$\lambda^2 = x_0 + x_0 + x_1 \implies \boxed{x_1 = \lambda^2 - 2x_0}.$
-
-
-### Projective coordinates
-This unfortunately requires an expensive inversion of $2y$. We can avoid this by arranging
-our equations to "defer" the computation of the inverse, since we often do not need the
-actual affine $(x', y')$ coordinate of the resulting point immediately after an individual
-curve operation. Let's introduce a third coordinate $Z$ and scale our curve equation by
-$Z^3$ like so:
-
-$$
-Z^3 y^2 = Z^3 x^3 + Z^3 b
-$$
-
-Our original curve is just this curve at the restriction $Z = 1$. If we allow the affine
-point $(x, y)$ to be represented by $X = xZ$, $Y = yZ$ and $Z \neq 0$ then we have the
-[homogenous projective curve](https://en.wikipedia.org/wiki/Homogeneous_coordinates)
-
-$$
-Y^2 Z = X^3 + Z^3 b.
-$$
-
-Obtaining $(x, y)$ from $(X, Y, Z)$ is as simple as computing $(X/Z, Y/Z)$ when
-$Z \neq 0$. (When $Z = 0,$ we are dealing with the point at infinity $O := (0:1:0)$.) In
-this form, we now have a convenient way to defer the inversion required by doubling a
-point. The general strategy is to express $x', y'$ as rational functions using $x = X/Z$
-and $y = Y/Z$, rearrange to make their denominators the same, and then take the resulting
-point $(X, Y, Z)$ to have $Z$ be the shared denominator and $X = x'Z, Y = y'Z$.
-
-> Projective coordinates are often, but not always, more efficient than affine
-> coordinates. There may be exceptions to this when either we have a different way to
-> apply Montgomery's trick, or when we're in the circuit setting where multiplications and
-> inversions are about equally as expensive (at least in terms of circuit size).
-
-The following shows an example of doubling a point $(X, Y, Z) = (xZ, yZ, Z)$ without an
-inversion. Substituting with $X, Y, Z$ gives us
-$$
-\lambda = \frac{3x^2}{2y} = \frac{3(X/Z)^2}{2(Y/Z)} = \frac{3 X^2}{2YZ}
-$$
-
-and gives us
-$$
-\begin{aligned}
-x' &= \lambda^2 - 2x \\
-&= \lambda^2 - \frac{2X}{Z} \\
-&= \frac{9 X^4}{4Y^2Z^2} - \frac{2X}{Z} \\
-&= \frac{9 X^4 - 8XY^2Z}{4Y^2Z^2} \\
-&= \frac{18 X^4 Y Z - 16XY^3Z^2}{8Y^3Z^3} \\
-\\
-y' &= \lambda (x - x') - y \\
-&= \lambda (\frac{X}{Z} - \frac{9 X^4 - 8XY^2Z}{4Y^2Z^2}) - \frac{Y}{Z} \\
-&= \frac{3 X^2}{2YZ} (\frac{X}{Z} - \frac{9 X^4 - 8XY^2Z}{4Y^2Z^2}) - \frac{Y}{Z} \\
-&= \frac{3 X^3}{2YZ^2} - \frac{27 X^6 - 24X^3Y^2Z}{8Y^3Z^3} - \frac{Y}{Z} \\
-&= \frac{12 X^3Y^2Z - 8Y^4Z^2 - 27 X^6 + 24X^3Y^2Z}{8Y^3Z^3}
-\end{aligned}
-$$
-
-Notice how the denominators of $x'$ and $y'$ are the same. Thus, instead of computing
-$(x', y')$ we can compute $(X, Y, Z)$ with $Z = 8Y^3Z^3$ and $X, Y$ set to the
-corresponding numerators such that $X/Z = x'$ and $Y/Z = y'$. This completely avoids the
-need to perform an inversion when doubling, and something analogous to this can be done
-when adding two distinct points.
-
-### TODO: Point addition
-$$
-\begin{aligned}
-P + Q &= R\\
-(x_p, y_p) + (x_q, y_q) &= (x_r, y_r) \\
-\lambda &= \frac{y_q - y_p}{x_q - x_p} \\
-x_r &= \lambda^2 - x_p - x_q \\
-y_r &= \lambda(x_p - x_r) - y_p
-\end{aligned}
-$$
-
----------
-
-Important notes:
-
-* There exist efficient formulae[^complete-formulae] for point addition that do not have
-  edge cases (so-called "complete" formulae) and that unify the addition and doubling
-  cases together. The result of adding a point to its negation using those formulae
-  produces $Z = 0$, which represents the point at infinity.
-* In addition, there are other models like the Jacobian representation where
-  $(x, y) = (xZ^2, yZ^3, Z)$ where the curve is rescaled by $Z^6$ instead of $Z^3$, and
-  this representation has even more efficient arithmetic but no unified/complete formulae.
-* We can easily compare two curve points $(X_1, Y_1, Z_1)$ and $(X_2, Y_2, Z_2)$ for
-  equality in the homogenous projective coordinate space by "homogenizing" their
-  Z-coordinates; the checks become $X_1 Z_2 = X_2 Z_1$ and $Y_1 Z_2 = Y_2 Z_1$.
-
-## Curve endomorphisms
-
-Imagine that $\mathbb{F}_p$ has a primitive cube root of unity, or in other words that
-$3 | p - 1$ and so an element $\zeta_p$ generates a $3$-order multiplicative subgroup.
-Notice that a point $(x, y)$ on our example elliptic curve $y^2 = x^3 + b$ has two cousin
-points: $(\zeta_p x, \zeta_p^2 x)$, because the computation $x^3$ effectively kills the
-$\zeta$ component of the $x$-coordinate. Applying the map $(x, y) \mapsto (\zeta_p x, y)$
-is an application of an endomorphism over the curve. The exact mechanics involved are
-complicated, but when the curve has a prime $q$ number of points (and thus a prime
-"order") the effect of the endomorphism is to multiply the point by a scalar in
-$\mathbb{F}_q$ which is also a primitive cube root $\zeta_q$ in the scalar field.
-
-## Curve point compression
-TODO
-
-## Cycles of curves
-Let $E_p$ be an elliptic curve over a finite field $\mathbb{F}_p,$ where $p$ is a prime.
-We denote this by $E_p/\mathbb{F}_p.$ and we denote the group of points of $E_p$ over
-$\mathbb{F}_p,$ with order $q = \#E(\mathbb{F}_p).$ For this curve, we call $\mathbb{F}_p$
-the "base field" and  $\mathbb{F}_q$ the "scalar field".
-
-We instantiate our proof system over the elliptic curve $E_p/\mathbb{F}_p$. This allows us
-to prove statements about $\mathbb{F}_q$-arithmetic circuit satisfiability.
-
-> **(aside) If our curve $E_p$ is over $\mathbb{F}_p,$ why is the arithmetic circuit instead in $\mathbb{F}_q$?**
-> The proof system is basically working on encodings of the scalars in the circuit (or
-> more precisely, commitments to polynomials whose coefficients are scalars). The scalars
-> are in $\mathbb{F}_q$ when their encodings/commitments are elliptic curve points in
-> $E_p/\mathbb{F}_p$.
-
-However, most of the verifier's arithmetic computations are over the base field
-$\mathbb{F}_p,$ and are thus efficiently expressed as an $\mathbb{F}_p$-arithmetic
-circuit.
-
-> **(aside) Why are the verifier's computations (mainly) over $\mathbb{F}_p$?**
-> The Halo 2 verifier actually has to perform group operations using information output by
-> the circuit. Group operations like point doubling and addition use arithmetic in
-> $\mathbb{F}_p$, because the coordinates of points are in $\mathbb{F}_p.$ 
-
-This motivates us to construct another curve with scalar field $\mathbb{F}_p$, which has
-an $\mathbb{F}_p$-arithmetic circuit that can efficiently verify proofs from the first
-curve. As a bonus, if this second curve had base field $E_q/\mathbb{F}_q,$ it would
-generate proofs that could be efficiently verified in the first curve's
-$\mathbb{F}_q$-arithmetic circuit. In other words, we instantiate a second proof system
-over $E_q/\mathbb{F}_q,$ forming a 2-cycle with the first:
-
-![](https://i.imgur.com/bNMyMRu.png)
-
-### TODO: Pallas-Vesta curves
-Reference: https://github.com/zcash/pasta
-
-## Hashing to curves
-
-Sometimes it is useful to be able to produce a random point on an elliptic curve
-$E_p/\mathbb{F}_p$ corresponding to some input, in such a way that no-one will know its
-discrete logarithm (to any other base).
-
-This is described in detail in the [Internet draft on Hashing to Elliptic Curves][cfrg-hash-to-curve].
-Several algorithms can be used depending on efficiency and security requirements. The
-framework used in the Internet Draft makes use of several functions:
-
-* ``hash_to_field``: takes a byte sequence input and maps it to a element in the base
-  field $\mathbb{F}_p$
-* ``map_to_curve``: takes an $\mathbb{F}_p$ element and maps it to $E_p$.
-
-[cfrg-hash-to-curve]: https://datatracker.ietf.org/doc/draft-irtf-cfrg-hash-to-curve/?include_text=1
-
-### TODO: Simplified SWU
-Reference: https://eprint.iacr.org/2019/403.pdf
-
-## References
-[^complete-formulae]: [Renes, J., Costello, C., & Batina, L. (2016, May). "Complete addition formulas for prime order elliptic curves." In Annual International Conference on the Theory and Applications of Cryptographic Techniques (pp. 403-428). Springer, Berlin, Heidelberg.](https://eprint.iacr.org/2015/1060)
--- a/book/src/background/fields.md
+++ b/book/src/background/fields.md
@ -1,294 +0,0 @@
-# Fields
-
-A fundamental component of many cryptographic protocols is the algebraic structure known
-as a [field]. Fields are sets of objects (usually numbers) with two associated binary
-operators $+$ and $\times$ such that various [field axioms][field-axioms] hold. The real
-numbers $\mathbb{R}$ are an example of a field with an uncountably infinite number of
-elements.
-
-[field]: https://en.wikipedia.org/wiki/Field_(mathematics)
-[field-axioms]: https://en.wikipedia.org/wiki/Field_(mathematics)#Classic_definition
-
-Halo makes use of _finite fields_ which have a finite number of elements. Finite fields
-are fully classified as follows:
-
- if $\mathbb{F}$ is a finite field, it contains $|\mathbb{F}| = p^k$ elements  for some
-  integer $k \geq 1$ and some prime $p$;
- any two finite fields with the same number of elements are isomorphic. In particular,
-  all of the arithmetic in a prime field $\mathbb{F}_p$ is isomorphic to addition and
-  multiplication of integers modulo $p$, i.e. in $\mathbb{Z}_p$. This is why we often
-  refer to $p$ as the _modulus_.
-
-We'll write a field as $\mathbb{F}_q$ where $q = p^k$. The prime $p$ is called its
-_characteristic_. In the cases where $k \gt 1$ the field $\mathbb{F}_q$ is a $k$-degree
-extension of the field $\mathbb{F}_p$. (By analogy, the complex numbers
-$\mathbb{C} = \mathbb{R}(i)$ are an extension of the real numbers.) However, in Halo we do
-not use extension fields. Whenever we write $\mathbb{F}_p$ we are referring to what
-we call a _prime field_ which has a prime $p$ number of elements, i.e. $k = 1$.
-
-Important notes:
-
-* There are two special elements in any field: $0$, the additive identity, and
-  $1$, the multiplicative identity.
-* The least significant bit of a field element, when represented as an integer in binary
-  format, can be interpreted as its "sign" to help distinguish it from its additive
-  inverse (negation). This is because for some nonzero element $a$ which has a least
-  significant bit $0$ we have that $-a = p - a$ has a least significant bit $1$, and vice
-  versa. We could also use whether or not an element is larger than $(p - 1) / 2$ to give
-  it a "sign."
-
-Finite fields will be useful later for constructing [polynomials](polynomials.md) and
-[elliptic curves](curves.md). Elliptic curves are examples of groups, which we discuss
-next.
-
-## Groups
-
-Groups are simpler and more limited than fields; they have only one binary operator $\cdot$
-and fewer axioms. They also have an identity, which we'll denote as $1$.
-
-[group]: https://en.wikipedia.org/wiki/Group_(mathematics)
-[group-axioms]: https://en.wikipedia.org/wiki/Group_(mathematics)#Definition
-
-Any non-zero element $a$ in a group has an _inverse_ $b = a^{-1}$,
-which is the _unique_ element $b$ such that $a \cdot b = 1$.
-     
-For example, the set of nonzero elements of $\mathbb{F}_p$ forms a group, where the
-group operation is given by multiplication on the field.
-
-[group]: https://en.wikipedia.org/wiki/Group_(mathematics)
-
-> #### (aside) Additive vs multiplicative notation 
-> If $\cdot$ is written as $\times$ or omitted (i.e. $a \cdot b$ written as $ab$), the
-> identity as $1$, and inversion as $a^{-1}$, as we did above, then we say that the group
-> is "written multiplicatively". If $\cdot$ is written as $+$, the identity as $0$ or
-> $\mathcal{O}$, and inversion as $-a$, then we say it is "written additively".
->
-> It's conventional to use additive notation for elliptic curve groups, and multiplicative
-> notation when the elements come from a finite field.
->
-> When additive notation is used, we also write
->
-> $$[k] A = \underbrace{A + A + \cdots + A}_{k \text{ times}}$$
->
-> for nonnegative $k$ and call this "scalar multiplication"; we also often use uppercase
-> letters for variables denoting group elements. When multiplicative notation is used, we
-> also write
->
-> $$a^k = \underbrace{a \times a \times \cdots \times a}_{k \text{ times}}$$
->
-> and call this "exponentiation". In either case we call the scalar $k$ such that
-> $[k] g = a$ or $g^k = a$ the "discrete logarithm" of $a$ to base $g$. We can extend
-> scalars to negative integers by inversion, i.e. $[-k] A + [k] A = \mathcal{O}$ or
-> $a^{-k} \times a^k = 1$.
-
-The _order_ of an element $a$ of a finite group is defined as the smallest positive integer
-$k$ such that $a^k = 1$ (in multiplicative notation) or $[k] a = \mathcal{O}$ (in additive
-notation). The order _of the group_ is the number of elements.
-
-Groups always have a [generating set], which is a set of elements such that we can produce
-any element of the group as (in multiplicative terminology) a product of powers of those
-elements. So if the generating set is $g_{1..k}$, we can produce any element of the group
-as $\prod\limits_{i=1}^{k} g_i^{a_i}$. There can be many different generating sets for a
-given group.
-
-[generating set]: https://en.wikipedia.org/wiki/Generating_set_of_a_group
-
-A group is called [cyclic] if it has a (not necessarily unique) generating set with only
-a single element — call it $g$. In that case we can say that $g$ generates the group, and
-that the order of $g$ is the order of the group.
-
-Any finite cyclic group $\mathbb{G}$ of order $n$ is [isomorphic] to the integers
-modulo $n$ (denoted $\mathbb{Z}/n\mathbb{Z}$), such that:
-
- the operation $\cdot$ in $\mathbb{G}$ corresponds to addition modulo $n$;
- the identity in $\mathbb{G}$ corresponds to $0$;
- some generator $g \in \mathbb{G}$ corresponds to $1$.
-
-Given a generator $g$, the isomorphism is always easy to compute in the
-$\mathbb{Z}/n\mathbb{Z} \rightarrow \mathbb{G}$ direction; it is just $a \mapsto g^a$
-(or in additive notation, $a \mapsto [a] g$).
-It may be difficult in general to compute in the $\mathbb{G} \rightarrow \mathbb{Z}/n\mathbb{Z}$
-direction; we'll discuss this further when we come to [elliptic curves](curves.md).
-
-If the order $n$ of a finite group is prime, then the group is cyclic, and every
-non-identity element is a generator.
-
-[isomorphic]: https://en.wikipedia.org/wiki/Isomorphism
-[cyclic]: https://en.wikipedia.org/wiki/Cyclic_group
-
-### The multiplicative group of a finite field
-
-We use the notation $\mathbb{F}_p^\times$ for the multiplicative group (i.e. the group
-operation is multiplication in $\mathbb{F}_p$) over the set $\mathbb{F}_p - \{0\}$.
-
-A quick way of obtaining the inverse in $\mathbb{F}_p^\times$ is $a^{-1} = a^{p - 2}$.
-The reason for this stems from [Fermat's little theorem][fermat-little], which states
-that $a^p = a \pmod p$ for any integer $a$. If $a$ is nonzero, we can divide by $a$ twice
-to get $a^{p-2} = a^{-1}.$
-
-[fermat-little]: https://en.wikipedia.org/wiki/Fermat%27s_little_theorem
-
-Let's assume that $\alpha$ is a generator of $\mathbb{F}_p^\times$, so it has order $p-1$
-(equal to the number of elements in $\mathbb{F}_p^\times$). Therefore, for any element in
-$a \in \mathbb{F}_p^\times$ there is a unique integer $i \in \{0..p-2\}$ such that $a = \alpha^i$.
-
-Notice that $a \times b$ where $a, b \in \mathbb{F}_p^\times$ can really be interpreted as
-$\alpha^i \times \alpha^j$ where $a = \alpha^i$ and $b = \alpha^j$. Indeed, it holds that
-$\alpha^i \times \alpha^j = \alpha^{i + j}$ for all $0 \leq i, j \lt p - 1$. As a result
-the multiplication of nonzero field elements can be interpreted as addition modulo $p - 1$
-with respect to some fixed generator $\alpha$. The addition just happens "in the exponent."
-
-This is another way to look at where $a^{p - 2}$ comes from for computing inverses in the
-field:
-
-$$p - 2 \equiv -1 \pmod{p - 1},$$
-
-so $a^{p - 2} = a^{-1}$.
-
-### Montgomery's Trick
-
-Montgomery's trick, named after Peter Montgomery (RIP) is a way to compute many group
-inversions at the same time. It is commonly used to compute inversions in
-$\mathbb{F}_p^\times$, which are quite computationally expensive compared to multiplication.
-
-Imagine we need to compute the inverses of three nonzero elements $a, b, c \in \mathbb{F}_p^\times$.
-Instead, we'll compute the products $x = ab$ and $y = xc = abc$, and compute the inversion
-
-$$z = y^{p - 2} = \frac{1}{abc}.$$
-
-We can now multiply $z$ by $x$ to obtain $\frac{1}{c}$ and multiply $z$ by $c$ to obtain
-$\frac{1}{ab}$, which we can then multiply by $a, b$ to obtain their respective inverses.
-
-This technique generalizes to arbitrary numbers of group elements with just a single
-inversion necessary.
-
-## Multiplicative subgroups
-
-A _subgroup_ of a group $G$ with operation $\cdot$, is a subset of elements of $G$ that
-also form a group under $\cdot$.
-
-In the previous section we said that $\alpha$ is a generator of the $(p - 1)$-order
-multiplicative group $\mathbb{F}_p^\times$. This group has _composite_ order, and so by
-the Chinese remainder theorem[^chinese-remainder] it has strict subgroups. As an example
-let's imagine that $p = 11$, and so $p - 1$ factors into $5 \cdot 2$. Thus, there is a
-generator $\beta$ of the $5$-order subgroup and a generator $\gamma$ of the $2$-order
-subgroup. All elements in $\mathbb{F}_p^\times$, therefore, can be written uniquely as
-$\beta^i \cdot \gamma^j$ for some $i$ (modulo $5$) and some $j$ (modulo $2$).
-
-If we have $a = \beta^i \cdot \gamma^j$ notice what happens when we compute
-
-$$
-a^5 = (\beta^i \cdot \gamma^j)^5
-    = \beta^{i \cdot 5} \cdot \gamma^{j \cdot 5}
-    = \beta^0 \cdot \gamma^{j \cdot 5}
-    = \gamma^{j \cdot 5};
-$$
-
-we have effectively "killed" the $5$-order subgroup component, producing a value in the
-$2$-order subgroup.
-
-[Lagrange's theorem (group theory)][lagrange-group] states that the order of any subgroup
-$H$ of a finite group $G$ divides the order of $G$. Therefore, the order of any subgroup
-of $\mathbb{F}_p^\times$ must divide $p-1.$
-
-[lagrange-group]: https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)
-
-## Square roots
-
-In a field $\mathbb{F}_p$ exactly half of all nonzero elements are squares; the remainder
-are non-squares or "quadratic non-residues". In order to see why, consider an $\alpha$
-that generates the $2$-order multiplicative subgroup of $\mathbb{F}_p^\times$ (this exists
-because $p - 1$ is divisible by $2$ since $p$ is a prime greater than $2$) and $\beta$ that
-generates the $t$-order multiplicative subgroup of $\mathbb{F}_p^\times$ where $p - 1 = 2t$.
-Then every element $a \in \mathbb{F}_p^\times$ can be written uniquely as
-$\alpha^i \cdot \beta^j$ with $i \in \mathbb{Z}_2$ and $j \in \mathbb{Z}_t$. Half of all
-elements will have $i = 0$ and the other half will have $i = 1$.
-
-Let's consider the simple case where $p \equiv 3 \pmod{4}$ and so $t$ is odd (if $t$ is
-even, then $p - 1$ would be divisible by $4$, which contradicts $p$ being $3 \pmod{4}$).
-If $a \in \mathbb{F}_p^\times$ is a square, then there must exist
-$b = \alpha^i \cdot \beta^j$ such that $b^2 = a$. But this means that
-
-$$a = (\alpha^i \cdot \beta^j)^2 = \alpha^{2i} \cdot \beta^{2j} = \beta^{2j}.$$
-
-In other words, all squares in this particular field do not generate the $2$-order
-multiplicative subgroup, and so since half of the elements generate the $2$-order subgroup
-then at most half of the elements are square. In fact exactly half of the elements are
-square (since squaring each nonsquare element gives a unique square). This means we can
-assume all squares can be written as $\beta^m$ for some $m$, and therefore finding the
-square root is a matter of exponentiating by $2^{-1} \pmod{t}$.
-
-In the event that $p \equiv 1 \pmod{4}$ then things get more complicated because
-$2^{-1} \pmod{t}$ does not exist. Let's write $p - 1$ as $2^k \cdot t$ with $t$ odd. The
-case $k = 0$ is impossible, and the case $k = 1$ is what we already described, so consider
-$k \geq 2$. $\alpha$ generates a $2^k$-order multiplicative subgroup and $\beta$ generates
-the odd $t$-order multiplicative subgroup. Then every element $a \in \mathbb{F}_p^\times$
-can be written as $\alpha^i \cdot \beta^j$ for $i \in \mathbb{Z}_{2^k}$ and
-$j \in \mathbb{Z}_t$. If the element is a square, then there exists some $b = \sqrt{a}$
-which can be written $b = \alpha^{i'} \cdot \beta^{j'}$ for $i' \in \mathbb{Z}_{2^k}$ and
-$j' \in \mathbb{Z}_t$. This means that $a = b^2 = \alpha^{2i'} \cdot \beta^{2j'}$,
-therefore we have $i \equiv 2i' \pmod{2^k}$, and $j \equiv 2j' \pmod{t}$. $i$ would have
-to be even in this case because otherwise it would be impossible to have
-$i \equiv 2i' \pmod{2^k}$ for any $i'$. In the case that $a$ is not a square, then $i$ is
-odd, and so half of all elements are squares.
-
-In order to compute the square root, we can first raise the element
-$a = \alpha^i \cdot  \beta^j$ to the power $t$ to "kill" the $t$-order component, giving
-
-$$a^t = \alpha^{it \pmod 2^k} \cdot \beta^{jt \pmod t} = \alpha^{it \pmod 2^k}$$
-
-and then raise this result to the power $t^{-1} \pmod{2^k}$ to undo the effect of the
-original exponentiation on the $2^k$-order component:
-
-$$(\alpha^{it \bmod 2^k})^{t^{-1} \pmod{2^k}} = \alpha^i$$
-
-(since $t$ is relatively prime to $2^k$). This leaves bare the $\alpha^i$ value which we
-can trivially handle. We can similarly kill the $2^k$-order component to obtain
-$\beta^{j \cdot 2^{-1} \pmod{t}}$, and put the values together to obtain the square root.
-
-It turns out that in the cases $k = 2, 3$ there are simpler algorithms that merge several
-of these exponentiations together for efficiency. For other values of $k$, the only known
-way is to manually extract $i$ by squaring until you obtain the identity for every single
-bit of $i$. This is the essence of the [Tonelli-Shanks square root algorithm][ts-sqrt] and
-describes the general strategy. (There is another square root algorithm that uses
-quadratic extension fields, but it doesn't pay off in efficiency until the prime becomes
-quite large.)
-
-[ts-sqrt]: https://en.wikipedia.org/wiki/Tonelli%E2%80%93Shanks_algorithm
-
-## Roots of unity
-
-In the previous sections we wrote $p - 1 = 2^k \cdot t$ with $t$ odd, and stated that an
-element $\alpha \in \mathbb{F}_p^\times$ generated the $2^k$-order subgroup. For
-convenience, let's denote $n := 2^k.$ The elements $\{1, \alpha, \alpha^2, \alpha^{n-1}\}$
-are known as the $n$th [roots of unity](https://en.wikipedia.org/wiki/Root_of_unity).
-
-The **primitive root of unity**, $\omega,$ is an $n$th root of unity such that
-$\omega^i \neq 1$ except when $i \equiv 0 \pmod{n}$.
-
-Important notes:
-
- If $\alpha$ is an $n$th root of unity, $\alpha$ satisfies $\alpha^n - 1 = 0.$ If
-  $\alpha \neq 1,$ then
-  $$1 + \alpha + \alpha^2 + \cdots + \alpha^{n-1} = 0.$$
- Equivalently, the roots of unity are solutions to the equation
-  $$X^n - 1 = (X - 1)(X - \alpha)(X - \alpha^2) \cdots (X - \alpha^{n-1}).$$
- **$\boxed{\omega^{\frac{n}{2}+i} =  -\omega^i}$ ("Negation lemma")**. Proof:
-  $$
-  \begin{aligned}
-  \omega^n = 1 &\implies \omega^n - 1 = 0 \\
-  &\implies (\omega^{n/2} + 1)(\omega^{n/2} - 1) = 0.
-  \end{aligned}
-  $$
-  Since the order of $\omega$ is $n$, $\omega^{n/2} \neq 1.$ Therefore, $\omega^{n/2} = -1.$
-
- **$\boxed{(\omega^{\frac{n}{2}+i})^2 =  (\omega^i)^2}$ ("Halving lemma")**. Proof:
-  $$
-  (\omega^{\frac{n}{2}+i})^2 = \omega^{n + 2i} = \omega^{n} \cdot \omega^{2i} = \omega^{2i} = (\omega^i)^2.
-  $$
-  In other words, if we square each element in the $n$th roots of unity, we would get back
-  only half the elements, $\{(\omega_n^i)^2\} = \{\omega_{n/2}\}$ (i.e. the $\frac{n}{2}$th roots
-  of unity). There is a two-to-one mapping between the elements and their squares.
-
-## References
-[^chinese-remainder]: [Friedman, R. (n.d.) "Cyclic Groups and Elementary Number Theory II" (p. 5).](http://www.math.columbia.edu/~rf/numbertheory2.pdf)
--- a/book/src/background/groups.md
+++ b/book/src/background/groups.md
@ -1,94 +0,0 @@
-# Cryptographic groups
-
-In the section [Inverses and groups](fields.md#inverses-and-groups) we introduced the
-concept of *groups*. A group has an identity and a group operation. In this section we
-will write groups additively, i.e. the identity is $\mathcal{O}$ and the group operation
-is $+$.
-
-Some groups can be used as *cryptographic groups*. At the risk of oversimplifying, this
-means that the problem of finding a discrete logarithm of a group element $P$ to a given
-base $G$, i.e. finding $x$ such that $P = [x] G$, is hard in general.
-
-## Pedersen commitment
-The Pedersen commitment [[P99]] is a way to commit to a secret message in a verifiable
-way. It uses two random public generators $G, H \in \mathbb{G},$ where $\mathbb{G}$ is a
-cryptographic group of order $p$. A random secret $r$ is chosen in $\mathbb{Z}_q$, and the
-message to commit to $m$ is from any subset of $\mathbb{Z}_q$. The commitment is 
-
-$$c = \text{Commit}(m,r)=[m]G + [r]H.$$ 
-
-To open the commitment, the committer reveals $m$ and $r,$ thus allowing anyone to verify
-that $c$ is indeed a commitment to $m.$
-
-[P99]: https://link.springer.com/content/pdf/10.1007%2F3-540-46766-1_9.pdf#page=3
-
-Notice that the Pedersen commitment scheme is homomorphic:
-
-$$
-\begin{aligned}
-\text{Commit}(m,r) + \text{Commit}(m',r') &= [m]G + [r]H + [m']G + [r']H \\
-&= [m + m']G + [r + r']H \\
-&= \text{Commit}(m + m',r + r').
-\end{aligned}
-$$
-
-Assuming the discrete log assumption holds, Pedersen commitments are also perfectly hiding
-and computationally binding:
-
-* **hiding**: the adversary chooses messages $m_0, m_1.$ The committer commits to one of
-  these messages $c = \text{Commit}(m_b;r), b \in \{0,1\}.$ Given $c,$ the probability of
-  the adversary guessing the correct $b$ is no more than $\frac{1}{2}$.
-* **binding**: the adversary cannot pick two different messages $m_0 \neq m_1,$ and
-  randomness $r_0, r_1,$ such that $\text{Commit}(m_0,r_0) = \text{Commit}(m_1,r_1).$
-
-### Vector Pedersen commitment
-We can use a variant of the Pedersen commitment scheme to commit to multiple messages at
-once, $\mathbf{m} = (m_1, \cdots, m_n)$. This time, we'll have to sample a corresponding
-number of random public generators $\mathbf{G} = (G_0, \cdots, G_{n-1}),$ along with a
-single random generator $H$ as before (for use in hiding). Then, our commitment scheme is:
-
-$$
-\begin{aligned}
-\text{Commit}(\mathbf{m}; r) &= \text{Commit}((m_0, \cdots, m_{n-1}); r) \\
-&= [r]H + [m_0]G_0 + \cdots + [m_{n-1}]G_{n-1} \\
-&= [r]H + \sum_{i= 0}^{n-1} [m_i]G_i.
-\end{aligned}
-$$
-
-> TODO: is this positionally binding?
-
-## Diffie--Hellman
-
-An example of a protocol that uses cryptographic groups is Diffie--Hellman key agreement
-[[DH1976]]. The Diffie--Hellman protocol is a method for two users, Alice and Bob, to
-generate a shared private key. It proceeds as follows:
-
-1. Alice and Bob publicly agree on two prime numbers, $p$ and $G,$ where $p$ is large and
-   $G$ is a primitive root $\pmod p.$ (Note that $g$ is a generator of the group
-   $\mathbb{F}_p^\times.$)
-2. Alice chooses a large random number $a$ as her private key. She computes her public key
-   $A = [a]G \pmod p,$ and sends $A$ to Bob.
-3. Similarly, Bob chooses a large random number $b$ as his private key. He computes his
-   public key $B = [b]G \pmod p,$ and sends $B$ to Alice.
-4. Now both Alice and Bob compute their shared key $K = [ab]G \pmod p,$ which Alice
-   computes as
-   $$K = [a]B \pmod p = [a]([b]G) \pmod p,$$
-   and Bob computes as
-   $$K = [b]A \pmod p = [b]([a]G) \pmod p.$$
-
-[DH1976]: https://ee.stanford.edu/~hellman/publications/24.pdf
-
-A potential eavesdropper would need to derive $K = [ab]g \pmod p$ knowing only
-$g, p, A = [a]G,$ and $B = [b]G$: in other words, they would need to either get the
-discrete logarithm $a$ from $A = [a]G$ or $b$ from $B = [b]G,$ which we assume to be
-computationally infeasible in $\mathbb{F}_p^\times.$
-
-More generally, protocols that use similar ideas to Diffie--Hellman are used throughout
-cryptography. One way of instantiating a cryptographic group is as an
-[elliptic curve](curves.md). Before we go into detail on elliptic curves, we'll describe
-some algorithms that can be used for any group.
-
-## Multiscalar multiplication
-
-### TODO: Pippenger's algorithm
-Reference: https://jbootle.github.io/Misc/pippenger.pdf
--- a/book/src/background/pc-ipa.md
+++ b/book/src/background/pc-ipa.md
@ -1,80 +0,0 @@
-# Polynomial commitment using inner product argument
-We want to commit to some polynomial $p(X) \in \mathbb{F}_p[X]$, and be able to provably
-evaluate the committed polynomial at arbitrary points. The naive solution would be for the
-prover to simply send the polynomial's coefficients to the verifier: however, this
-requires $O(n)$ communication. Our polynomial commitment scheme gets the job done using
-$O(\log n)$ communication.
-
-### `Setup`
-Given a parameter $d = 2^k,$ we generate the common reference string
-$\sigma = (\mathbb{G}, \mathbf{G}, H, \mathbb{F}_p)$ defining certain constants for this
-scheme:
-* $\mathbb{G}$ is a group of prime order $p;$
-* $\mathbf{G} \in \mathbb{G}^d$ is a vector of $d$ random group elements;
-* $H \in \mathbb{G}$ is a random group element; and
-* $\mathbb{F}_p$ is the finite field of order $p.$
-
-### `Commit`
-The Pedersen vector commitment $\text{Commit}$ is defined as
-
-$$\text{Commit}(\sigma, p(X); r) = \langle\mathbf{a}, \mathbf{G}\rangle + [r]H,$$
-
-for some polynomial $p(X) \in \mathbb{F}_p[X]$ and some blinding factor
-$r \in \mathbb{F}_p.$ Here, each element of the vector $\mathbf{a}_i \in \mathbb{F}_p$ is
-the coefficient for the $i$th degree term of $p(X),$ and $p(X)$ is of maximal degree
-$d - 1.$
-
-### `Open` (prover) and `OpenVerify` (verifier)
-The modified inner product argument is an argument of knowledge for the relation
-
-$$\boxed{\{((P, x, v); (\mathbf{a}, r)): P = \langle\mathbf{a}, \mathbf{G}\rangle + [r]H, v = \langle\mathbf{a}, \mathbf{b}\rangle\}},$$
-
-where $\mathbf{b} = (1, x, x^2, \cdots, x^{d-1})$ is composed of increasing powers of the
-evaluation point $x.$ This allows a prover to demonstrate to a verifier that the
-polynomial contained “inside” the commitment $P$ evaluates to $v$ at $x,$ and moreover,
-that the committed polynomial has maximum degree $d − 1.$
-
-The inner product argument proceeds in $k = \log_2 d$ rounds. For our purposes, it is
-sufficient to know about its final outputs, while merely providing intuition about the
-intermediate rounds. (Refer to Section 3 in the [Halo] paper for a full explanation.)
-
-[Halo]: https://eprint.iacr.org/2019/1021.pdf
-
-Before beginning the argment, the verifier selects a random group element $U$ and sends it
-to the prover. We initialise the argument at round $k,$ with the vectors
-$\mathbf{a}^{(k)} := \mathbf{a},$ $\mathbf{G}^{(k)} := \mathbf{G}$ and
-$\mathbf{b}^{(k)} := \mathbf{b}.$ In each round $j = k, k-1, \cdots, 1$:
-
-* the prover computes two values $L_j$ and $R_j$ by taking some inner product of
-  $\mathbf{a}^{(j)}$ with $\mathbf{G}^{(j)}$ and $\mathbf{b}^{(j)}$. Note that are in some
-  sense "cross-terms": the lower half of $\mathbf{a}$ is used with the higher half of
-  $\mathbf{G}$ and $\mathbf{b}$, and vice versa:
-
-$$
-\begin{aligned}
-L_j &= \langle\mathbf{a_{lo}^{(j)}}, \mathbf{G_{hi}^{(j)}}\rangle + [l_j]H + [\langle\mathbf{a_{lo}^{(j)}}, \mathbf{b_{hi}^{(j)}}\rangle] U\\
-R_j &= \langle\mathbf{a_{hi}^{(j)}}, \mathbf{G_{lo}^{(j)}}\rangle + [l_j]H + [\langle\mathbf{a_{hi}^{(j)}}, \mathbf{b_{lo}^{(j)}}\rangle] U\\
-\end{aligned}
-$$
-
-* the verifier issues a random challenge $u_j$;
-* the prover uses $u_j$ to compress the lower and higher halves of $\mathbf{a}^{(j)}$,
-  thus producing a new vector of half the original length 
-  $$\mathbf{a}^{(j-1)} = \mathbf{a_{hi}^{(j)}}\cdot u_j^{-1} + \mathbf{a_{lo}^{(j)}}\cdot u_j.$$
-  The vectors $\mathbf{G}^{(j)}$ and $\mathbf{b}^{(j)}$ are similarly compressed to give
-  $\mathbf{G}^{(j-1)}$ and $\mathbf{b}^{(j-1)}$.
-* $\mathbf{a}^{(j-1)}$, $\mathbf{G}^{(j-1)}$ and $\mathbf{b}^{(j-1)}$ are input to the
-  next round $j - 1.$
-
-Note that at the end of the last round $j = 1,$ we are left with $a := \mathbf{a}^{(0)}$,
-$G := \mathbf{G}^{(0)}$, $b := \mathbf{b}^{(0)},$ each of length 1. The intuition is that
-these final scalars, together with the challenges $\{u_j\}$ and "cross-terms"
-$\{L_j, R_j\}$ from each round, encode the compression in each round. Since the prover did
-not know the challenges $U, \{u_j\}$ in advance, they would have been unable to manipulate
-the round compressions. Thus, checking a constraint on these final terms should enforce
-that the compression had been performed correctly, and that the original $\mathbf{a}$
-satisfied the relation before undergoing compression.
-
-Note that $G, b$ are simply rearrangements of the publicly known $\mathbf{G}, \mathbf{b},$
-with the round challenges $\{u_j\}$ mixed in: this means the verifier can compute $G, b$
-independently and verify that the prover had provided those same values.
--- a/book/src/background/polynomials.md
+++ b/book/src/background/polynomials.md
@ -1,289 +0,0 @@
-# Polynomials
-
-Let $A(X)$ be a polynomial over $\mathbb{F}_p$ with formal indeterminate $X$. As an example,
-
-$$
-A(X) = a_0 + a_1 X + a_2 X^2 + a_3 X^3
-$$
-
-defines a degree-$3$ polynomial. $a_0$ is referred to as the constant term. Polynomials of
-degree $n-1$ have $n$ coefficients. We will often want to compute the result of replacing
-the formal indeterminate $X$ with some concrete value $x$, which we denote by $A(x)$.
-
-> In mathematics this is commonly referred to as "evaluating $A(X)$ at a point $x$".
-> The word "point" here stems from the geometrical usage of polynomials in the form
-> $y = A(x)$, where $(x, y)$ is the coordinate of a point in two-dimensional space.
-> However, the polynomials we deal with are almost always constrained to equal zero, and
-> $x$ will be an [element of some field](fields.md). This should not be confused
-> with points on an [elliptic curve](curves.md), which we also make use of, but never in
-> the context of polynomial evaluation.
-
-Important notes:
-
-* Multiplication of polynomials produces a product polynomial that is the sum of the
-  degrees of its factors. Polynomial division subtracts from the degree.
-  $$\deg(A(X)B(X)) = \deg(A(X)) + \deg(B(X)),$$
-  $$\deg(A(X)/B(X)) = \deg(A(X)) -\deg(B(X)).$$
-* Given a polynomial $A(X)$ of degree $n-1$, if we obtain $n$ evaluations of the
-  polynomial at distinct points then these evaluations perfectly define the polynomial. In
-  other words, given these evaluations we can obtain a unique polynomial $A(X)$ of degree
-  $n-1$ via polynomial interpolation.
-* $[a_0, a_1, \cdots, a_{n-1}]$ is the **coefficient representation** of the polynomial
-  $A(X)$. Equivalently, we could use its **evaluation representation**
-  $$[(x_0, A(x_0)), (x_1, A(x_1)), \cdots, (x_{n-1}, A(x_{n-1}))]$$
-  at $n$ distinct points. Either representation uniquely specifies the same polynomial.
-
-> #### (aside) Horner's rule
-> Horner's rule allows for efficient evaluation of a polynomial of degree $n-1$, using
-> only $n-1$ multiplications and $n-1$ additions. It is the following identity:
-> $$\begin{aligned}a_0 &+ a_1X + a_2X^2 + \cdots + a_{n-1}X^{n-1} \\ &= a_0 + X\bigg( a_1 + X \Big( a_2 + \cdots + X(a_{n-2} + X a_{n-1}) \Big)\!\bigg),\end{aligned}$$
-
-## Fast Fourier Transform (FFT)
-The FFT is an efficient way of converting between the coefficient and evaluation
-representations of a polynomial. It evaluates the polynomial at the $n$th roots of unity
-$\{\omega^0, \omega^1, \cdots, \omega^{n-1}\},$ where $\omega$ is a primitive $n$th root
-of unity. By exploiting symmetries in the roots of unity, each round of the FFT reduces
-the evaluation into a problem only half the size. Most commonly we use polynomials of
-length some power of two, $n = 2^k$, and apply the halving reduction recursively.
-
-### Motivation: Fast polynomial multiplication
-In the coefficient representation, it takes $O(n^2)$ operations to multiply two
-polynomials $A(X)\cdot B(X) = C(X)$:
-
-$$
-\begin{aligned}
-A(X) &= a_0 + a_1X + a_2X^2 + \cdots + a_{n-1}X^{n-1}, \\
-B(X) &= b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1}, \\
-C(X) &= a_0\cdot (b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1}) \\
-&+ a_1X\cdot (b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1})\\
-&+ \cdots \\
-&+ a_{n-1}X^{n-1} \cdot (b_0 + b_1X + b_2X^2 + \cdots + b_{n-1}X^{n-1}),
-\end{aligned}
-$$
-
-where each of the $n$ terms in the first polynomial has to be multiplied by the $n$ terms
-of the second polynomial.
-
-In the evaluation representation, however, polynomial multiplication only requires $O(n)$
-operations:
-
-$$
-\begin{aligned}
-A&: \{(x_0, A(x_0)), (x_1, A(x_1)), \cdots, (x_{n-1}, A(x_{n-1}))\}, \\
-B&: \{(x_0, B(x_0)), (x_1, B(x_1)), \cdots, (x_{n-1}, B(x_{n-1}))\}, \\
-C&: \{(x_0, A(x_0)B(x_0)), (x_1, A(x_1)B(x_1)), \cdots, (x_{n-1}, A(x_{n-1})B(x_{n-1}))\},
-\end{aligned}
-$$
-
-where each evaluation is multiplied pointwise.
-
-This suggests the following strategy for fast polynomial multiplication:
-
-1. Evaluate polynomials at all $n$ points;
-2. Perform fast pointwise multiplication in the evaluation representation ($O(n)$);
-3. Convert back to the coefficient representation.
-
-The challenge now is how to **evaluate** and **interpolate** the polynomials efficiently.
-Naively, evaluating a polynomial at $n$ points would require $O(n^2)$ operations (we use
-the $O(n)$ Horner's rule at each point):
-
-$$
-\begin{bmatrix}
-A(1) \\
-A(\omega) \\
-A(\omega^2) \\
-\vdots \\
-A(\omega^{n-1})
-\end{bmatrix} =
-\begin{bmatrix}
-1&1&1&\dots&1 \\
-1&\omega&\omega^2&\dots&\omega^{n-1} \\
-1&\omega^2&\omega^{2\cdot2}&\dots&\omega^{2\cdot(n-1)} \\
-\vdots&\vdots&\vdots& &\vdots \\
-1&\omega^{n-1}&\omega^{2(n-1)}&\cdots&\omega^{(n-1)^2}\\
-\end{bmatrix} \cdot
-\begin{bmatrix}
-a_0 \\
-a_1 \\
-a_2 \\
-\vdots \\
-a_{n-1}
-\end{bmatrix}.
-$$
-
-For convenience, we will denote the matrices above as:
-$$\hat{\mathbf{A}} = \mathbf{V}_\omega \cdot \mathbf{A}. $$
-
-($\hat{\mathbf{A}}$ is known as the *Discrete Fourier Transform* of $\mathbf{A}$;
-$\mathbf{V}_\omega$ is also called the *Vandermonde matrix*.)
-
-### The (radix-2) Cooley-Tukey algorithm
-Our strategy is to divide a DFT of size $n$ into two interleaved DFTs of size $n/2$. Given
-the polynomial $A(X) = a_0 + a_1X + a_2X^2 + \cdots + a_{n-1}X^{n-1},$ we split it up into
-even and odd terms:
-
-$$
-\begin{aligned}
-A_{\text{even}} &= a_0 + a_2X + \cdots + a_{n-2}X^{\frac{n}{2} - 1}, \\
-A_{\text{odd}} &= a_1 + a_3X + \cdots + a_{n-1}X^{\frac{n}{2} - 1}. \\
-\end{aligned}
-$$
-
-To recover the original polynomial, we do
-$A(X) = A_{\text{even}} (X^2) + X A_{\text{odd}}(X^2).$
-
-Trying this out on points $\omega_n^i$ and $\omega_n^{\frac{n}{2} + i}$,
-$i \in [0..\frac{n}{2}-1],$ we start to notice some symmetries:
-
-$$
-\begin{aligned}
-A(\omega_n^i) &= A_{\text{even}} ((\omega_n^i)^2) + \omega_n^i A_{\text{odd}}((\omega_n^i)^2), \\
-A(\omega_n^{\frac{n}{2} + i}) &= A_{\text{even}} ((\omega_n^{\frac{n}{2} + i})^2) + \omega_n^{\frac{n}{2} + i} A_{\text{odd}}((\omega_n^{\frac{n}{2} + i})^2) \\
-&= A_{\text{even}} ((-\omega_n^i)^2) - \omega_n^i A_{\text{odd}}((-\omega_n^i)^2) \leftarrow\text{(negation lemma)} \\
-&= A_{\text{even}} ((\omega_n^i)^2) - \omega_n^i A_{\text{odd}}((\omega_n^i)^2).
-\end{aligned}
-$$
-
-Notice that we are only evaluating $A_{\text{even}}(X)$ and $A_{\text{odd}}(X)$ over half
-the domain $\{(\omega_n^0)^2, (\omega_n)^2, \cdots, (\omega_n^{\frac{n}{2} -1})^2\} = \{\omega_{n/2}^i\}, i = [0..\frac{n}{2}-1]$ (halving lemma).
-This gives us all the terms we need to reconstruct $A(X)$ over the full domain
-$\{\omega^0, \omega, \cdots, \omega^{n -1}\}$: which means we have transformed a
-length-$n$ DFT into two length-$\frac{n}{2}$ DFTs. 
-
-We choose $n = 2^k$ to be a power of two (by zero-padding if needed), and apply this
-divide-and-conquer strategy recursively. By the Master Theorem[^master-thm], this gives us
-an evaluation algorithm with $O(n\log_2n)$ operations, also known as the Fast Fourier
-Transform (FFT).
-
-### Inverse FFT
-So we've evaluated our polynomials and multiplied them pointwise. What remains is to
-convert the product from the evaluation representation back to coefficient representation.
-To do this, we simply call the FFT on the evaluation representation. However, this time we
-also:
- replace $\omega^i$ by $\omega^{-i}$ in the Vandermonde matrix, and
- multiply our final result by a factor of $1/n$.
-
-In other words:
-$$\mathbf{A} = \frac{1}{n} \mathbf{V}_{\omega^{-1}} \cdot \hat{\mathbf{A}}. $$
-
-(To understand why the inverse FFT has a similar form to the FFT, refer to Slide 13-1 of
-[^ifft]. The below image was also taken from [^ifft].)
-
-![](https://i.imgur.com/lSw30zo.png)
-
-
-## The Schwartz-Zippel lemma
-The Schwartz-Zippel lemma informally states that "different polynomials are different at
-most points." Formally, it can be written as follows:
-
-> Let $p(x_1, x_2, \cdots, x_n)$ be a nonzero polynomial of $n$ variables with degree $d$.
-> Let $S$ be a finite set of numbers with at least $d$ elements in it. If we choose random
-> $\alpha_1, \alpha_1, \cdots, \alpha_n$ from $S$,
-> $$\text{Pr}[p(\alpha_1, \alpha_2, \cdots, \alpha_n) = 0] \leq \frac{d}{|S|}.$$
-
-In the familiar univariate case $p(X)$, this reduces to saying that a nonzero polynomial
-of degree $d$ has at most $d$ roots.
-
-The Schwartz-Zippel lemma is used in polynomial equality testing.  Given two multi-variate
-polynomials $p_1(x_1,\cdots,x_n)$ and $p_2(x_1,\cdots,x_n)$ of degrees $d_1, d_2$
-respectively, we can test if
-$p_1(\alpha_1, \cdots, \alpha_n) - p_2(\alpha_1, \cdots, \alpha_n) = 0$ for random
-$\alpha_1, \cdots, \alpha_n \leftarrow S,$ where the size of $S$ is at least
-$|S| \geq (d_1 + d_2).$  If the two polynomials are identical, this will always be true,
-whereas if the two polynomials are different then the equality holds with probability at
-most $\frac{\max(d_1,d_2)}{|S|}$.
-
-## Vanishing polynomial
-Consider the order-$n$ multiplicative subgroup $\mathcal{H}$ with primitive root of unity
-$\omega$. For all $\omega^i \in \mathcal{H}, i \in [n-1],$ we have
-$(\omega^i)^n = (\omega^n)^i = (\omega^0)^i = 1.$ In other words, every element of
-$\mathcal{H}$ fulfils the equation 
-
-$$
-\begin{aligned}
-Z_H(X) &= X^n - 1 \\
-&= (X-\omega^0)(X-\omega^1)(X-\omega^2)\cdots(X-\omega^{n-1}),
-\end{aligned}
-$$
-
-meaning every element is a root of $Z_H(X).$ We call $Z_H(X)$ the **vanishing polynomial**
-over $\mathcal{H}$ because it evaluates to zero on all elements of $\mathcal{H}.$
-
-This comes in particularly handy when checking polynomial constraints. For instance, to
-check that $A(X) + B(X) = C(X)$ over $\mathcal{H},$ we simply have to check that
-$A(X) + B(X) - C(X)$ is some multiple of $Z_H(X)$. In other words, if dividing our
-constraint by the vanishing polynomial still yields some polynomial
-$\frac{A(X) + B(X) - C(X)}{Z_H(X)} = H(X),$ we are satisfied that $A(X) + B(X) - C(X) = 0$
-over $\mathcal{H}.$
-
-## Lagrange basis functions
-
-> TODO: explain what a basis is in general (briefly).
-
-Polynomials are commonly written in the monomial basis (e.g. $X, X^2, ... X^n$). However,
-when working over a multiplicative subgroup of order $n$, we find a more natural expression
-in the Lagrange basis.
-
-Consider the order-$n$ multiplicative subgroup $\mathcal{H}$ with primitive root of unity
-$\omega$. The Lagrange basis corresponding to this subgroup is a set of functions
-$\{\mathcal{L}_i\}_{i = 0}^{n-1}$, where 
-
-$$
-\mathcal{L_i}(\omega^j) = \begin{cases}
-1 & \text{if } i = j, \\
-0 & \text{otherwise.}
-\end{cases}
-$$
-
-We can write this more compactly as $\mathcal{L_i}(\omega^j) = \delta_{ij},$ where
-$\delta$ is the Kronecker delta function. 
-
-Now, we can write our polynomial as a linear combination of Lagrange basis functions,
-
-$$A(X) = \sum_{i = 0}^{n-1} a_i\mathcal{L_i}(X), X \in \mathcal{H},$$
-
-which is equivalent to saying that $p(X)$ evaluates to $a_0$ at $\omega^0$,
-to $a_1$ at $\omega^1$, to $a_2$ at $\omega^2, \cdots,$ and so on.
-
-When working over a multiplicative subgroup, the Lagrange basis function has a convenient
-sparse representation of the form
-
-$$
-\mathcal{L}_i(X) = \frac{c_i\cdot(X^{n} - 1)}{X - \omega^i},
-$$
-
-where $c_i$ is the barycentric weight. (To understand how this form was derived, refer to
-[^barycentric].) For $i = 0,$ we have
-$c = 1/n \implies \mathcal{L}_0(X) = \frac{1}{n} \frac{(X^{n} - 1)}{X - 1}$.
-
-Suppose we are given a set of evaluation points $\{x_0, x_1, \cdots, x_{n-1}\}$.
-Since we cannot assume that the $x_i$'s form a multiplicative subgroup, we consider also
-the Lagrange polynomials $\mathcal{L}_i$'s in the general case. Then we can construct:
-
-$$
-\mathcal{L}_i(X) = \prod_{j\neq i}\frac{X - x_j}{x_i - x_j}, i \in [0..n-1].
-$$
-
-Here, every $X = x_j \neq x_i$ will produce a zero numerator term $(x_j - x_j),$ causing
-the whole product to evaluate to zero. On the other hand, $X= x_i$ will evaluate to
-$\frac{x_i - x_j}{x_i - x_j}$ at every term, resulting in an overall product of one. This
-gives the desired Kronecker delta behaviour $\mathcal{L_i}(x_j) = \delta_{ij}$ on the
-set $\{x_0, x_1, \cdots, x_{n-1}\}$.
-
-### Lagrange interpolation
-Given a polynomial in its evaluation representation
-
-$$A: \{(x_0, A(x_0)), (x_1, A(x_1)), \cdots, (x_{n-1}, A(x_{n-1}))\},$$
-
-we can reconstruct its coefficient form in the Lagrange basis:
-
-$$A(X) = \sum_{i = 0}^{n-1} A(x_i)\mathcal{L_i}(X), $$
-
-where $X \in \{x_0, x_1,\cdots, x_{1-n}\}.$
-
-## References
-[^master-thm]: [Dasgupta, S., Papadimitriou, C. H., & Vazirani, U. V. (2008). "Algorithms" (ch. 2). New York: McGraw-Hill Higher Education.](https://people.eecs.berkeley.edu/~vazirani/algorithms/chap2.pdf)
-
-[^ifft]: [Golin, M. (2016). "The Fast Fourier Transform and Polynomial Multiplication" [lecture notes], COMP 3711H Design and Analysis of Algorithms, Hong Kong University of Science and Technology.](http://www.cs.ust.hk/mjg_lib/Classes/COMP3711H_Fall16/lectures/FFT_Slides.pdf)
-
-[^barycentric]: [Berrut, J. and Trefethen, L. (2004). "Barycentric Lagrange Interpolation."](https://people.maths.ox.ac.uk/trefethen/barycentric.pdf)
--- a/book/src/background/recursion.md
+++ b/book/src/background/recursion.md
@ -1,26 +0,0 @@
-## Recursion
-> Alternative terms: Induction; Accumulation scheme; Proof-carrying data
-
-However, the computation of $G$ requires a length-$2^k$ multiexponentiation
-$\langle \mathbf{G}, \mathbf{s}\rangle,$ where $\mathbf{s}$ is composed of the round
-challenges $u_1, \cdots, u_k$ arranged in a binary counting structure. This is the
-linear-time computation that we want to amortise across a batch of proof instances.
-Instead of computing $G,$ notice that we can express $G$ as a commitment to a polynomial
-
-$$G = \text{Commit}(\sigma, g(X, u_1, \cdots, u_k)),$$
-
-where $g(X, u_1, \cdots, u_k) := \prod_{i=1}^k (u_i + u_i^{-1}X^{2^{i-1}})$ is a
-polynomial with degree $2^k - 1.$ 
- 
-|  |  | 
-| -------- | -------- | 
-| <img src="https://i.imgur.com/vMXKFDV.png" width=1900> | Since $G$ is a commitment, it can be checked in an inner product argument. The verifier circuit witnesses $G$ and brings $G, u_1, \cdots, u_k$ out as public inputs to the proof $\pi.$ The next verifier instance checks $\pi$ using the inner product argument; this includes checking that $G = \text{Commit}(g(X, u_1, \cdots, u_k))$ evaluates at some random point to the expected value for the given challenges $u_1, \cdots, u_k.$ Recall from the [previous section](#Polynomial-commitment-using-inner-product-argument) that this check only requires $\log d$ work. <br><br> At the end of checking $\pi$ and $G,$ the circuit is left with a new $G',$ along with the $u_1', \cdots, u_k'$ challenges sampled for the check. To fully accept $\pi$ as valid, we should perform a linear-time computation of $G' = \langle\mathbf{G}, \mathbf{s}'\rangle$. Once again, we delay this computation by witnessing $G'$ and bringing $G, u_1, \cdots, u_k$ out as public inputs to the proof $\pi.$ <br><br> This goes on from one proof instance to the next, until we are satisfied with the size of our batch of proofs. We finally perform a single linear-time computation, thus deciding the validity of the whole batch.   |
-
-We recall from the section [Cycles of curves](curves.md#cycles-of-curves) that we can
-instantiate this protocol over a two-cycle, where a proof produced by one curve is
-efficiently verified in the circuit of the other curve. However, some of these verifier
-checks can actually be efficiently performed in the native circuit; these are "deferred"
-to the next native circuit (see diagram below) instead of being immediately passed over to
-the other curve. 
-
-![](https://i.imgur.com/l4HrYgE.png)
--- a/book/src/background/upa.md
+++ b/book/src/background/upa.md
@ -1,81 +0,0 @@
-# [WIP] UltraPLONK arithmetisation
-
-We call the field over which the circuit is defined $\mathbb{F} = \mathbb{F}_p$.
-
-Let $n = 2^k$, and assume that $\omega$ is a primitive root of unity of order $n$ in
-$\mathbb{F}^\times$, so that $\mathbb{F}^\times$ has a multiplicative subgroup
-$\mathcal{H} = \{1, \omega, \omega^2, \cdots, \omega^{n-1}\}$. This forms a Lagrange
-basis corresponding to the elements in the subgroup.
-
-## Polynomial rules
-A polynomial rule defines a constraint that must hold between its specified columns at
-every row (i.e. at every element in the multiplicative subgroup).
-
-e.g.
-
-```text
-a * sa + b * sb + a * b * sm + c * sc + PI = 0 
-```
-
-## Columns
- **fixed (i.e. "selector") columns**: fixed for all instances of a particular circuit.
-  These columns toggle parts of a polynomial rule "on" or "off" to form a "custom gate".
- **advice columns**: variable values assigned in each instance of the circuit.
-  Corresponds to the prover's secret witness.
- **public input**: like advice columns, but publicly known values.
-
-Each column is a vector of $n$ values, e.g. $\mathbf{a} = [a_0, a_1, \cdots, a_{n-1}]$. We
-can think of the vector as the evaluation form of the column polynomial
-$a(X), X \in \mathcal{H}.$ To recover the coefficient form, we can use
-[Lagrange interpolation](polynomials.md#lagrange-interpolation), such that
-$a(\omega^i) = a_i.$
-
-## Copy constraints
- Define permutation between a set of columns, e.g. $\sigma(a, b, c)$
- Copy specific cells between these columns, e.g. $b_1 = c_0$
- Construct permuted columns which should evaluate to same value as original columns
-
-## Permutation grand product
-$$Z(\omega^i) := \prod_{0 \leq j \leq i} \frac{C_k(\omega^j) + \beta\delta^k \omega^j + \gamma}{C_k(\omega^j) + \beta S_k(\omega^j) + \gamma},$$
-where $i = 0, \cdots, n-1$ indexes over the size of the multiplicative subgroup, and
-$k = 0, \cdots, m-1$ indexes over the advice columns involved in the permutation. This is
-a running product, where each term includes the cumulative product of the terms before it.
-
-> TODO: what is $\delta$? keep columns linearly independent
-
-Check the constraints:
-
-1. First term is equal to one
-   $$\mathcal{L}_0(X) \cdot (1 - Z(X)) = 0$$
-
-2. Running product is well-constructed. For each row, we check that this holds:
-   $$Z(\omega^i) \cdot{(C(\omega^i) + \beta S_k(\omega^i) + \gamma)} - Z(\omega^{i-1}) \cdot{(C(\omega^i) + \delta^k \beta \omega^i + \gamma)} = 0$$
-   Rearranging gives 
-   $$Z(\omega^i) = Z(\omega^{i-1}) \frac{C(\omega^i) + \beta\delta^k \omega^i + \gamma}{C(\omega^i) + \beta S_k(\omega^i) + \gamma},$$
-   which is how we defined the grand product polynomial in the first place.
-
-### Lookup
-Reference: [Generic Lookups with PLONK (DRAFT)](/LTPc5f-3S0qNF6MtwD-Tdg?view)
-
-### Vanishing argument
-We want to check that the expressions defined by the gate constraints, permutation
-constraints and loookup constraints evaluate to zero at all elements in the multiplicative
-subgroup. To do this, the prover collapses all the expressions into one polynomial 
-$$H(X) = \sum_{i=0}^e y^i E_i(X),$$
-where $e$ is the number of expressions and $y$ is a random challenge used to keep the
-constraints linearly independent. The prover then divides this by the vanishing polynomial
-(see section: [Vanishing polynomial](polynomials.md#vanishing-polynomial)) and commits to
-the resulting quotient
-
-$$\text{Commit}(Q(X)), \text{where } Q(X) = \frac{H(X)}{Z_H(X)}.$$
-
-The verifier responds with a random evaluation point $x,$ to which the prover replies with
-the claimed evaluations $q = Q(x), \{e_i\}_{i=0}^e = \{E_i(x)\}_{i=0}^e.$ Now, all that
-remains for the verifier to check is that the evaluations satisfy
-
-$$q \stackrel{?}{=} \frac{\sum_{i=0}^e y^i e_i}{Z_H(x)}.$$
-
-Notice that we have yet to check that the committed polynomials indeed evaluate to the
-claimed values at
-$x, q \stackrel{?}{=} Q(x), \{e_i\}_{i=0}^e \stackrel{?}{=} \{E_i(x)\}_{i=0}^e.$
-This check is handled by the polynomial commitment scheme (described in the next section).
--- a/book/src/concepts.md
+++ b/book/src/concepts.md
@ -1,5 +0,0 @@
-# Concepts
-
-First we'll describe the concepts behind zero-knowledge proof systems; the
-*arithmetization* (kind of circuit description) used by Halo 2; and the
-abstractions we use to build circuit implementations.
--- a/book/src/concepts/arithmetization.md
+++ b/book/src/concepts/arithmetization.md
@ -1,60 +0,0 @@
-# UltraPLONK Arithmetization
-
-The arithmetization used by Halo 2 comes from [PLONK](https://eprint.iacr.org/2019/953), or
-more precisely its extension UltraPLONK that supports custom gates and lookup arguments. We'll
-call it ***UPA*** (***UltraPLONK arithmetization***).
-
-> The term UPA and some of the other terms we use to describe it are not used in the PLONK
-> paper.
-
-***UPA circuits*** are defined in terms of a rectangular matrix of values. We refer to
-***rows***, ***columns***, and ***cells*** of this matrix with the conventional meanings.
-
-A UPA circuit depends on a ***configuration***:
-
-* A finite field $\mathbb{F}$, where cell values (for a given statement and witness) will be
-  elements of $\mathbb{F}$.
-* The number of columns in the matrix, and a specification of each column as being
-  ***fixed***, ***advice***, or ***instance***. Fixed columns are fixed by the circuit;
-  advice columns correspond to witness values; and instance columns are normally used for
-  public inputs (technically, they can be used for any elements shared between the prover
-  and verifier).
-
-* A subset of the columns that can participate in equality constraints.
-
-* A ***polynomial degree bound***.
-
-* A sequence of ***polynomial constraints***. These are multivariate polynomials over
-  $\mathbb{F}$ that must evaluate to zero *for each row*. The variables in a polynomial
-  constraint may refer to a cell in a given column of the current row, or a given column of
-  another row relative to this one (with wrap-around, i.e. taken modulo $n$). The maximum
-  degree of each polynomial is given by the polynomial degree bound.
-
-* A sequence of ***lookup arguments*** defined over tuples of ***input expressions***
-  (which are multivariate polynomials as above) and ***table columns***.
-
-A UPA circuit also defines:
-
-* The number of rows $n$ in the matrix. $n$ must correspond to the size of a multiplicative
-  subgroup of $\mathbb{F}^\times$; typically a power of two.
-
-* A sequence of ***equality constraints***, which specify that two given cells must have equal
-  values.
-
-* The values of the fixed columns at each row.
-
-From a circuit description we can generate a ***proving key*** and a ***verification key***,
-which are needed for the operations of proving and verification for that circuit.
-
-> Note that we specify the ordering of columns, polynomial constraints, lookup arguments, and
-> equality constraints, even though these do not affect the meaning of the circuit. This makes
-> it easier to define the generation of proving and verification keys as a deterministic
-> process.
-
-Typically, a configuration will define polynomial constraints that are switched off and on by
-***selectors*** defined in fixed columns. For example, a constraint $q_i \cdot p(...) = 0$ can
-be switched off for a particular row $i$ by setting $q_i = 0$. In this case we sometimes refer
-to a set of constraints controlled by a set of selector columns that are designed to be used
-together, as a ***gate***. Typically there will be a ***standard gate*** that supports generic
-operations like field multiplication and division, and possibly also ***custom gates*** that
-support more specialized operations.
--- a/book/src/concepts/chips.md
+++ b/book/src/concepts/chips.md
@ -1,18 +0,0 @@
-# Chips
-
-In order to combine functionality from several cores, we use a ***chip***. To implement a
-chip, we define a set of fixed, advice, and instance columns, and then specify how they
-should be distributed between cores.
-
-In the simplest case, each core will use columns disjoint from the other cores. However, it
-is allowed to share a column between cores. It is important to optimize the number of advice
-columns in particular, because that affects proof size.
-
-The result (possibly after optimization) is a UPA configuration. Our circuit implementation
-will be parameterized on a chip, and can use any features of the supported cores via the chip.
-
-Our hope is that less expert users will normally be able to find an existing chip that
-supports the operations they need, or only have to make minor modifications to an existing
-chip. Expert users will have full control to do the kind of
-[circuit optimizations](https://zips.z.cash/protocol/canopy.pdf#circuitdesign)
-[that ECC is famous  for](https://electriccoin.co/blog/cultivating-sapling-faster-zksnarks/) 🙂.
--- a/book/src/concepts/cores.md
+++ b/book/src/concepts/cores.md
@ -1,68 +0,0 @@
-# Cores
-
-The previous section gives a fairly low-level description of a circuit. When implementing circuits we will
-typically use a higher-level API which aims for the desirable characteristics of auditability,
-efficiency, modularity, and expressiveness.
-
-Some of the terminology and concepts used in this API are taken from an analogy with
-integrated circuit design and layout. [As for integrated circuits](https://opencores.org/),
-the above desirable characteristics are easier to obtain by composing ***cores*** that provide
-efficient pre-built implementations of particular functionality.
-
-For example, we might have cores that implement particular cryptographic primitives such as a
-hash function or cipher, or algorithms like scalar multiplication or pairings.
-
-In UPA, it is possible to build up arbitrary logic just from standard gates that do field
-multiplication and addition. However, very significant efficiency gains can be obtained by
-using custom gates.
-
-Using our API, we define cores that "know" how to use particular sets of custom gates. This
-creates an abstraction layer that isolates the implementation of a high-level circuit from the
-complexity of using custom gates directly.
-
-> Even if we sometimes need to "wear two hats", by implementing both a high-level circuit and
-> the cores that it uses, the intention is that this separation will result in code that is
-> easier to understand, audit, and maintain/reuse. This is partly because some potential
-> implementation errors are ruled out by construction.
-
-Gates in UPA refer to cells by ***relative references***, i.e. to the cell in a given column,
-and the row at a given offset relative to the one in which the gate's selector is set. We call
-this an ***offset reference*** when the offset is nonzero (i.e. offset references are a subset
-of relative references).
-
-Relative references contrast with ***absolute references*** used in equality constraints,
-which can point to any cell.
-
-The motivation for offset references is to reduce the number of columns needed in the
-configuration, which reduces proof size. If we did not have offset references then we would
-need a column to hold each value referred to by a custom gate, and we would need to use
-equality constraints to copy values from other cells of the circuit into that column. With
-offset references, we not only need fewer columns; we also do not need equality constraints to
-be supported for all of those columns, which improves efficiency.
-
-In R1CS (another arithmetization which may be more familiar to some readers, but don't worry
-if it isn't), a circuit consists of a "sea of gates" with no semantically significant ordering.
-Because of offset references, the order of rows in a UPA circuit, on the other hand, *is*
-significant. We're going to make some simplifying assumptions and define some abstractions to
-tame the resulting complexity: the aim will be that, [at the gadget level](gadgets.md) where
-we do most of our circuit construction, we will not have to deal with relative references or
-with gate layout explicitly.
-
-We will partition a circuit into ***regions***, where each region contains a disjoint subset
-of cells, and relative references only ever point *within* a region. Part of the responsibility
-of a core implementation is to ensure that gates that make offset references are laid out in
-the correct positions in a region.
-
-Given the set of regions and their ***shapes***, we will use a separate ***floor planner***
-to decide where (i.e. at what starting row) each region is placed. There is a default floor
-planner that implements a very general algorithm, but you can write your own floor planner if
-you need to.
-
-Floor planning will in general leave gaps in the matrix, because the gates in a given row did
-not use all available columns. These are filled in —as far as possible— by gates that do
-not require offset references, which allows them to be placed on any row.
-
-Cores can also define lookup tables. If more than one table is defined for the same lookup
-argument, we can use a ***tag column*** to specify which table is used on each row. It is also
-possible to perform a lookup in the union of several tables (limited by the polynomial degree
-bound).
--- a/book/src/concepts/gadgets.md
+++ b/book/src/concepts/gadgets.md
@ -1,25 +0,0 @@
-# Gadgets
-
-When implementing a circuit, we could use the features of the cores we've selected directly
-via the chip. Typically, though, we will use them via ***gadgets***. This indirection is
-useful because, for reasons of efficiency and limitations imposed by UPA, the core interfaces
-will often be dependent on low-level implementation details. The gadget interface can provide
-a more convenient and stable API that abstracts away from extraneous detail.
-
-For example, consider a hash function such as SHA-256. The interface of a core supporting
-SHA-256 might be dependent on internals of the hash function design such as the separation
-between message schedule and compression function. The corresponding gadget interface can
-provide a more convenient and familiar `update`/`finalize` API, and can also handle parts
-of the hash function that do not need core support, such as padding. This is similar to how
-[accelerated](https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html)
-[instructions](https://developer.arm.com/documentation/ddi0514/g/introduction/about-the-cortex-a57-processor-cryptography-engine)
-for cryptographic primitives on CPUs are typically accessed via software libraries, rather
-than directly.
-
-Gadgets can also provide modular and reusable abstractions for circuit programming
-at a higher level, similar to their use in libraries such as
-[libsnark](https://github.com/christianlundkvist/libsnark-tutorial) and
-[bellman](https://electriccoin.co/blog/bellman-zksnarks-in-rust/). As well as abstracting
-*functions*, they can also abstract *types*, such as elliptic curve points or integers of
-specific sizes.
-
--- a/book/src/concepts/proofs.md
+++ b/book/src/concepts/proofs.md
@ -1,91 +0,0 @@
-# Proof systems
-
-The aim of any ***proof system*** is to be able to prove interesting mathematical or
-cryptographic ***statements***.
-
-Typically, in a given protocol we will want to prove families of statements that differ
-in their ***public inputs***. The prover will also need to show that they know some
-***private inputs*** that make the statement hold.
-
-To do this we write down a ***relation***, $\mathcal{R}$, that specifies which
-combinations of public and private inputs are valid.
-
-> The terminology above is intended to be aligned with the
-> [ZKProof Community Reference](https://docs.zkproof.org/reference#latest-version).
-
-To be precise, we should distinguish between the relation $\mathcal{R}$, and its
-implementation to be used in a proof system. We call the latter a ***circuit***.
-
-The language that we use to express circuits for a particular proof system is called an
-***arithmetization***. Usually, an arithmetization will define circuits in terms of
-polynomial constraints on variables over a field.
-
-> The _process_ of expressing a particular relation as a circuit is also sometimes called
-> "arithmetization", but we'll avoid that usage.
-
-To create a proof of a statement, the prover will need to know the private inputs,
-and also intermediate values, called ***advice*** values, that are used by the circuit.
-
-We assume that we can compute advice values efficiently from the private and public inputs.
-The particular advice values will depend on how we write the circuit, not only on the
-high-level statement.
-
-The private inputs and advice values are collectively called a ***witness***.
-
-> Some authors use "witness" as just a synonym for private inputs. But in our usage,
-> a witness includes advice, i.e. it includes all values that the prover supplies to
-> the circuit.
-
-For example, suppose that we want to prove knowledge of a preimage $x$ of a
-hash function $H$ for a digest $y$:
-
-* The private input would be the preimage $x$.
-
-* The public input would be the digest $y$.
-
-* The relation would be $\{(x, y) : H(x) = y\}$.
-
-* For a particular public input $Y$, the statement would be: $\{(x) : H(x) = Y\}$.
-
-* The advice would be all of the intermediate values in the circuit implementing the
-  hash function. The witness would be $x$ and the advice.
-
-A ***Non-interactive Argument*** allows a ***prover*** to create a ***proof*** for a
-given statement and witness. The proof is data that can be used to convince a ***verifier***
-that _there exists_ a witness for which the statement holds. The security property that
-such proofs cannot falsely convince a verifier is called ***soundness***.
-
-A ***Non-interactive Argument of Knowledge*** (***NARK***) further convinces the verifier
-that the prover _knew_ a witness for which the statement holds. This security property is
-called ***knowledge soundness***, and it implies soundness.
-
-In practice knowledge soundness is more useful for cryptographic protocols than soundness:
-if we are interested in whether Alice holds a secret key in some protocol, say, we need
-Alice to prove that _she knows_ the key, not just that it exists.
-
-Knowledge soundness is formalized by saying that an ***extractor***, which can observe
-precisely how the proof is generated, must be able to compute the witness.
-
-> This property is subtle given that proofs can be ***malleable***. That is, depending on the
-> proof system it may be possible to take an existing proof (or set of proofs) and, without
-> knowing the witness(es), modify it/them to produce a distinct proof of the same or a related
-> statement. Higher-level protocols that use malleable proof systems need to take this into
-> account.
->
-> Even without malleability, proofs can also potentially be ***replayed***. For instance,
-> we would not want Alice in our example to be able to present a proof generated by someone
-> else, and have that be taken as a demonstration that she knew the key.
-
-If a proof yields no information about the witness (other than that a witness exists and was
-known to the prover), then we say that the proof system is ***zero knowledge***.
-
-If a proof system produces short proofs —i.e. of length polylogarithmic in the circuit
-size— then we say that it is ***succinct***. A succinct NARK is called a ***SNARK***
-(***Succinct Non-Interactive Argument of Knowledge***).
-
-> By this definition, a SNARK need not have verification time polylogarithmic in the circuit
-> size. Some papers use the term ***efficient*** to describe a SNARK with that property, but
-> we'll avoid that term since it's ambiguous for SNARKs that support amortized or recursive
-> verification, which we'll get to later.
-
-A ***zk-SNARK*** is a zero-knowledge SNARK.
--- a/book/src/design/gadgets.md
+++ b/book/src/design/gadgets.md
@ -1,7 +0,0 @@
-# Gadgets
-
-In this section we document some example gadgets and chip designs that are suitable for
-Halo 2.
-
-> Neither these gadgets, nor their implementations, have been reviewed, and they should
-> not be used in production.
--- a/book/src/design/gadgets/sha256.md
+++ b/book/src/design/gadgets/sha256.md
@ -1,65 +0,0 @@
-# SHA-256
-
-## Specification
-
-SHA-256 is specified in [NIST FIPS PUB 180-4](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf).
-
-Unlike the specification, we use $\boxplus$ for addition modulo $2^{32}$, and $+$ for
-field addition. $\oplus$ is used for XOR.
-
-## Gadget interface
-
-SHA-256 maintains state in eight 32-bit variables. It processes input as 512-bit blocks,
-but internally splits these blocks into 32-bit chunks. We therefore designed the SHA-256
-gadget to consume input in 32-bit chunks.
-
-## Chip instructions
-
-The SHA-256 gadget requires a chip with the following instructions:
-
-```rust
-# extern crate halo2;
-# use halo2::plonk::Error;
-# use std::fmt;
-#
-# trait Chip: Sized {}
-# trait Layouter<C: Chip> {}
-const BLOCK_SIZE: usize = 16;
-const DIGEST_SIZE: usize = 8;
-
-pub trait Sha256Instructions: Chip {
-    /// Variable representing the SHA-256 internal state.
-    type State: Clone + fmt::Debug;
-    /// Variable representing a 32-bit word of the input block to the SHA-256 compression
-    /// function.
-    type BlockWord: Copy + fmt::Debug;
-
-    /// Places the SHA-256 IV in the circuit, returning the initial state variable.
-    fn initialization_vector(layouter: &mut impl Layouter<Self>) -> Result<Self::State, Error>;
-
-    /// Starting from the given initial state, processes a block of input and returns the
-    /// final state.
-    fn compress(
-        layouter: &mut impl Layouter<Self>,
-        initial_state: &Self::State,
-        input: [Self::BlockWord; BLOCK_SIZE],
-    ) -> Result<Self::State, Error>;
-
-    /// Converts the given state into a message digest.
-    fn digest(
-        layouter: &mut impl Layouter<Self>,
-        state: &Self::State,
-    ) -> Result<[Self::BlockWord; DIGEST_SIZE], Error>;
-}
-```
-
-TODO: Add instruction for computing padding.
-
-This set of instructions was chosen to strike a balance between the reusability of the
-instructions, and the scope for chips to internally optimise them. In particular, we
-considered splitting the compression function into its constituent parts (Ch, Maj etc),
-and providing a compression function gadget that implemented the round logic. However,
-this would prevent chips from using relative references between the various parts of a
-compression round. Having an instruction that implements all compression rounds is also
-similar to the Intel SHA extensions, which provide an instruction that performs multiple
-compression rounds.
--- a/book/src/design/gadgets/sha256/bit_reassignment.png
+++ b/book/src/design/gadgets/sha256/bit_reassignment.png
--- a/book/src/design/gadgets/sha256/compression.png
+++ b/book/src/design/gadgets/sha256/compression.png
--- a/book/src/design/gadgets/sha256/low_sigma_0.png
+++ b/book/src/design/gadgets/sha256/low_sigma_0.png
--- a/book/src/design/gadgets/sha256/low_sigma_1.png
+++ b/book/src/design/gadgets/sha256/low_sigma_1.png
--- a/book/src/design/gadgets/sha256/table16.md
+++ b/book/src/design/gadgets/sha256/table16.md
@ -1,899 +0,0 @@
-# 16-bit table chip for SHA-256
-
-This chip implementation is based around a single 16-bit lookup table. It requires a
-minimum of $2^{16}$ circuit rows, and is therefore suitable for use in larger circuits.
-
-We target a maximum constraint degree of $9$. That will allow us to handle constraining
-carries and "small pieces" to a range of up to $\{0..7\}$ in one row.
-
-## Compression round
-
-There are $64$ compression rounds. Each round takes 32-bit values $A, B, C, D, E, F, G, H$
-as input, and performs the following operations:
-
-$$
-\begin{array}{rcl}
-Ch(E, F, G)  &=& (E \wedge F) \oplus (¬E \wedge G) \\
-Maj(A, B, C) &=& (A \wedge B) \oplus (A \wedge C) \oplus (B \wedge C) \\
-             &=& count(A, B, C) \geq 2 \\
-\Sigma_0(A)  &=& (A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋙ 22) \\
-\Sigma_1(E)  &=& (E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋙ 25) \\
-H' &=& H + Ch(E, F, G) + \Sigma_1(E) + K_t + W_t \\
-E_{new} &=& reduce_6(H' + D) \\
-A_{new} &=& reduce_7(H' + Maj(A, B, C) + \Sigma_0(A))
-\end{array}
-$$
-
-where $reduce_i$ must handle a carry $0 \leq \mathit{carry} < i$.
-
-![The SHA-256 compression function](./compression.png)
-
-Define $\mathtt{spread}$ as a table mapping a $16$-bit input to an output interleaved with
-zero bits. We do not require a separate table for range checks because $\mathtt{spread}$
-can be used.
-
-### Modular addition
-
-To implement addition modulo $2^{32}$, we note that this is equivalent to adding the
-operands using field addition, and then masking away all but the lowest 32 bits of the
-result. For example, if we have two operands $a$ and $b$:
-
-$$a \boxplus b = c,$$
-
-we decompose each operand (along with the result) into 16-bit chunks:
-
-$$(a_L : \mathbb{Z}_{2^{16}}, a_H : \mathbb{Z}_{2^{16}}) \boxplus (b_L : \mathbb{Z}_{2^{16}}, b_H : \mathbb{Z}_{2^{16}}) = (c_L : \mathbb{Z}_{2^{16}}, c_H : \mathbb{Z}_{2^{16}}),$$
-
-and then reformulate the constraint using field addition:
-
-$$\mathsf{carry} \cdot 2^{32} + c_H \cdot 2^{16} + c_L = (a_H + b_H) \cdot 2^{16} + a_L + b_L.$$
-
-More generally, any bit-decomposition of the output can be used, not just a decomposition
-into 16-bit chunks. Note that this correctly handles the carry from $a_L + b_L$.
-
-This constraint requires that each chunk is correctly range-checked (or else an assignment
-could overflow the field).
-
- The operand and result chunks can be constrained using $\mathtt{spread}$, by looking up
-  each chunk in the "dense" column within a subset of the table. This way we additionally
-  get the "spread" form of the output for free; in particular this is true for the output
-  of the bottom-right $\boxplus$ which becomes $A_{new}$, and the output of the leftmost
-  $\boxplus$ which becomes $E_{new}$. We will use this below to optimize $Maj$ and $Ch$.
-
- $\mathsf{carry}$ must be constrained to the precise range of allowed carry values for
-  the number of operands. We do this with a
-  [small range constraint](../../../user/tips-and-tricks.md#small-range-constraints).
-
-### Maj function
-
-$Maj$ can be done in $4$ lookups: $2\; \mathtt{spread} * 2$ chunks
-
- As mentioned above, after the first round we already have $A$ in spread form $A'$.
-  Similarly, $B$ and $C$ are equal to the $A$ and $B$ respectively of the previous round,
-  and therefore in the steady state we already have them in spread form $B'$ and $C'$. In
-  fact we can also assume we have them in spread form in the first round, either from the
-  fixed IV or from the use of $\mathtt{spread}$ to reduce the output of the feedforward in
-  the previous block.
- Add the spread forms in the field: $M' = A' + B' + C'$;
-  - We can add them as $32$-bit words or in pieces; it's equivalent
- Witness the compressed even bits $M^{even}_i$ and the compressed odd bits $M^{odd}_i$ for $i = \{0..1\}$;
- Constrain $M' = \mathtt{spread}(M^{even}_0) + 2 \cdot \mathtt{spread}(M^{odd}_0) + 2^{32} \cdot \mathtt{spread}(M^{even}_1) + 2^{33} \cdot \mathtt{spread}(M^{odd}_1)$, where $M^{odd}_i$ is the $Maj$ function output.
-
-> Note: by "even" bits we mean the bits of weight an even-power of $2$, i.e. of weight
-> $2^0, 2^2, \ldots$. Similarly by "odd" bits we mean the bits of weight an odd-power of
-> $2$.
-
-### Ch function
-> TODO: can probably be optimised to $4$ or $5$ lookups using an additional table.
->
-$Ch$ can be done in $8$ lookups: $4\; \mathtt{spread} * 2$ chunks
-
- As mentioned above, after the first round we already have $E$ in spread form $E'$.
-  Similarly, $F$ and $G$ are equal to the $E$ and $F$ respectively of the previous round,
-  and therefore in the steady state we already have them in spread form $F'$ and $G'$. In
-  fact we can also assume we have them in spread form in the first round, either from the
-  fixed IV or from the use of $\mathtt{spread}$ to reduce the output of the feedforward in
-  the previous block.
- Calculate $P' = E' + F'$ and $Q' = (evens - E') + G'$, where $evens = \mathtt{spread}(2^{32} - 1)$.
-  - We can add them as $32$-bit words or in pieces; it's equivalent.
-  - $evens - E'$ works to compute the spread of $¬E$ even though negation and
-    $\mathtt{spread}$ do not commute in general. It works because each spread bit in $E'$
-    is subtracted from $1$, so there are no borrows.
- Witness $P^{even}_i, P^{odd}_i, Q^{even}_i, Q^{odd}_i$ such that
-  $P' = \mathtt{spread}(P^{even}_0) + 2 \cdot \mathtt{spread}(P^{odd}_0) + 2^{32} \cdot \mathtt{spread}(P^{even}_1) + 2^{33} \cdot \mathtt{spread}(P^{odd}_1)$, and similarly for $Q'$.
- $\{P^{odd}_i + Q^{odd}_i\}_{i=0..1}$ is the $Ch$ function output.
-
-### Σ_0 function
-
-$\Sigma_0(A)$ can be done in $6$ lookups.
-
-To achieve this we first split $A$ into pieces $(a, b, c, d)$, of lengths $(2, 11, 9, 10)$
-bits respectively counting from the little end. At the same time we obtain the spread
-forms of these pieces. This can all be done in two PLONK rows, because the $10$ and
-$11$-bit pieces can be handled using $\mathtt{spread}$ lookups, and the $9$-bit piece can
-be split into $3 * 3$-bit subpieces. The latter and the remaining $2$-bit piece can be
-range-checked by polynomial constraints in parallel with the two lookups, two small pieces
-in each row. The spread forms of these small pieces are found by interpolation.
-
-Note that the splitting into pieces can be combined with the reduction of $A_{new}$, i.e.
-no extra lookups are needed for the latter. In the last round we reduce $A_{new}$ after
-adding the feedforward (requiring a carry of up to $7$ which is fine).
-
-$(A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋙ 22)$ is equivalent to
-$(A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋘ 10)$:
-
-![](./upp_sigma_0.png)
-
-Then, using $4$ more $\mathtt{spread}$ lookups we obtain the result as the even bits of a
-linear combination of the pieces:
-
-$$
-\begin{array}{rcccccccl}
-     &    (a    &||&    d    &||&    c   &||&   b) & \oplus \\
-     &    (b    &||&    a    &||&    d   &||&   c) & \oplus \\
-     &    (c    &||&    b    &||&    a   &||&   d) & \\
-&&&&\Downarrow \\
-R' = & 4^{30} a &+& 4^{20} d &+& 4^{11} c &+&   b\;&+ \\
-     & 4^{21} b &+& 4^{19} a &+& 4^{ 9} d &+&   c\;&+ \\
-     & 4^{23} c &+& 4^{12} b &+& 4^{10} a &+&   d\;&
-\end{array}
-$$
-
-That is, we witness the compressed even bits $R^{even}_i$ and the compressed odd bits
-$R^{odd}_i$, and constrain
-$$R' = \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)$$
-where $\{R^{even}_i\}_{i=0..1}$ is the $\Sigma_0$ function output.
-
-### Σ_1 function
-
-$\Sigma_1(E)$ can be done in $6$ lookups.
-
-To achieve this we first split $E$ into pieces $(a, b, c, d)$, of lengths $(6, 5, 14, 7)$
-bits respectively counting from the little end. At the same time we obtain the spread
-forms of these pieces. This can all be done in two PLONK rows, because the $7$ and
-$14$-bit pieces can be handled using $\mathtt{spread}$ lookups, the $5$-bit piece can be
-split into $3$ and $2$-bit subpieces, and the $6$-bit piece can be split into $2 * 3$-bit
-subpieces. The four small pieces can be range-checked by polynomial constraints in
-parallel with the two lookups, two small pieces in each row. The spread forms of these
-small pieces are found by interpolation.
-
-Note that the splitting into pieces can be combined with the reduction of $E_{new}$, i.e.
-no extra lookups are needed for the latter. In the last round we reduce $E_{new}$ after
-adding the feedforward (requiring a carry of up to $6$ which is fine).
-
-$(E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋙ 25)$ is equivalent to
-$(E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋘ 7)$.
-
-![](./upp_sigma_1.png)
-
-Then, using $4$ more $\mathtt{spread}$ lookups we obtain the result as the even bits of a
-linear combination of the pieces, in the same way we did for $\Sigma_0$:
-
-$$
-\begin{array}{rcccccccl}
-     &    (a    &||&    d    &||&    c   &||&   b) & \oplus \\
-     &    (b    &||&    a    &||&    d   &||&   c) & \oplus \\
-     &    (c    &||&    b    &||&    a   &||&   d) & \\
-&&&&\Downarrow \\
-R' = & 4^{26} a &+& 4^{19} d &+& 4^{ 5} c &+&   b\;&+ \\
-     & 4^{27} b &+& 4^{21} a &+& 4^{14} d &+&   c\;&+ \\
-     & 4^{18} c &+& 4^{13} b &+& 4^{ 7} a &+&   d\;&
-\end{array}
-$$
-
-That is, we witness the compressed even bits $R^{even}_i$ and the compressed odd bits
-$R^{odd}_i$, and constrain
-$$R' = \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)$$
-where $\{R^{even}_i\}_{i=0..1}$ is the $\Sigma_1$ function output.
-
-## Block decomposition
-
-For each block $M \in \{0,1\}^{512}$ of the padded message, $64$ words of $32$ bits each
-are constructed as follows:
- The first $16$ are obtained by splitting $M$ into $32$-bit blocks $$M = W_0 || W_1 || \cdots || W_{14} || W_{15};$$
- The remaining $48$ words are constructed using the formula:
-$$W_i = \sigma_1(W_{i-2}) \boxplus W_{i-7} \boxplus \sigma_0(W_{i-15}) \boxplus W_{i-16},$$ for $16 \leq i < 64$.
-
-> Note: $0$-based numbering is used for the $W$ word indices.
-
-$$
-\begin{array}{ccc}
-\sigma_0(X) &=& (X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3) \\
-\sigma_1(X) &=& (X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10) \\
-\end{array}
-$$
-
-> Note: $≫$ is a right-**shift**, not a rotation.
-
-### σ_0 function
-
-$(X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3)$ is equivalent to
-$(X ⋙ 7) \oplus (X ⋘ 14) \oplus (X ≫ 3)$.
-
-![](./low_sigma_0.png)
-
-As above but with pieces $(a, b, c, d)$ of lengths $(3, 4, 11, 14)$ counting from the
-little end. Split $b$ into two $2$-bit subpieces.
-
-$$
-\begin{array}{rcccccccl}
-     & (0^{[3]} &||&    d    &||&    c   &||&   b) & \oplus \\
-     & (\;\;\;b &||&    a    &||&    d   &||&   c) & \oplus \\
-     & (\;\;\;c &||&    b    &||&    a   &||&   d) & \\
-&&&&\Downarrow \\
-R' = &          & & 4^{15} d &+& 4^{ 4} c &+&   b\;&+ \\
-     & 4^{28} b &+& 4^{25} a &+& 4^{11} d &+&   c\;&+ \\
-     & 4^{21} c &+& 4^{17} b &+& 4^{14} a &+&   d\;&
-\end{array}
-$$
-
-### σ_1 function
-
-$(X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10)$ is equivalent to
-$(X ⋘ 15) \oplus (X ⋘ 13) \oplus (X ≫ 10)$.
-
-![](./low_sigma_1.png)
-
-TODO: this diagram doesn't match the expression on the right. This is just for consistency
-with the other diagrams.
-
-As above but with pieces $(a, b, c, d)$ of lengths $(10, 7, 2, 13)$ counting from the
-little end. Split $b$ into $(3, 2, 2)$-bit subpieces.
-
-$$
-\begin{array}{rcccccccl}
-     & (0^{[10]}&||&    d    &||&    c   &||&   b) & \oplus \\
-     & (\;\;\;b &||&    a    &||&    d   &||&   c) & \oplus \\
-     & (\;\;\;c &||&    b    &||&    a   &||&   d) & \\
-&&&&\Downarrow \\
-R' = &          & & 4^{ 9} d &+& 4^{ 7} c &+&   b\;&+ \\
-     & 4^{25} b &+& 4^{15} a &+& 4^{ 2} d &+&   c\;&+ \\
-     & 4^{30} c &+& 4^{23} b &+& 4^{13} a &+&   d\;&
-\end{array}
-$$
-
-### Message scheduling
-
-We apply $\sigma_0$ to $W_{1..48}$, and $\sigma_1$ to $W_{14..61}$. In order to avoid
-redundant applications of $\mathtt{spread}$, we can merge the splitting into pieces for
-$\sigma_0$ and $\sigma_1$ in the case of $W_{14..48}$. Merging the piece lengths
-$(3, 4, 11, 14)$ and $(10, 7, 2, 13)$ gives pieces of lengths $(3, 4, 3, 7, 1, 1, 13)$.
-
-![](./bit_reassignment.png)
-
-If we can do the merged split in $3$ rows (as opposed to a total of $4$ rows when
-splitting for $\sigma_0$ and $\sigma_1$ separately), we save $35$ rows.
-
-> These might even be doable in $2$ rows; not sure.
-> —Daira
-
-We can merge the reduction mod $2^{32}$ of $W_{16..61}$ into their splitting when they are
-used to compute subsequent words, similarly to what we did for $A$ and $E$ in the round
-function.
-
-We will still need to reduce $W_{62..63}$ since they are not split. (Technically we could
-leave them unreduced since they will be reduced later when they are used to compute
-$A_{new}$ and $E_{new}$ -- but that would require handling a carry of up to $10$ rather
-than $6$, so it's not worth the complexity.)
-
-The resulting message schedule cost is:
- $2$ rows to constrain $W_0$ to $32$ bits
-  - This is technically optional, but let's do it for robustness, since the rest of the
-    input is constrained for free.
- $13*2$ rows to split $W_{1..13}$ into $(3, 4, 11, 14)$-bit pieces
- $35*3$ rows to split $W_{14..48}$ into $(3, 4, 3, 7, 1, 1, 13)$-bit pieces (merged with
-  a reduction for $W_{16..48}$)
- $13*2$ rows to split $W_{49..61}$ into $(10, 7, 2, 13)$-bit pieces (merged with a
-  reduction)
- $4*48$ rows to extract the results of $\sigma_0$ for $W_{1..48}$
- $4*48$ rows to extract the results of $\sigma_1$ for $W_{14..61}$
- $2*2$ rows to reduce $W_{62..63}$
- $= 547$ rows.
-
-## Overall cost
-
-For each round:
- $8$ rows for $Ch$
- $4$ rows for $Maj$
- $6$ rows for $\Sigma_0$
- $6$ rows for $\Sigma_1$
- $reduce_6$ and $reduce_7$ are always free
- $= 24$ per round
-
-This gives $24*64 = 1792$ rows for all of "step 3", to which we need to add:
-
- $547$ rows for message scheduling
- $2*8$ rows for $8$ reductions mod $2^{32}$ in "step 4"
-
-giving a total of $2099$ rows.
-
-## Tables
-
-We only require one table $\mathtt{spread}$, with $2^{16}$ rows and $3$ columns. We need a
-tag column to allow selecting $(7, 10, 11, 13, 14)$-bit subsets of the table for
-$\Sigma_{0..1}$ and $\sigma_{0..1}$.
-
-### `spread` table
-
-| row          | tag | table (16b)      | spread (32b)                     |
-|--------------|-----|------------------|----------------------------------|
-| $0$          |  0  | 0000000000000000 | 00000000000000000000000000000000 |
-| $1$          |  0  | 0000000000000001 | 00000000000000000000000000000001 |
-| $2$          |  0  | 0000000000000010 | 00000000000000000000000000000100 |
-| $3$          |  0  | 0000000000000011 | 00000000000000000000000000000101 |
-| ...          |  0  |       ...        |                ...               |
-| $2^{7} - 1$  |  0  | 0000000001111111 | 00000000000000000001010101010101 |
-| $2^{7}$      |  1  | 0000000010000000 | 00000000000000000100000000000000 |
-| ...          |  1  |       ...        |                ...               |
-| $2^{10} - 1$ |  1  | 0000001111111111 | 00000000000001010101010101010101 |
-| ...          |  2  |       ...        |                ...               |
-| $2^{11} - 1$ |  2  | 0000011111111111 | 00000000010101010101010101010101 |
-| ...          |  3  |       ...        |                ...               |
-| $2^{13} - 1$ |  3  | 0001111111111111 | 00000001010101010101010101010101 |
-| ...          |  4  |       ...        |                ...               |
-| $2^{14} - 1$ |  4  | 0011111111111111 | 00000101010101010101010101010101 |
-| ...          |  5  |       ...        |                ...               |
-| $2^{16} - 1$ |  5  | 1111111111111111 | 01010101010101010101010101010101 |
-
-For example, to do an $11$-bit $\mathtt{spread}$ lookup, we polynomial-constrain the tag
-to be in $\{0, 1, 2\}$. For the most common case of a $16$-bit lookup, we don't need to
-constrain the tag. Note that we can fill any unused rows beyond $2^{16}$ with a duplicate
-entry, e.g. all-zeroes.
-
-## Gates
-
-### Choice gate
-Input from previous operations:
- $E', F', G',$ 64-bit spread forms of 32-bit words $E, F, G$, assumed to be constrained by previous operations
-    - in practice, we'll have the spread forms of $E', F', G'$ after they've been decomposed into 16-bit subpieces
- $evens$ is defined as $\mathtt{spread}(2^{32} - 1)$
-    - $evens_0 = evens_1 = \mathtt{spread}(2^{16} - 1)$
-
-#### E ∧ F
-|s_ch|   $a_0$     |    $a_1$    |           $a_2$             |                $a_3$               |                $a_4$               |
-|----|-------------|-------------|-----------------------------|------------------------------------|------------------------------------|
-|0   |{0,1,2,3,4,5}|$P_0^{even}$ |$\texttt{spread}(P_0^{even})$| $\mathtt{spread}(E^{lo})$          |      $\mathtt{spread}(E^{hi})$     |
-|1   |{0,1,2,3,4,5}|$P_0^{odd}$  |$\texttt{spread}(P_0^{odd})$ |$\texttt{spread}(P_1^{odd})$        |                                    |
-|0   |{0,1,2,3,4,5}|$P_1^{even}$ |$\texttt{spread}(P_1^{even})$| $\mathtt{spread}(F^{lo})$          |      $\mathtt{spread}(F^{hi})$     |
-|0   |{0,1,2,3,4,5}|$P_1^{odd}$  |$\texttt{spread}(P_1^{odd})$ |                                    |                                    |
-
-#### ¬E ∧ G
-s_ch_neg|   $a_0$     |    $a_1$    |           $a_2$             |                $a_3$               |                $a_4$               |                $a_5$               |
--------|-------------|-------------|-----------------------------|------------------------------------|------------------------------------|------------------------------------|
-    0   |{0,1,2,3,4,5}|$Q_0^{even}$ |$\texttt{spread}(Q_0^{even})$|$\mathtt{spread}(E_{neg}^{lo})$     |   $\mathtt{spread}(E_{neg}^{hi})$  | $\mathtt{spread}(E^{lo})$          |
-    1   |{0,1,2,3,4,5}|$Q_0^{odd}$  |$\texttt{spread}(Q_0^{odd})$ |$\texttt{spread}(Q_1^{odd})$        |                                    |      $\mathtt{spread}(E^{hi})$     |
-    0   |{0,1,2,3,4,5}|$Q_1^{even}$ |$\texttt{spread}(Q_1^{even})$|$\mathtt{spread}(G^{lo})$           |      $\mathtt{spread}(G^{hi})$     |                                    |
-    0   |{0,1,2,3,4,5}|$Q_1^{odd}$  |$\texttt{spread}(Q_1^{odd})$ |                                    |                                    |                                    |
-
-Constraints:
- `s_ch` (choice): $LHS - RHS = 0$
-    - $LHS = a_3 \omega^{-1} + a_3 \omega + 2^{32}(a_4 \omega^{-1} + a_4 \omega)$
-    - $RHS = a_2 \omega^{-1} + 2* a_2 + 2^{32}(a_2 \omega + 2* a_3)$
- `s_ch_neg` (negation): `s_ch` with an extra negation check
- $\mathtt{spread}$ lookup on $(a_0, a_1, a_2)$
- permutation between $(a_2, a_3)$
-
-Output: $Ch(E, F, G) = P^{odd} + Q^{odd} = (P_0^{odd} + Q_0^{odd}) + 2^{16} (P_1^{odd} + Q_1^{odd})$
-
-### Majority gate
-
-Input from previous operations:
- $A', B', C',$ 64-bit spread forms of 32-bit words $A, B, C$, assumed to be constrained by previous operations
-    - in practice, we'll have the spread forms of $A', B', C'$  after they've been decomposed into $16$-bit subpieces
-
-s_maj|   $a_0$     |   $a_1$    |           $a_2$             |           $a_3$            |           $a_4$          |          $a_5$           |
-----|-------------|------------|-----------------------------|----------------------------|--------------------------|--------------------------|
-  0  |{0,1,2,3,4,5}|$M_0^{even}$|$\texttt{spread}(M_0^{even})$|                            |$\mathtt{spread}(A^{lo})$ |$\mathtt{spread}(A^{hi})$ |
-  1  |{0,1,2,3,4,5}|$M_0^{odd}$ |$\texttt{spread}(M_0^{odd})$ |$\texttt{spread}(M_1^{odd})$|$\mathtt{spread}(B^{lo})$ |$\mathtt{spread}(B^{hi})$ |
-  0  |{0,1,2,3,4,5}|$M_1^{even}$|$\texttt{spread}(M_1^{even})$|                            |$\mathtt{spread}(C^{lo})$ |$\mathtt{spread}(C^{hi})$ |
-  0  |{0,1,2,3,4,5}|$M_1^{odd}$ |$\texttt{spread}(M_1^{odd})$ |                            |                          |                          |
-
-Constraints:
- `s_maj` (majority): $LHS - RHS = 0$
-    - $LHS = \mathtt{spread}(M^{even}_0) + 2 \cdot \mathtt{spread}(M^{odd}_0) + 2^{32} \cdot \mathtt{spread}(M^{even}_1) + 2^{33} \cdot \mathtt{spread}(M^{odd}_1)$
-    - $RHS = A' + B' + C'$
- $\mathtt{spread}$ lookup on $(a_0, a_1, a_2)$
- permutation between $(a_2, a_3)$
-
-Output: $Maj(A,B,C) = M^{odd} = M_0^{odd} + 2^{16} M_1^{odd}$
-
-### Σ_0 gate
-
-$A$ is a 32-bit word split into $(2,11,9,10)$-bit chunks, starting from the little end. We refer to these chunks as $(a(2), b(11), c(9), d(10))$ respectively, and further split $c(9)$ into three 3-bit chunks $c(9)^{lo}, c(9)^{mid}, c(9)^{hi}$. We witness the spread versions of the small chunks.
-
-$$
-\begin{array}{ccc}
-\Sigma_0(A) &=& (A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋙ 22) \\
-&=& (A ⋙ 2) \oplus (A ⋙ 13) \oplus (A ⋘ 10)
-\end{array}
-$$
-
-s_upp_sigma_0|    $a_0$    |   $a_1$    |           $a_2$             |   $a_3$                      |   $a_4$                    |   $a_5$                |   $a_6$                     |
-------------|-------------|------------|-----------------------------|------------------------------|----------------------------|------------------------|-----------------------------|
-     0       |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $c(9)^{lo}$                |$\texttt{spread}(c(9)^{lo})$|   $c(9)^{mid}$         |$\texttt{spread}(c(9)^{mid})$|
-     1       |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d(10))$   |$\texttt{spread}(b(11))$|         $c(9)$              |
-     0       |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|   $a(2)$                     |$\texttt{spread}(a(2))$     |   $c(9)^{hi}$          |$\texttt{spread}(c(9)^{hi})$ |
-     0       |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                              |                            |                        |                             |
-
-
-Constraints:
- `s_upp_sigma_0` ($\Sigma_0$ constraint): $LHS - RHS + tag + decompose = 0$
-
-$$
-\begin{array}{ccc}
-tag &=& constrain_1(a_0\omega^{-1}) + constrain_2(a_0\omega) \\
-decompose &=& a(2) + 2^2 b(11) + 2^{13} c(9)^{lo} + 2^{16} c(9)^{mid} + 2^{19} c(9)^{hi} + 2^{22} d(10) - A\\
-LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
-\end{array}
-$$
-$$
-\begin{array}{rcccccccccl}
-RHS = & 4^{30} \texttt{spread}(a(2)) &+& 4^{20} \texttt{spread}(d(10)) &+& 4^{17} \texttt{spread}(c(9)^{hi}) &+& 4^{14} \texttt{spread}(c(9)^{mid}) &+& 4^{11} \texttt{spread}(c(9)^{lo})  &+&   \texttt{spread}(b(11))\;&+ \\
-     & 4^{21} \texttt{spread}(b(11)) &+& 4^{19} \texttt{spread}(a(2)) &+& 4^{9} \texttt{spread}(d(10)) &+&   4^{6} \texttt{spread}(c(9)^{hi}) &+& 4^{3} \texttt{spread}(c(9)^{mid}) &+& \texttt{spread}(c(9)^{lo}) \;&+ \\
-     & 4^{29} \texttt{spread}(c(9)^{hi}) &+& 4^{26} \texttt{spread}(c(9)^{mid}) &+& 4^{23} \texttt{spread}(c(9)^{lo})  &+& 4^{12} \texttt{spread}(b(11)) &+& 4^{10} \texttt{spread}(a(2)) &+&   \texttt{spread}(d(10))\;&
-\end{array}
-$$
-
- $\mathtt{spread}$ lookup on $a_0, a_1, a_2$
- 2-bit range check and 2-bit spread check on $a(2)$
- 3-bit range check and 3-bit spread check on $c(9)^{lo}, c(9)^{mid}, c(9)^{hi}$
-
-(see section [Helper gates](#helper-gates))
-
-Output: $\Sigma_0(A) = R^{even} = R_0^{even} + 2^{16} R_1^{even}$
-
-### Σ_1 gate
-$E$ is a 32-bit word split into $(6,5,14,7)$-bit chunks, starting from the little end. We refer to these chunks as $(a(6), b(5), c(14), d(7))$ respectively, and further split $a(6)$ into two 3-bit chunks $a(6)^{lo}, a(6)^{hi}$ and $b$ into (2,3)-bit chunks $b(5)^{lo}, b(5)^{hi}$. We witness the spread versions of the small chunks.
-
-$$
-\begin{array}{ccc}
-\Sigma_1(E) &=& (E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋙ 25) \\
-&=& (E ⋙ 6) \oplus (E ⋙ 11) \oplus (E ⋘ 7)
-\end{array}
-$$
-
-s_upp_sigma_1|    $a_0$    |   $a_1$    |           $a_2$             |   $a_3$                      |   $a_4$                    |   $a_5$                |   $a_6$                     |   $a_7$    |
-------------|-------------|------------|-----------------------------|------------------------------|----------------------------|------------------------|-----------------------------|------------|
-     0       |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $b(5)^{lo}$                |$\texttt{spread}(b(5)^{lo})$|   $b(5)^{hi}$          |$\texttt{spread}(b(5)^{hi})$ |   $b(5)$   |
-     1       |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d(7))$    |$\texttt{spread}(c(14))$|                             |            |
-     0       |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|   $a(6)^{lo}$                |$\texttt{spread}(a(6)^{lo})$|   $a(6)^{hi}$          |$\texttt{spread}(a(6)^{hi})$ |   $a(6)$   |
-     0       |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                              |                            |                        |                             |            |
-
-
-Constraints:
- `s_upp_sigma_1` ($\Sigma_1$ constraint): $LHS - RHS + tag + decompose = 0$
-
-$$
-\begin{array}{ccc}
-tag &=& a_0\omega^{-1} + constrain_4(a_0\omega) \\
-decompose &=& a(6)^{lo} + 2^3 a(6)^{hi} + 2^6 b(5)^{lo} + 2^8 b(5)^{hi} + 2^{11} c(14) + 2^{25} d(7) - E \\
-LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
-\end{array}
-$$
-$$
-\begin{array}{rcccccccccl}
-RHS = & 4^{29} \texttt{spread}(a(6)^{hi}) &+& 4^{26} \texttt{spread}(a(6)^{lo}) &+& 4^{19} \texttt{spread}(d(7)) &+& 4^{ 5} \texttt{spread}(c(14)) &+&  4^{2} \texttt{spread}(b(5)^{hi}) &+& \texttt{spread}(b(5)^{lo})\;&+ \\
-     & 4^{29} \texttt{spread}(b(5)^{hi}) &+& 4^{27} \texttt{spread}(b(5)^{lo}) &+& 4^{24} \texttt{spread}(a(6)^{hi}) &+& 4^{21} \texttt{spread}(a(6)^{lo}) &+& 4^{14} \texttt{spread}(d(7)) &+&   \texttt{spread}(c(14))\;&+ \\
-     & 4^{18} \texttt{spread}(c(14)) &+& 4^{15} \texttt{spread}(b(5)^{hi}) &+& 4^{13} \texttt{spread}(b(5)^{lo}) &+& 4^{10} \texttt{spread}(a(6)^{hi}) &+& 4^{7} \texttt{spread}(a(6)^{lo}) &+&   \texttt{spread}(d(7))\;&
-\end{array}
-$$
-
- $\mathtt{spread}$ lookup on $a_0, a_1, a_2$
- 2-bit range check and 2-bit spread check on $b(5)^{lo}$
- 3-bit range check and 3-bit spread check on $a(6)^{lo}, a(6)^{hi}, b(4)^{hi}$
-
-(see section [Helper gates](#helper-gates))
-
-Output: $\Sigma_1(E) = R^{even} = R_0^{even} + 2^{16} R_1^{even}$
-
-### σ_0 gate
-#### v1
-v1 of the $\sigma_0$ gate takes in a word that's split into $(3, 4, 11, 14)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(3), b(4), c(11), d(14)).$  $b(4$ is further split into two 2-bit chunks $b(4)^{lo},b(4)^{hi}.$ We witness the spread versions of the small chunks. We already have $\texttt{spread}(c(11))$ and $\texttt{spread}(d(14))$ from the message scheduling.
-
-$(X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3)$ is equivalent to
-$(X ⋙ 7) \oplus (X ⋘ 14) \oplus (X ≫ 3)$.
-
-s_low_sigma_0|    $a_0$    |   $a_1$    |            $a_2$            |    $a_3$                    |    $a_4$                   |    $a_5$           |    $a_6$                   |
-------------|-------------|------------|-----------------------------|-----------------------------|----------------------------|--------------------|----------------------------|
-      0      |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|    $b(4)^{lo}$              |$\texttt{spread}(b(4)^{lo})$|    $b(4)^{hi}$     |$\texttt{spread}(b(4)^{hi})$|
-      1      |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ |$\texttt{spread}(R_1^{odd})$ |$\texttt{spread}(c)$        |$\texttt{spread}(d)$|      $b(4)$                |
-      0      |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|    $0$                      |    $0$                     |    $a$             |    $\texttt{spread}(a)$    |
-      0      |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                             |                            |                    |                            |
-
-Constraints:
- `s_low_sigma_0` ($\sigma_0$ v1 constraint): $LHS - RHS = 0$
-
-$$
-\begin{array}{ccc}
-LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
-\end{array}
-$$
-$$
-\begin{array}{rccccccccl}
-RHS = &         & & 4^{15} d(14) &+& 4^{ 4} c(11) &+&  4^2 b(4)^{hi} &+&   b(4)^{lo}\;&+ \\
-     & 4^{30} b(4)^{hi} &+& 4^{28} b(4)^{lo} &+& 4^{25} a(3) &+& 4^{11} d(14) &+&   c(11)\;&+ \\
-     & 4^{21} c(11) &+& 4^{19} b(4)^{hi} &+& 4^{17} b(4)^{lo} &+& 4^{14} a(3) &+&   d(14)\;&
-\end{array}
-$$
-
- check that `b` was properly split into subsections for 4-bit pieces.
-    - $W^{b(4)lo} + 2^2 W^{b(4)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(4)^{lo}, b(4)^{hi}$
- 3-bit range check and 3-bit spread check on $a(3)$
-
-
-#### v2
-v2 of the $\sigma_0$ gate takes in a word that's split into $(3, 4, 3, 7, 1, 1, 13)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(3), b(4), c(3), d(7), e(1), f(1), g(13)).$ We already have $\mathtt{spread}(d(7)), \mathtt{spread}(g(13))$ from the message scheduling. The 1-bit $e(1), f(1)$ remain unchanged by the spread operation and can be used directly. We further split $b(4)$ into two 2-bit chunks $b(4)^{lo}, b(4)^{hi}.$ We witness the spread versions of the small chunks.
-
-$(X ⋙ 7) \oplus (X ⋙ 18) \oplus (X ≫ 3)$ is equivalent to
-$(X ⋙ 7) \oplus (X ⋘ 14) \oplus (X ≫ 3)$.
-
-s_low_sigma_0_v2|    $a_0$    |   $a_1$    |           $a_2$             |   $a_3$                     |   $a_4$                    |   $a_5$                |   $a_6$                    |   $a_7$    |
----------------|-------------|------------|-----------------------------|-----------------------------|----------------------------|------------------------|----------------------------|------------|
-        0       |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $b(4)^{lo}$               |$\texttt{spread}(b(4)^{lo})$| $b(4)^{hi}$            |$\texttt{spread}(b(4)^{hi})$|            |
-        1       |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$| $\texttt{spread}(d(7))$    |$\texttt{spread}(g(13))$|       $b(4)$               | $e(1)$     |
-        0       |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|    $a(3)$                   |$\texttt{spread}(a(3))$     |    $c(3)$              |$\texttt{spread}(c(3))$     | $f(1)$     |
-        0       |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                             |                            |                        |                            |            |
-
-Constraints:
- `s_low_sigma_0_v2` ($\sigma_0$ v2 constraint): $LHS - RHS = 0$
-
-$$
-\begin{array}{ccc}
-LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
-\end{array}
-$$
-$$
-\begin{array}{rcccccccccccl}
-RHS = &         & & 4^{16} g(13) &+& 4^{15} f(1) &+& 4^{ 14} e(1) &+& 4^{ 7} d(7) &+& 4^{ 4} c(3) &+&  4^2 b(4)^{hi} &+&   b(4)^{lo}\;&+ \\
-     & 4^{30} b(4)^{hi} &+& 4^{28} b(4)^{lo} &+& 4^{25} a(3) &+& 4^{12} g(13) &+& 4^{11} f(1) &+&  4^{10} e(1) &+&   4^{3} d(7) &+&   c(3)\;&+ \\
-     & 4^{31} e(1) &+& 4^{24} d(7) &+& 4^{21} c(3) &+& 4^{19} b(4)^{hi} &+& 4^{17} b(4)^{lo} &+& 4^{14} a(3) &+&   4^{1} g(13) &+&   f(1)\;&
-\end{array}
-$$
-
- check that `b` was properly split into subsections for 4-bit pieces.
-    - $W^{b(4)lo} + 2^2 W^{b(4)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(4)^{lo}, b(4)^{hi}$
- 3-bit range check and 3-bit spread check on $a(3), c(3)$
-
-### σ_1 gate
-#### v1
-v1 of the $\sigma_1$ gate takes in a word that's split into $(10, 7, 2, 13)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(10), b(7), c(2), d(13)).$  $b(7)$ is further split into $(2, 2, 3)$-bit chunks $b(7)^{lo}, b(7)^{mid}, b(7)^{hi}.$ We witness the spread versions of the small chunks. We already have $\texttt{spread}(a(10))$ and $\texttt{spread}(d(13))$ from the message scheduling.
-
-$(X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10)$ is equivalent to
-$(X ⋘ 15) \oplus (X ⋘ 13) \oplus (X ≫ 10)$.
-
-s_low_sigma_1|    $a_0$    |   $a_1$    |           $a_2$             |   $a_3$                      |   $a_4$                    |   $a_5$                |   $a_6$                     |
-------------|-------------|------------|-----------------------------|------------------------------|----------------------------|------------------------|-----------------------------|
-      0      |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $b(7)^{lo}$                |$\texttt{spread}(b(7)^{lo})$|   $b(7)^{mid}$         |$\texttt{spread}(b(7)^{mid})$|
-      1      |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(a(10))$   |$\texttt{spread}(d(13))$|     $b(7)$                  |
-      0      |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|    $c(2)$                    |$\texttt{spread}(c(2))$     |   $b(7)^{hi}$          |$\texttt{spread}(b(7)^{hi})$ |
-      0      |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                              |                            |                        |                             |
-
-Constraints:
- `s_low_sigma_1` ($\sigma_1$ v1 constraint): $LHS - RHS = 0$
-$$
-\begin{array}{ccc}
-LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
-\end{array}
-$$
-$$
-\begin{array}{rcccccccccl}
-RHS = &          & & 4^{ 9} d(13) &+& 4^{ 7} c(2) &+& 4^{4} b(7)^{hi} &+& 4^{2} b(7)^{mid} &+&   b(7)^{lo}\;&+ \\
-     & 4^{29} b(7)^{hi} &+& 4^{27} b(7)^{mid} &+& 4^{25} b(7)^{lo} &+& 4^{15} a(10) &+& 4^{ 2} d(13) &+&   c(2)\;&+ \\
-     & 4^{30} c(2) &+& 4^{27} b(7)^{hi} &+& 4^{25} b(7)^{mid} &+& 4^{23} b(7)^{lo} &+& 4^{13} a(10) &+&   d(13)\;&
-\end{array}
-$$
-
- check that `b` was properly split into subsections for 7-bit pieces.
-    - $W^{b(7)lo} + 2^2 W^{b(7)mid} + 2^4 W^{b(7)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(7)^{lo}, b(7)^{mid}, c(2)$
- 3-bit range check and 3-bit spread check on $b(7)^{hi}$
-
-
-#### v2
-v2 of the $\sigma_1$ gate takes in a word that's split into $(3, 4, 3, 7, 1, 1, 13)$-bit chunks (already constrained by message scheduling). We refer to these chunks respectively as $(a(3), b(4), c(3), d(7), e(1), f(1), g(13)).$ We already have $\mathtt{spread}(d(7)), \mathtt{spread}(g(13))$ from the message scheduling. The 1-bit $e(1), f(1)$ remain unchanged by the spread operation and can be used directly. We further split $b(4)$ into two 2-bit chunks $b(4)^{lo}, b(4)^{hi}.$ We witness the spread versions of the small chunks.
-
-$(X ⋙ 17) \oplus (X ⋙ 19) \oplus (X ≫ 10)$ is equivalent to
-$(X ⋘ 15) \oplus (X ⋘ 13) \oplus (X ≫ 10)$.
-
-s_low_sigma_1_v2|    $a_0$    |   $a_1$    |           $a_2$             |   $a_3$                     |   $a_4$                    |   $a_5$                 |   $a_6$                    |   $a_7$    |
----------------|-------------|------------|-----------------------------|-----------------------------|----------------------------|-------------------------|----------------------------|------------|
-        0       |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $b(4)^{lo}$               |$\texttt{spread}(b(4)^{lo})$|   $b(4)^{hi}$           |$\texttt{spread}(b(4)^{hi})$|            |
-        1       |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ | $\texttt{spread}(R_1^{odd})$| $\texttt{spread}(d(7))$    | $\texttt{spread}(g(13))$|       $b(4)$               | $e(1)$     |
-        0       |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|    $a(3)$                   |$\texttt{spread}(a(3))$     |    $c(3)$               |$\texttt{spread}(c(3))$     | $f(1)$     |
-        0       |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                             |                            |                         |                            |            |
-
-Constraints:
- `s_low_sigma_1_v2` ($\sigma_1$ v2 constraint): $LHS - RHS = 0$
-
-$$
-\begin{array}{ccc}
-LHS &=& \mathtt{spread}(R^{even}_0) + 2 \cdot \mathtt{spread}(R^{odd}_0) + 2^{32} \cdot \mathtt{spread}(R^{even}_1) + 2^{33} \cdot \mathtt{spread}(R^{odd}_1)
-\end{array}
-$$
-$$
-\begin{array}{rccccccccccccl}
-RHS = &        &&&&  & & 4^{ 9} g(13) &+& 4^{ 8} f(1) &+& 4^{ 7} e(1) &+& d(7)\;&+ \\
-     & 4^{25} d(7) &+& 4^{22} c(3)  &+& 4^{20} b(4)^{hi}  &+& 4^{18} b(4)^{lo}  &+& 4^{15} a &+& 4^{ 2} g(13) &+&  4^{1}f(1)  &+&   e(1)\;&+ \\
-     & 4^{31} f(1) &+& 4^{30} e(1) &+&  4^{23} d(7) &+& 4^{20} c(3)  &+& 4^{18} b(4)^{hi}  &+& 4^{16} b(4)^{lo}  &+& 4^{13} a &+&   g(13)\;&
-\end{array}
-$$
-
- check that `b` was properly split into subsections for 4-bit pieces.
-    - $W^{b(4)lo} + 2^2 W^{b(4)hi} - W = 0$
- 2-bit range check and 2-bit spread check on $b(4)^{lo}, b(4)^{hi}$
- 3-bit range check and 3-bit spread check on $a(3), c(3)$
-
-
-### Helper gates
-
-#### Small range constraints
-Let $constrain_n(x) = \prod_{i=0}^n (x-i)$. Constraining this expression to equal zero enforces that $x$ is in $[0..n].$
-
-#### 2-bit range check
-$(a - 3)(a - 2)(a - 1)(a) = 0$
-
-sr2| $a_0$ |
---|-------|
- 1 |   a   |
-
-#### 2-bit spread
-$l_1(a) + 4*l_2(a) + 5*l_3(a) - a' = 0$
-
-ss2| $a_0$ | $a_1$
---|-------|------
- 1 |   a   |   a'
-
-with interpolation polynomials:
- $l_0(a) = \frac{(a - 3)(a - 2)(a - 1)}{(-3)(-2)(-1)}$ ($\mathtt{spread}(00) = 0000$)
- $l_1(a) = \frac{(a - 3)(a - 2)(a)}{(-2)(-1)(1)}$ ($\mathtt{spread}(01) = 0001$)
- $l_2(a) = \frac{(a - 3)(a - 1)(a)}{(-1)(1)(2)}$ ($\mathtt{spread}(10) = 0100$)
- $l_3(a) = \frac{(a - 2)(a - 1)(a)}{(1)(2)(3)}$ ($\mathtt{spread}(11) = 0101$)
-
-#### 3-bit range check
-$(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)(a) = 0$
-
-sr3| $a_0$ |
---|-------|
- 1 |   a   |
-
-#### 3-bit spread
-$l_1(a) + 4*l_2(a) + 5*l_3(a) + 16*l_4(a) + 17*l_5(a) + 20*l_6(a) + 21*l_7(a) - a' = 0$
-
-ss3| $a_0$ | $a_1$
---|-------|------
- 1 |   a   |   a'
-
-with interpolation polynomials:
- $l_0(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)}{(-7)(-6)(-5)(-4)(-3)(-2)(-1)}$ ($\mathtt{spread}(000) = 000000$)
- $l_1(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a)}{(-6)(-5)(-4)(-3)(-2)(-1)(1)}$ ($\mathtt{spread}(001) = 000001$)
- $l_2(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 4)(a - 3)(a - 1)(a)}{(-5)(-4)(-3)(-2)(-1)(1)(2)}$ ($\mathtt{spread}(010) = 000100$)
- $l_3(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 3)(a - 2)(a - 1)(a)}{(-4)(-3)(-2)(-1)(1)(2)(3)}$ ($\mathtt{spread}(011) = 000101$)
- $l_4(a) = \frac{(a - 7)(a - 6)(a - 5)(a - 3)(a - 2)(a - 1)(a)}{(-3)(-2)(-1)(1)(2)(3)(4)}$ ($\mathtt{spread}(100) = 010000$)
- $l_5(a) = \frac{(a - 7)(a - 6)(a - 4)(a - 3)(a - 2)(a - 1)(a)}{(-2)(-1)(1)(2)(3)(4)(5)}$ ($\mathtt{spread}(101) = 010001$)
- $l_6(a) = \frac{(a - 7)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)(a)}{(-1)(1)(2)(3)(4)(5)(6)}$ ($\mathtt{spread}(110) = 010100$)
- $l_7(a) = \frac{(a - 6)(a - 5)(a - 4)(a - 3)(a - 2)(a - 1)(a)}{(1)(2)(3)(4)(5)(6)(7)}$ ($\mathtt{spread}(111) = 010101$)
-
-#### reduce_6 gate
-Addition $\pmod{2^{32}}$ of 6 elements
-
-Input:
- $E$
- $\{e_i^{lo}, e_i^{hi}\}_{i=0}^5$
- $carry$
-
-Check: $E = e_0 + e_1 + e_2 + e_3 + e_4 + e_5 \pmod{32}$
-
-Assume inputs are constrained to 16 bits.
- Addition gate (sa):
-  - $a_0 + a_1 + a_2 + a_3 + a_4 + a_5 + a_6 - a_7 = 0$
- Carry gate (sc):
-  - $2^{16} a_6 \omega^{-1} + a_6 + [(a_6 - 5)(a_6 - 4)(a_6 -3)(a_6 - 2)(a_6 - 1)(a_6)] = 0$
-
-sa|sc|  $a_0$   |  $a_1$   |$a_2$     |$a_3$     |$a_4$     |$a_5$     |$a_6$          |$a_7$   |
--|--|----------|----------|----------|----------|----------|----------|---------------|--------|
-1 |0 |$e_0^{lo}$|$e_1^{lo}$|$e_2^{lo}$|$e_3^{lo}$|$e_4^{lo}$|$e_5^{lo}$|$-carry*2^{16}$|$E^{lo}$|
-1 |1 |$e_0^{hi}$|$e_1^{hi}$|$e_2^{hi}$|$e_3^{hi}$|$e_4^{hi}$|$e_5^{hi}$|$carry$        |$E^{hi}$|
-
-Assume inputs are constrained to 16 bits.
- Addition gate (sa):
-  - $a_0 \omega^{-1} + a_1 \omega^{-1} + a_2 \omega^{-1} + a_0 + a_1 + a_2 + a_3 \omega^{-1} - a_3 = 0$
- Carry gate (sc):
-  - $2^{16} a_3 \omega + a_3 \omega^{-1} = 0$
-
-
-sa|sc|  $a_0$   |  $a_1$   |$a_2$     |$a_3$          |
--|--|----------|----------|----------|---------------|
-0 |0 |$e_0^{lo}$|$e_1^{lo}$|$e_2^{lo}$|$-carry*2^{16}$|
-1 |1 |$e_3^{lo}$|$e_4^{lo}$|$e_5^{lo}$|$E^{lo}$       |
-0 |0 |$e_0^{hi}$|$e_1^{hi}$|$e_2^{hi}$|$carry$        |
-1 |0 |$e_3^{hi}$|$e_4^{hi}$|$e_5^{hi}$|$E^{hi}$       |
-
-#### reduce_7 gate
-Addition $\pmod{2^{32}}$ of 7 elements
-
-Input:
- $E$
- $\{e_i^{lo}, e_i^{hi}\}_{i=0}^6$
- $carry$
-
-Check: $E = e_0 + e_1 + e_2 + e_3 + e_4 + e_5 + e_6 \pmod{32}$
-
-Assume inputs are constrained to 16 bits.
- Addition gate (sa):
-  - $a_0 + a_1 + a_2 + a_3 + a_4 + a_5 + a_6 + a_7 - a_8 = 0$
- Carry gate (sc):
-  - $2^{16} a_7 \omega^{-1} + a_7 + [(a_7 - 6)(a_7 - 5)(a_7 - 4)(a_7 -3)(a_7 - 2)(a_7 - 1)(a_7)] = 0$
-
-sa|sc|  $a_0$   |  $a_1$   |$a_2$     |$a_3$     |$a_4$     |$a_5$     |$a_6$     |$a_7$          |$a_8$   |
--|--|----------|----------|----------|----------|----------|----------|----------|---------------|--------|
-1 |0 |$e_0^{lo}$|$e_1^{lo}$|$e_2^{lo}$|$e_3^{lo}$|$e_4^{lo}$|$e_5^{lo}$|$e_6^{lo}$|$-carry*2^{16}$|$E^{lo}$|
-1 |1 |$e_0^{hi}$|$e_1^{hi}$|$e_2^{hi}$|$e_3^{hi}$|$e_4^{hi}$|$e_5^{hi}$|$e_6^{hi}$|$carry$        |$E^{hi}$|
-
-
-### Message scheduling region
-For each block $M \in \{0,1\}^{512}$ of the padded message, $64$ words of $32$ bits each are constructed as follows:
- the first $16$ are obtained by splitting $M$ into $32$-bit blocks $$M = W_0 || W_1 || \cdots || W_{14} || W_{15};$$
- the remaining $48$ words are constructed using the formula:
-$$W_i = \sigma_1(W_{i-2}) \boxplus W_{i-7} \boxplus \sigma_0(W_{i-15}) \boxplus W_{i-16},$$ for $16 \leq i < 64$.
-
-sw|sd0|sd1|sd2|sd3|ss0|ss0_v2|ss1|ss1_v2|     $a_0$     |     $a_1$        |             $a_2$                 |    $a_3$                     |    $a_4$                         |    $a_5$                        |         $a_6$                     |         $a_7$          |     $a_8$      |    $a_9$     |
--|---|---|---|---|---|------|---|------|---------------|------------------|-----------------------------------|------------------------------|----------------------------------|---------------------------------|---------------------------------  |------------------------|----------------|--------------|
-0 | 1 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $W_{0}^{lo}$     | $\texttt{spread}(W_{0}^{lo})$     | $W_{0}^{lo}$                 | $W_{0}^{hi}$                     |   $W_{0}$                       |$\sigma_0(W_1)^{lo}$               |$\sigma_1(W_{14})^{lo}$ |  $W_{9}^{lo}$  |              |
-1 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $W_{0}^{hi}$     | $\texttt{spread}(W_{0}^{hi})$     |                              |                                  |   $W_{16}$                      |$\sigma_0(W_1)^{hi}$               |$\sigma_1(W_{14})^{hi}$ |  $W_{9}^{hi}$  | $carry_{16}$ |
-0 | 1 | 1 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4}   |  $W_{1}^{d(14)}$ | $\texttt{spread}(W_{1}^{d(14)})$  | $W_{1}^{lo}$                 | $W_{1}^{hi}$                     |   $W_{1}$                       |$\sigma_0(W_2)^{lo}$               |$\sigma_1(W_{15})^{lo}$ |  $W_{10}^{lo}$ |              |
-1 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2}       |  $W_{1}^{c(11)}$ | $\texttt{spread}(W_{1}^{c(11)})$  | $W_{1}^{a(3)}$               | $W_{1}^{b(4)}$                   |   $W_{17}$                      |$\sigma_0(W_2)^{hi}$               |$\sigma_1(W_{15})^{hi}$ |  $W_{10}^{hi}$ | $carry_{17}$ |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} |  $R_0^{even}$    | $\texttt{spread}(R_0^{even})$     | $W_{1}^{b(4)lo}$             |$\texttt{spread}(W_{1}^{b(4)lo})$ | $W_{1}^{b(4)hi}$                |$\texttt{spread}(W_{1}^{b(4)hi})$  |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 1 | 0    | 0 | 0    | {0,1,2,3,4,5} |  $R_1^{odd}$     | $\texttt{spread}(R_0^{odd})$      | $\texttt{spread}(R_1^{odd})$ |$\texttt{spread}(W_{1}^{c(11)})$  |$\texttt{spread}(W_{1}^{d(14)})$ | $W_{1}^{b(4)}$                    |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} |  $R_0^{odd}$     | $\texttt{spread}(R_1^{even})$     |  $0$                         |   $0$                            |    $W_{1}^{a(3)}$               |$\texttt{spread}(W_{1}^{a(3)})$    |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} |  $R_1^{even}$    | $\texttt{spread}(R_1^{odd})$      |    $\sigma_0 v1 R_0$         |    $\sigma_0 v1 R_1$             | $\sigma_0 v1 R_0^{even}$        |  $\sigma_0 v1 R_0^{odd}$          |                        |                |              |
-..|...|...|...|...|...|...   |...|...   |      ...      |      ...         |              ...                  |     ...                      |     ...                          |    ...                          |         ...                       |         ...            |      ...       |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3}     | $W_{14}^{g(13)}$ | $\texttt{spread}(W_{14}^{g(13)})$ | $W_{14}^{a(3)}$              | $W_{14}^{c(3)}$                  |                                 |                                   |                        |                |              |
-0 | 1 | 0 | 1 | 0 | 0 | 0    | 0 | 0    |       0       | $W_{14}^{d(7)}$  | $\texttt{spread}(W_{14}^{d(7)})$  | $W_{14}^{lo}$                | $W_{14}^{hi}$                    |   $W_{14}$                      |$\sigma_0(W_{15})^{lo}$            |$\sigma_1(W_{28})^{lo}$ |  $W_{23}^{lo}$ |              |
-1 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    |       0       | $W_{14}^{b(4)}$  | $\texttt{spread}(W_{14}^{b(4)})$  | $W_{14}^{e(1)}$              | $W_{14}^{f(1)}$                  |   $W_{30}$                      |$\sigma_0(W_{15})^{hi}$            |$\sigma_1(W_{28})^{hi}$ |  $W_{23}^{hi}$ | $carry_{30}$ |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $R_0^{even}$     | $\texttt{spread}(R_0^{even})$     | $W_{14}^{b(4)lo}$            |$\texttt{spread}(W_{14}^{b(4)lo})$|   $W_{14}^{b(4) hi}$            |$\texttt{spread}(W_{14}^{b(4)hi})$ |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 1    | 0 | 0    | {0,1,2,3,4,5} | $R_0^{odd}$      | $\texttt{spread}(R_0^{odd})$      | $\texttt{spread}(R_1^{odd})$ |$\texttt{spread}(W_{14}^{d(7)})$  |$\texttt{spread}(W_{14}^{g(13)})$| $W_{1}^{b(14)}$                   | $W_{14}^{e(1)}$        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $R_1^{even}$     | $\texttt{spread}(R_1^{even})$     |    $W_{14}^{a(3)}$           |$\texttt{spread}(W_{14}^{a(3)})$  |    $W_{14}^{c(3)}$              |$\texttt{spread}(W_{14}^{c(3)})$   | $W_{14}^{f(1)}$        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $R_1^{odd}$      | $\texttt{spread}(R_1^{odd})$      |  $\sigma_0 v2 R_0$           |  $\sigma_0 v2 R_1$               |$\sigma_0 v2 R_0^{even}$         |$\sigma_0 v2 R_0^{odd}$            |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $R_0^{even}$     | $\texttt{spread}(R_0^{even})$     | $W_{14}^{b(4)lo}$            |$\texttt{spread}(W_{14}^{b(4)lo})$|   $W_{14}^{b(4) hi}$            |$\texttt{spread}(W_{14}^{b(4)hi})$ |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 1    | {0,1,2,3,4,5} | $R_0^{odd}$      | $\texttt{spread}(R_0^{odd})$      | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(d)$             | $\texttt{spread}(g)$            |                                   | $W_{14}^{e(1)}$        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $R_1^{even}$     | $\texttt{spread}(R_1^{even})$     |    $W_{14}^{a(3)}$           |$\texttt{spread}(W_{14}^{a(3)})$  |    $W_{14}^{c(3)}$              |$\texttt{spread}(W_{14}^{c(3)})$   | $W_{14}^{f(1)}$        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $R_1^{odd}$      | $\texttt{spread}(R_1^{odd})$      |  $\sigma_1 v2 R_0$           |  $\sigma_1 v2 R_1$               |$\sigma_1 v2 R_0^{even}$         |$\sigma_1 v2 R_0^{odd}$            |                        |                |              |
-..|...|...|...|...|...|...   |...|...   |      ...      |      ...         |              ...                  |     ...                      |     ...                          |    ...                          |         ...                       |         ...            |      ...       |              |
-0 | 1 | 0 | 0 | 1 | 0 | 0    | 0 | 0    | {0,1,2,3}     | $W_{49}^{d(13)}$ | $\texttt{spread}(W_{49}^{d(13)})$ | $W_{49}^{lo}$                | $W_{49}^{hi}$                    |   $W_{49}$                      |                                   |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1}         | $W_{49}^{a(10)}$ | $\texttt{spread}(W_{49}^{a(10)})$ | $W_{49}^{c(2)}$              | $W_{49}^{b(7)}$                  |                                 |                                   |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    |{0,1,2,3,4,5}  | $R_0^{even}$     | $\texttt{spread}(R_0^{even})$     | $W_{49}^{b(7)lo}$            |$\texttt{spread}(W_{49}^{b(7)lo})$| $W_{49}^{b(7)mid}$              |$\texttt{spread}(W_{49}^{b(7)mid})$|                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 1    |{0,1,2,3,4,5}  | $R_0^{odd}$      | $\texttt{spread}(R_0^{odd})$      | $\texttt{spread}(R_1^{odd})$ | $\texttt{spread}(a)$             | $\texttt{spread}(d)$            | $W_{1}^{b(49)}$                   |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    |{0,1,2,3,4,5}  | $R_1^{even}$     | $\texttt{spread}(R_1^{even})$     |    $W_{49}^{c(2)}$           |$\texttt{spread}(W_{49}^{c(2)})$  | $W_{49}^{b(7)hi}$               |$\texttt{spread}(W_{49}^{b(7)hi})$ |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    |{0,1,2,3,4,5}  | $R_1^{odd}$      | $\texttt{spread}(R_1^{odd})$      |  $\sigma_1 v1 R_0$           |  $\sigma_1 v1 R_1$               |$\sigma_1 v1 R_0^{even}$         |$\sigma_1 v1 R_0^{odd}$            |                        |                |              |
-..|...|...|...|...|...|...   |...|...   |      ...      |      ...         |              ...                  |     ...                      |     ...                          |    ...                          |         ...                       |         ...            |      ...       |              |
-0 | 1 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $W_{62}^{lo}$    | $\texttt{spread}(W_{62}^{lo})$    | $W_{62}^{lo}$                | $W_{62}^{hi}$                    |   $W_{62}$                      |                                   |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $W_{62}^{hi}$    | $\texttt{spread}(W_{62}^{hi})$    |                              |                                  |                                 |                                   |                        |                |              |
-0 | 1 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $W_{63}^{lo}$    | $\texttt{spread}(W_{63}^{lo})$    | $W_{63}^{lo}$                | $W_{63}^{hi}$                    |   $W_{63}$                      |                                   |                        |                |              |
-0 | 0 | 0 | 0 | 0 | 0 | 0    | 0 | 0    | {0,1,2,3,4,5} | $W_{63}^{hi}$    | $\texttt{spread}(W_{63}^{hi})$    |                              |                                  |                                 |                                   |                        |                |              |
-
-Constraints:
- `sw`: construct word using $reduce_4$
- `sd0`: decomposition gate for $W_0, W_{62}, W_{63}$
-    - $W^{lo} + 2^{16} W^{hi} - W = 0$
- `sd1`: decomposition gate for $W_{1..13}$ (split into $(3,4,11,14)$-bit pieces)
-    - $W^{a(3)} + 2^3 W^{b(4) lo} + 2^5 W^{b(4) hi} + 2^7 W^{c(11)} + 2^{18} W^{d(14)} - W = 0$
- `sd2`: decomposition gate for $W_{14..48}$ (split into $(3,4,3,7,1,1,13)$-bit pieces)
-    - $W^{a(3)} + 2^3 W^{b(4) lo} + 2^5 W^{b(4) hi} + 2^7 W^{c(11)} + 2^{10} W^{d(14)} + 2^{17} W^{e(1)} + 2^{18} W^{f(1)} + 2^{19} W^{g(13)} - W = 0$
- `sd3`: decomposition gate for $W_{49..61}$ (split into $(10,7,2,13)$-bit pieces)
-    - $W^{a(10)} + 2^{10} W^{b(7) lo} + 2^{12} W^{b(7) mid} + 2^{15} W^{b(7) hi} + 2^{17} W^{c(2)} + 2^{19} W^{d(13)} - W = 0$
-
-### Compression region
-
-```plaintext
-+----------------------------------------------------------+
-|                                                          |
-|          decompose E,                                    |
-|          Σ_1(E)                                          |
-|                                                          |
-|                  +---------------------------------------+
-|                  |                                       |
-|                  |        reduce_5() to get H'           |
-|                  |                                       |
-+----------------------------------------------------------+
-|          decompose F, decompose G                        |
-|                                                          |
-|                        Ch(E,F,G)                         |
-|                                                          |
-+----------------------------------------------------------+
-|                                                          |
-|          decompose A,                                    |
-|          Σ_0(A)                                          |
-|                                                          |
-|                                                          |
-|                  +---------------------------------------+
-|                  |                                       |
-|                  |        reduce_7() to get A_new,       |
-|                  |              using H'                 |
-|                  |                                       |
-+------------------+---------------------------------------+
-|          decompose B, decompose C                        |
-|                                                          |
-|          Maj(A,B,C)                                      |
-|                                                          |
-|                  +---------------------------------------+
-|                  |        reduce_6() to get E_new,       |
-|                  |              using H'                 |
-+------------------+---------------------------------------+
-
-```
-
-#### Initial round:
-
-s_digest|sd_abcd|sd_efgh|ss0|ss1|s_maj|s_ch_neg|s_ch|s_a_new |s_e_new |s_h_prime|    $a_0$    |   $a_1$    |           $a_2$             |               $a_3$                 |              $a_4$                  |               $a_5$                    |               $a_6$                |               $a_7$                |               $a_8$                |               $a_9$                |
--------|-------|-------|---|---|-----|--------|----|--------|--------|---------|-------------|------------|-----------------------------|-------------------------------------|-------------------------------------|----------------------------------------|------------------------------------|------------------------------------|------------------------------------|------------------------------------|
-   0    |   0   |   1   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$F_0 d(7)$  |$\texttt{spread}(E_0 d(7)) $ |          $E_0 b(5)^{lo}$            | $\texttt{spread}(E_0 b(5)^{lo})$    |           $E_0 b(5)^{hi}$              |  $\texttt{spread}(E_0 b(5)^{hi}) $ |             $E_0^{lo}$             |  $\mathtt{spread}(E_0^{lo})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$E_0 c(14)$ |$\texttt{spread}(E_0 c(14))$ |          $E_0 a(6)^{lo}$            | $\texttt{spread}(E_0 a(6)^{lo})$    |           $E_0 a(6)^{hi}$              |  $\texttt{spread}(E_0 a(6)^{hi}) $ |             $E_0^{hi}$             |  $\mathtt{spread}(E_0^{hi})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $\texttt{spread}(E_0 b(5)^{lo})$  |   $\texttt{spread}(E_0 b(5)^{hi})$  |                                        |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 1 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ |        $\texttt{spread}(R_1^{odd})$ |    $\texttt{spread}(E_0 d(7))$      |     $\texttt{spread}(E_0 c(14))$       |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|   $\texttt{spread}(E_0 a(6)^{lo})$  |   $\texttt{spread}(E_0 a(6)^{hi})$  |                                        |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-   0    |   0   |   1   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$F_0 d(7)$  |$\texttt{spread}(F_0 d(7)) $ |          $F_0 b(5)^{lo}$            | $\texttt{spread}(F_0 b(5)^{lo})$    |          $F_0 b(5)^{hi}$               | $\texttt{spread}(F_0 b(5)^{hi}) $  |             $F_0^{lo}$             |  $\mathtt{spread}(F_0^{lo})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$F_0 c(14)$ |$\texttt{spread}(F_0 c(14))$ |          $F_0 a(6)^{lo}$            | $\texttt{spread}(F_0 a(6)^{lo})$    |          $F_0 a(6)^{hi}$               | $\texttt{spread}(F_0 a(6)^{hi}) $  |             $F_0^{hi}$             |  $\mathtt{spread}(F_0^{hi})$       |                                    |
-   0    |   0   |   1   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$G_0 d(7)$  |$\texttt{spread}(G_0 d(7)) $ |          $G_0 b(5)^{lo}$            | $\texttt{spread}(G_0 b(5)^{lo})$    |          $G_0 b(5)^{hi}$               | $\texttt{spread}(G_0 b(5)^{hi}) $  |             $G_0^{lo}$             |  $\mathtt{spread}(G_0^{lo})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$G_0 c(14)$ |$\texttt{spread}(G_0 c(14))$ |          $G_0 a(6)^{lo}$            | $\texttt{spread}(G_0 a(6)^{lo})$    |          $G_0 a(6)^{hi}$               | $\texttt{spread}(G_0 a(6)^{hi}) $  |             $G_0^{hi}$             |  $\mathtt{spread}(G_0^{hi})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$P_0^{even}$|$\texttt{spread}(P_0^{even})$|  $\mathtt{spread}(E^{lo})$          |      $\mathtt{spread}(E^{hi})$      |           $Q_0^{odd}$                  |             $K_0^{lo}$             |             $H_0^{lo}$             |             $W_0^{lo}$             |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 1  |    0   |    0   |    1    |{0,1,2,3,4,5}|$P_0^{odd}$ |$\texttt{spread}(P_0^{odd})$ | $\texttt{spread}(P_1^{odd})$        |        $\Sigma_1(E_0)^{lo}$         |       $\Sigma_1(E_0)^{hi}$             |             $K_0^{hi}$             |             $H_0^{hi}$             |             $W_0^{hi}$             |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$P_1^{even}$|$\texttt{spread}(P_1^{even})$|  $\mathtt{spread}(F^{lo})$          |      $\mathtt{spread}(F^{hi})$      |           $Q_1^{odd}$                  |            $P_1^{odd}$             |           $Hprime_0^{lo}$          |           $Hprime_0^{hi}$          |          $Hprime_0 carry$          |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$P_1^{odd}$ |$\texttt{spread}(P_1^{odd})$ |                                     |                                     |                                        |                                    |             $D_0^{lo}$             |             $E_1^{lo}$             |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    1   |    0    |{0,1,2,3,4,5}|$Q_0^{even}$|$\texttt{spread}(Q_0^{even})$| $\mathtt{spread}(E_{neg}^{lo})$     |    $\mathtt{spread}(E_{neg}^{hi})$  |     $\mathtt{spread}(E^{lo})$          |                                    |             $D_0^{hi}$             |             $E_1^{hi}$             |             $E_1 carry$            |
-   0    |   0   |   0   | 0 | 0 | 0   |    1   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$Q_0^{odd}$ |$\texttt{spread}(Q_0^{odd})$ | $\texttt{spread}(Q_1^{odd})$        |                                     |     $\mathtt{spread}(E^{hi})$          |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$Q_1^{even}$|$\texttt{spread}(Q_1^{even})$| $\mathtt{spread}(G^{lo})$           |       $\mathtt{spread}(G^{hi})$     |                                        |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$Q_1^{odd}$ |$\texttt{spread}(Q_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-   0    |   1   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$A_0 b(11)$ |$\texttt{spread}(A_0 b(11))$ |          $A_0 c(9)^{lo}$            | $\texttt{spread}(A_0 c(9)^{lo})$    |          $A_0 c(9)^{mid}$              | $\texttt{spread}(A_0 c(9)^{mid})$  |             $A_0^{lo}$             |  $\mathtt{spread}(A_0^{lo})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$A_0 d(10)$ |$\texttt{spread}(A_0 d(10))$ |              $A_0 a(2)$             | $\texttt{spread}(A_0 a(2))$         |          $A_0 c(9)^{hi}$               | $\texttt{spread}(A_0 c(9)^{hi})$   |             $A_0^{hi}$             |  $\mathtt{spread}(A_0^{hi})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|    $\texttt{spread}(c(9)^{lo})$     |    $\texttt{spread}(c(9)^{mid})$    |                                        |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 1 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ |        $\texttt{spread}(R_1^{odd})$ |     $\texttt{spread}(d(10))$        |         $\texttt{spread}(b(11))$       |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|    $\texttt{spread}(a(2))$          |      $\texttt{spread}(c(9)^{hi})$   |                                        |                                    |                                    |                                    |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-   0    |   1   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$B_0 b(11)$ |$\texttt{spread}(B_0 b(11))$ |           $B_0 c(9)^{lo}$           | $\texttt{spread}(B_0 c(9)^{lo})$    |          $B_0 c(9)^{mid}$              | $\texttt{spread}(B_0 c(9)^{mid})$  |             $B_0^{lo}$             |  $\mathtt{spread}(B_0^{lo})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$B_0 d(10)$ |$\texttt{spread}(B_0 d(10))$ |               $B_0 a(2)$            | $\texttt{spread}(B_0 a(2))$         |          $B_0 c(9)^{hi}$               | $\texttt{spread}(B_0 c(9)^{hi})$   |             $B_0^{hi}$             |  $\mathtt{spread}(B_0^{hi})$       |                                    |
-   0    |   1   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$C_0 b(11)$ |$\texttt{spread}(C_0 b(11))$ |           $C_0 c(9)^{lo}$           | $\texttt{spread}(C_0 c(9)^{lo})$    |          $C_0 c(9)^{mid}$              | $\texttt{spread}(C_0 c(9)^{mid})$  |             $C_0^{lo}$             |  $\mathtt{spread}(C_0^{lo})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$C_0 d(10)$ |$\texttt{spread}(C_0 d(10))$ |               $C_0 a(2)$            | $\texttt{spread}(C_0 a(2))$         |          $C_0 c(9)^{hi}$               | $\texttt{spread}(C_0 c(9)^{hi})$   |             $C_0^{hi}$             |  $\mathtt{spread}(C_0^{hi})$       |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$M_0^{even}$|$\texttt{spread}(M_0^{even})$|              $M_1^{odd}$            |     $\mathtt{spread}(A_0^{lo})$     |     $\mathtt{spread}(A_0^{hi})$        |                                    |           $Hprime_0^{lo}$          |           $Hprime_0^{hi}$          |                                    |
-   0    |   0   |   0   | 0 | 0 | 1   |    0   | 0  |    1   |    0   |    0    |{0,1,2,3,4,5}|$M_0^{odd}$ |$\texttt{spread}(M_0^{odd})$ |    $\texttt{spread}(M_1^{odd})$     |     $\mathtt{spread}(B_0^{lo})$     |     $\mathtt{spread}(B_0^{hi})$        |        $\Sigma_0(A_0)^{lo}$        |                                    |             $A_1^{lo}$             |              $A_1 carry$           |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$M_1^{even}$|$\texttt{spread}(M_1^{even})$|                                     |     $\mathtt{spread}(C_0^{lo})$     |     $\mathtt{spread}(C_0^{hi})$        |        $\Sigma_0(A_0)^{hi}$        |                                    |             $A_1^{hi}$             |                                    |
-   0    |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$M_1^{odd}$ |$\texttt{spread}(M_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-
-#### Steady-state:
-
-s_digest|sd_abcd|sd_efgh|ss0|ss1|s_maj|s_ch_neg|s_ch|s_a_new |s_e_new |s_h_prime|    $a_0$    |   $a_1$    |           $a_2$             |               $a_3$                 |              $a_4$                  |               $a_5$                    |               $a_6$                |               $a_7$                |               $a_8$                |               $a_9$                |
--------|-------|-------|---|---|-----|--------|----|--------|--------|---------|-------------|------------|-----------------------------|-------------------------------------|-------------------------------------|----------------------------------------|------------------------------------|------------------------------------|------------------------------------|------------------------------------|
-    0   |   0   |   1   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$F_0 d(7)$  |$\texttt{spread}(E_0 d(7)) $ |          $E_0 b(5)^{lo}$            | $\texttt{spread}(E_0 b(5)^{lo})$    |           $E_0 b(5)^{hi}$              |  $\texttt{spread}(E_0 b(5)^{hi}) $ |             $E_0^{lo}$             |  $\mathtt{spread}(E_0^{lo})$       |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$E_0 c(14)$ |$\texttt{spread}(E_0 c(14))$ |          $E_0 a(6)^{lo}$            | $\texttt{spread}(E_0 a(6)^{lo})$    |           $E_0 a(6)^{hi}$              |  $\texttt{spread}(E_0 a(6)^{hi}) $ |             $E_0^{hi}$             |  $\mathtt{spread}(E_0^{hi})$       |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|   $\texttt{spread}(E_0 b(5)^{lo})$  |   $\texttt{spread}(E_0 b(5)^{hi})$  |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 1 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ |        $\texttt{spread}(R_1^{odd})$ |    $\texttt{spread}(E_0 d(7))$      |     $\texttt{spread}(E_0 c(14))$       |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|   $\texttt{spread}(E_0 a(6)^{lo})$  |   $\texttt{spread}(E_0 a(6)^{hi})$  |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$P_0^{even}$|$\texttt{spread}(P_0^{even})$|  $\mathtt{spread}(E^{lo})$          |      $\mathtt{spread}(E^{hi})$      |           $Q_0^{odd}$                  |             $K_0^{lo}$             |             $H_0^{lo}$             |             $W_0^{lo}$             |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 1  |    0   |    0   |    1    |{0,1,2,3,4,5}|$P_0^{odd}$ |$\texttt{spread}(P_0^{odd})$ | $\texttt{spread}(P_1^{odd})$        |        $\Sigma_1(E_0)^{lo}$         |       $\Sigma_1(E_0)^{hi}$             |             $K_0^{hi}$             |             $H_0^{hi}$             |             $W_0^{hi}$             |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$P_1^{even}$|$\texttt{spread}(P_1^{even})$|  $\mathtt{spread}(F^{lo})$          |      $\mathtt{spread}(F^{hi})$      |           $Q_1^{odd}$                  |            $P_1^{odd}$             |           $Hprime_0^{lo}$          |           $Hprime_0^{hi}$          |          $Hprime_0 carry$          |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$P_1^{odd}$ |$\texttt{spread}(P_1^{odd})$ |                                     |                                     |                                        |                                    |             $D_0^{lo}$             |             $E_1^{lo}$             |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    1   |    0    |{0,1,2,3,4,5}|$Q_0^{even}$|$\texttt{spread}(Q_0^{even})$| $\mathtt{spread}(E_{neg}^{lo})$     |    $\mathtt{spread}(E_{neg}^{hi})$  |     $\mathtt{spread}(E^{lo})$          |                                    |             $D_0^{hi}$             |             $E_1^{hi}$             |             $E_1 carry$            |
-    0   |   0   |   0   | 0 | 0 | 0   |    1   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$Q_0^{odd}$ |$\texttt{spread}(Q_0^{odd})$ | $\texttt{spread}(Q_1^{odd})$        |                                     |     $\mathtt{spread}(E^{hi})$          |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$Q_1^{even}$|$\texttt{spread}(Q_1^{even})$| $\mathtt{spread}(G^{lo})$           |       $\mathtt{spread}(G^{hi})$     |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$Q_1^{odd}$ |$\texttt{spread}(Q_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-    0   |   1   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |   {0,1,2}   |$A_0 b(11)$ |$\texttt{spread}(A_0 b(11))$ |          $A_0 c(9)^{lo}$            | $\texttt{spread}(A_0 c(9)^{lo})$    |          $A_0 c(9)^{mid}$              | $\texttt{spread}(A_0 c(9)^{mid})$  |             $A_0^{lo}$             |  $\mathtt{spread}(A_0^{lo})$       |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |    {0,1}    |$A_0 d(10)$ |$\texttt{spread}(A_0 d(10))$ |              $A_0 a(2)$             | $\texttt{spread}(A_0 a(2))$         |          $A_0 c(9)^{hi}$               | $\texttt{spread}(A_0 c(9)^{hi})$   |             $A_0^{hi}$             |  $\mathtt{spread}(A_0^{hi})$       |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{even}$|$\texttt{spread}(R_0^{even})$|    $\texttt{spread}(c(9)^{lo})$     |    $\texttt{spread}(c(9)^{mid})$    |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 1 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_0^{odd}$ |$\texttt{spread}(R_0^{odd})$ |        $\texttt{spread}(R_1^{odd})$ |     $\texttt{spread}(d(10))$        |         $\texttt{spread}(b(11))$       |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{even}$|$\texttt{spread}(R_1^{even})$|    $\texttt{spread}(a(2))$          |      $\texttt{spread}(c(9)^{hi})$   |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$R_1^{odd}$ |$\texttt{spread}(R_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$M_0^{even}$|$\texttt{spread}(M_0^{even})$|              $M_1^{odd}$            |     $\mathtt{spread}(A_0^{lo})$     |     $\mathtt{spread}(A_0^{hi})$        |                                    |           $Hprime_0^{lo}$          |           $Hprime_0^{hi}$          |                                    |
-    0   |   0   |   0   | 0 | 0 | 1   |    0   | 0  |    1   |    0   |    0    |{0,1,2,3,4,5}|$M_0^{odd}$ |$\texttt{spread}(M_0^{odd})$ |    $\texttt{spread}(M_1^{odd})$     |     $\mathtt{spread}(B_0^{lo})$     |     $\mathtt{spread}(B_0^{hi})$        |        $\Sigma_0(A_0)^{lo}$        |                                    |             $A_1^{lo}$             |              $A_1 carry$           |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$M_1^{even}$|$\texttt{spread}(M_1^{even})$|                                     |     $\mathtt{spread}(C_0^{lo})$     |     $\mathtt{spread}(C_0^{hi})$        |        $\Sigma_0(A_0)^{hi}$        |                                    |             $A_1^{hi}$             |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |{0,1,2,3,4,5}|$M_1^{odd}$ |$\texttt{spread}(M_1^{odd})$ |                                     |                                     |                                        |                                    |                                    |                                    |                                    |
-
-#### Final digest:
-s_digest|sd_abcd|sd_efgh|ss0|ss1|s_maj|s_ch_neg|s_ch|s_a_new |s_e_new |s_h_prime|    $a_0$    |   $a_1$     |           $a_2$              |               $a_3$                 |              $a_4$                  |               $a_5$                    |               $a_6$                |               $a_7$                |               $a_8$                |               $a_9$                |
--------|-------|-------|---|---|-----|--------|----|--------|--------|---------|-------------|-------------|------------------------------|-------------------------------------|-------------------------------------|----------------------------------------|------------------------------------|------------------------------------|------------------------------------|------------------------------------|
-    1   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |      0      |      0      |             0                |          $A_{63}^{lo}$              |            $A_{63}^{hi}$            |             $A_{63}$                   |          $B_{63}^{lo}$             |            $B_{63}^{hi}$           |             $B_{63}$               |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |      0      |      0      |             0                |          $C_{63}^{lo}$              |            $C_{63}^{hi}$            |             $C_{63}$                   |          $C_{63}^{lo}$             |            $C_{63}^{hi}$           |             $C_{63}$               |                                    |
-    1   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |      0      |      0      |             0                |          $E_{63}^{lo}$              |            $E_{63}^{hi}$            |             $E_{63}$                   |          $G_{63}^{lo}$             |            $G_{63}^{hi}$           |             $G_{63}$               |                                    |
-    0   |   0   |   0   | 0 | 0 | 0   |    0   | 0  |    0   |    0   |    0    |      0      |      0      |             0                |          $F_{63}^{lo}$              |            $F_{63}^{hi}$            |             $F_{63}$                   |          $H_{63}^{lo}$             |            $H_{63}^{hi}$           |             $H_{63}$               |                                    |
--- a/book/src/design/gadgets/sha256/upp_sigma_0.png
+++ b/book/src/design/gadgets/sha256/upp_sigma_0.png
--- a/book/src/design/gadgets/sha256/upp_sigma_1.png
+++ b/book/src/design/gadgets/sha256/upp_sigma_1.png
--- a/book/src/design/implementation.md
+++ b/book/src/design/implementation.md
@ -1,33 +1 @@
 # Implementation
-
-## Proofs as opaque byte streams
-
-In proving system implementations like `bellman`, there is a concrete `Proof` struct that
-encapsulates the proof data, is returned by a prover, and can be passed to a verifier.
-
-`halo2` does not contain any proof-like structures, for several reasons:
-
- The Proof structures would contain vectors of (vectors of) curve points and scalars.
-  This complicates serialization/deserialization of proofs because the lengths of these
-  vectors depend on the configuration of the circuit. However, we didn't want to encode
-  the lengths of vectors inside of proofs, because at runtime the circuit is fixed, and
-  thus so are the proof sizes.
- It's easy to accidentally put stuff into a Proof structure that isn't also placed in the
-  transcript, which is a hazard when developing and implementing a proving system.
- We needed to be able to create multiple PLONK proofs at the same time; these proofs
-  share many different substructures when they are for the same circuit.
-
-Instead, `halo2` treats proof objects as opaque byte streams. Creation and consumption of
-these byte streams happens via the transcript:
-
- The `TranscriptWrite` trait represents something that we can write proof components to
-  (at proving time).
- The `TranscriptRead` trait represents something that we can read proof components from
-  (at verifying time).
-
-Crucially, implementations of `TranscriptWrite` are responsible for simultaneously writing
-to some `std::io::Write` buffer at the same time that they hash things into the transcript,
-and similarly for `TranscriptRead`/`std::io::Read`.
-
-As a bonus, treating proofs as opaque byte streams ensures that verification accounts for
-the cost of deserialization, which isn't negligible due to point compression.
--- a/book/src/design/implementation/fields.md
+++ b/book/src/design/implementation/fields.md
@ -1,7 +1,7 @@
 # Fields

 The [Pasta curves](https://electriccoin.co/blog/the-pasta-curves-for-halo-2-and-beyond/)
-that we use in `halo2` are designed to be highly 2-adic, meaning that a large $2^S$
+are designed to be highly 2-adic, meaning that a large $2^S$
 [multiplicative subgroup](../../background/fields.md#multiplicative-subgroups) exists in
 each field. That is, we can write $p - 1 \equiv 2^S \cdot T$ with $T$ odd. For both Pallas
 and Vesta, $S = 32$; this helps to simplify the field implementations.
@ -9,8 +9,8 @@ and Vesta, $S = 32$; this helps to simplify the field implementations.
 ## Sarkar square-root algorithm (table-based variant)

 We use a technique from [Sarkar2020](https://eprint.iacr.org/2020/1407.pdf) to compute
-[square roots](../../background/fields.md#square-roots) in `halo2`. The intuition behind
-the algorithm is that we can split the task into computing square roots in each
+[square roots](../../background/fields.md#square-roots) in `pasta_curves`. The intuition
+behind the algorithm is that we can split the task into computing square roots in each
 multiplicative subgroup.

 Suppose we want to find the square root of $u$ modulo one of the Pasta primes $p$, where
--- a/book/src/design/proving-system.md
+++ b/book/src/design/proving-system.md
@ -1,74 +0,0 @@
-# Proving system
-
-The Halo 2 proving system can be broken down into five stages:
-
-1. Commit to polynomials encoding the main components of the circuit:
-   - Cell assignments.
-   - Permuted values and products for each lookup argument.
-   - Equality constraint permutations.
-2. Construct the vanishing argument to constrain all circuit relations to zero:
-   - Standard and custom gates.
-   - Lookup argument rules.
-   - Equality constraint permutation rules.
-3. Evaluate the above polynomials at all necessary points:
-   - All relative rotations used by custom gates across all columns.
-   - Vanishing argument pieces.
-4. Construct the multipoint opening argument to check that all evaluations are consistent
-   with their respective commitments.
-5. Run the inner product argument to create a polynomial commitment opening proof for the
-   multipoint opening argument polynomial.
-
-These stages are presented in turn across this section of the book.
-
-## Example
-
-To aid our explanations, we will at times refer to the following example constraint
-system:
-
- Four advice columns $a, b, c, d$.
- One fixed column $f$.
- Three custom gates:
-  - $a \cdot b \cdot c_{-1} - d = 0$
-  - $f_{-1} \cdot c = 0$
-  - $f \cdot d \cdot a = 0$
-
-## tl;dr
-
-The table below provides a (probably too) succinct description of the Halo 2 protocol.
-This description will likely be replaced by the Halo 2 paper and security proof, but for
-now serves as a summary of the following sub-sections.
-
-| Prover                                                                      |         | Verifier                           |
-| --------------------------------------------------------------------------- | ------- | ---------------------------------- |
-|                                                                             | $\larr$ | $t(X) = (X^n - 1)$                 |
-|                                                                             | $\larr$ | $F = [F_0, F_1, \dots, F_{m - 1}]$ |
-| $\mathbf{A} = [A_0, A_1, \dots, A_{m - 1}]$                                 | $\rarr$ |                                    |
-|                                                                             | $\larr$ | $\theta$                           |
-| $\mathbf{L} = [(A'_0, S'_0), \dots, (A'_{m - 1}, S'_{m - 1})]$              | $\rarr$ |                                    |
-|                                                                             | $\larr$ | $\beta, \gamma$                    |
-| $\mathbf{Z_P} = [Z_{P,0}, Z_{P,1}, \ldots]$                                 | $\rarr$ |                                    |
-| $\mathbf{Z_L} = [Z_{L,0}, Z_{L,1}, \ldots]$                                 | $\rarr$ |                                    |
-|                                                                             | $\larr$ | $y$                                |
-| $h(X) = \frac{\text{gate}_0(X) + \dots + y^i \cdot \text{gate}_i(X)}{t(X)}$ |         |                                    |
-| $h(X) = h_0(X) + \dots + X^{n(d-1)} h_{d-1}(X)$                             |         |                                    |
-| $\mathbf{H} = [H_0, H_1, \dots, H_{d-1}]$                                   | $\rarr$ |                                    |
-|                                                                             | $\larr$ | $x$                                |
-| $evals = [A_0(x), \dots, H_{d - 1}(x)]$                                     | $\rarr$ |                                    |
-|                                                                             |         | Checks $h(x)$                      |
-|                                                                             | $\larr$ | $x_1, x_2$                         |
-| Constructs $h'(X)$ multipoint opening poly                                  |         |                                    |
-| $U = \text{Commit}(h'(X))$                                                  | $\rarr$ |                                    |
-|                                                                             | $\larr$ | $x_3$                              |
-| $\mathbf{q}_\text{evals} = [Q_0(x_3), Q_1(x_3), \dots]$                     | $\rarr$ |                                    |
-| $u_\text{eval} = U(x_3)$                                                    | $\rarr$ |                                    |
-|                                                                             | $\larr$ | $x_4$                              |
-
-Then the prover and verifier:
-
- Construct $\text{finalPoly}(X)$ as a linear combination of $\mathbf{Q}$ and $U$ using
-  powers of $x_4$;
- Construct $\text{finalPolyEval}$ as the equivalent linear combination of
-  $\mathbf{q}_\text{evals}$ and $u_\text{eval}$; and
- Perform $\text{InnerProduct}(\text{finalPoly}(X), x_3, \text{finalPolyEval}).$
-
-> TODO: Write up protocol components that provide zero-knowledge.
--- a/book/src/design/proving-system/circuit-commitments.md
+++ b/book/src/design/proving-system/circuit-commitments.md
@ -1,102 +0,0 @@
-# Circuit commitments
-
-## Committing to the circuit assignments
-
-At the start of proof creation, the prover has a table of cell assignments that it claims
-satisfy the constraint system. The table has $n = 2^k$ rows, and is broken into advice,
-instance, and fixed columns. We define $F_{i,j}$ as the assignment in the $j$th row of
-the $i$th fixed column. Without loss of generality, we'll similarly define $A_{i,j}$ to
-represent the advice and instance assignments.
-
-> We separate fixed columns here because they are provided by the verifier, whereas the
-> advice and instance columns are provided by the prover. In practice, the commitments to
-> instance and fixed columns are computed by both the prover and verifier, and only the
-> advice commitments are stored in the proof.
-
-To commit to these assignments, we construct Lagrange polynomials of degree $n - 1$ for
-each column, over an evaluation domain of size $n$ (where $\omega$ is the $n$th primitive
-root of unity):
-
- $a_i(X)$ interpolates such that $a_i(\omega^j) = A_{i,j}$.
- $f_i(X)$ interpolates such that $f_i(\omega^j) = F_{i,j}$.
-
-We then create a blinding commitment to the polynomial for each column:
-
-$$\mathbf{A} = [\text{Commit}(a_0(X)), \dots, \text{Commit}(a_i(X))]$$
-$$\mathbf{F} = [\text{Commit}(f_0(X)), \dots, \text{Commit}(f_i(X))]$$
-
-$\mathbf{F}$ is constructed as part of key generation, using a blinding factor of $1$.
-$\mathbf{A}$ is constructed by the prover and sent to the verifier.
-
-## Committing to the lookup permutations
-
-The verifier starts by sampling $\theta$, which is used to keep individual columns within
-lookups independent. Then, the prover commits to the permutations for each lookup as
-follows:
-
- Given a lookup with input column polynomials $[A_0(X), \dots, A_{m-1}(X)]$ and table
-  column polynomials $[S_0(X), \dots, S_{m-1}(X)]$, the prover constructs two compressed
-  polynomials
-
-  $$A_\text{compressed}(X) = \theta^{m-1} A_0(X) + \theta^{m-2} A_1(X) + \dots + \theta A_{m-2}(X) + A_{m-1}(X)$$
-  $$S_\text{compressed}(X) = \theta^{m-1} S_0(X) + \theta^{m-2} S_1(X) + \dots + \theta S_{m-2}(X) + S_{m-1}(X)$$
-
- The prover then permutes $A_\text{compressed}(X)$ and $S_\text{compressed}(X)$ according
-  to the [rules of the lookup argument](lookup.md), obtaining $A'(X)$ and $S'(X)$.
-
-Finally, the prover creates blinding commitments for all of the lookups
-
-$$\mathbf{L} = \left[ (\text{Commit}(A'(X))), \text{Commit}(S'(X))), \dots \right]$$
-
-and sends them to the verifier.
-
-## Committing to the equality constraint permutations
-
-The verifier samples $\beta$ and $\gamma$.
-
-For each equality constraint argument:
-
- The prover constructs a vector $P$:
-
-$$
-P_j = \prod\limits_{i=0}^{m-1} \frac{p_i(\omega^j) + \beta \cdot \delta^i \cdot \omega^j + \gamma}{p_i(\omega^j) + \beta \cdot s_i(\omega^j) + \gamma}
-$$
-
- The prover constructs a polynomial $Z_P$ which has a Lagrange basis representation
-  corresponding to a running product of $P$, starting at $Z_P(1) = 1$.
-
-See the [Permutation argument](permutation.md#argument-specification) section for more detail.
-
-The prover creates blinding commitments to each $Z_P$ polynomial:
-
-$$\mathbf{Z_P} = \left[\text{Commit}(Z_P(X)), \dots \right]$$
-
-and sends them to the verifier.
-
-## Committing to the lookup permutation product columns
-
-In addition to committing to the individual permuted lookups, for each lookup,
-the prover needs to commit to the permutation product column:
-
- The prover constructs a vector $P$:
-
-$$
-P_j = \frac{(A_\text{compressed}(\omega^j) + \beta)(S_\text{compressed}(\omega^j) + \gamma)}{(A'(\omega^j) + \beta)(S'(\omega^j) + \gamma)}
-$$
-
- The prover constructs a polynomial $Z_L$ which has a Lagrange basis representation
-  corresponding to a running product of $P$, starting at $Z_L(1) = 1$.
-
-$\beta$ and $\gamma$ are used to combine the permutation arguments for $A'(X)$ and $S'(X)$
-while keeping them independent. We can reuse $\beta$ and $\gamma$ from the equality
-constraint permutation here because they serve the same purpose in both places, and we
-aren't trying to combine the lookup and equality constraint permutation arguments. The
-important thing here is that the verifier samples $\beta$ and $\gamma$ after the prover
-has created $\mathbf{A}$, $\mathbf{F}$, and $\mathbf{L}$ (and thus commited to all the
-cell values used in lookup columns, as well as $A'(X)$ and $S'(X)$ for each lookup).
-
-As before, the prover creates blinding commitments to each $Z_L$ polynomial:
-
-$$\mathbf{Z_L} = \left[\text{Commit}(Z_L(X)), \dots \right]$$
-
-and sends them to the verifier.
--- a/book/src/design/proving-system/comparison.md
+++ b/book/src/design/proving-system/comparison.md
@ -1,54 +0,0 @@
-# Comparison to other work
-
-## BCMS20 Appendix A.2
-
-Appendix A.2 of [BCMS20] describes a polynomial commitment scheme that is similar to the
-one described in [BGH19] (BCMS20 being a generalization of the original Halo paper). Halo
-2 builds on both of these works, and thus itself uses a polynomial commitment scheme that
-is very similar to the one in BCMS20.
-
-[BGH19]: https://eprint.iacr.org/2019/1021
-[BCMS20]: https://eprint.iacr.org/2020/499
-
-The following table provides a mapping between the variable names in BCMS20, and the
-equivalent objects in Halo 2 (which builds on the nomenclature from the Halo paper):
-
-|     BCMS20     |       Halo 2        |
-| :------------: | :-----------------: |
-|      $S$       |         $H$         |
-|      $H$       |         $U$         |
-|      $C$       |    `msm` or $P$     |
-|    $\alpha$    |       $\iota$       |
-|    $\xi_0$     |         $z$         |
-|    $\xi_i$     |    `challenge_i`    |
-|      $H'$      |       $[z] U$       |
-|   $\bar{p}$    |      `s_poly`       |
-| $\bar{\omega}$ |   `s_poly_blind`    |
-|   $\bar{C}$    | `s_poly_commitment` |
-|     $h(X)$     |       $g(X)$        |
-|   $\omega'$    |   `blind` / $\xi$   |
-|  $\mathbf{c}$  |    $\mathbf{a}$     |
-|      $c$       | $a = \mathbf{a}_0$  |
-|      $v'$      |        $ab$         |
-
-Halo 2's polynomial commitment scheme differs from Appendix A.2 of BCMS20 in two ways:
-
-1. Step 8 of the $\text{Open}$ algorithm computes a "non-hiding" commitment $C'$ prior to
-   the inner product argument, which opens to the same value as $C$ but is a commitment to
-   a randomly-drawn polynomial. The remainder of the protocol involves no blinding. By
-   contrast, in Halo 2 we blind every single commitment that we make (even for instance
-   and fixed polynomials, though using a blinding factor of 1 for the fixed polynomials);
-   this makes the protocol simpler to reason about. As a consequence of this, the verifier
-   needs to handle the cumulative blinding factor at the end of the protocol, and so there
-   is no need to derive an equivalent to $C'$ at the start of the protocol.
-
-   - $C'$ is also an input to the random oracle for $\xi_0$; in Halo 2 we utilize a
-     transcript that has already committed to the equivalent components of $C'$ prior to
-     sampling $z$.
-
-2. The $\text{PC}_\text{DL}.\text{SuccinctCheck}$ subroutine (Figure 2 of BCMS20) computes
-   the initial group element $C_0$ by adding $[v] H' = [v \epsilon] H$, which requires two
-   scalar multiplications. Instead, we subtract $[v] G_0$ from the original commitment $P$,
-   so that we're effectively opening the polynomial at the point to the value zero. The
-   computation $[v] G_0$ is more efficient in the context of recursion because $G_0$ is a
-   fixed base (so we can use lookup tables).
--- a/book/src/design/proving-system/inner-product.md
+++ b/book/src/design/proving-system/inner-product.md
@ -1,11 +0,0 @@
-# Inner product argument
-
-Halo 2 uses a polynomial commitment scheme for which we can create polynomial commitment
-opening proofs, based around the Inner Product Argument.
-
-> TODO: Explain Halo 2's variant of the IPA.
->
-> It is very similar to $\text{PC}_\text{DL}.\text{Open}$ from Appendix A.2 of [BCMS20].
-> See [this comparison](comparison.md#bcms20-appendix-a2) for details.
->
-> [BCMS20]: https://eprint.iacr.org/2020/499
--- a/book/src/design/proving-system/lookup.md
+++ b/book/src/design/proving-system/lookup.md
@ -1,111 +0,0 @@
-# Lookup argument
-
-halo2 uses the following lookup technique, which allows for lookups in arbitrary sets, and
-is arguably simpler than Plookup.
-
-## Note on Language
-
-In addition to the [general notes on language](../design.md#note-on-language):
-
- We call the $Z(X)$ polynomial (the grand product argument polynomial for the permutation
-  argument) the "permutation product" column.
-
-## Technique Description
-
-We express lookups in terms of a "subset argument" over a table with $2^k$ rows (numbered
-from 0), and columns $A$ and $S$.
-
-The goal of the subset argument is to enforce that every cell in $A$ is equal to _some_
-cell in $S$. This means that more than one cell in $A$ can be equal to the _same_ cell in
-$S$, and some cells in $S$ don't need to be equal to any of the cells in $A$.
-
- $S$ might be fixed, but it doesn't need to be. That is, we can support looking up values
-  in either fixed or variable tables (where the latter includes advice columns).
- $A$ and $S$ can contain duplicates. If the sets represented by $A$ and/or $S$ are not
-  naturally of size $2^k$, we extend $S$ with duplicates and $A$ with dummy values known
-  to be in $S$.
-  - Alternatively we could add a "lookup selector" that controls which elements of the $A$
-    column participate in lookups. This would modify the occurrence of $A(X)$ in the
-    permutation rule below to replace $A$ with, say, $S_0$ if a lookup is not selected.
-
-Let $\ell_i$ be the Lagrange basis polynomial that evaluates to $1$ at row $i$, and $0$
-otherwise.
-
-We start by allowing the prover to supply permutation columns of $A$ and $S$. Let's call
-these $A'$ and $S'$, respectively. We can enforce that they are permutations using a
-permutation argument with product column $Z$ with the rules:
-
-$$
-Z(X) (A(X) + \beta) (S(X) + \gamma) - Z(\omega^{-1} X) (A'(X) + \beta) (S'(X) + \gamma) = 0
-$$$$
-\ell_0(X) (Z(X) - 1) = 0
-$$
-
-This is a version of the permutation argument which allows $A'$ and $S'$ to be
-permutations of $A$ and $S$, respectively, but doesn't specify the exact permutations.
-$\beta$ and $\gamma$ are separate challenges so that we can combine these two permutation
-arguments into one without worrying that they might interfere with each other.
-
-The goal of these permutations is to allow $A'$ and $S'$ to be arranged by the prover in a
-particular way:
-
-1. All the cells of column $A'$ are arranged so that like-valued cells are vertically
-   adjacent to each other. This could be done by some kind of sorting algorithm, but all
-   that matters is that like-valued cells are on consecutive rows in column $A'$, and that
-   $A'$ is a permutation of $A$.
-2. The first row in a sequence of like values in $A'$ is the row that has the
-   corresponding value in $S'.$ Apart from this constraint, $S'$ is any arbitrary
-   permutation of $S$.
-
-Now, we'll enforce that either $A'_i = S'_i$ or that $A'_i = A'_{i-1}$, using the rule
-
-$$
-(A'(X) - S'(X)) \cdot (A'(X) - A'(\omega^{-1} X)) = 0
-$$
-
-In addition, we enforce $A'_0 = S'_0$ using the rule
-
-$$
-\ell_0(X) \cdot (A'(X) - S'(X)) = 0
-$$
-
-Together these constraints effectively force every element in $A'$ (and thus $A$) to equal
-at least one element in $S'$ (and thus $S$). Proof: by induction on prefixes of the rows.
-
-## Cost
-
-* There is the original column $A$ and the fixed column $S$.
-* There is a permutation product column $Z$.
-* There are the two permutations $A'$ and $S'$.
-* The gates are all of low degree.
-
-## Generalizations
-
-halo2's lookup argument implementation generalizes the above technique in the following
-ways:
-
- $A$ and $S$ can be extended to multiple columns, combined using a random challenge. $A'$
-  and $S'$ stay as single columns.
-  - The commitments to the columns of $S$ can be precomputed, then combined cheaply once
-    the challenge is known by taking advantage of the homomorphic property of Pedersen
-    commitments.
-  - The columns of $A$ can be given as arbitrary polynomial expressions using relative
-    references. These will be substituted into the product column constraint, subject to
-    the maximum degree bound. This potentially saves one or more advice columns.
- Then, a lookup argument for an arbitrary-width relation can be implemented in terms of a
-  subset argument, i.e. to constrain $\mathcal{R}(x, y, ...)$ in each row, consider
-  $\mathcal{R}$ as a set of tuples $S$ (using the method of the previous point), and check
-  that $(x, y, ...) \in \mathcal{R}$.
-  - In the case where $\mathcal{R}$ represents a function, this implicitly also checks
-    that the inputs are in the domain. This is typically what we want, and often saves an
-    additional range check.
- We can support multiple tables in the same circuit, by combining them into a single
-  table that includes a tag column to identify the original table.
-  - The tag column could be merged with the "lookup selector" mentioned earlier, if this
-    were implemented.
-
-These generalizations are similar to those in sections 4 and 5 of the
-[Plookup paper](https://eprint.iacr.org/2020/315.pdf). That is, the differences from
-Plookup are in the subset argument. This argument can then be used in all the same ways;
-for instance, the optimized range check technique in section 5 of the Plookup paper can
-also be used with this subset argument.
--- a/book/src/design/proving-system/multipoint-opening.md
+++ b/book/src/design/proving-system/multipoint-opening.md
@ -1,93 +0,0 @@
-# Multipoint opening argument
-
-Consider the commitments $A, B, C, D$ to polynomials $a(X), b(X), c(X), d(X)$.
-Let's say that $a$ and $b$ were queried at the point $x$, while $c$ and $d$
-were queried at both points $x$ and $\omega x$. (Here, $\omega$ is the primitive
-root of unity in the multiplicative subgroup over which we constructed the
-polynomials).
-
-To open these commitments, we could create a polynomial $Q$ for each point that we queried
-at (corresponding to each relative rotation used in the circuit). But this would not be
-efficient in the circuit; for example, $c(X)$ would appear in multiple polynomials.
-
-Instead, we can group the commitments by the sets of points at which they were queried:
-$$
-\begin{array}{cccc}
-&\{x\}&     &\{x, \omega x\}& \\
- &A&            &C& \\
- &B&            &D&
-\end{array}
-$$
-
-For each of these groups, we combine them into a polynomial set, and create a single $Q$
-for that set, which we open at each rotation.
-
-## Optimisation steps
-
-The multipoint opening optimisation takes as input:
-
- A random $x$ sampled by the verifier, at which we evaluate $a(X), b(X), c(X), d(X)$.
- Evaluations of each polynomial at each point of interest, provided by the prover:
-  $a(x), b(x), c(x), d(x), c(\omega x), d(\omega x)$
-
-These are the outputs of the [vanishing argument](vanishing.md#evaluating-the-polynomials).
-
-The multipoint opening optimisation proceeds as such:
-
-1. Sample random $x_1$, to keep $a, b, c, d$ linearly independent.
-2. Accumulate polynomials and their corresponding evaluations according
-   to the point set at which they were queried:
-    `q_polys`:
-    $$
-    \begin{array}{rccl}
-    q_1(X) &=& a(X) &+& x_1 b(X) \\
-    q_2(X) &=& c(X) &+& x_1 d(X)
-    \end{array}
-    $$
-    `q_eval_sets`:
-    ```math
-            [
-                [a(x) + x_1 b(x)],
-                [
-                    c(x) + x_1 d(x),
-                    c(\omega x) + x_1 d(\omega x)
-                ]
-            ]
-    ```
-    NB: `q_eval_sets` is a vector of sets of evaluations, where the outer vector
-    goes over the point sets, and the inner vector goes over the points in each set.
-3. Interpolate each set of values in `q_eval_sets`:
-    `r_polys`:
-    $$
-    \begin{array}{cccc}
-    r_1(X) s.t.&&& \\
-        &r_1(x) &=& a(x) + x_1 b(x) \\
-    r_2(X) s.t.&&& \\
-        &r_2(x) &=& c(x) + x_1 d(x) \\
-        &r_2(\omega x) &=& c(\omega x) + x_1 d(\omega x) \\
-    \end{array}
-    $$
-4. Construct `f_polys` which check the correctness of `q_polys`:
-    `f_polys`
-    $$
-    \begin{array}{rcl}
-    f_1(X) &=& \frac{ q_1(X) - r_1(X)}{X - x} \\
-    f_2(X) &=& \frac{ q_2(X) - r_2(X)}{(X - x)(X - \omega x)} \\
-    \end{array}
-    $$
-
-    If $q_1(x) = r_1(x)$, then $f_1(X)$ should be a polynomial.
-    If $q_2(x) = r_2(x)$ and $q_2(\omega x) = r_2(\omega x)$
-    then $f_2(X)$ should be a polynomial.
-5. Sample random $x_2$ to keep the `f_polys` linearly independent.
-6. Construct $f(X) = f_1(X) + x_2 f_2(X)$.
-7.  Sample random $x_3$, at which we evaluate $f(X)$:
-    $$
-    \begin{array}{rcccl}
-    f(x_3) &=& f_1(x_3) &+& x_2 f_2(x_3) \\
-           &=& \frac{q_1(x_3) - r_1(x_3)}{x_3 - x} &+& x_2\frac{q_2(x_3) - r_2(x_3)}{(x_3 - x)(x_3 - \omega x)}
-    \end{array}
-    $$
-8.  Sample random $x_4$ to keep $f(X)$ and `q_polys` linearly independent.
-9.  Construct `final_poly`, $$final\_poly(X) = f(X) + x_4 q_1(X) + x_4^2 q_2(X),$$
-    which is the polynomial we commit to in the inner product argument.
--- a/book/src/design/proving-system/permutation.md
+++ b/book/src/design/proving-system/permutation.md
@ -1,162 +0,0 @@
-# Permutation argument
-
-Given that gates in halo2 circuits operate "locally" (on cells in the current row or
-defined relative rows), it is common to need to copy a value from some arbitrary cell into
-the current row for use in a gate. This is performed with an equality constraint, which
-enforces that the source and destination cells contain the same value.
-
-We implement these equality constraints by constructing a permutation that represents the
-constraints, and then using a permutation argument within the proof to enforce them.
-
-## Notation
-
-A permutation is a one-to-one and onto mapping of a set onto itself. A permutation can be
-factored uniquely into a composition of cycles (up to ordering of cycles, and rotation of
-each cycle).
-
-We sometimes use [cycle notation](https://en.wikipedia.org/wiki/Permutation#Cycle_notation)
-to write permutations. Let $(a\ b\ c)$ denote a cycle where $a$ maps to $b$, $b$ maps to
-$c$, and $c$ maps to $a$ (with the obvious generalisation to arbitrary-sized cycles).
-Writing two or more cycles next to each other denotes a composition of the corresponding
-permutations. For example, $(a\ b)\ (c\ d)$ denotes the permutation that maps $a$ to $b$,
-$b$ to $a$, $c$ to $d$, and $d$ to $c$.
-
-## Constructing the permutation
-
-### Goal
-
-We want to construct a permutation in which each subset of variables that are in a
-equality-constraint set form a cycle. For example, suppose that we have a circuit that
-defines the following equality constraints:
-
- $a \equiv b$
- $a \equiv c$
- $d \equiv e$
-
-From this we have the equality-constraint sets $\{a, b, c\}$ and $\{d, e\}$. We want to
-construct the permutation:
-
-$$(a\ b\ c)\ (d\ e)$$
-
-which defines the mapping of $[a, b, c, d, e]$ to $[b, c, a, e, d]$.
-
-### Algorithm
-
-We need to keep track of the set of cycles, which is a
-[set of disjoint sets](https://en.wikipedia.org/wiki/Disjoint-set_data_structure).
-Efficient data structures for this problem are known; for the sake of simplicity we choose
-one that is not asymptotically optimal but is easy to implement.
-
-We represent the current state as:
-
- an array $\mathsf{mapping}$ for the permutation itself;
- an auxiliary array $\mathsf{aux}$ that keeps track of a distinguished element of each
-  cycle;
- another array $\mathsf{sizes}$ that keeps track of the size of each cycle.
-
-We have the invariant that for each element $x$ in a given cycle $C$, $\mathsf{aux}(x)$
-points to the same element $c \in C$. This allows us to quickly decide whether two given
-elements $x$ and $y$ are in the same cycle, by checking whether
-$\mathsf{aux}(x) = \mathsf{aux}(y)$. Also, $\mathsf{sizes}(\mathsf{aux}(x))$ gives the
-size of the cycle containing $x$. (This is guaranteed only for
-$\mathsf{sizes}(\mathsf{aux}(x)))$, not for $\mathsf{sizes}(x)$.)
-
-The algorithm starts with a representation of the identity permutation:
-for all $x$, we set $\mathsf{mapping}(x) = x$, $\mathsf{aux}(x) = x$, and
-$\mathsf{sizes}(x) = 1$.
-
-To add an equality constraint $\mathit{left} \equiv \mathit{right}$:
-
-1. Check whether $\mathit{left}$ and $\mathit{right}$ are already in the same cycle, i.e.
-   whether $\mathsf{aux}(\mathit{left}) = \mathsf{aux}(\mathit{right})$. If so, there is
-   nothing to do.
-2. Otherwise, $\mathit{left}$ and $\mathit{right}$ belong to different cycles. Make
-   $\mathit{left}$ the larger cycle and $\mathit{right}$ the smaller one, by swapping them
-   iff $\mathsf{sizes}(\mathsf{aux}(\mathit{left})) < \mathsf{sizes}(\mathsf{aux}(\mathit{right}))$.
-3. Following the mapping around the right (smaller) cycle, for each element $x$ set
-   $\mathsf{aux}(x) = \mathsf{aux}(\mathit{left})$.
-4. Splice the smaller cycle into the larger one by swapping $\mathsf{mapping}(\mathit{left})$
-   with $\mathsf{mapping}(\mathit{right})$.
-
-For example, given two disjoint cycles $(A\ B\ C\ D)$ and $(E\ F\ G\ H)$:
-
-```plaintext
-A +---> B
-^       +
-|       |
-+       v
-D <---+ C       E +---> F
-                ^       +
-                |       |
-                +       v
-                H <---+ G
-```
-
-After adding constraint $B \equiv E$ the above algorithm produces the cycle:
-
-```plaintext
-A +---> B +-------------+
-^                       |
-|                       |
-+                       v
-D <---+ C <---+ E       F
-                ^       +
-                |       |
-                +       v
-                H <---+ G
-```
-
-### Broken alternatives
-
-If we did not check whether $\mathit{left}$ and $\mathit{right}$ were already in the same
-cycle, then we could end up undoing an equality constraint. For example, if we have the
-following constraints:
-
- $a \equiv b$
- $b \equiv c$
- $c \equiv d$
- $b \equiv d$
-
-and we tried to implement adding an equality constraint just using step 4 of the above
-algorithm, then we would end up constructing the cycle $(a\ b)\ (c\ d)$, rather than the
-correct $(a\ b\ c\ d)$.
-
-## Argument specification
-
-We need to represent permutations over $m$ columns, represented by polynomials $p_0, \ldots, p_{m-1}$.
-
-We first assign a unique element of $\mathbb{F}^\times$ as an "extended domain" element for each cell
-that can participate in the permutation argument.
-
-Let $\omega$ be a $2^k$ root of unity and let $\delta$ be a $T$ root of unity, where
-$T \cdot 2^S + 1 = p$ with $T$ odd and $k \leq S$.
-We will use $\delta^i \cdot \omega^j \in \mathbb{F}^\times$ as the extended domain element for the
-cell in the $j$th row of the $i$th column of the permutation argument.
-
-If we have a permutation $\sigma(\mathsf{column}: i, \mathsf{row}: j) = (\mathsf{column}: i', \mathsf{row}: j')$,
-we can represent it as a vector of $m$ polynomials $s_i(X)$ such that $s_i(\omega^j) = \delta^{i'} \cdot \omega^{j'}$.
-
-Notice that the identity permutation can be represented by the vector of $m$ polynomials
-$\mathsf{ID}_i(X)$ such that $\mathsf{ID}_i(X) = \delta^i \cdot X$.
-
-Now given our permutation represented by $s_0, \ldots, s_{m-1}$, over advice columns represented by
-$p_0, \ldots, p_{m-1}$, we want to ensure that:
-$$
-\prod\limits_{i=0}^{m-1} \prod\limits_{j=0}^{n-1} \left(\frac{p_i(\omega^j) + \beta \cdot \delta^i \cdot \omega^j + \gamma}{p_i(\omega^j) + \beta \cdot s_i(\omega^j) + \gamma}\right) = 1
-$$
-
-Let $Z_P$ be such that $Z_P(\omega^0) = Z_P(\omega^n) = 1$ and for $0 \leq j < n$:
-$$\begin{array}{rl}
-Z_P(\omega^{j+1}) &= \prod\limits_{h=0}^{j} \prod\limits_{i=0}^{m-1} \frac{p_i(\omega^h) + \beta \cdot \delta^i \cdot \omega^h + \gamma}{p_i(\omega^h) + \beta \cdot s_i(\omega^h) + \gamma} \\
-                  &= Z_P(\omega^j) \prod\limits_{i=0}^{m-1} \frac{p_i(\omega^j) + \beta \cdot \delta^i \cdot \omega^j + \gamma}{p_i(\omega^j) + \beta \cdot s_i(\omega^j) + \gamma}
-\end{array}$$
-
-Then it is sufficient to enforce the constraints:
-$$
-l_0 \cdot (Z_P(X) - 1) = 0 \\
-Z_P(\omega X) \cdot \prod\limits_{i=0}^{m-1} \left(p_i(X) + \beta \cdot s_i(X) + \gamma\right) - Z_P(X) \cdot \prod\limits_{i=0}^{m-1} \left(p_i(X) + \beta \cdot \delta^i \cdot X + \gamma\right) = 0
-$$
-
-> The optimization used to obtain the simple representation of the identity permutation was suggested
-> by Vitalik Buterin for PLONK, and is described at the end of section 8 of the PLONK paper. Note that
-> the $\delta^i$ are all distinct quadratic non-residues.
--- a/book/src/design/proving-system/vanishing.md
+++ b/book/src/design/proving-system/vanishing.md
@ -1,79 +0,0 @@
-# Vanishing argument
-
-Having committed to the circuit assignments, the prover now needs to demonstrate that the
-various circuit relations are satisfied:
-
- The custom gates, represented by polynomials $\text{gate}_i(X)$.
- The rules of the lookup arguments.
- The rules of the equality constraint permutations.
-
-Each of these relations is represented as a polynomial of degree $d$ (the maximum degree
-of any of the relations) with respect to the circuit columns. Given that the degree of the
-assignment polynomials for each column is $n - 1$, the relation polynomials have degree
-$d(n - 1)$ with respect to $X$.
-
-> In our [example](../proving-system.md#example), these would be the gate polynomials, of
-> degree $3n - 3$:
->
-> - $\text{gate}_0(X) = a_0(X) \cdot a_1(X) \cdot a_2(X \omega^{-1}) - a_3(X)$
-> - $\text{gate}_1(X) = f_0(X \omega^{-1}) \cdot a_2(X)$
-> - $\text{gate}_2(X) = f_0(X) \cdot a_3(X) \cdot a_0(X)$
-
-A relation is satisfied if its polynomial is equal to zero. One way to demonstrate this is
-to divide each polynomial relation by the vanishing polynomial $t(X) = (X^n - 1)$, which
-is the lowest-degree monomial that has roots at every $\omega^i$. If relation's polynomial
-is perfectly divisible by $t(X)$, it is equal to zero over the domain (as desired).
-
-This simple construction would require a polynomial commitment per relation. Instead, we
-commit to all of the circuit relations simultaneously: the verifier samples $y$, and then
-the prover constructs the quotient polynomial
-
-$$h(X) = \frac{\text{gate}_0(X) + y \cdot \text{gate}_1(X) + \dots + y^i \cdot \text{gate}_i(X) + \dots}{t(X)},$$
-
-where the numerator is a random (the prover commits to the cell assignments before the
-verifier samples $y$) linear combination of the circuit relations.
-
- If the numerator polynomial (in formal indeterminate $X$) is perfectly divisible by
-  $t(X)$, then with high probability all relations are satisfied.
- Conversely, if at least one relation is not satisfied, then with high probability
-  $h(x) \cdot t(x)$ will not equal the evaluation of the numerator at $x$. In this case,
-  the numerator polynomial would not be perfectly divisible by $t(X)$.
-
-## Committing to $h(X)$
-
-$h(X)$ has degree $(d - 1)n - d$ (because the divisor $t(X)$ has degree $n$). However, the
-polynomial commitment scheme we use for Halo 2 only supports committing to polynomials of
-degree $n - 1$ (which is the maximum degree that the rest of the protocol needs to commit
-to). Instead of increasing the cost of the polynomial commitment scheme, the prover split
-$h(X)$ into pieces of degree $n - 1$
-
-$$h_0(X) + X^n h_1(X) + \dots + X^{n(d-1)} h_{d-1}(X),$$
-
-and produces blinding commitments to each piece
-
-$$\mathbf{H} = [\text{Commit}(h_0(X)), \text{Commit}(h_1(X)), \dots, \text{Commit}(h_{d-1}(X))].$$
-
-## Evaluating the polynomials
-
-At this point, all properties of the circuit have been committed to. The verifier now
-wants to see if the prover committed to the correct $h(X)$ polynomial. The verifier
-samples $x$, and the prover produces the purported evaluations of the various polynomials
-at $x$, for all the relative offsets used in the circuit, as well as $h(X)$.
-
-> In our [example](../proving-system.md#example), this would be:
->
-> - $a_0(x)$
-> - $a_1(x)$
-> - $a_2(x)$, $a_2(x \omega^{-1})$
-> - $a_3(x)$
-> - $f_0(x)$, $f_0(x \omega^{-1})$
-> - $h_0(x)$, ..., $h_{d-1}(x)$
-
-The verifier checks that these evaluations satisfy the form of $h(X)$:
-
-$$\frac{\text{gate}_0(x) + \dots + y^i \cdot \text{gate}_i(x) + \dots}{t(x)} = h_0(x) + \dots + x^{n(d-1)} h_{d-1}(x)$$
-
-Now content that the evaluations collectively satisfy the gate constraints, the verifier
-needs to check that the evaluations themselves are consistent with the original
-[circuit commitments](circuit-commitments.md), as well as $\mathbf{H}$. To implement this
-efficiently, we use a [multipoint opening argument](multipoint-opening.md).
--- a/book/src/user.md
+++ b/book/src/user.md
@ -1,5 +0,0 @@
-# User Documentation
-
-You're probably here because you want to write circuits? Excellent!
-
-This section will guide you through the process of creating circuits with halo2.
--- a/book/src/user/gadgets.md
+++ b/book/src/user/gadgets.md
@ -1 +0,0 @@
-# Gadgets
--- a/book/src/user/lookup-tables.md
+++ b/book/src/user/lookup-tables.md
@ -1,11 +0,0 @@
-# Lookup tables
-
-In normal programs, you can trade memory for CPU to improve performance, by pre-computing
-and storing lookup tables for some part of the computation. We can do the same thing in
-halo2 circuits!
-
-A lookup table can be thought of as enforcing a *relation* between variables, where the relation is expressed as a table.
-Assuming we have only one lookup argument in our constraint system, the total size of tables is constrained by the size of the circuit:
-each table entry costs one row, and it also costs one row to do each lookup.
-
-TODO
--- a/book/src/user/simple-example.md
+++ b/book/src/user/simple-example.md
@ -1,86 +0,0 @@
-# A simple example
-
-Let's start with a simple circuit, to introduce you to the common APIs and how they are
-used. The circuit will take a public input $c$, and will prove knowledge of two private
-inputs $a$ and $b$ such that
-
-$$a^2 \cdot b^2 = c.$$
-
-## Define instructions
-
-Firstly, we need to define the instructions that our circuit will rely on. Instructions
-are the boundary between high-level [gadgets](../concepts/gadgets.md) and the low-level
-circuit operations. Instructions may be as coarse or as granular as desired, but in
-practice you want to strike a balance between an instruction being large enough to
-effectively optimize its implementation, and small enough that it is meaningfully
-reusable.
-
-For our circuit, we will use three instructions:
- Load a private number into the circuit.
- Multiply two numbers.
- Expose a number as a public input to the circuit.
-
-We also need a type for a variable representing a number. Instruction interfaces provide
-associated types for their inputs and outputs, to allow the implementations to represent
-these in a way that makes the most sense for their optimization goals.
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:instructions}}
-```
-
-## Define a chip implementation
-
-For our circuit, we will build a [chip](../concepts/chips.md) that provides the above
-numeric instructions for a finite field.
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:chip}}
-```
-
-Every chip needs to implement the `Chip` trait. This defines the properties of the chip
-that a `Layouter` may rely on when synthesizing a circuit, as well as enabling any initial
-state that the chip requires to be loaded into the circuit.
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:chip-impl}}
-```
-
-## Configure the chip
-
-The chip needs to be configured with the columns, permutations, and gates that will be
-required to implement all of the desired instructions.
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:chip-config}}
-```
-
-## Implement chip traits
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:instructions-impl}}
-```
-
-## Build the circuit
-
-Now that we have the instructions we need, and a chip that implements them, we can finally
-build our circuit!
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:circuit}}
-```
-
-## Testing the circuit
-
-`halo2::dev::MockProver` can be used to test that the circuit is working correctly. The
-private and public inputs to the circuit are constructed as we will do to create a proof,
-but by passing them to `MockProver::run` we get an object that can test every constraint
-in the circuit, and tell us exactly what is failing (if anything).
-
-```rust,ignore,no_run
-{{#include ../../../examples/simple-example.rs:test-circuit}}
-```
-
-## Full example
-
-You can find the source code for this example
-[here](https://github.com/zcash/halo2/tree/main/examples/simple-example.rs).
--- a/book/src/user/tips-and-tricks.md
+++ b/book/src/user/tips-and-tricks.md
@ -1,71 +0,0 @@
-# Tips and tricks
-
-This section contains various ideas and snippets that you might find useful while writing
-halo2 circuits.
-
-## Small range constraints
-
-A common constraint used in R1CS circuits is the boolean constraint: $b * (1 - b) = 0$.
-This constraint can only be satisfied by $b = 0$ or $b = 1$.
-
-In halo2 circuits, you can similarly constrain a cell to have one of a small set of
-values. For example, to constrain $a$ to the range $[0..5]$, you would create a gate of
-the form:
-
-$$a \cdot (1 - a) \cdot (2 - a) \cdot (3 - a) \cdot (4 - a) = 0$$
-
-while to constraint $c$ to be either 7 or 13, you would use:
-
-$$(7 - c) \cdot (13 - c) = 0$$
-
-> The underlying principle here is that we create a polynomial constraint with roots at
-> each value in the set of possible values we want to allow. In R1CS circuits, the maximum
-> supported polynomial degree is 2 (due to all constraints being of the form $a * b = c$).
-> In halo2 circuits, you can use arbitrary-degree polynomials - with the proviso that
-> higher-degree constraints are more expensive to use.
-
-Note that the roots don't have to be constants; for example $(a - x) \cdot (a - y) \cdot (a - z) = 0$ will constrain $a$ to be equal to one of $\{ x, y, z \}$ where the latter can be arbitrary polynomials, as long as the whole expression stays within the maximum degree bound.
-
-## Small set interpolation
-We can use Lagrange interpolation to create a polynomial constraint that maps
-$f(X) = Y$ for small sets of $X \in \{x_i\}, Y \in \{y_i\}$. 
-
-For instance, say we want to map a 2-bit value to a "spread" version interleaved
-with zeros. We first precompute the evaluations at each point:
-
-$$
-\begin{array}{rcl}
-00 \rightarrow 0000 &\implies& 0 \rightarrow 0 \\
-01 \rightarrow 0001 &\implies& 1 \rightarrow 1 \\
-10 \rightarrow 0100 &\implies& 2 \rightarrow 4 \\
-11 \rightarrow 0101 &\implies& 3 \rightarrow 5
-\end{array}
-$$
-
-Then, we construct the Lagrange basis polynomial for each point using the
-identity:
-$$\mathcal{l}_j(X) = \prod_{0 \leq m < k,\; m \neq j} \frac{x - x_m}{x_j - x_m},$$
-where $k$ is the number of data points. ($k = 4$ in our example above.)
-
-Recall that the Lagrange basis polynomial $\mathcal{l}_j(X)$ evaluates to $1$ at
-$X = x_j$ and $0$ at all other $x_i, j \neq i.$
-
-Continuing our example, we get four Lagrange basis polynomials:
-
-$$
-\begin{array}{ccc}
-l_0(X) &=& \frac{(X - 3)(X - 2)(X - 1)}{(-3)(-2)(-1)} \\[1ex]
-l_1(X) &=& \frac{(X - 3)(X - 2)(X)}{(-2)(-1)(1)} \\[1ex]
-l_2(X) &=& \frac{(X - 3)(X - 1)(X)}{(-1)(1)(2)} \\[1ex]
-l_3(X) &=& \frac{(X - 2)(X - 1)(X)}{(1)(2)(3)}
-\end{array}
-$$
-
-Our polynomial constraint is then
-
-$$
-\begin{array}{cccccccccccl}
-&f(0) \cdot l_0(X) &+& f(1) \cdot l_1(X) &+& f(2) \cdot l_2(X) &+& f(3) \cdot l_3(X) &-& f(X) &=& 0 \\
-\implies& 0 \cdot l_0(X) &+& 1 \cdot l_1(X) &+& 4 \cdot l_2(X) &+& 5 \cdot l_3(X) &-& f(X) &=& 0. \\
-\end{array}
-$$