From e9e208e3c4752c1747750b8753d67ee87a4fa653 Mon Sep 17 00:00:00 2001
From: Jack Grigg <jack@electriccoin.co>
Date: Tue, 15 Dec 2020 16:02:45 +0000
Subject: [PATCH 1/4] book: Bring in the lookup argument description from
 HackMD

---
 book/src/SUMMARY.md                |  1 +
 book/src/design/lookup-argument.md | 75 ++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+)
 create mode 100644 book/src/design/lookup-argument.md

diff --git a/book/src/SUMMARY.md b/book/src/SUMMARY.md
index 33639964..1bee6b2b 100644
--- a/book/src/SUMMARY.md
+++ b/book/src/SUMMARY.md
@@ -10,3 +10,4 @@
   - [Tips and tricks](user/tips-and-tricks.md)
 - [Design](design.md)
   - [Permutation argument](design/permutation.md)
+  - [Lookup argument](design/lookup-argument.md)
diff --git a/book/src/design/lookup-argument.md b/book/src/design/lookup-argument.md
new file mode 100644
index 00000000..7f63ff30
--- /dev/null
+++ b/book/src/design/lookup-argument.md
@@ -0,0 +1,75 @@
+# Generic Lookups with PLONK (DRAFT)
+
+_By Sean Bowe and Daira Hopwood_
+
+## Note on Language
+
+We use slightly different language than others, here's the overview.
+
+1. We like to think of PLONK-like arguments as tables, where each column corresponds to a "wire".
+2. We like to call "selector polynomials" and so on "fixed columns" instead.
+3. We call the others "advice columns" usually, when they're populated by the prover.
+4. We call the $Z(X)$ polynomial (the grand product argument polynomial for the permutation argument) the "permutation product" column.
+5. We use the term "rule" to refer to a "gate" like $A(X) \cdot q_A(X) + B(X) \cdot q_B(X) + A(X) \cdot B(X) \cdot q_M(X) + C(X) \cdot q_C(X) = 0.$
+
+## Technique Description
+
+The following is a description of a lookup technique which allows for lookups in arbitrary sets, and is arguably simpler than Plookup. We will express lookups in terms of a "subset argument".
+
+* There are $2^k$ rows (numbered from 0).
+* There are columns $A$ and $S$.
+
+The goal of the subset argument is to enforce that every cell in $A$ is equal to _some_ cell in $S$. This means that more than one cell in $A$ can be equal to the _same_ cell in $S$, and some cells in $S$ don't need to be equal to any of the cells in $A$.
+
+$S$ might be fixed, but it doesn't need to be. That is, we can support looking up values in either fixed or variable tables. $A$ and $S$ can contain duplicates. If the sets represented by $A$ and/or $S$ are not naturally of size $2^k$, we extend $S$ with duplicates and $A$ with dummy values known to be in $S$. Alternatively we can add a "lookup selector" that controls which elements of the $A$ column participate in lookups (this would modify the occurrence of $A(X)$ in the permutation rule below to replace $A$ with, say, $S_0$ if a lookup is not selected).
+
+Let $\ell_i$ be the Lagrange basis polynomial that evaluates to $1$ at row $i$ and $0$ otherwise.
+
+We'll start by allowing the prover to supply permutation columns of $A$ and $S$. Let's call these $A'$ and $S'$, respectively. We can enforce that they are permutations using a permutation argument with product column $Z$ with the rules:
+
+$$
+Z(X) (A(X) + \beta) (S(X) + \gamma) - Z(\omega^{-1} X) (A'(X) + \beta) (S'(X) + \gamma) = 0
+$$$$
+\ell_0(X) (Z(X) - 1) = 0
+$$
+
+This is a version of the permutation argument which allows $A'$ and $S'$ to be permutations of $A$ and $S$, respectively, but doesn't specify the exact permutations. $\beta$ and $\gamma$ are separate challenges so that we can combine these two permutation arguments into one without worrying that they might interfere with each other.
+
+The goal of these permutations is to allow $A'$ and $S'$ to be arranged by the prover in a particular way:
+
+1. All the cells of column $A'$ are arranged so that like-valued cells are vertically adjacent to each other. This could be done by some kind of sorting algorithm but all that matters is that like-valued cells are on consecutive rows in column $A'$, and that $A'$ is a permutation of $A$.
+2. The first row  in a sequence of like values in $A'$ is the row that has the corresponding value in $S'$. Apart from this constraint, $S'$ is any arbitrary permutation of $S$.
+
+Now, we'll enforce that either $A'_i = S'_i$ or that $A'_i = A'_{i-1}$, using the rule
+
+$$
+(A'(X) - S'(X)) \cdot (A'(X) - A'(\omega^{-1} X)) = 0
+$$
+
+In addition, we enforce $A'_0 = S'_0$ using the rule
+
+$$
+\ell_0(X) \cdot (A'(X) - S'(X)) = 0
+$$
+
+Together these constraints effectively force every element in $A'$ (and thus $A$) to equal at least one element in $S'$ (and thus $S$). Proof: by induction on prefixes of the rows.
+
+## Cost
+
+* There is the original column $A$ and the fixed column $S$.
+* There is a permutation product column $Z$.
+* There are the two permutations $A'$ and $S'$.
+* The gates are all of low degree.
+
+## Generalizations
+
+These generalizations are similar to those in sections 4  and 5 of the [Plookup paper](https://eprint.iacr.org/2020/315.pdf):
+
+* $A$ and $S$ can be extended to multiple columns, combined using a random challenge. $A'$ and $S'$ stay as single columns.
+  * The commitments to the columns of $S$ can be precomputed, then combined cheaply once the challenge is known by taking advantage of the homomorphic property of Pedersen commitments.
+* Then, a lookup argument for an arbitrary-width relation can be implemented in terms of a subset argument, i.e. to constrain $\mathcal{R}(x, y, ...)$ in each row, consider $\mathcal{R}$ as a set of tuples $S$ (using the  method of the previous point) and check that $(x, y, ...) \in \mathcal{R}$.
+  * In the case where $\mathcal{R}$ represents a function, this implicitly also checks that the inputs are in the domain. This is typically what we want, and often saves an additional range check.
+* We can support multiple tables in the same circuit, by combining them into a single table that includes a tag column to identify the original table. The tag column could be merged with the "lookup selector" mentioned earlier.
+* The optimized range check technique in section 5 of the Plookup paper can also be used with this subset argument.
+
+That is, the differences from Plookup are in the subset argument. This argument can then be used in all the same ways.

From c1fe1537c1c093893d210649d55f914397c7e12f Mon Sep 17 00:00:00 2001
From: Jack Grigg <jack@electriccoin.co>
Date: Tue, 22 Dec 2020 20:45:55 +0000
Subject: [PATCH 2/4] book: Update authors

---
 book/book.toml | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/book/book.toml b/book/book.toml
index c6278076..005ef54b 100644
--- a/book/book.toml
+++ b/book/book.toml
@@ -1,5 +1,10 @@
 [book]
-authors = ["Jack Grigg"]
+authors = [
+    "Jack Grigg",
+    "Sean Bowe",
+    "Daira Hopwood",
+    "Ying Tong Lai",
+]
 language = "en"
 multilingual = false
 src = "src"

From d41e8ef36409b7456830c7ab72e504cf66214546 Mon Sep 17 00:00:00 2001
From: Jack Grigg <jack@electriccoin.co>
Date: Tue, 22 Dec 2020 21:09:05 +0000
Subject: [PATCH 3/4] book: Move general PLONK language differences to top of
 design section

---
 book/src/design.md                 | 16 ++++++++++++++++
 book/src/design/lookup-argument.md |  9 +++------
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/book/src/design.md b/book/src/design.md
index 3d14cb7c..db8a6d09 100644
--- a/book/src/design.md
+++ b/book/src/design.md
@@ -1 +1,17 @@
 # Design
+
+## Note on Language
+
+We use slightly different language than others to describe PLONK concepts. Here's the
+overview:
+
+1. We like to think of PLONK-like arguments as tables, where each column corresponds to a
+   "wire". We refer to entries in this table as "cells".
+2. We like to call "selector polynomials" and so on "fixed columns" instead. We then refer
+   specifically to a "selector constraint" when a cell in a fixed column is being used to
+   control whether a particular constraint is enabled in that row.
+3. We call the other polynomials "advice columns" usually, when they're populated by the
+   prover.
+4. We use the term "rule" to refer to a "gate" like
+   $$A(X) \cdot q_A(X) + B(X) \cdot q_B(X) + A(X) \cdot B(X) \cdot q_M(X) + C(X) \cdot q_C(X) = 0.$$
+   - TODO: Check how consistent we are with this, and update the code and docs to match.
diff --git a/book/src/design/lookup-argument.md b/book/src/design/lookup-argument.md
index 7f63ff30..5e72525d 100644
--- a/book/src/design/lookup-argument.md
+++ b/book/src/design/lookup-argument.md
@@ -4,13 +4,10 @@ _By Sean Bowe and Daira Hopwood_
 
 ## Note on Language
 
-We use slightly different language than others, here's the overview.
+In addition to the [general notes on language](../design.md#note-on-language):
 
-1. We like to think of PLONK-like arguments as tables, where each column corresponds to a "wire".
-2. We like to call "selector polynomials" and so on "fixed columns" instead.
-3. We call the others "advice columns" usually, when they're populated by the prover.
-4. We call the $Z(X)$ polynomial (the grand product argument polynomial for the permutation argument) the "permutation product" column.
-5. We use the term "rule" to refer to a "gate" like $A(X) \cdot q_A(X) + B(X) \cdot q_B(X) + A(X) \cdot B(X) \cdot q_M(X) + C(X) \cdot q_C(X) = 0.$
+- We call the $Z(X)$ polynomial (the grand product argument polynomial for the permutation
+  argument) the "permutation product" column.
 
 ## Technique Description
 

From c2742c0d05391079a2814aad5d4bf63f0f77399b Mon Sep 17 00:00:00 2001
From: Jack Grigg <jack@electriccoin.co>
Date: Tue, 22 Dec 2020 21:23:32 +0000
Subject: [PATCH 4/4] book: Edit the lookup argument page to fit the design
 section

---
 book/src/design/lookup-argument.md | 82 +++++++++++++++++++++---------
 1 file changed, 59 insertions(+), 23 deletions(-)

diff --git a/book/src/design/lookup-argument.md b/book/src/design/lookup-argument.md
index 5e72525d..81076711 100644
--- a/book/src/design/lookup-argument.md
+++ b/book/src/design/lookup-argument.md
@@ -1,6 +1,7 @@
-# Generic Lookups with PLONK (DRAFT)
+# Lookup argument
 
-_By Sean Bowe and Daira Hopwood_
+halo2 uses the following lookup technique, which allows for lookups in arbitrary sets, and
+is arguably simpler than Plookup.
 
 ## Note on Language
 
@@ -11,18 +12,28 @@ In addition to the [general notes on language](../design.md#note-on-language):
 
 ## Technique Description
 
-The following is a description of a lookup technique which allows for lookups in arbitrary sets, and is arguably simpler than Plookup. We will express lookups in terms of a "subset argument".
+We express lookups in terms of a "subset argument" over a table with $2^k$ rows (numbered
+from 0), and columns $A$ and $S$.
 
-* There are $2^k$ rows (numbered from 0).
-* There are columns $A$ and $S$.
+The goal of the subset argument is to enforce that every cell in $A$ is equal to _some_
+cell in $S$. This means that more than one cell in $A$ can be equal to the _same_ cell in
+$S$, and some cells in $S$ don't need to be equal to any of the cells in $A$.
 
-The goal of the subset argument is to enforce that every cell in $A$ is equal to _some_ cell in $S$. This means that more than one cell in $A$ can be equal to the _same_ cell in $S$, and some cells in $S$ don't need to be equal to any of the cells in $A$.
+- $S$ might be fixed, but it doesn't need to be. That is, we can support looking up values
+  in either fixed or variable tables (where the latter includes advice columns).
+- $A$ and $S$ can contain duplicates. If the sets represented by $A$ and/or $S$ are not
+  naturally of size $2^k$, we extend $S$ with duplicates and $A$ with dummy values known
+  to be in $S$.
+  - Alternatively we could add a "lookup selector" that controls which elements of the $A$
+    column participate in lookups. This would modify the occurrence of $A(X)$ in the
+    permutation rule below to replace $A$ with, say, $S_0$ if a lookup is not selected.
 
-$S$ might be fixed, but it doesn't need to be. That is, we can support looking up values in either fixed or variable tables. $A$ and $S$ can contain duplicates. If the sets represented by $A$ and/or $S$ are not naturally of size $2^k$, we extend $S$ with duplicates and $A$ with dummy values known to be in $S$. Alternatively we can add a "lookup selector" that controls which elements of the $A$ column participate in lookups (this would modify the occurrence of $A(X)$ in the permutation rule below to replace $A$ with, say, $S_0$ if a lookup is not selected).
+Let $\ell_i$ be the Lagrange basis polynomial that evaluates to $1$ at row $i$, and $0$
+otherwise.
 
-Let $\ell_i$ be the Lagrange basis polynomial that evaluates to $1$ at row $i$ and $0$ otherwise.
-
-We'll start by allowing the prover to supply permutation columns of $A$ and $S$. Let's call these $A'$ and $S'$, respectively. We can enforce that they are permutations using a permutation argument with product column $Z$ with the rules:
+We start by allowing the prover to supply permutation columns of $A$ and $S$. Let's call
+these $A'$ and $S'$, respectively. We can enforce that they are permutations using a
+permutation argument with product column $Z$ with the rules:
 
 $$
 Z(X) (A(X) + \beta) (S(X) + \gamma) - Z(\omega^{-1} X) (A'(X) + \beta) (S'(X) + \gamma) = 0
@@ -30,12 +41,21 @@ $$$$
 \ell_0(X) (Z(X) - 1) = 0
 $$
 
-This is a version of the permutation argument which allows $A'$ and $S'$ to be permutations of $A$ and $S$, respectively, but doesn't specify the exact permutations. $\beta$ and $\gamma$ are separate challenges so that we can combine these two permutation arguments into one without worrying that they might interfere with each other.
+This is a version of the permutation argument which allows $A'$ and $S'$ to be
+permutations of $A$ and $S$, respectively, but doesn't specify the exact permutations.
+$\beta$ and $\gamma$ are separate challenges so that we can combine these two permutation
+arguments into one without worrying that they might interfere with each other.
 
-The goal of these permutations is to allow $A'$ and $S'$ to be arranged by the prover in a particular way:
+The goal of these permutations is to allow $A'$ and $S'$ to be arranged by the prover in a
+particular way:
 
-1. All the cells of column $A'$ are arranged so that like-valued cells are vertically adjacent to each other. This could be done by some kind of sorting algorithm but all that matters is that like-valued cells are on consecutive rows in column $A'$, and that $A'$ is a permutation of $A$.
-2. The first row  in a sequence of like values in $A'$ is the row that has the corresponding value in $S'$. Apart from this constraint, $S'$ is any arbitrary permutation of $S$.
+1. All the cells of column $A'$ are arranged so that like-valued cells are vertically
+   adjacent to each other. This could be done by some kind of sorting algorithm, but all
+   that matters is that like-valued cells are on consecutive rows in column $A'$, and that
+   $A'$ is a permutation of $A$.
+2. The first row in a sequence of like values in $A'$ is the row that has the
+   corresponding value in $S'.$ Apart from this constraint, $S'$ is any arbitrary
+   permutation of $S$.
 
 Now, we'll enforce that either $A'_i = S'_i$ or that $A'_i = A'_{i-1}$, using the rule
 
@@ -49,7 +69,8 @@ $$
 \ell_0(X) \cdot (A'(X) - S'(X)) = 0
 $$
 
-Together these constraints effectively force every element in $A'$ (and thus $A$) to equal at least one element in $S'$ (and thus $S$). Proof: by induction on prefixes of the rows.
+Together these constraints effectively force every element in $A'$ (and thus $A$) to equal
+at least one element in $S'$ (and thus $S$). Proof: by induction on prefixes of the rows.
 
 ## Cost
 
@@ -60,13 +81,28 @@ Together these constraints effectively force every element in $A'$ (and thus $A$
 
 ## Generalizations
 
-These generalizations are similar to those in sections 4  and 5 of the [Plookup paper](https://eprint.iacr.org/2020/315.pdf):
+halo2's lookup argument implementation generalizes the above technique in the following
+ways:
 
-* $A$ and $S$ can be extended to multiple columns, combined using a random challenge. $A'$ and $S'$ stay as single columns.
-  * The commitments to the columns of $S$ can be precomputed, then combined cheaply once the challenge is known by taking advantage of the homomorphic property of Pedersen commitments.
-* Then, a lookup argument for an arbitrary-width relation can be implemented in terms of a subset argument, i.e. to constrain $\mathcal{R}(x, y, ...)$ in each row, consider $\mathcal{R}$ as a set of tuples $S$ (using the  method of the previous point) and check that $(x, y, ...) \in \mathcal{R}$.
-  * In the case where $\mathcal{R}$ represents a function, this implicitly also checks that the inputs are in the domain. This is typically what we want, and often saves an additional range check.
-* We can support multiple tables in the same circuit, by combining them into a single table that includes a tag column to identify the original table. The tag column could be merged with the "lookup selector" mentioned earlier.
-* The optimized range check technique in section 5 of the Plookup paper can also be used with this subset argument.
+- $A$ and $S$ can be extended to multiple columns, combined using a random challenge. $A'$
+  and $S'$ stay as single columns.
+  - The commitments to the columns of $S$ can be precomputed, then combined cheaply once
+    the challenge is known by taking advantage of the homomorphic property of Pedersen
+    commitments.
+- Then, a lookup argument for an arbitrary-width relation can be implemented in terms of a
+  subset argument, i.e. to constrain $\mathcal{R}(x, y, ...)$ in each row, consider
+  $\mathcal{R}$ as a set of tuples $S$ (using the method of the previous point), and check
+  that $(x, y, ...) \in \mathcal{R}$.
+  - In the case where $\mathcal{R}$ represents a function, this implicitly also checks
+    that the inputs are in the domain. This is typically what we want, and often saves an
+    additional range check.
+- We can support multiple tables in the same circuit, by combining them into a single
+  table that includes a tag column to identify the original table.
+  - The tag column could be merged with the "lookup selector" mentioned earlier, if this
+    were implemented.
 
-That is, the differences from Plookup are in the subset argument. This argument can then be used in all the same ways.
+These generalizations are similar to those in sections 4 and 5 of the
+[Plookup paper](https://eprint.iacr.org/2020/315.pdf) That is, the differences from
+Plookup are in the subset argument. This argument can then be used in all the same ways;
+for instance, the optimized range check technique in section 5 of the Plookup paper can
+also be used with this subset argument.