[book] Write Concepts section.

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2020-12-25 12:56:25 +00:00 · 2020-12-25 12:56:25 +00:00 · f23677cea9
parent 591ab49266
commit f23677cea9
7 changed files with 219 additions and 0 deletions
--- a/book/src/SUMMARY.md
+++ b/book/src/SUMMARY.md
@ -2,6 +2,11 @@

 [halo2](README.md)
 - [Concepts](concepts.md)
+  - [Background](concepts/background.md)
+  - [UltraPLONK Arithmetization](concepts/arithmetization.md)
+  - [Cores](concepts/cores.md)
+  - [Chips](concepts/chips.md)
+  - [Gadgets](concepts/gadgets.md)
 - [User Documentation](user.md)
  - [A simple example](user/simple-example.md)
  - [Lookup tables](user/lookup-tables.md)
--- a/book/src/concepts.md
+++ b/book/src/concepts.md
@ -1 +1,5 @@
 # Concepts
+
+First we'll describe the concepts behind zero-knowledge proof systems; the
+*arithmetization* (kind of circuit description) used by Halo 2; and the
+abstractions we use to build circuit implementations.
--- a/book/src/concepts/arithmetization.md
+++ b/book/src/concepts/arithmetization.md
@ -0,0 +1,58 @@
+# UltraPLONK Arithmetization
+
+The arithmetization used by Halo 2 comes from [PLONK](https://eprint.iacr.org/2019/953), or
+more precisely its extension UltraPLONK that supports custom gates and lookup arguments. We'll
+call it ***UPA*** (***UltraPLONK arithmetization***).
+
+> The term UPA and some of the other terms we use to describe it are not used in the PLONK
+> paper.
+
+***UPA circuits*** are defined in terms of a rectangular matrix of values. We refer to
+***rows***, ***columns***, and ***cells*** of this matrix with the conventional meanings.
+
+A UPA circuit depends on a ***configuration***:
+
+* A finite field $\mathbb{F}$, where cell values (for a given instance and witness) will be
+  elements of $\mathbb{F}$.
+* The number of columns in the matrix, and a specification of each column as being
+  ***fixed***, ***advice***, or ***auxiliary***. Fixed columns are fixed by the circuit;
+  advice columns correspond to witness values; and auxiliary columns are used for public inputs.
+
+* A subset of the columns that can participate in equality constraints.
+
+* A ***polynomial degree bound***.
+
+* A sequence of ***polynomial constraints***. These are multivariate polynomials over
+  $\mathbb{F}$ that must evaluate to zero *for each row*. The variables in a polynomial
+  constraint may refer to a cell in a given column of the current row, or a given column of
+  another row relative to this one (with wrap-around, i.e. taken modulo $n$). The maximum
+  degree of each polynomial is given by the polynomial degree bound.
+
+* A sequence of ***lookup arguments*** defined over tuples of ***input columns*** and
+  ***table columns***.
+
+A UPA circuit also defines:
+
+* The number of rows $n$ in the matrix. $n$ must correspond to the size of a multiplicative
+  subgroup of $\mathbb{F}^\times$; typically a power of two.
+
+* A sequence of ***equality constraints***, which specify that two given cells must have equal
+  values.
+
+* The values of the fixed columns at each row.
+
+From a circuit description we can generate a ***proving key*** and a ***verification key***,
+which are needed for the operations of proving and verification for that circuit.
+
+> Note that we specify the ordering of columns, polynomial constraints, lookup arguments, and
+> equality constraints, even though these do not affect the meaning of the circuit. This makes
+> it easier to define the generation of proving and verification keys as a deterministic
+> process.
+
+Typically, a configuration will define polynomial constraints that are switched off and on by
+***selectors*** defined in fixed columns. For example, a constraint $q_i \cdot p(...) = 0$ can
+be switched off for a particular row $i$ by setting $q_i = 0$. In this case we sometimes refer
+to a set of constraints controlled by a set of selector columns that are designed to be used
+together, as a ***gate***. Typically there will be a ***standard gate*** that supports generic
+operations like field multiplication and division, and possibly also ***custom gates*** that
+support more specialized operations.
--- a/book/src/concepts/background.md
+++ b/book/src/concepts/background.md
@ -0,0 +1,41 @@
+# Background
+
+The aim of any ***proof system*** is to be able to prove ***instances*** of ***statements***.
+
+A statement is a high-level description of what is to be proven, parameterized by
+***public inputs*** of given types. An instance is a statement with particular values for
+these public inputs.
+
+Normally the statement will also have ***private inputs***. The private inputs and any
+intermediate values that make an instance of a statement hold, are collectively called a
+***witness*** for that instance.
+
+A ***Non-interactive Argument of Knowledge*** allows a ***prover*** to create a ***proof***
+for a given instance and witness. The proof is data that can be used to convince a
+***verifier*** that the creator of the proof knew a witness for which the statement holds on
+this instance. The security property that such proofs cannot falsely convince a verifier is
+called ***knowledge soundness***.
+
+> This property is subtle given that proofs can be ***malleable***. That is, depending on the
+> proof system it may be possible to take an existing proof (or set of proofs) and, without
+> knowing the witness(es), modify it/them to produce a distinct proof of the same or a related
+> statement. Higher-level protocols that use malleable proof systems need to take this into
+> account.
+
+If a proof yields no information about the witness (other than that a witness exists and was
+known to the prover), then we say that the proof system is ***zero knowledge***.
+
+A proof system will define an ***arithmetization***, which is a way of describing statements
+--- typically in terms of polynomial constraints on variables over a field. An arithmetized
+statement is called a ***circuit***.
+
+If the proof is short ---i.e. it has length polylogarithmic in the circuit size--- then
+we say that the proof system is ***succinct***, and call it a ***SNARK***
+(***Succinct Non-Interactive Argument of Knowledge***).
+
+> By this definition, a SNARK need not have verification time polylogarithmic in the circuit
+> size. Some papers use the term ***efficient*** to describe a SNARK with that property, but
+> we'll avoid that term since it's ambiguous for SNARKs that support amortized or recursive
+> verification, which we'll get to later.
+
+A ***zk-SNARK*** is a zero-knowledge SNARK.
--- a/book/src/concepts/chips.md
+++ b/book/src/concepts/chips.md
@ -0,0 +1,18 @@
+# Chips
+
+In order to combine functionality from several cores, we use a ***chip***. To implement a
+chip, we define a set of fixed, advice, and auxiliary columns, and then specify how they
+should be distributed between cores.
+
+In the simplest case, each core will use columns disjoint from the other cores. However, it
+is allowed to share a column between cores. It is important to optimize the number of advice
+columns in particular, because that affects proof size.
+
+The result (possibly after optimization) is a UPA configuration. Our circuit implementation
+will be parameterized on a chip, and can use any features of the supported cores via the chip.
+
+Our hope is that less expert users will normally be able to find an existing chip that
+supports the operations they need, or only have to make minor modifications to an existing
+chip. Expert users will have full control to do the kind of
+[circuit optimizations](https://zips.z.cash/protocol/canopy.pdf#circuitdesign)
+[that ECC is famous  for](https://electriccoin.co/blog/cultivating-sapling-faster-zksnarks/) 🙂.
--- a/book/src/concepts/cores.md
+++ b/book/src/concepts/cores.md
@ -0,0 +1,68 @@
+# Cores
+
+The above is a fairly low-level description of a circuit. When implementing circuits we will
+typically use a higher-level API which aims for the desirable characteristics of auditability,
+efficiency, modularity, and expressiveness.
+
+Some of the terminology and concepts used in this API are taken from an analogy with
+integrated circuit design and layout. [As for integrated circuits](https://opencores.org/),
+the above desirable characteristics are easier to obtain by composing ***cores*** that provide
+efficient pre-built implementations of particular functionality.
+
+For example, we might have cores that implement particular cryptographic primitives such as a
+hash function or cipher, or algorithms like scalar multiplication or pairings.
+
+In UPA, it is possible to build up arbitrary logic just from standard gates that do field
+multiplication and addition. However, very significant efficiency gains can be obtained by
+using custom gates.
+
+Using our API, we define cores that "know" how to use particular sets of custom gates. This
+creates an abstraction layer that isolates the implementation of a high-level circuit from the
+complexity of using custom gates directly.
+
+> Even if we sometimes need to "wear two hats", by implementing both a high-level circuit and
+> the cores that it uses, the intention is that this separation will result in code that is
+> easier to understand, audit, and maintain/reuse. This is partly because some potential
+> implementation errors are ruled out by construction.
+
+Gates in UPA refer to cells by ***relative references***, i.e. to the cell in a given column,
+and the row at a given offset relative to the one in which the gate's selector is set. We call
+this an ***offset reference*** when the offset is nonzero (i.e. offset references are a subset
+of relative references).
+
+Relative references contrast with ***absolute references*** used in equality constraints,
+which can point to any cell.
+
+The motivation for offset references is to reduce the number of columns needed in the
+configuration, which reduces proof size. If we did not have offset references then we would
+need a column to hold each value referred to by a custom gate, and we would need to use
+equality constraints to copy values from other cells of the circuit into that column. With
+offset references, we not only need fewer columns; we also do not need equality constraints to
+be supported for all of those columns, which improves efficiency.
+
+In R1CS (which may be more familiar to some readers, but don't worry if it isn't), a circuit
+consists of a "sea of gates" with no semantically significant ordering. Because of offset
+references, the order of rows in a UPA circuit, on the other hand, *is* significant. We're
+going to make some simplifying assumptions and define some abstractions to tame the resulting
+complexity: the aim will be that, [at the gadget level](#Gadgets) where we do most of our
+circuit construction, we will not have to deal with relative references or with gate layout
+explicitly.
+
+We will partition a circuit into ***regions***, where each region contains a disjoint subset
+of cells, and relative references only ever point *within* a region. Part of the responsibility
+of a core implementation is to ensure that gates that make offset references are laid out in
+the correct positions in a region.
+
+Given the set of regions and their ***shapes***, we will use a separate ***floor planner***
+to decide where (i.e. at what starting row) each region is placed. There is a default floor
+planner that implements a very general algorithm, but you can write your own floor planner if
+you need to.
+
+Floor planning will in general leave gaps in the matrix, because the gates in a given row did
+not use all available columns. These are filled in ---as far as possible--- by gates that do
+not require offset references, which allows them to be placed on any row.
+
+Cores can also define lookup tables. If more than one table is defined for the same lookup
+argument, we can use a ***tag column*** to specify which table is used on each row. It is also
+possible to perform a lookup in the union of several tables (limited by the polynomial degree
+bound).
--- a/book/src/concepts/gadgets.md
+++ b/book/src/concepts/gadgets.md
@ -0,0 +1,25 @@
+# Gadgets
+
+When implementing a circuit, we could use the features of the cores we've selected directly
+via the chip. Typically, though, we will use them via ***gadgets***. This indirection is
+useful because, for reasons of efficiency and limitations imposed by UPA, the core interfaces
+will often be dependent on low-level implementation details. The gadget interface can provide
+a more convenient and stable API that abstracts away from extraneous detail.
+
+For example, consider a hash function such as SHA-256. The interface of a core supporting
+SHA-256 might be dependent on internals of the hash function design such as the separation
+between message schedule and compression function. The corresponding gadget interface can
+provide a more convenient and familiar `update`/`finalize` API, and can also handle parts
+of the hash function that do not need core support, such as padding. This is similar to how
+[accelerated](https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html)
+[instructions](https://developer.arm.com/documentation/ddi0514/g/introduction/about-the-cortex-a57-processor-cryptography-engine)
+for cryptographic primitives on CPUs are typically accessed via software libraries, rather
+than directly.
+
+Gadgets can also provide modular and reusable abstractions for circuit programming
+at a higher level, similar to their use in libraries such as
+[libsnark](https://github.com/christianlundkvist/libsnark-tutorial) and
+[bellman](https://electriccoin.co/blog/bellman-zksnarks-in-rust/). As well as abstracting
+*functions*, they can also abstract *types*, such as elliptic curve points or integers of
+specific sizes.
+