16 KiB

Raw Blame History

Fields

A fundamental component of many cryptographic protocols is the algebraic structure known as a field. Fields are sets of objects (usually numbers) with two associated binary operators + and \times such that various field axioms hold. The real numbers \mathbb{R} are an example of a field with uncountably many elements.

Halo makes use of finite fields which have a finite number of elements. Finite fields are fully classified as follows:

if \mathbb{F} is a finite field, it contains |\mathbb{F}| = p^k elements for some integer k \geq 1 and some prime p;
any two finite fields with the same number of elements are isomorphic. In particular, all of the arithmetic in a prime field \mathbb{F}_p is isomorphic to addition and multiplication of integers modulo p, i.e. in \mathbb{Z}_p. This is why we often refer to p as the modulus.

We'll write a field as \mathbb{F}_q where q = p^k. The prime p is called its characteristic. In the cases where k \gt 1 the field \mathbb{F}_q is a k-degree extension of the field \mathbb{F}_p. (By analogy, the complex numbers \mathbb{C} = \mathbb{R}(i) are an extension of the real numbers.) However, in Halo we do not use extension fields. Whenever we write \mathbb{F}_p we are referring to what we call a prime field which has a prime p number of elements, i.e. k = 1.

Important notes:

There are two special elements in any field: 0, the additive identity, and 1, the multiplicative identity.
The least significant bit of a field element, when represented as an integer in binary format, can be interpreted as its "sign" to help distinguish it from its additive inverse (negation). This is because for some nonzero element a which has a least significant bit 0 we have that -a = p - a has a least significant bit 1, and vice versa. We could also use whether or not an element is larger than (p - 1) / 2 to give it a "sign."

Finite fields will be useful later for constructing polynomials and elliptic curves. Elliptic curves are examples of groups, which we discuss next.

Groups

Groups are simpler and more limited than fields; they have only one binary operator \cdot and fewer axioms. They also have an identity, which we'll denote as 1.

Any element a in a group has an inverse b = a^{-1}, which is the unique element b such that a \cdot b = 1.

For example, the set of nonzero elements of \mathbb{F}_p forms a group, where the group operation is given by multiplication on the field.

(aside) Additive vs multiplicative notation

If \cdot is written as \times or omitted (i.e. a \cdot b written as ab), the identity as 1, and inversion as a^{-1}, as we did above, then we say that the group is "written multiplicatively". If \cdot is written as +, the identity as 0 or \mathcal{O}, and inversion as -a, then we say it is "written additively".

It's conventional to use additive notation for elliptic curve groups, and multiplicative notation when the elements come from a finite field.

When additive notation is used, we also write
[k] A = \underbrace{A + A + \cdots + A}_{k \text{ times}}
for nonnegative k and call this "scalar multiplication"; we also often use uppercase letters for variables denoting group elements. When multiplicative notation is used, we also write
a^k = \underbrace{a \times a \times \cdots \times a}_{k \text{ times}}
and call this "exponentiation". In either case we call the scalar k such that [k] g = a or g^k = a the "discrete logarithm" of a to base g. We can extend scalars to negative integers by inversion, i.e. [-k] A + [k] A = \mathcal{O} or a^{-k} \times a^k = 1.

The order of an element a of a finite group is defined as the smallest positive integer k such that a^k = 1 (in multiplicative notation) or [k] a = \mathcal{O} (in additive notation). The order of the group is the number of elements.

Groups always have a generating set, which is a set of elements such that we can produce any element of the group as (in multiplicative terminology) a product of powers of those elements. So if the generating set is g_{1..n}, we can produce any element of the group as \prod\limits_{i=1}^{n} g_i^{k_i} where k_i \in \mathbb{Z}. There can be many different generating sets for a given group.

A group is called cyclic if it has a (not necessarily unique) generating set with only a single element — call it g. In that case we can say that g generates the group, and that the order of g is the order of the group.

Any finite cyclic group \mathbb{G} of order n is isomorphic to the integers modulo n (denoted \mathbb{Z}/n\mathbb{Z}), such that:

the operation \cdot in \mathbb{G} corresponds to addition modulo n;
the identity in \mathbb{G} corresponds to 0;
some generator g \in \mathbb{G} corresponds to 1.

Given a generator g, the isomorphism is always easy to compute in the \mathbb{Z}/n\mathbb{Z} \rightarrow \mathbb{G} direction; it is just a \mapsto g^a (or in additive notation, a \mapsto [a] g). It may be difficult in general to compute in the \mathbb{G} \rightarrow \mathbb{Z}/n\mathbb{Z} direction; we'll discuss this further when we come to elliptic curves.

If the order n of a finite group is prime, then the group is cyclic, and every non-identity element is a generator.

The multiplicative group of a finite field

We use the notation \mathbb{F}_p^\times for the multiplicative group (i.e. the group operation is multiplication in \mathbb{F}_p) over the set \mathbb{F}_p - \{0\}.

A quick way of obtaining the inverse in \mathbb{F}_p^\times is a^{-1} = a^{p - 2}. The reason for this stems from Fermat's little theorem, which states that a^p = a \pmod p for any integer a. If a is nonzero, we can divide by a twice to get a^{p-2} = a^{-1}.

Let's assume that \alpha is a generator of \mathbb{F}_p^\times, so it has order p-1 (equal to the number of elements in \mathbb{F}_p^\times). Therefore, for any element in a \in \mathbb{F}_p^\times there is a unique integer i \in \{0..p-2\} such that a = \alpha^i.

Notice that a \times b where a, b \in \mathbb{F}_p^\times can really be interpreted as \alpha^i \times \alpha^j where a = \alpha^i and b = \alpha^j. Indeed, it holds that \alpha^i \times \alpha^j = \alpha^{i + j} for all 0 \leq i, j \lt p - 1. As a result the multiplication of nonzero field elements can be interpreted as addition modulo p - 1 with respect to some fixed generator \alpha. The addition just happens "in the exponent."

This is another way to look at where a^{p - 2} comes from for computing inverses in the field:

p - 2 \equiv -1 \pmod{p - 1},

so a^{p - 2} = a^{-1}.

Montgomery's Trick

Montgomery's trick, named after Peter Montgomery (RIP) is a way to compute many group inversions at the same time. It is commonly used to compute inversions in \mathbb{F}_p^\times, which are quite computationally expensive compared to multiplication.

Imagine we need to compute the inverses of three nonzero elements a, b, c \in \mathbb{F}_p^\times. Instead, we'll compute the products x = ab and y = xc = abc, and compute the inversion

z = y^{p - 2} = \frac{1}{abc}.

We can now multiply z by x to obtain \frac{1}{c} and multiply z by c to obtain \frac{1}{ab}, which we can then multiply by a, b to obtain their respective inverses.

This technique generalizes to arbitrary numbers of group elements with just a single inversion necessary.

Multiplicative subgroups

A subgroup of a group G with operation \cdot, is a subset of elements of G that also form a group under \cdot.

In the previous section we said that \alpha is a generator of the (p - 1)-order multiplicative group \mathbb{F}_p^\times. This group has composite order, and so by the Chinese remainder theorem¹ it has proper subgroups. As an example let's imagine that p = 11, and so p - 1 factors into 5 \cdot 2. Thus, there is a generator \beta of the 5-order subgroup and a generator \gamma of the 2-order subgroup. All elements in \mathbb{F}_p^\times, therefore, can be written uniquely as \beta^i \cdot \gamma^j for some i (modulo 5) and some j (modulo 2).

If we have a = \beta^i \cdot \gamma^j notice what happens when we compute


a^5 = (\beta^i \cdot \gamma^j)^5
    = \beta^{i \cdot 5} \cdot \gamma^{j \cdot 5}
    = \beta^0 \cdot \gamma^{j \cdot 5}
    = \gamma^{j \cdot 5};

we have effectively "killed" the 5-order subgroup component, producing a value in the 2-order subgroup.

Lagrange's theorem (group theory) states that the order of any subgroup H of a finite group G divides the order of G. Therefore, the order of any subgroup of \mathbb{F}_p^\times must divide p-1.

PLONK-based proving systems like Halo 2 are more convenient to use with fields that have a large number of multiplicative subgroups with a "smooth" distribution (which makes the performance cliffs smaller and more granular as circuit sizes increase). The Pallas and Vesta curves specifically have primes of the form

T \cdot 2^S = p - 1

with S = 32 and T odd (i.e. p - 1 has 32 lower zero-bits). This means they have multiplicative subgroups of order 2^k for all k \leq 32. These 2-adic subgroups are nice for efficient FFTs, as well as enabling a wide variety of circuit sizes.

Square roots

In a field \mathbb{F}_p exactly half of all nonzero elements are squares; the remainder are non-squares or "quadratic non-residues". In order to see why, consider an \alpha that generates the 2-order multiplicative subgroup of \mathbb{F}_p^\times (this exists because p - 1 is divisible by 2 since p is a prime greater than 2) and \beta that generates the t-order multiplicative subgroup of \mathbb{F}_p^\times where p - 1 = 2t. Then every element a \in \mathbb{F}_p^\times can be written uniquely as \alpha^i \cdot \beta^j with i \in \mathbb{Z}_2 and j \in \mathbb{Z}_t. Half of all elements will have i = 0 and the other half will have i = 1.

Let's consider the simple case where p \equiv 3 \pmod{4} and so t is odd (if t is even, then p - 1 would be divisible by 4, which contradicts p being 3 \pmod{4}). If a \in \mathbb{F}_p^\times is a square, then there must exist b = \alpha^i \cdot \beta^j such that b^2 = a. But this means that

a = (\alpha^i \cdot \beta^j)^2 = \alpha^{2i} \cdot \beta^{2j} = \beta^{2j}.

In other words, all squares in this particular field do not generate the 2-order multiplicative subgroup, and so since half of the elements generate the 2-order subgroup then at most half of the elements are square. In fact exactly half of the elements are square (since squaring each nonsquare element gives a unique square). This means we can assume all squares can be written as \beta^m for some m, and therefore finding the square root is a matter of exponentiating by 2^{-1} \pmod{t}.

In the event that p \equiv 1 \pmod{4} then things get more complicated because 2^{-1} \pmod{t} does not exist. Let's write p - 1 as 2^k \cdot t with t odd. The case k = 0 is impossible, and the case k = 1 is what we already described, so consider k \geq 2. \alpha generates a 2^k-order multiplicative subgroup and \beta generates the odd t-order multiplicative subgroup. Then every element a \in \mathbb{F}_p^\times can be written as \alpha^i \cdot \beta^j for i \in \mathbb{Z}_{2^k} and j \in \mathbb{Z}_t. If the element is a square, then there exists some b = \sqrt{a} which can be written b = \alpha^{i'} \cdot \beta^{j'} for i' \in \mathbb{Z}_{2^k} and j' \in \mathbb{Z}_t. This means that a = b^2 = \alpha^{2i'} \cdot \beta^{2j'}, therefore we have i \equiv 2i' \pmod{2^k}, and j \equiv 2j' \pmod{t}. i would have to be even in this case because otherwise it would be impossible to have i \equiv 2i' \pmod{2^k} for any i'. In the case that a is not a square, then i is odd, and so half of all elements are squares.

In order to compute the square root, we can first raise the element a = \alpha^i \cdot \beta^j to the power t to "kill" the t-order component, giving

a^t = \alpha^{it \pmod {2^k}} \cdot \beta^{jt \pmod t} = \alpha^{it \pmod {2^k}}

and then raise this result to the power t^{-1} \pmod{2^k} to undo the effect of the original exponentiation on the 2^k-order component:

(\alpha^{it \bmod 2^k})^{t^{-1} \pmod{2^k}} = \alpha^i

(since t is relatively prime to 2^k). This leaves bare the \alpha^i value which we can trivially handle. We can similarly kill the 2^k-order component to obtain \beta^{j \cdot 2^{-1} \pmod{t}}, and put the values together to obtain the square root.

It turns out that in the cases k = 2, 3 there are simpler algorithms that merge several of these exponentiations together for efficiency. For other values of k, the only known way is to manually extract i by squaring until you obtain the identity for every single bit of i. This is the essence of the Tonelli-Shanks square root algorithm and describes the general strategy. (There is another square root algorithm that uses quadratic extension fields, but it doesn't pay off in efficiency until the prime becomes quite large.)

Roots of unity

In the previous sections we wrote p - 1 = 2^k \cdot t with t odd, and stated that an element \alpha \in \mathbb{F}_p^\times generated the 2^k-order subgroup. For convenience, let's denote n := 2^k. The elements \{1, \alpha, \ldots, \alpha^{n-1}\} are known as the $n$th roots of unity.

The primitive root of unity, \omega, is an $n$th root of unity such that \omega^i \neq 1 except when i \equiv 0 \pmod{n}.