The pattern is preserved in one location inside the inner product
argument, where we instead desire to avoid allocations by collapsing
p_prime into itself. Using `+=` here requires both mutable and immutable
borrows simultaneously, and assigning temporaries to avoid this makes
the implementation less clear.
These are all cases inside the ECC chip, where we are inherently
producing valid coordinate pairs as a result of the constraints being
implemented, but it is useful to be explicit about the contract being
asserted at each point we construct `EccPoint` or `NonIdentityEccPoint`.
Previously we used the existing `Polynomial::rotate` and
`EvaluationDomain::rotate_extended` APIs to rotate the polynomials we
query at non-zero rotations, and then stored references to the chunks of
each (rotated and unrotated) polynomial as slices. When we reached an
`AstLeaf`, we would then clone the corresponding polynomial chunk.
We now avoid all of this with combined "rotate-and-chunk" APIs, making
use of the observation that a rotation is simply a renumbering of the
indices of the polynomial coefficients. Given that when we reach an
`AstLeaf` we already needed to allocate, these new APIs have the same
number of allocations during AST evaluation, but enable us to completely
avoid caching any information about the rotations or rotated polynomials
ahead of time.
`poly::Evaluator` stores all of the polynomials registered with it in
memory for the duration of its existence. When evaluating an AST, it
additionally caches the rotated polynomials in memory, and then chunks
all of the rotated polynomials for parallel evaluation.
Previously, it stored a polynomial for every unique AST leaf, regardless
of whether that leaf required a rotation or not. This resulted in the
unrotated polynomials being stored twice in memory. However, the chunks
simply refer to slices over cached polynomials, so we can reference the
unrotated polynomials stored in `poly::Evaluator` instead of copies of
them stored in the rotated polynomial `HashMap`. This strictly reduces
memory usage during proving with no effect on correctness.
`BatchVerifier` now manages the entire batch verification process.
Individual proofs are verified on a threadpool, and the resulting MSMs
are then batch-checked as before. The addition of parallelism here
couples with zcash/halo2#608 to make parallelism less fine-grained and
reduce the overhead of multi-threading.