ZIP 316 Work in Progress.

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
This commit is contained in:
Daira Hopwood 2021-04-08 23:59:30 +01:00
parent cb141ac91e
commit 3de014d33c
3 changed files with 547 additions and 2 deletions

BIN
zip-0316-f3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

BIN
zip-0316-f4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

View File

@ -3,11 +3,556 @@
ZIP: 316
Title: Unified Addresses
Owners: Daira Hopwood <daira@electriccoin.co>
Nathan Wilcox <nathan@electriccoin.co>
Taylor Hornby <taylor@electriccoin.co>
Jack Grigg <jack@electriccoin.co>
Sean Bowe <sean@electriccoin.co>
Kris Nuttycombe <kris@electriccoin.co>
Ying Tong Lai <yingtong@electriccoin.co>
Nathan Wilcox <nathan@electriccoin.co>
Status: Reserved
Status: Proposed
Category: Standards / RPC / Wallet
Created: 2021-04-07
License: MIT
Discussions-To: <https://github.com/zcash/zips/issues/482>
Terminology
===========
The key words "MUST", "MUST NOT", and "SHOULD" in this document are to
be interpreted as described in RFC 2119. [#RFC2119]_
The terms below are to be interpreted as follows:
Recipient
A wallet or other software that can receive transfers of assets (such
as ZEC) or in the future potentially other transaction-based state changes.
Sender
A wallet or other software that can send transfers of assets, or other
future consensus state side-effects.
Receiver
The necessary information to transfer an asset to a Recipient that generated
that Receiver using a specific Transfer Protocol. Each Receiver is associated
unambiguously with a specific Receiver Type.
Legacy Address (or LA)
A transparent, Sprout, or Sapling Address.
Unified Address (or UA)
A Unified Address combines multiple Receivers.
Address
Either a Legacy Address or a Unified Address.
Transfer Protocol
A specification of how a Sender can transfer assets to a Recipient.
For example, the Transfer Protocol for a Sapling Receiver is the subset
of the Zcash protocol required to successfully transfer ZEC using Sapling
Spend/Output Transfers as specified in the Zcash Protocol Specification.
(A single Zcash transaction can contain transfers of multiple
Transfer Protocols. For example a t→z transaction that shields to the
Sapling pool requires both Transparent and Sapling Transfer Protocols.)
Address Encoding
The externally visible encoding of an Address (e.g. as a string of
characters or a QR code).
Abstract
========
This proposal defines Unified Addresses, which bundle together Zcash addresses
(or other payment methods) of different types in a way that can be presented as
a single Address Encoding. It also defines Unified Viewing Keys, which perform
a similar function for Zcash viewing keys.
Motivation
==========
Up to and including the Canopy network upgrade, Zcash supported the following
payment address types:
* Transparent Addresses (P2PKH and P2SH)
* Sprout Addresses
* Sapling Addresses
Each of these has its own Address Encodings, as a string and as a QR code.
(Since the QR code is derivable from the string encoding, for many purposes
it suffices to consider the string encoding.)
The Orchard proposal [#zip-0224]_ adds a new address type, Orchard Addresses.
The difficulty with defining new Address Encodings for each address type, is
that end-users are forced to be aware of the various types, and in particular
which types are supported by a given Sender or Recipient. In order to make
sure that transfers are completed successfully, users may be forced to
explicitly generate addresses of different types and re-distribute encodings
of them, which adds significant friction and cognitive overhead to
understanding and using Zcash.
The goals for a Unified Address standard are as follows:
- Simplify coordination between receivers and senders of Zcash wallets by
removing complexity from negotiating address types.
- Provide a “bridging mechanism” to allow shielded wallets to successfully
interact with conformant Transparent-Only wallets.
- Allow older conformant wallets to interact seamlessly with newer wallets.
- Enable users of newer wallets to upgrade to newer transaction technologies
and/or pools while maintaining seamless interactions with counterparties
using older wallets.
- Allow wallets to shoulder more sophisticated responsibilities for shielding
and/or migrating user funds.
- Allow wallets to potentially develop new transfer mechanisms without
underlying protocol changes.
- Provide forward compatibility that is standard for all wallets across a
range of potential future features. Some examples might include Layer 2
features, cross-chain interoperability and bridging, and decentralized
exchange.
- The standard should work well for Zcash today and upcoming potential
upgrades, and also anticipate even broader use cases down the road such
as cross-chain functionality.
Requirements
============
Overview
--------
Unified Addresses specify multiple methods for payment to a Recipient's Wallet.
The Sender's Wallet can then non-interactively select the method of payment.
Importantly, any wallet can support Unified Addresses, even when that wallet
only supports a subset of payment methods for receiving and/or sending.
Despite having some similar characteristics, the Unified Address standard is
orthogonal to Payment Request URIs [#zip-0321]_ and similar schemes, and the
Unified Address format is likely to be incorporated into such schemes as a new
address type.
Concepts
--------
Wallets follow a model *Interaction Flow* as follows:
1. A Recipient *generates* an Address.
2. The Recipient wallet or human user encodes the address, and
*distributes* this Address Encoding through mechanisms which may be
“out-of-band” (example: they spray paint a QR Code on a sign) or
they may be more or less “in-band” (example: they include a string
encoding of a ``Reply-To`` address in an encrypted memo following a
common standard).
3. A Sender wallet or user *imports* the Address Encoding through any of
a variety of mechanisms (QR Code scanning, Payment URIs, cut-and-paste,
or “in-band” protocols like ``Reply-To`` memos). This includes decoding
and validity checks.
4. (Perhaps later in time) the Sender wallet executes a transfer of ZEC
(or other assets or future protocol state changes) to the Address.
These steps are a funnel: encodings of the same Address may be distributed
zero or more times through different means. Zero or more Senders may import
addresses. Zero or more of those may execute a Transfer. A single Sender may
execute multiple Transfers over time from a single import.
[TODO: examples]
Addresses
---------
A Unified Address (or UA for short) combines one or more Receivers.
When new Transport Protocols are introduced to the Zcash protocol after
Unified Addresses are standardized, those should introduce new Receiver Types
but *not* different address types outside of the UA standard. There needs
to be a compelling reason to deviate from the standard, since the benefits
of UA come precisely from their applicability across all new protocol
upgrades.
Receivers
---------
Every Wallet must anticipate and properly parse a UA with any unknown
arbitrary Receiver Type.
When Transferring to a valid UA, a Sender must behave as if any unknown
Receiver Type is simply not present for the purposes of the transfer.
A Wallet may process unknown Receiver Types by indicating to the user
their presence or similar information for usability or diagnostic purposes.
Transport Encoding
------------------
The string encoding is “opaque” to human readers: it does *not* allow
visual identification of which Receivers or Receiver Types are present.
- Rationale: The general thinking behind UAs is to allow wallets to
streamline user experience (UX). If human users can parse a UA and
alter their behaviour based on that, then different users will end up
using the same wallet very differently; this complicates troubleshooting
and learning from other users or educational resources. Note that this
does not preclude a wallet from providing user-friendly displays or
indications about Receiver support, and the wallet's UX design can
decide when and how to do this and build a behavioural flow around that.
The string encoding is resilient against typos, transcription errors,
cut-and-paste errors, unanticipated truncation, or other anticipated
UX hazards.
There is a well-defined encoding of a Unified Address as a QR Code,
which produces QR codes that are reasonably compact and robust.
There is a well-defined transformation between the QR Code and string
encodings in either direction.
The string encoding fits into ZIP-321 Payment URIs [#zip-0321]_ and
general URIs without introducing parse ambiguities.
The encoding must support sufficiently many Recipient Types to allow
for reasonable future expansion.
The encoding must allow all wallets to safely and correctly parse out
unknown Receiver Types well enough to ignore them.
Transfers
---------
When executing a Transfer the Sender selects a transfer method via a
Selection process.
Given a valid UA, Selection must treat any unrecognized Receiver as
though it were absent.
- This property is crucial for forward compatibility to ensure users
who upgrade to newer protocols / UAs don't lose the ability to smoothly
interact with older wallets.
- This property is crucial for allowing Transparent-Only UA-Conformant
wallets to interact with newer shielded wallets, removing a
disincentive for adopting newer shielded wallets.
- This property also allows Transparent-Only wallets to upgrade to
shielded support without re-acquiring counterparty UAs, or even when
they are re-acquired the user flow and usability will be minimally
disrupted.
Open Issues and Known Concerns
------------------------------
FIXME: We have a few of these I [Nathan] will add in future edits.
This is especially true of privacy impacts of transparent or cross-pool
transactions and the associated UX issues.
Non-requirements
================
...
Specification
=============
Definitions
-----------
Encoding of Unified Payment Addresses
-------------------------------------
Rather than defining a Bech32 string encoding of Orchard shielded
payment addresses, we instead define a unified payment address format
that is able to encode a set of payment addresses of different types.
This enables the consumer of an address to choose the best address
type it supports, providing a better user experience as new formats
are added in the future.
Assume that we are given a set of one or more raw encodings of
payment addresses of distinct types. That is, the set may optionally
contain one of each of the payment address types in the following
list:
* typecode :math:`\mathtt{0x03}` — an Orchard raw address as defined
in [#protocol-orchardpaymentaddrencoding]_;
* typecode :math:`\mathtt{0x02}` — a Sapling raw address as defined
in [#protocol-saplingpaymentaddrencoding]_;
* typecode :math:`\mathtt{0x01}` — a transparent P2SH address, *or*
typecode :math:`\mathtt{0x00}` — a transparent P2PKH address.
A unified payment address MUST contain at least one shielded payment
address (typecodes :math:`\geq \mathtt{0x02}`).
The intended semantics is that the consumer of a unified payment
address SHOULD take the “best” address type that it supports from
the set, i.e. the first in the above list. For example, if the
unified payment address includes an Orchard address, and the consumer
supports sending funds to Orchard addresses, and no more recent
address format has been defined at the time of use, then the Orchard
address SHOULD be used.
The raw encoding of a unified payment address is a concatenation of
:math:`(\mathtt{typecode}, \mathtt{length}, \mathtt{addr})` encodings
of the consituent addresses:
* :math:`\mathtt{typecode} : \mathtt{byte}` — the typecode from the
above list;
* :math:`\mathtt{length} : \mathtt{byte}` — the length in bytes of
:math:`\mathtt{addr}`;
* :math:`\mathtt{addr} : \mathtt{byte[length]}`
the raw encoding of a shielded payment address, or the :math:`160`-bit
script hash of a P2SH address [#P2SH]_, or the :math:`160`-bit
validating key hash of a P2PKH address [#P2PKH]_.
The result of the concatenation is then encoded with Bech32m
[#bip-0350]_, ignoring any length restrictions. This is chosen over
Bech32 in order to better handle variable-length inputs.
For unified payment addresses on Mainnet, the Human-Readable Part (as
defined in [#bip-0350]_) is “``u``”. For unified payment addresses
on Testnet, the Human-Readable Part is “``utest``”.
Notes:
* The :math:`\mathtt{length}` field is always encoded as a single
byte, *not* as a :math:`\mathtt{compactSize}`.
* For transparent addresses, the :math:`\mathtt{addr}` field does not
include the first two bytes of a raw encoding.
* There is intentionally no typecode defined for a Sprout shielded
payment address. Since it is no longer possible (since activation
of ZIP 211 in the Canopy network upgrade [#zip-0211]_) to send
funds into the Sprout chain value pool, this would not be generally
useful.
* Consumers MUST ignore constituent addresses with typecodes they do
not recognize.
* Consumers MUST reject unified payment addresses in which the same
typecode appears more than once, or that include both P2SH and
P2PKH transparent addresses, or that contain only a transparent
address.
* Producers SHOULD order the constituent addresses in the same order
as the list of address types above. However, consumers MUST NOT
assume this ordering, and it does not affect which address should
be used by a consumer.
* There MUST NOT be additional bytes at the end of the encoding that
cannot be interpreted as specified above.
Address hardening
-----------------
Security goal (**near second preimage resistance**):
* An adversary is given :math:`q` Unified Addresses, generated honestly.
* The attack goal is to produce a “partially colliding” valid Unified
Address that:
a) has a string encoding matching that of *one of* the input
addresses on some subset of characters (for concreteness, consider
the first :math:`n` and last :math:`m` characters, up to some bound
on :math:`n+m`);
b) is controlled by the adversary (for concreteness, the adversary
knows *at least one* of the private keys of the constituent
addresses).
Security goal (**nonmalleability**):
In this variant, part b) above is replaced by the meaning of the new
address being “usefully” different than the address it is based on, even
though the adversary does not know any of the private keys. For example,
if it were possible to delete a shielded constituent address from a UA
leaving only a transparent address, that would be a significant malleability
attack.
Discussion
''''''''''
There is a generic brute force attack against near second preimage
resistance. The adversary generates UAs at random with known keys, until
one has an encoding that partially collides with one of the :math:`q` target
addresses. It may be possible to improve on this attack by making use of
properties of checksums, etc.
The generic attack puts an upper bound on the achievable security: if it
takes work :math:`w` to produce and verify a UA, and the size of the character
set is :math:`c`, then the generic attack costs :math:`\sim \frac{w \cdot
c^{n+m}}{q}`.
Proposed solution
'''''''''''''''''
We use an unkeyed 4-round Feistel construction to approximate a random
permutation. (As explained below, 3 rounds would not be sufficient.)
Let :math:`H_i` be a hash personalized by :math:`i`, with maximum output
length :math:`\ell_H` bytes. Let :math:`G_i` be a XOF (a hash function with
extendable output length) based on :math:`H`, personalized by :math:`i`.
Given input :math:`M` of length :math:`\ell_M` bytes such that
:math:`22 \leq \ell_M \leq 16448`, define :math:`\mathsf{F4Jumble}(M)`
by:
* let :math:`\ell_L = \mathsf{min}(\ell_H, \mathsf{floor}(\ell_M/2))`
* let :math:`\ell_R = \ell_M - \ell_L`
* split :math:`M` into :math:`a` of length :math:`\ell_L` and :math:`b` of length :math:`\ell_R`
* let :math:`x = b \oplus G_0(a)`
* let :math:`y = a \oplus H_0(x)`
* let :math:`d = x \oplus G_1(y)`
* let :math:`c = y \oplus H_1(d)`
* return :math:`c \,||\, d`.
The first argument to BLAKE2b below is the personalization.
We instantiate :math:`H_i(u)` by
:math:`\mathsf{BLAKE2b}(8\ell_L)(“\mathtt{UA\_F4Jumble\_H\_}” \,||\,`
:math:`[i, 0], u)`.
We instantiate :math:`G_i(u)` as the first :math:`\ell_R` bytes of the
concatenation of
:math:`[\mathsf{BLAKE2b}512(“\mathtt{UA\_F4Jumble\_G\_}” \,||\,`
:math:`[i, j], u) \text{ for } j \text{ from } 0 \text{ up to}`
:math:`\mathsf{ceiling}(\ell_R/\ell_H)-1]`.
.. figure:: zip-0316-f4.png
:align: center
:figclass: align-center
Diagram of 4-round unkeyed Feistel construction
(In practice the lengths :math:`\ell_L` and :math:`\ell_R` will be roughly
the same until :math:`\ell_M` is larger than :math:`128` bytes.)
Usage for Unified Addresses
'''''''''''''''''''''''''''
The producer of a unified address applies :math:`\mathsf{F4Jumble}` to the
encoding of the sequence of (typecode, length, addr) before encoding it
with Bech32m.
The consumer rejects any Bech32m-decoded byte sequence that is less than
22 bytes; otherwise it applies :math:`\mathsf{F4Jumble}^{-1}` before
parsing the result. (22 bytes is the minimum size of a valid encoded
address sequence, corresponding to just a transparent address.)
Heuristic analysis
''''''''''''''''''
A 3-round unkeyed Feistel, as shown, is not sufficient:
.. figure:: zip-0316-f3.png
:align: center
:figclass: align-center
Diagram of 3-round unkeyed Feistel construction
Suppose that an adversary has a target input/output pair
:math:`(a \,||\, b, c \,||\, d)`, and that the input to :math:`G_0` is
:math:`x`. By fixing :math:`x`, we can obtain another pair
:math:`((a \oplus t) \,||\, b', (c \oplus t) \,||\, d')` such that
:math:`a \oplus t` is close to :math:`a` and :math:`c \oplus t` is close
to :math:`c`.
(:math:`b'` and :math:`d'` will not be close to :math:`b` and :math:`d`,
but that isn't necessarily required for a valid attack.)
A 4-round Feistel thwarts this and similar attacks. Defining :math:`x` and
:math:`y` as the intermediate values in the first diagram above:
* if :math:`(x', y')` are fixed to the same values as :math:`(x, y)`, then
:math:`(a', b', c', d') = (a, b, c, d)`;
* if :math:`x' = x` but :math:`y' \neq y`, then the adversary is able to
introduce a controlled :math:`\oplus`-difference
:math:`a \oplus a' = y \oplus y'`, but the other three pieces
:math:`(b, c, d)` are all randomized, which is sufficient;
* if :math:`y' = y` but :math:`x' \neq x`, then the adversary is able to
introduce a controlled :math:`\oplus`-difference
:math:`d \oplus d' = x \oplus x'`, but the other three pieces
:math:`(a, b, c)` are all randomized, which is sufficient;
* if :math:`x' \neq x` and :math:`y' \neq y`, all four pieces are
randomized.
Note that the size of each piece is at least 11 bytes. TODO: analyze
whether this is sufficient when using 4 rounds.
It would be possible to make an attack more expensive by making the work
done by an address producer more expensive. (This wouldn't necessarily
have to increase the work done by the consumer.) However, given that
addresses may need to be produced on constrained computing platforms, I
did not think that would be beneficial overall.
Efficiency
''''''''''
The cost is dominated by 4 BLAKE2b compressions for :math:`\ell_M \leq 128`
bytes. A UA containing a transparent address, a Sapling address, and an
Orchard address, would have :math:`\ell_M = 112` bytes. The restriction
to a single address with a given typecode (and at most one transparent
address) means that this is also the maximum length as of NU5 activation.
For longer UAs (when other typecodes are added), the cost increases to 6
BLAKE2b compressions for :math:`128 < \ell_M \leq 192`, and 10 BLAKE2b
compressions for :math:`192 < \ell_M \leq 256`, for example. The maximum
cost for which the algorithm is defined would be 768 BLAKE2b compressions
at :math:`\ell_M = 16448` bytes. We will almost certainly never add enough
typecodes to reach that, and we might want to define a smaller limit.
The memory usage, for a memory-optimized implementation, is roughly
:math:`\ell_M` bytes plus the size of a BLAKE2b hash state.
Dependencies
''''''''''''
BLAKE2b, with personalization and variable output length, is the only
external dependency. TODO: would it be useful to remove the requirement
for variable output length?
Related work
''''''''''''
[Eliminating Random Permutation Oracles in the EvenMansour Cipher](https://www.iacr.org/cryptodb/data/paper.php?pubkey=218)
* This paper argues that a 4-round unkeyed Feistel is sufficient to
replace a random permutation in the EvenMansour cipher construction.
[On the Round Security of Symmetric-Key Cryptographic Primitives](https://www.iacr.org/archive/crypto2000/18800377/18800377.pdf)
LIONESS: https://www.cl.cam.ac.uk/~rja14/Papers/bear-lion.pdf
* LIONESS is a similarly structured 4-round unbalanced Feistel cipher.
Open questions
--------------
Reference implementation
========================
Acknowledgements
================
The authors would like to thank Benjamin Winston, Zooko Wilcox, Francisco Gindre,
Marshall Gaucher, Jospeh Van Geffen, Brad Miller, Deirdre Connolly, and Teor for
discussions on the subject of Unified Addresses.
References
==========
.. [#RFC2119] `RFC 2119: Key words for use in RFCs to Indicate Requirement Levels <https://www.rfc-editor.org/rfc/rfc2119.html>`_
.. [#protocol-nu5] `Zcash Protocol Specification, Version 2020.1.22 or later [NU5 proposal] <protocol/nu5.pdf>`_
.. [#protocol-saplingpaymentaddrencoding] `Zcash Protocol Specification, Version 2020.1.22 [NU5 proposal]. Section 5.6.3.1: Sapling Payment Addresses <protocol/nu5.pdf#saplingpaymentaddrencoding>`_
.. [#protocol-orchardpaymentaddrencoding] `Zcash Protocol Specification, Version 2020.1.22 [NU5 proposal]. Section 5.6.4.2: Orchard Raw Payment Addresses <protocol/nu5.pdf#orchardpaymentaddrencoding>`_
.. [#zip-0211] `ZIP 211: Disabling Addition of New Value to the Sprout Chain Value Pool <zip-0211.rst>`_
.. [#zip-0224] `ZIP 224: Orchard Shielded Protocol <zip-0224.rst>`_
.. [#zip-0321] `ZIP 321: Payment Request URIs <zip-0321.rst>`_
.. [#bip-0350] `BIP 350: Bech32m format for v1+ witness addresses <https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki>`_
.. [#P2PKH] `Transactions: P2PKH Script Validation — Bitcoin Developer Guide <https://developer.bitcoin.org/devguide/transactions.html#p2pkh-script-validation>`_
.. [#P2SH] `Transactions: P2SH Scripts — Bitcoin Developer Guide <https://developer.bitcoin.org/devguide/transactions.html#pay-to-script-hash-p2sh>`_