ZIP 316 Work in Progress.

Signed-off-by: Daira Hopwood <daira@jacaranda.org>
2021-04-08 23:59:30 +01:00 · 2021-04-08 23:59:30 +01:00 · 3de014d33c
parent cb141ac91e
commit 3de014d33c
3 changed files with 547 additions and 2 deletions
--- a/zip-0316-f3.png
+++ b/zip-0316-f3.png
--- a/zip-0316-f4.png
+++ b/zip-0316-f4.png
--- a/zip-0316.rst
+++ b/zip-0316.rst
@ -3,11 +3,556 @@
  ZIP: 316
  Title: Unified Addresses
  Owners: Daira Hopwood <daira@electriccoin.co>
+          Nathan Wilcox <nathan@electriccoin.co>
+          Taylor Hornby <taylor@electriccoin.co>
          Jack Grigg <jack@electriccoin.co>
          Sean Bowe <sean@electriccoin.co>
          Kris Nuttycombe <kris@electriccoin.co>
          Ying Tong Lai <yingtong@electriccoin.co>
-          Nathan Wilcox <nathan@electriccoin.co>
-  Status: Reserved
+  Status: Proposed
  Category: Standards / RPC / Wallet
+  Created: 2021-04-07
+  License: MIT
  Discussions-To: <https://github.com/zcash/zips/issues/482>
+
+
+Terminology
+===========
+
+The key words "MUST", "MUST NOT", and "SHOULD" in this document are to
+be interpreted as described in RFC 2119. [#RFC2119]_
+
+The terms below are to be interpreted as follows:
+
+Recipient
+  A wallet or other software that can receive transfers of assets (such
+  as ZEC) or in the future potentially other transaction-based state changes.
+Sender
+  A wallet or other software that can send transfers of assets, or other
+  future consensus state side-effects.
+Receiver
+  The necessary information to transfer an asset to a Recipient that generated
+  that Receiver using a specific Transfer Protocol. Each Receiver is associated
+  unambiguously with a specific Receiver Type.
+Legacy Address (or LA)
+  A transparent, Sprout, or Sapling Address.
+Unified Address (or UA)
+  A Unified Address combines multiple Receivers.
+Address
+  Either a Legacy Address or a Unified Address.
+Transfer Protocol
+  A specification of how a Sender can transfer assets to a Recipient.
+  For example, the Transfer Protocol for a Sapling Receiver is the subset
+  of the Zcash protocol required to successfully transfer ZEC using Sapling
+  Spend/Output Transfers as specified in the Zcash Protocol Specification.
+  (A single Zcash transaction can contain transfers of multiple
+  Transfer Protocols. For example a t→z transaction that shields to the
+  Sapling pool requires both Transparent and Sapling Transfer Protocols.)
+Address Encoding
+  The externally visible encoding of an Address (e.g. as a string of
+  characters or a QR code).
+
+
+Abstract
+========
+
+This proposal defines Unified Addresses, which bundle together Zcash addresses
+(or other payment methods) of different types in a way that can be presented as
+a single Address Encoding. It also defines Unified Viewing Keys, which perform
+a similar function for Zcash viewing keys.
+
+
+Motivation
+==========
+
+Up to and including the Canopy network upgrade, Zcash supported the following
+payment address types:
+
+* Transparent Addresses (P2PKH and P2SH)
+* Sprout Addresses
+* Sapling Addresses
+
+Each of these has its own Address Encodings, as a string and as a QR code.
+(Since the QR code is derivable from the string encoding, for many purposes
+it suffices to consider the string encoding.)
+
+The Orchard proposal [#zip-0224]_ adds a new address type, Orchard Addresses.
+
+The difficulty with defining new Address Encodings for each address type, is
+that end-users are forced to be aware of the various types, and in particular
+which types are supported by a given Sender or Recipient. In order to make
+sure that transfers are completed successfully, users may be forced to
+explicitly generate addresses of different types and re-distribute encodings
+of them, which adds significant friction and cognitive overhead to
+understanding and using Zcash.
+
+The goals for a Unified Address standard are as follows:
+
+- Simplify coordination between receivers and senders of Zcash wallets by
+  removing complexity from negotiating address types.
+- Provide a “bridging mechanism” to allow shielded wallets to successfully
+  interact with conformant Transparent-Only wallets.
+- Allow older conformant wallets to interact seamlessly with newer wallets.
+- Enable users of newer wallets to upgrade to newer transaction technologies
+  and/or pools while maintaining seamless interactions with counterparties
+  using older wallets.
+- Allow wallets to shoulder more sophisticated responsibilities for shielding
+  and/or migrating user funds.
+- Allow wallets to potentially develop new transfer mechanisms without
+  underlying protocol changes.
+- Provide forward compatibility that is standard for all wallets across a
+  range of potential future features. Some examples might include Layer 2
+  features, cross-chain interoperability and bridging, and decentralized
+  exchange.
+- The standard should work well for Zcash today and upcoming potential
+  upgrades, and also anticipate even broader use cases down the road such
+  as cross-chain functionality.
+
+
+Requirements
+============
+
+Overview
+--------
+
+Unified Addresses specify multiple methods for payment to a Recipient's Wallet.
+The Sender's Wallet can then non-interactively select the method of payment.
+
+Importantly, any wallet can support Unified Addresses, even when that wallet
+only supports a subset of payment methods for receiving and/or sending.
+
+Despite having some similar characteristics, the Unified Address standard is
+orthogonal to Payment Request URIs [#zip-0321]_ and similar schemes, and the
+Unified Address format is likely to be incorporated into such schemes as a new
+address type.
+
+Concepts
+--------
+
+Wallets follow a model *Interaction Flow* as follows:
+
+1. A Recipient *generates* an Address.
+2. The Recipient wallet or human user encodes the address, and
+   *distributes* this Address Encoding through mechanisms which may be
+   “out-of-band” (example: they spray paint a QR Code on a sign) or
+   they may be more or less “in-band” (example: they include a string
+   encoding of a ``Reply-To`` address in an encrypted memo following a
+   common standard).
+3. A Sender wallet or user *imports* the Address Encoding through any of
+   a variety of mechanisms (QR Code scanning, Payment URIs, cut-and-paste,
+   or “in-band” protocols like ``Reply-To`` memos). This includes decoding
+   and validity checks.
+4. (Perhaps later in time) the Sender wallet executes a transfer of ZEC
+   (or other assets or future protocol state changes) to the Address.
+
+These steps are a funnel: encodings of the same Address may be distributed
+zero or more times through different means. Zero or more Senders may import
+addresses. Zero or more of those may execute a Transfer. A single Sender may
+execute multiple Transfers over time from a single import.
+
+[TODO: examples]
+
+Addresses
+---------
+
+A Unified Address (or UA for short) combines one or more Receivers.
+
+When new Transport Protocols are introduced to the Zcash protocol after
+Unified Addresses are standardized, those should introduce new Receiver Types
+but *not* different address types outside of the UA standard. There needs
+to be a compelling reason to deviate from the standard, since the benefits
+of UA come precisely from their applicability across all new protocol
+upgrades.
+
+Receivers
+---------
+
+Every Wallet must anticipate and properly parse a UA with any unknown
+arbitrary Receiver Type.
+
+When Transferring to a valid UA, a Sender must behave as if any unknown
+Receiver Type is simply not present for the purposes of the transfer.
+
+A Wallet may process unknown Receiver Types by indicating to the user
+their presence or similar information for usability or diagnostic purposes.
+
+Transport Encoding
+------------------
+
+The string encoding is “opaque” to human readers: it does *not* allow
+visual identification of which Receivers or Receiver Types are present.
+
+- Rationale: The general thinking behind UAs is to allow wallets to
+  streamline user experience (UX). If human users can parse a UA and
+  alter their behaviour based on that, then different users will end up
+  using the same wallet very differently; this complicates troubleshooting
+  and learning from other users or educational resources. Note that this
+  does not preclude a wallet from providing user-friendly displays or
+  indications about Receiver support, and the wallet's UX design can
+  decide when and how to do this and build a behavioural flow around that.
+
+The string encoding is resilient against typos, transcription errors,
+cut-and-paste errors, unanticipated truncation, or other anticipated
+UX hazards.
+
+There is a well-defined encoding of a Unified Address as a QR Code,
+which produces QR codes that are reasonably compact and robust.
+
+There is a well-defined transformation between the QR Code and string
+encodings in either direction.
+
+The string encoding fits into ZIP-321 Payment URIs [#zip-0321]_ and
+general URIs without introducing parse ambiguities.
+
+The encoding must support sufficiently many Recipient Types to allow
+for reasonable future expansion.
+
+The encoding must allow all wallets to safely and correctly parse out
+unknown Receiver Types well enough to ignore them.
+
+Transfers
+---------
+
+When executing a Transfer the Sender selects a transfer method via a
+Selection process.
+
+Given a valid UA, Selection must treat any unrecognized Receiver as
+though it were absent.
+
+- This property is crucial for forward compatibility to ensure users
+  who upgrade to newer protocols / UAs don't lose the ability to smoothly
+  interact with older wallets.
+
+- This property is crucial for allowing Transparent-Only UA-Conformant
+  wallets to interact with newer shielded wallets, removing a
+  disincentive for adopting newer shielded wallets.
+
+- This property also allows Transparent-Only wallets to upgrade to
+  shielded support without re-acquiring counterparty UAs, or even when
+  they are re-acquired the user flow and usability will be minimally
+  disrupted.
+
+Open Issues and Known Concerns
+------------------------------
+
+FIXME: We have a few of these I [Nathan] will add in future edits.
+This is especially true of privacy impacts of transparent or cross-pool
+transactions and the associated UX issues.
+
+
+Non-requirements
+================
+
+...
+
+
+Specification
+=============
+
+Definitions
+-----------
+
+
+
+Encoding of Unified Payment Addresses
+-------------------------------------
+
+Rather than defining a Bech32 string encoding of Orchard shielded
+payment addresses, we instead define a unified payment address format
+that is able to encode a set of payment addresses of different types.
+This enables the consumer of an address to choose the best address
+type it supports, providing a better user experience as new formats
+are added in the future.
+
+Assume that we are given a set of one or more raw encodings of
+payment addresses of distinct types. That is, the set may optionally
+contain one of each of the payment address types in the following
+list:
+
+* typecode :math:`\mathtt{0x03}` — an Orchard raw address as defined
+  in [#protocol-orchardpaymentaddrencoding]_;
+
+* typecode :math:`\mathtt{0x02}` — a Sapling raw address as defined
+  in [#protocol-saplingpaymentaddrencoding]_;
+
+* typecode :math:`\mathtt{0x01}` — a transparent P2SH address, *or*
+  typecode :math:`\mathtt{0x00}` — a transparent P2PKH address.
+
+A unified payment address MUST contain at least one shielded payment
+address (typecodes :math:`\geq \mathtt{0x02}`).
+
+The intended semantics is that the consumer of a unified payment
+address SHOULD take the “best” address type that it supports from
+the set, i.e. the first in the above list. For example, if the
+unified payment address includes an Orchard address, and the consumer
+supports sending funds to Orchard addresses, and no more recent
+address format has been defined at the time of use, then the Orchard
+address SHOULD be used.
+
+The raw encoding of a unified payment address is a concatenation of
+:math:`(\mathtt{typecode}, \mathtt{length}, \mathtt{addr})` encodings
+of the consituent addresses:
+
+* :math:`\mathtt{typecode} : \mathtt{byte}` — the typecode from the
+  above list;
+
+* :math:`\mathtt{length} : \mathtt{byte}` — the length in bytes of
+  :math:`\mathtt{addr}`;
+
+* :math:`\mathtt{addr} : \mathtt{byte[length]}` —
+  the raw encoding of a shielded payment address, or the :math:`160`-bit
+  script hash of a P2SH address [#P2SH]_, or the :math:`160`-bit
+  validating key hash of a P2PKH address [#P2PKH]_.
+
+The result of the concatenation is then encoded with Bech32m
+[#bip-0350]_, ignoring any length restrictions. This is chosen over
+Bech32 in order to better handle variable-length inputs.
+
+For unified payment addresses on Mainnet, the Human-Readable Part (as
+defined in [#bip-0350]_) is “``u``”. For unified payment addresses
+on Testnet, the Human-Readable Part is “``utest``”.
+
+Notes:
+
+* The :math:`\mathtt{length}` field is always encoded as a single
+  byte, *not* as a :math:`\mathtt{compactSize}`.
+
+* For transparent addresses, the :math:`\mathtt{addr}` field does not
+  include the first two bytes of a raw encoding.
+
+* There is intentionally no typecode defined for a Sprout shielded
+  payment address. Since it is no longer possible (since activation
+  of ZIP 211 in the Canopy network upgrade [#zip-0211]_) to send
+  funds into the Sprout chain value pool, this would not be generally
+  useful.
+
+* Consumers MUST ignore constituent addresses with typecodes they do
+  not recognize.
+
+* Consumers MUST reject unified payment addresses in which the same
+  typecode appears more than once, or that include both P2SH and
+  P2PKH transparent addresses, or that contain only a transparent
+  address.
+
+* Producers SHOULD order the constituent addresses in the same order
+  as the list of address types above. However, consumers MUST NOT
+  assume this ordering, and it does not affect which address should
+  be used by a consumer.
+
+* There MUST NOT be additional bytes at the end of the encoding that
+  cannot be interpreted as specified above.
+
+
+Address hardening
+-----------------
+
+Security goal (**near second preimage resistance**):
+
+* An adversary is given :math:`q` Unified Addresses, generated honestly.
+* The attack goal is to produce a “partially colliding” valid Unified
+  Address that:
+
+  a) has a string encoding matching that of *one of* the input
+     addresses on some subset of characters (for concreteness, consider
+     the first :math:`n` and last :math:`m` characters, up to some bound
+     on :math:`n+m`);
+  b) is controlled by the adversary (for concreteness, the adversary
+     knows *at least one* of the private keys of the constituent
+     addresses).
+
+Security goal (**nonmalleability**):
+
+In this variant, part b) above is replaced by the meaning of the new
+address being “usefully” different than the address it is based on, even
+though the adversary does not know any of the private keys. For example,
+if it were possible to delete a shielded constituent address from a UA
+leaving only a transparent address, that would be a significant malleability
+attack.
+
+Discussion
+''''''''''
+
+There is a generic brute force attack against near second preimage
+resistance. The adversary generates UAs at random with known keys, until
+one has an encoding that partially collides with one of the :math:`q` target
+addresses. It may be possible to improve on this attack by making use of
+properties of checksums, etc.
+
+The generic attack puts an upper bound on the achievable security: if it
+takes work :math:`w` to produce and verify a UA, and the size of the character
+set is :math:`c`, then the generic attack costs :math:`\sim \frac{w \cdot
+c^{n+m}}{q}`.
+
+Proposed solution
+'''''''''''''''''
+
+We use an unkeyed 4-round Feistel construction to approximate a random
+permutation. (As explained below, 3 rounds would not be sufficient.)
+
+Let :math:`H_i` be a hash personalized by :math:`i`, with maximum output
+length :math:`\ell_H` bytes. Let :math:`G_i` be a XOF (a hash function with
+extendable output length) based on :math:`H`, personalized by :math:`i`.
+
+Given input :math:`M` of length :math:`\ell_M` bytes such that
+:math:`22 \leq \ell_M \leq 16448`, define :math:`\mathsf{F4Jumble}(M)`
+by:
+
+* let :math:`\ell_L = \mathsf{min}(\ell_H, \mathsf{floor}(\ell_M/2))`
+* let :math:`\ell_R = \ell_M - \ell_L`
+* split :math:`M` into :math:`a` of length :math:`\ell_L` and :math:`b` of length :math:`\ell_R`
+* let :math:`x = b \oplus G_0(a)`
+* let :math:`y = a \oplus H_0(x)`
+* let :math:`d = x \oplus G_1(y)`
+* let :math:`c = y \oplus H_1(d)`
+* return :math:`c \,||\, d`.
+   
+The first argument to BLAKE2b below is the personalization.
+
+We instantiate :math:`H_i(u)` by
+:math:`\mathsf{BLAKE2b‐}(8\ell_L)(“\mathtt{UA\_F4Jumble\_H\_}” \,||\,`
+:math:`[i, 0], u)`.
+
+We instantiate :math:`G_i(u)` as the first :math:`\ell_R` bytes of the
+concatenation of
+:math:`[\mathsf{BLAKE2b‐}512(“\mathtt{UA\_F4Jumble\_G\_}” \,||\,`
+:math:`[i, j], u) \text{ for } j \text{ from } 0 \text{ up to}`
+:math:`\mathsf{ceiling}(\ell_R/\ell_H)-1]`.
+
+.. figure:: zip-0316-f4.png
+    :align: center
+    :figclass: align-center
+
+    Diagram of 4-round unkeyed Feistel construction
+
+(In practice the lengths :math:`\ell_L` and :math:`\ell_R` will be roughly
+the same until :math:`\ell_M` is larger than :math:`128` bytes.)
+
+Usage for Unified Addresses
+'''''''''''''''''''''''''''
+
+The producer of a unified address applies :math:`\mathsf{F4Jumble}` to the
+encoding of the sequence of (typecode, length, addr) before encoding it
+with Bech32m.
+
+The consumer rejects any Bech32m-decoded byte sequence that is less than
+22 bytes; otherwise it applies :math:`\mathsf{F4Jumble}^{-1}` before
+parsing the result. (22 bytes is the minimum size of a valid encoded
+address sequence, corresponding to just a transparent address.)
+
+Heuristic analysis
+''''''''''''''''''
+
+A 3-round unkeyed Feistel, as shown, is not sufficient:
+
+.. figure:: zip-0316-f3.png
+    :align: center
+    :figclass: align-center
+
+    Diagram of 3-round unkeyed Feistel construction
+
+Suppose that an adversary has a target input/output pair
+:math:`(a \,||\, b, c \,||\, d)`, and that the input to :math:`G_0` is
+:math:`x`. By fixing :math:`x`, we can obtain another pair
+:math:`((a \oplus t) \,||\, b', (c \oplus t) \,||\, d')` such that
+:math:`a \oplus t` is close to :math:`a` and :math:`c \oplus t` is close
+to :math:`c`.
+(:math:`b'` and :math:`d'` will not be close to :math:`b` and :math:`d`,
+but that isn't necessarily required for a valid attack.)
+
+A 4-round Feistel thwarts this and similar attacks. Defining :math:`x` and
+:math:`y` as the intermediate values in the first diagram above:
+
+* if :math:`(x', y')` are fixed to the same values as :math:`(x, y)`, then
+  :math:`(a', b', c', d') = (a, b, c, d)`;
+
+* if :math:`x' = x` but :math:`y' \neq y`, then the adversary is able to
+  introduce a controlled :math:`\oplus`-difference
+  :math:`a \oplus a' = y \oplus y'`, but the other three pieces
+  :math:`(b, c, d)` are all randomized, which is sufficient;
+
+* if :math:`y' = y` but :math:`x' \neq x`, then the adversary is able to
+  introduce a controlled :math:`\oplus`-difference
+  :math:`d \oplus d' = x \oplus x'`, but the other three pieces
+  :math:`(a, b, c)` are all randomized, which is sufficient;
+
+* if :math:`x' \neq x` and :math:`y' \neq y`, all four pieces are
+  randomized.
+
+Note that the size of each piece is at least 11 bytes. TODO: analyze
+whether this is sufficient when using 4 rounds.
+
+It would be possible to make an attack more expensive by making the work
+done by an address producer more expensive. (This wouldn't necessarily
+have to increase the work done by the consumer.) However, given that
+addresses may need to be produced on constrained computing platforms, I
+did not think that would be beneficial overall.
+
+Efficiency
+''''''''''
+
+The cost is dominated by 4 BLAKE2b compressions for :math:`\ell_M \leq 128`
+bytes. A UA containing a transparent address, a Sapling address, and an
+Orchard address, would have :math:`\ell_M = 112` bytes. The restriction
+to a single address with a given typecode (and at most one transparent
+address) means that  this is also the maximum length as of NU5 activation.
+
+For longer UAs (when other typecodes are added), the cost increases to 6
+BLAKE2b compressions for :math:`128 < \ell_M \leq 192`, and 10 BLAKE2b
+compressions for :math:`192 < \ell_M \leq 256`, for example. The maximum
+cost for which the algorithm is defined would be 768 BLAKE2b compressions
+at :math:`\ell_M = 16448` bytes. We will almost certainly never add enough
+typecodes to reach that, and we might want to define a smaller limit.
+
+The memory usage, for a memory-optimized implementation, is roughly
+:math:`\ell_M` bytes plus the size of a BLAKE2b hash state.
+
+Dependencies
+''''''''''''
+
+BLAKE2b, with personalization and variable output length, is the only
+external dependency. TODO: would it be useful to remove the requirement
+for variable output length?
+
+Related work
+''''''''''''
+
+[Eliminating Random Permutation Oracles in the Even–Mansour Cipher](https://www.iacr.org/cryptodb/data/paper.php?pubkey=218)
+
+* This paper argues that a 4-round unkeyed Feistel is sufficient to
+  replace a random permutation in the Even–Mansour cipher construction.
+
+[On the Round Security of Symmetric-Key Cryptographic Primitives](https://www.iacr.org/archive/crypto2000/18800377/18800377.pdf)
+
+LIONESS: https://www.cl.cam.ac.uk/~rja14/Papers/bear-lion.pdf
+
+* LIONESS is a similarly structured 4-round unbalanced Feistel cipher.
+
+Open questions
+--------------
+
+
+Reference implementation
+========================
+
+
+Acknowledgements
+================
+
+The authors would like to thank Benjamin Winston, Zooko Wilcox, Francisco Gindre,
+Marshall Gaucher, Jospeh Van Geffen, Brad Miller, Deirdre Connolly, and Teor for
+discussions on the subject of Unified Addresses.
+
+
+References
+==========
+
+.. [#RFC2119] `RFC 2119: Key words for use in RFCs to Indicate Requirement Levels <https://www.rfc-editor.org/rfc/rfc2119.html>`_
+.. [#protocol-nu5] `Zcash Protocol Specification, Version 2020.1.22 or later [NU5 proposal] <protocol/nu5.pdf>`_
+.. [#protocol-saplingpaymentaddrencoding] `Zcash Protocol Specification, Version 2020.1.22 [NU5 proposal]. Section 5.6.3.1: Sapling Payment Addresses <protocol/nu5.pdf#saplingpaymentaddrencoding>`_
+.. [#protocol-orchardpaymentaddrencoding] `Zcash Protocol Specification, Version 2020.1.22 [NU5 proposal]. Section 5.6.4.2: Orchard Raw Payment Addresses <protocol/nu5.pdf#orchardpaymentaddrencoding>`_
+.. [#zip-0211] `ZIP 211: Disabling Addition of New Value to the Sprout Chain Value Pool <zip-0211.rst>`_
+.. [#zip-0224] `ZIP 224: Orchard Shielded Protocol <zip-0224.rst>`_
+.. [#zip-0321] `ZIP 321: Payment Request URIs <zip-0321.rst>`_
+.. [#bip-0350] `BIP 350: Bech32m format for v1+ witness addresses <https://github.com/bitcoin/bips/blob/master/bip-0350.mediawiki>`_
+.. [#P2PKH] `Transactions: P2PKH Script Validation — Bitcoin Developer Guide <https://developer.bitcoin.org/devguide/transactions.html#p2pkh-script-validation>`_
+.. [#P2SH] `Transactions: P2SH Scripts — Bitcoin Developer Guide <https://developer.bitcoin.org/devguide/transactions.html#pay-to-script-hash-p2sh>`_