[ZcF-general] Grant project update - new PoW scheme
Solar Designer
solar at openwall.com
Mon May 6 16:15:21 EDT 2019
Hi,
This is another update on GrantProposals-2018Q2 #25 "review, tweaks, and
maybe design of a new PoW scheme for Zcash."
https://github.com/ZcashFoundation/GrantProposals-2018Q2/issues/25
This time, I thought of and experimented with a floating-point hack of
ProgPoW. As expected, it's complicated to get it right. I arrived at
some conclusions I recently posted as comments to:
On Sat, Apr 06, 2019 at 04:12:20PM +0200, Solar Designer wrote:
> "Make greater use of MADs"
> https://github.com/ifdefelse/ProgPOW/issues/34
Specifically, to reasonably switch to using FP it appears that ProgPoW
needs to be redesigned to no longer have its one and the same inner
loop, but to have a larger "unrolled" function generated by its
"compiler", where floating-point constraints could be enforced at
irregular intervals just as needed to meet the constraints on never
potentially getting unsafe values (uncertainty or fatal entropy loss).
Also, to continue indexing the ProgPoW cache - which is an important
component of ProgPoW's computational cost - a few (perhaps very few)
registers would need to remain with integers. I included more detail,
along with pros and cons of this approach, in a comment to the issue.
As I shared in another comment, while staying with integers only I'm
only able to get to a (geometric) middle point between official
ProgPoW's MUL/s rate and theoretical maximum FP32 MUL/s rate: ~7x
higher than original, but still ~9x lower than theoretical maximum.
(Of course, that theoretical maximum is for FP32 and not INT32, and it
assumes not doing any other work, whereas we need to do lots of that.)
> On (repairing) Ethash's and ProgPoW's performance drop on older GPUs:
[...]
> I experimented with it some more, and got success at recovering the
> speed on NVIDIA Maxwell (aka GTX 9xx series GPUs, or two generations
> behind from latest RTX 2xxx):
>
> https://github.com/ifdefelse/ProgPOW/issues/26#issuecomment-480382319
>
> Specifically, combining a minor cleanup to untie the different
> parameters, a parameters tweak, and a code hack (not yet final, but
> works for proof-of-concept), I got a 3x+ speedup on Titan X Maxwell
I've started to maintain an unofficial fork of ProgPoW with a more
elaborate revision of this change and more in the "maxwell" branch here:
https://github.com/solardiz/ProgPOW
Another change I included is "Index cache with byte offsets", which also
breaks compatibility with official ProgPoW, but I hope will be accepted
upstream (for a new spec revision) as it provides a 1% speedup on new
GPUs (and further 2% speedup on Maxwell) with no ill effects, as far as
I can tell. I propose it upstream here:
https://github.com/ifdefelse/ProgPOW/issues/40
Yet another topic I arrived at an opinion on and brought up for
discussion with upstream is "Make cache content vary per-hash":
https://github.com/ifdefelse/ProgPOW/issues/41
Both of these are described in detail on the issues above, with
rationale and pros and cons (well, there are no cons for #40).
In related news, I heard of a planned ProgPoW audit by Least Authority:
https://github.com/ethereum-cat-herders/progpow-audit
https://medium.com/ethereum-cat-herders/progpow-audit-goals-expectations-75bb902a1f01
I'm in contact with Least Authority via e-mail, but as of this writing I
don't know what that project's status is - including not whether it got
funded to a sufficient extent or not.
Regardless, as I tweeted last month, this is similar to yet different
from my work. As I understand, theirs will focus on ProgPoW as-is,
whereas I focus on tweaking ProgPoW. I think the projects can co-exist.
Alexander
More information about the general
mailing list