equihash/blake2-avx2
John Tromp 525a16a1ff replace USE_AVX2 by NBLAKES; try x8 as well as x4 2016-11-11 23:17:01 -05:00
..
LICENSE try make blake2bip work 2016-10-26 21:52:46 -04:00
Makefile try make blake2bip work 2016-10-26 21:52:46 -04:00
README.md try make blake2bip work 2016-10-26 21:52:46 -04:00
blake2.h try make blake2bip work 2016-10-26 21:52:46 -04:00
blake2b-common.h try make blake2bip work 2016-10-26 21:52:46 -04:00
blake2b-load-avx2-simple.h try make blake2bip work 2016-10-26 21:52:46 -04:00
blake2bip.c replace USE_AVX2 by NBLAKES; try x8 as well as x4 2016-11-11 23:17:01 -05:00
blake2bip.h optimize dupe test 2016-11-10 21:34:16 -05:00

README.md

BLAKE2 AVX2 implementations

This is experimental code implementing BLAKE2 using the AVX2 instruction set present in the Intel Haswell and later microarchitectures.

It currently implements BLAKE2b, BLAKE2bp, and BLAKE2sp using 3 similar but slightly different approaches: one lets the compiler choose how to permute the message, another one does it manually, and the final one uses the gather instructions introduced with AVX2. Current recorded speeds for long messages are:

  • 3.19 cycles per byte on Haswell for BLAKE2b;
  • 1.45 cycles per byte on Haswell for BLAKE2bp;
  • 1.56 cycles per byte on Haswell for BLAKE2sp.