FAQ: Why t1ha don’t follow NH-approach like FARSH, XXH3, HighwayHash and so on?
Okay, just for clarity, we should distinguish functions families: MMH (Multilinear-Modular-Hashing), NMH (Non-linear Modular-Hashing) and the next simplification step UMAC’s NH.
Now take a look to NH hash-function family definition:
It is very SIMD-friendly, since SSE2’s _mm_add_epi32()
and
_mm_mul_epu32()
is enough for .
On the other hand, the result of the inner multiplication becomes zero
when (m[2i] + k[2i]) mod 2^32 == 0 or (m[2i+1] + k[2i+1]) mod
2^32 == 0, in which case the opposite multiplier will not affect the
result of hashing, i.e. NH function just ignores part of the input data.
I called this an “blinding multiplication”. That’s all.
More useful related information can be googled by “UMAC NH key
recovery
attack”.
The right NMH/NH code without entropy loss should be looking like this:
uint64_t proper_NH_block(const uint32_t *M /* message data */,
const uint64_t *K /* 64-bit primes */,
size_t N_even, uint64_t optional_weak_seed) {
uint64_t H = optional_weak_seed;
for (size_t i = 0; i < N_even / 2; ++i)
H += (uint64_t(M[i*2]) + K[i*2]) * (uint64_t(M[i*2+1]) + K[i*2+1]);
return H;
}
Planned: t1ha4
= 128 and 256 bits, fast insecure fingerprinting
Planned: t1ha5
= 256 bits, fast Cryptographic, but with some limitations
Planned: t1ha6
= 256 and 512 bits, Cryptographic with reasonable resistance to acceleration on GPU and FPGA.
Planned: t1ha7
= 256, 512 and 1024 bits, Cryptographic, Strong Post-Quantum
Acknowledgement:
The t1ha was originally developed by Leonid Yuriev (Леонид Юрьев) for The 1Hippeus project - zerocopy messaging in the spirit of Sparta!
The --hash-stdin-strings
option
One noteable option is --hash-stdin-strings
, it intended to estimate hash collisions on your custom data.
With this option test
tool will hash each line from standard input and print its hash to standard output.
For instance, you could count collisions for lines from some words.list
file by bash’s command:
./t1ha/test --hash-stdin-strings < words.list | sort | uniq -c -d | wc -l
More complex example - count xxhash()
collisions for lines from words.list
and 0…10000 numbers,
with distinction only in 32 bit of hash values:
(cat words.list && seq 0 10000) | \
./t1ha/test --xxhash --hash-stdin-strings | \
cut --bytes=-8 | sort | uniq -c -d | wc -l
SMHasher
SMHasher is a wellknown test suite designed to test the distribution, collision, and performance properties of non-cryptographic hash functions.
Reini Urban provides extended version/fork of SMHasher which integrates a lot of modern hash functions, including t1ha.
So, the quality and speed of t1ha can be easily checked with the following scenario:
git clone https://github.com/rurban/smhasher
cd smhasher
cmake .
make
./SMHasher City64
./SMHasher metrohash64_1
./SMHasher xxHash64
...
./SMHasher t1ha
For properly performance please use at least GCC 5.5, Clang 6.0 or Visual Studio 2017.
Scores
Please take in account that the results is significantly depend on actual CPU, compiler version and CFLAGS. The results below were obtained in 2016 with:
- CPU:
Intel(R) Core(TM) i7-6700K CPU
; - Compiler:
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
; - CFLAGS:
-march=native -O3 -fPIC
;
The SMALL KEYS case
Order by average Cycles per Hash for 1..31 bytes (less is better).
Function | MiB/Second | Cycles/Hash | Notes (quality, portability) |
---|---|---|---|
donothing | 15747227.36 | 6.00 | not a hash (just for reference) |
sumhash32 | 43317.86 | 16.69 | not a hash (just for reference) |
FNV1a_YoshimitsuTRIAD | 13000.49 | 24.96 | poor (100% bias, collisions, distrib) |
crc64_hw | 7308.06 | 28.37 | poor (insecure, 100% bias, collisions, distrib), non-portable (SSE4.2) |
crc32_hw | 5577.64 | 29.10 | poor (insecure, 100% bias, collisions, distrib), non-portable (SSE4.2) |
NOP_OAAT_read64 | 1991.31 | 30.46 | poor (100% bias, 2.17x collisions) |
Crap8 | 2743.80 | 32.50 | poor (2.42% bias, collisions, 2% distrib) |
t1ha_aes | 34636.42 | 33.03 | non-portable (AES-NI) |
t1ha | 12228.80 | 35.55 | |
MUM | 10246.20 | 37.25 | non-portable (different result, machine specific) |
Murmur2 | 2789.89 | 38.37 | poor (1.7% bias, 81x coll, 1.7% distrib) |
t1ha_32le | 5958.54 | 38.54 | alien (designed for 32-bit CPU) |
t1ha_64be | 9321.23 | 38.29 | alien (designed for big-endian CPU) |
lookup3 | 1817.11 | 39.30 | poor (28% bias, collisions, 30% distrib) |
t1ha_32be | 5873.45 | 39.81 | alien (designed for 32-bit big-endian CPU) |
Murmur2C | 3655.60 | 42.68 | poor (91% bias, collisions, distrib) |
fasthash64 | 5578.06 | 43.42 | |
Murmur2A | 2789.85 | 43.38 | poor (12.7% bias) |
xxHash32 | 5513.55 | 43.72 | |
Murmur2B | 5578.21 | 44.13 | weak (1.8% bias, collisions, distrib) |
fasthash32 | 5381.46 | 45.50 | |
cmetrohash64_1_optshort | 11808.92 | 46.33 | seems weak (likely cyclic collisions) |
metrohash64_2 | 12113.12 | 46.88 | seems weak (likely cyclic collisions) |
cmetrohash64_1 | 12081.32 | 47.28 | seems weak (likely cyclic collisions) |
metrohash64_1 | 12024.68 | 47.21 | seems weak (likely cyclic collisions) |
Murmur3F | 5473.62 | 47.37 | |
superfast | 1860.25 | 47.45 | poor (91% bias, 5273.01x collisions, 37% distrib) |
cmetrohash64_2 | 12052.58 | 48.66 | |
Murmur3A | 2232.00 | 48.16 | |
City32 | 5014.33 | 51.13 | far to perfect (2 minor collisions) |
City64 | 11041.72 | 51.77 | |
metrohash64crc_2 | 20582.76 | 51.39 | seems weak (likely cyclic collisions), non-portable (SSE4.2) |
sumhash | 9668.13 | 51.31 | not a hash (just for reference) |
metrohash64crc_1 | 21319.23 | 52.36 | weak (cyclic collisions), non-portable (SSE4.2) |
PMurHash32 | 2232.26 | 53.18 | |
Murmur3C | 3719.22 | 54.05 | |
bernstein | 921.43 | 55.17 | poor (100% bias, collisions, distrib) |
xxHash64 | 11123.15 | 56.17 | |
Spooky32 | 11464.20 | 59.45 | |
City128 | 12551.54 | 60.93 | |
FarmHash64 | 12145.36 | 60.12 | non-portable (SSE4.2) |
Spooky128 | 11735.99 | 60.45 | weak (collisions with 4bit diff) |
Spooky64 | 11820.20 | 60.39 | |
CityCrc128 | 14821.82 | 62.38 | non-portable (SSE4.2) |
MicroOAAT | 826.32 | 62.06 | poor (100% bias, distrib) |
metrohash128_1 | 11063.78 | 66.58 | seems weak (likely cyclic collisions) |
metrohash128_2 | 11465.18 | 66.72 | weak (cyclic collisions) |
GoodOAAT | 930.18 | 68.24 | |
metrohash128crc_1 | 21322.80 | 70.33 | seems weak (likely cyclic collisions), non-portable (SSE4.2) |
metrohash128crc_2 | 20990.70 | 70.40 | seems weak (likely cyclic collisions), non-portable (SSE4.2) |
farmhash64_c | 12033.13 | 71.30 | non-portable (SSE4.2) |
sdbm | 695.29 | 71.76 | poor (100% bias, collisions, distrib) |
FNV1a | 684.17 | 72.75 | poor (zeros, 100% bias, collisions, distrib) |
FNV64 | 697.67 | 72.70 | poor (100% bias, collisions, distrib) |
FarmHash128 | 12515.98 | 77.43 | non-portable (SSE4.2) |
hasshe2 | 2587.39 | 81.23 | poor (insecure, 100% bias, collisions, distrib), non-portable (SSE2) |
BadHash | 558.14 | 87.87 | not a hash (just for reference) |
x17 | 551.99 | 89.24 | poor (99.98% bias, collisions, distrib) |
JenkinsOOAT_perl | 558.14 | 95.26 | poor (1.5-11.5% bias, 7.2x collisions) |
farmhash128_c | 12709.06 | 96.42 | non-portable (SSE4.1) |
MurmurOAAT | 465.12 | 107.61 | poor (collisions, 99.99% distrib) |
JenkinsOOAT | 558.13 | 116.75 | poor (53.5% bias, collisions, distrib) |
falkhash | 8909.54 | 124.48 | non-portable (AES-NI) |
crc32 | 342.27 | 142.06 | poor (insecure, 8589.93x collisions, distrib) |
SipHash | 962.35 | 147.36 | |
md5_32a | 433.03 | 508.98 | |
sha1_32a | 531.44 | 1222.44 |
The LARGE KEYS case
Order by hashing speed in Mi-bytes (2^20 = 1048576) per second for 262144-byte block (more is better).
Function | MiB/Second | Cycles/Hash | Notes (quality, portability) |
---|---|---|---|
donothing | 15747227.36 | 6.00 | not a hash (just for reference) |
sumhash32 | 43317.86 | 16.69 | not a hash (just for reference) |
t1ha_aes | 34636.42 | 33.03 | non-portable (AES-NI) |
metrohash128crc_1 | 21322.80 | 70.33 | seems weak (likely cyclic collisions), non-portable (SSE4.2) |
metrohash64crc_1 | 21319.23 | 52.36 | seems weak (cyclic collisions), non-portable (SSE4.2) |
metrohash128crc_2 | 20990.70 | 70.40 | seems weak (likely cyclic collisions), non-portable (SSE4.2) |
metrohash64crc_2 | 20582.76 | 51.39 | seems weak (likely cyclic collisions), non-portable (SSE4.2) |
CityCrc128 | 14821.82 | 62.38 | non-portable (SSE4.2) |
FNV1a_YoshimitsuTRIAD | 13000.49 | 24.96 | poor (100% bias, collisions, distrib) |
farmhash128_c | 12709.06 | 96.42 | non-portable (SSE4.1) |
City128 | 12551.54 | 60.93 | |
FarmHash128 | 12515.98 | 77.43 | non-portable (SSE4.2) |
t1ha | 12228.80 | 35.55 | |
FarmHash64 | 12145.36 | 60.12 | non-portable (SSE4.2) |
metrohash64_2 | 12113.12 | 46.88 | seems weak (likely cyclic collisions) |
cmetrohash64_1 | 12081.32 | 47.28 | seems weak (likely cyclic collisions) |
cmetrohash64_2 | 12052.58 | 48.66 | seems weak (likely cyclic collisions) |
farmhash64_c | 12033.13 | 71.30 | non-portable (SSE4.2) |
metrohash64_1 | 12024.68 | 47.21 | seems weak (likely cyclic collisions) |
Spooky64 | 11820.20 | 60.39 | |
cmetrohash64_1_optshort | 11808.92 | 46.33 | seems weak (likely cyclic collisions) |
Spooky128 | 11735.99 | 60.45 | weak (collisions with 4-bit diff) |
metrohash128_2 | 11465.18 | 66.72 | weak (cyclic collisions) |
Spooky32 | 11464.20 | 59.45 | |
xxHash64 | 11123.15 | 56.17 | |
metrohash128_1 | 11063.78 | 66.58 | seems weak (likely cyclic collisions) |
City64 | 11041.72 | 51.77 | |
MUM | 10246.20 | 37.25 | non-portable (different result, machine specific) |
sumhash | 9668.13 | 51.31 | not a hash (just for reference) |
t1ha_64be | 9321.23 | 38.29 | alien (designed for big-endian CPU) |
falkhash | 8909.54 | 124.48 | non-portable (AES-NI) |
crc64_hw | 7308.06 | 28.37 | poor (insecure, 100% bias, collisions, distrib), non-portable (SSE4.2) |
t1ha_32le | 5958.54 | 38.54 | alien (designed for 32-bit CPU) |
t1ha_32be | 5873.45 | 39.81 | alien (designed for 32-bit big-endian CPU) |
fasthash64 | 5578.06 | 43.42 | |
Murmur2B | 5578.21 | 44.13 | weak (1.8% bias, collisions, distrib) |
crc32_hw | 5577.64 | 29.10 | poor (insecure, 100% bias, collisions, distrib), non-portable (SSE4.2) |
xxHash32 | 5513.55 | 43.72 | |
Murmur3F | 5473.62 | 47.37 | |
fasthash32 | 5381.46 | 45.50 | |
City32 | 5014.33 | 51.13 | far to perfect (2 minor collisions) |
Murmur3C | 3719.22 | 54.05 | |
Murmur2C | 3655.60 | 42.68 | poor (91% bias, collisions, distrib) |
Murmur2 | 2789.89 | 38.37 | poor (1.7% bias, 81x coll, 1.7% distrib) |
Murmur2A | 2789.85 | 43.38 | poor (12.7% bias) |
Crap8 | 2743.80 | 32.50 | poor (2.42% bias, collisions, 2% distrib) |
hasshe2 | 2587.39 | 81.23 | poor (insecure, 100% bias, collisions, distrib), non-portable (SSE2) |
Murmur3A | 2232.00 | 48.16 | |
PMurHash32 | 2232.26 | 53.18 | |
NOP_OAAT_read64 | 1991.31 | 30.46 | poor (100% bias, 2.17x collisions) |
superfast | 1860.25 | 47.45 | poor (91% bias, 5273.01x collisions, 37% distrib) |
lookup3 | 1817.11 | 39.30 | poor (28% bias, collisions, 30% distrib) |
SipHash | 962.35 | 147.36 | |
GoodOAAT | 930.18 | 68.24 | |
bernstein | 921.43 | 55.17 | poor (100% bias, collisions, distrib) |
MicroOAAT | 826.32 | 62.06 | poor (100% bias, distrib) |
FNV64 | 697.67 | 72.70 | poor (100% bias, collisions, distrib) |
sdbm | 695.29 | 71.76 | poor (100% bias, collisions, distrib) |
FNV1a | 684.17 | 72.75 | poor (zeros, 100% bias, collisions, distrib) |
BadHash | 558.14 | 87.87 | not a hash (just for reference) |
JenkinsOOAT | 558.13 | 116.75 | poor (53.5% bias, collisions, distrib) |
JenkinsOOAT_perl | 558.14 | 95.26 | poor (1.5-11.5% bias, 7.2x collisions) |
x17 | 551.99 | 89.24 | poor (99.98% bias, collisions, distrib) |
sha1_32a | 531.44 | 1222.44 | |
MurmurOAAT | 465.12 | 107.61 | poor (collisions, 99.99% distrib) |
md5_32a | 433.03 | 508.98 | |
crc32 | 342.27 | 142.06 | poor (insecure, 8589.93x collisions, distrib) |