smhasher
xxHash
Our great sponsors
smhasher | xxHash | |
---|---|---|
30 | 28 | |
1,690 | 8,462 | |
- | - | |
7.1 | 8.4 | |
about 2 months ago | 5 days ago | |
C++ | C | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
smhasher
-
GxHash - A new (extremely) fast and robust hashing algorithm 🚀
The algorithm passes all SMHasher quality tests and uses rounds of AES block cipher internally, so it is quite robust! For comparison XxH3, t1ha0 and many others don't pass SMHasher (while being slower).
-
The PolymurHash universal hash function
Confirmed, I tested it. https://github.com/rurban/smhasher
-
Show HN: Discohash – simply, quality, fast hash
There's lots of great hash functions out there: some are super fast, like xxhash and highly optimized, others are also super fast umash and based on interesting math ideas from finite fields^1, while maintaining high quality (according to SMHasher). Others are also fast and interesting (tabulation hash, that may sometimes be seemingly universal), one of the main originators of those ideas are Mikkel Thorup^2. Anyway, a couple of years ago I also tried my hand at building hashes and created a few that passed SMHasher (tifuhash ~ a floating point hash, beamsplitter - a seemingly-universal tabulation style hash, and this one discohash - a "more traditional" ARX-based design (addition rotation xor)^3 ).
0: https://github.com/rurban/smhasher/blob/master/xxh3.h
1: https://pvk.ca/Blog/2022/12/29/fixing-hashing-modulo-alpha-e...
2: https://arxiv.org/abs/1505.01523
3: https://eprint.iacr.org/2018/898.pdf https://crypto.polito.it/content/download/480/2850/file/docu...
4: https://en.wikipedia.org/wiki/BLAKE_(hash_function)
Discohash (posted here) is the fastest one I made, it's simple and doesn't rely on any arch-specific optimizations or vector instructions (AVX etc ~ tho I suppose...they could be added? I'm definitely no expert in them, I barely get away with doing the C/C++ implementations!)
The main mixing round function is:
mix(const int A) {
-
A Vulnerability in Implementations of SHA-3, Shake, EdDSA
ubsan, asan, valgrind tests are missing. some do offer symbolic verification of the algo, but not the implementations.
See my https://github.com/rurban/smhasher#crypto paragraph, and
-
Academic Urban Legends
The spinach story reminds me a lot on the false recommendation of siphash for hash table DDOS prevention. https://github.com/rurban/smhasher#security
The authors came up in their widely cited paper with a proper solution to spread the random hash seed into the inner loop, vastly enhancing its security by avoiding trivial hash collision attacks. But a secure, slow hash function can never prevent from normal hash seed attacks, when the random seed is known somehow. esp. with dynamic languages it's trivial to get the seed externally.
Other trivial countermeasures must be used then, which also don't make hash tables 10x slower, keeping them practical.
- SHA-1 is out. NIST recommends switching to the SHA-2 and SHA-3 groups of hash algorithms as soon as possible, with an official deadline of Dec. 31, 2030.
- Adventures in Advent of Code
-
New ScyllaDB Go Driver: Faster Than GoCQL and Its Rust Counterpart
This is the best, most comprehensive hash test suite I know of: https://github.com/rurban/smhasher/
you might want to particularly look into murmur, spooky, and metrohash. I'm not exactly sure of what the tradeoffs involved are, or what your need is, but that site should serve as a good starting point at least.
-
What do you typically use for non-cryptographic hash functions?
Here is a good comparison table, as you can see, BLAKE can perform in secure way much faster than crc32, so my original point, - to use non weak hashes unless you really have a reason/requirement not to do so
-
What hash function you use for hash maps / hash tables?
smhasher is a great place to testing results for a massive number of hash algorithms.
xxHash
-
The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
> GPU Hash Table?
How bad would performance have suffered if you sha256'd the lines to build the map? I'm going to guess "badly"?
Maybe something like this in CUDA: https://github.com/Cyan4973/xxHash ?
- ETag and HTTP Caching
-
Day 64: Implementing a basic Bloom Filter Using Java BitSet api
Examples of fast, simple hashes that are independent enough includes murmur, xxHash, Fowler–Noll–Vo hash function and many others
- Closed-addressing hashtables implementation
-
NIST Retires SHA-1 Cryptographic Algorithm
If you're only using the hash for non-cryptographic applications, there are much faster hashes: https://github.com/Cyan4973/xxHash
-
Does the checksum algorithm crc32c-intel support AMD Ryzen series 3000 or newer?
I found the benchmark result of AMD ryzen 5950X
-
[Study Project] A memory-optimized JSON data structure
But what's the catch, you're thinking ? Well, it is a bit slower than its counterparts when it comes to deserializing (and marginally faster for serializing). To achieve smaller footprint, it uses a few tricks and notably a custom hash table to deduplicate strings. This comes at a cost of course (even when featuring xxHash to speed things up), but keeps the slowdown reasonable (I think).
-
What do you typically use for non-cryptographic hash functions?
Non cryptographic hashes has collisions, for example, assume you having content like "abcdefg" which hashed value is "123", in case of weak hash algorithm some other content like "abcdefZ" can also have a hash "123" which basically means such hash function is failed to be unique fingerprint of particular content. BLAKE3 for example can do 6-7Gb/s which make it pretty fast and secure. If your requirement accepts collision with defined error rate, I would advise you to take a look at XXH3 if you need very snappy hash algorithm, which can run at pace or RAM access (30GB/s+), but again, run tests at particular equipment you targeting, may be AES hardware accelerated MeowHash will serve you better.
- C++ gonna die😥
- rsync, article 3: How does rsync work?
What are some alternatives?
wyhash - The FASTEST QUALITY hash function, random number generators (PRNG) and hash map.
BLAKE3 - the official Rust and C implementations of the BLAKE3 cryptographic hash function
meow_hash - Official version of the Meow hash, an extremely fast level 1 hash
Hashids.java - Hashids algorithm v1.0.0 implementation in Java
xxh - 🚀 Bring your favorite shell wherever you go through the ssh. Xonsh shell, fish, zsh, osquery and so on.
png-decoder - A pure-Rust, no_std compatible PNG decoder
blake3 - An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
rustls - A modern TLS library in Rust
swift-crypto - Open-source implementation of a substantial portion of the API of Apple CryptoKit suitable for use on Linux platforms.
Halide - a language for fast, portable data-parallel computation
PostgreSQL - Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch