smhasher
Hashids.java
Our great sponsors
smhasher | Hashids.java | |
---|---|---|
30 | 31 | |
1,690 | 1,012 | |
- | 0.3% | |
7.1 | 0.0 | |
about 2 months ago | 6 months ago | |
C++ | Java | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
smhasher
-
GxHash - A new (extremely) fast and robust hashing algorithm 🚀
The algorithm passes all SMHasher quality tests and uses rounds of AES block cipher internally, so it is quite robust! For comparison XxH3, t1ha0 and many others don't pass SMHasher (while being slower).
-
The PolymurHash universal hash function
Confirmed, I tested it. https://github.com/rurban/smhasher
-
Show HN: Discohash – simply, quality, fast hash
There's lots of great hash functions out there: some are super fast, like xxhash and highly optimized, others are also super fast umash and based on interesting math ideas from finite fields^1, while maintaining high quality (according to SMHasher). Others are also fast and interesting (tabulation hash, that may sometimes be seemingly universal), one of the main originators of those ideas are Mikkel Thorup^2. Anyway, a couple of years ago I also tried my hand at building hashes and created a few that passed SMHasher (tifuhash ~ a floating point hash, beamsplitter - a seemingly-universal tabulation style hash, and this one discohash - a "more traditional" ARX-based design (addition rotation xor)^3 ).
0: https://github.com/rurban/smhasher/blob/master/xxh3.h
1: https://pvk.ca/Blog/2022/12/29/fixing-hashing-modulo-alpha-e...
2: https://arxiv.org/abs/1505.01523
3: https://eprint.iacr.org/2018/898.pdf https://crypto.polito.it/content/download/480/2850/file/docu...
4: https://en.wikipedia.org/wiki/BLAKE_(hash_function)
Discohash (posted here) is the fastest one I made, it's simple and doesn't rely on any arch-specific optimizations or vector instructions (AVX etc ~ tho I suppose...they could be added? I'm definitely no expert in them, I barely get away with doing the C/C++ implementations!)
The main mixing round function is:
mix(const int A) {
-
A Vulnerability in Implementations of SHA-3, Shake, EdDSA
ubsan, asan, valgrind tests are missing. some do offer symbolic verification of the algo, but not the implementations.
See my https://github.com/rurban/smhasher#crypto paragraph, and
-
Academic Urban Legends
The spinach story reminds me a lot on the false recommendation of siphash for hash table DDOS prevention. https://github.com/rurban/smhasher#security
The authors came up in their widely cited paper with a proper solution to spread the random hash seed into the inner loop, vastly enhancing its security by avoiding trivial hash collision attacks. But a secure, slow hash function can never prevent from normal hash seed attacks, when the random seed is known somehow. esp. with dynamic languages it's trivial to get the seed externally.
Other trivial countermeasures must be used then, which also don't make hash tables 10x slower, keeping them practical.
- SHA-1 is out. NIST recommends switching to the SHA-2 and SHA-3 groups of hash algorithms as soon as possible, with an official deadline of Dec. 31, 2030.
- Adventures in Advent of Code
-
New ScyllaDB Go Driver: Faster Than GoCQL and Its Rust Counterpart
This is the best, most comprehensive hash test suite I know of: https://github.com/rurban/smhasher/
you might want to particularly look into murmur, spooky, and metrohash. I'm not exactly sure of what the tradeoffs involved are, or what your need is, but that site should serve as a good starting point at least.
-
What do you typically use for non-cryptographic hash functions?
Here is a good comparison table, as you can see, BLAKE can perform in secure way much faster than crc32, so my original point, - to use non weak hashes unless you really have a reason/requirement not to do so
-
What hash function you use for hash maps / hash tables?
smhasher is a great place to testing results for a massive number of hash algorithms.
Hashids.java
- Hashids: Generate short unique ids from integers
-
Auto Generate Sequential UIID
You basically want Hashids but sequential? Why not simple convert a base 10 (0-9) number to hex? (0-F)
-
Features I'd Like in PostgreSQL
I found hashids [1] to be a great compromise between integer ids in the database and copyable non-enumerable strings on the client.
[1] https://hashids.org/
- Short, friendly base32 slugs from timestamps
-
We Chose NanoIDs for PlanetScale’s API
I wonder how this might compare to just storing regular autoincrementing ints in the database, and converting to/from hashids (https://hashids.org/) at the edge. It eliminates the collision concern and stores more compactly at the cost of a tiny amount of encode/decode when processing requests. You’d want to push it down as close to the database layer as possible to avoid inadvertent int ID leaks; I added native hashids support to clickhouse but I’m not sure what other database support might entail.
-
How can I generate truly unique slugs?
Since hashids are not really hashes and are not secure at all this is not even achieved. Hashids can be easily decoded without the salt by a simple brute-force attack described by the authors of hashid themselves right on their website: https://hashids.org/
-
How to handle id-based routes with UUID
You don't necessarily need to use UUIDs. You could use something like Hashids to generate random strings from your sequential IDs in a reversible way, so that users can't predict what their values will be, but you can decode them as needed.
-
All of my database models have id replaced with UUID4s. Is there any risk to using these in URLs?
You should not use UUIDv4 as a primary key. You can use normal int values and then use hashids to make them safe for URL. UUIDv7 might be good to use as well once they are more widely supported as well.
- What’s Django’s argument for using 64-bit int as default pk over uuid. Can anyone point me to something I can read?
- Library for generating string IDs from number IDs
What are some alternatives?
xxHash - Extremely fast non-cryptographic hash algorithm
BLAKE3 - the official Rust and C implementations of the BLAKE3 cryptographic hash function
wyhash - The FASTEST QUALITY hash function, random number generators (PRNG) and hash map.
uuid7 - UUID version 7, which are time-sortable (following the Peabody RFC4122 draft)
Guava - Google core libraries for Java
png-decoder - A pure-Rust, no_std compatible PNG decoder
JGit - JGit project repository (jgit)
rustls - A modern TLS library in Rust
Embulk - Embulk: Pluggable Bulk Data Loader.
Halide - a language for fast, portable data-parallel computation
JADE - a pug implementation written in Java (formerly known as jade)