wyhash
aHash
Our great sponsors
wyhash | aHash | |
---|---|---|
9 | 11 | |
899 | 912 | |
- | - | |
6.6 | 7.2 | |
about 2 months ago | 8 days ago | |
C | Rust | |
The Unlicense | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wyhash
-
What hash function you use for hash maps / hash tables?
I recently switched to wyhash as it seems to have a good combination of speed and stability.
-
Are there any weaker hashes than MD5, but still randomly distributed?
wyhash is a decent option for if you don't need a cryptographical quality hash
-
Hacker News top posts: Mar 15, 2021
New Bare Hash Map: 2X-3X Speedup over SOTA\ (32 comments)
-
New Bare Hash Map: 2X-3X Speedup over SOTA
I feel like you’d want something a bit safer than “we don’t store the keys and just rely on the hash to be really good” [1], putting “please do not use this for serious tasks” in a comment embedded in the header file isn’t a clear enough warning.
It’s not clear to me that that probability of collision assumptions hold. It’s basically assuming that the hashing is perfect and distributes any inputs to the full 64-bit space with uniform probability. That’s the usual hash map / randomized algorithm hope, but does BigCrush or similar avalanche testing really prove that? (Presumably not, otherwise there wouldn’t be image attacks for things like md5).
[1] https://github.com/wangyi-fudan/wyhash/blob/d2a305811972f391...
aHash
-
I wrote kubernetes admission controller in Rust. And it's blazingly fast!
If you find yourself in a situation where you've got some kind of HashMap in your JSON data, try using ahash as the hasher... either via the ready-made ahash::AHashMap or via something like type AHashMap = std::collections::HashMap; if you're using something like serde_with which doesn't like the ready-made one.
-
New ScyllaDB Go Driver: Faster Than GoCQL and Its Rust Counterpart
aHash claims it is faster than t1ha[1].
The t1ha crate also hasn't been updated in over three years so the benchmark in this link should be current.
[1] https://github.com/tkaitchuck/aHash/blob/master/compare/read...
There are probably a bunch of reasons, which is why I want an easy "run benchmarks" command that I can use. I'd even be fine using infra so long as I had pulumi/terraform to set it all up for me.
I just don't want to spin up EC2 instances manually, get the connections all working, make sure I can reset state, etc.
I already have a fork of Scylla where I removed a lot of unnecessary cloning of `String` but no way I'm gonna PR it without a benchmark.
I also opened a PR to replace the hash algorithm used in their PreparedStatement cache, which gets hit for every query, but they wanted benchmarks before accepting (completely fair) and I have none. `ahash` is extremely fast compared to Rust's default - https://github.com/tkaitchuck/ahash and with the `comptime` randomness (more than sufficient for the scylla use case) you can avoid a system call when creating the HashMap.
-
The quick and practical “MSI” hash table
When I recently went shopping for fast hashes for short strings, I settled on wyhash, but ahash[0] seemed like it would have been better if I had bothered to port it from Rust.
> In that time you can FNV-1a a "short" string.
Not if you read it one byte at a time like in TFA!
It looks like the best FNV for short strings in smhasher[1] is comparably fast to ahash[2] on short strings, but I proposed doing slightly less work than ahash.
> From the top of my head, t1ha, falkhash, meowhash and metrohash are using AES-NI and none of them are particularly fast on short inputs, and at least two of them have severe issues, despite guarding against lots of vulnerabilities, which your construction does not.
For issues like reading past the ends of buffers and discarding the extra values, it would be nice if programmers could arrange to have buffers that could be used this way. I posted a thing for hashing strings of a fixed length though, to compare with the thing for hashing strings of a fixed length in TFA.
[0]: https://github.com/tkaitchuck/aHash/blob/master/src/aes_hash...
[1]: https://github.com/rurban/smhasher/blob/master/doc/FNV1a_YT....
[2]: https://github.com/rurban/smhasher/blob/master/doc/ahash64.t...
-
Lox interpreter in Rust slower than in Java
Regarding the hashing function: I'll already tried using aHash which sped thing things up but not by a lot.
-
Any "surprises" in Rust to be aware of?
aHash has a very good comparison doc: https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md (Personally, I use it more to compare non-aHash hashes than to aHash; aHash has no reason to be biased between other hashes, though it does have reason to be biased for itself. I trust their analysis to not be biased, but it's always better to be more sure.)
-
New Bare Hash Map: 2X-3X Speedup over SOTA
Apparently there is a patch for the SMHasher here which adds support for ahash:
https://github.com/tkaitchuck/aHash/tree/master/smhasher
There are also ahash's own benchmarks here:
https://github.com/tkaitchuck/aHash/blob/master/compare/test...
They use the wyhash Rust crate, so if wyhash itself was updated doing a head to head comparison would boil down to updating the wyhash crate and rerunning ahash's benchmark suite.
What are some alternatives?
smhasher - Hash function quality and speed tests
meow_hash - Official version of the Meow hash, an extremely fast level 1 hash
smhasher - Automatically exported from code.google.com/p/smhasher
leocad - A CAD application for creating virtual LEGO models
wyhash-rs - wyhash fast portable non-cryptographic hashing algorithm and random number generator in Rust
Mersenne-Twister-in-Python - A Mersenne Twister Random Number Generator
countwords - Playing with counting word frequencies (and performance) in various languages.
houndsniff - Hash identification program.
securitytxt.org - Static website for security.txt.