New Bare Hash Map: 2X-3X Speedup over SOTA

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • wyhash

    The FASTEST QUALITY hash function, random number generators (PRNG) and hash map.

  • I feel like you’d want something a bit safer than “we don’t store the keys and just rely on the hash to be really good” [1], putting “please do not use this for serious tasks” in a comment embedded in the header file isn’t a clear enough warning.

    It’s not clear to me that that probability of collision assumptions hold. It’s basically assuming that the hashing is perfect and distributes any inputs to the full 64-bit space with uniform probability. That’s the usual hash map / randomized algorithm hope, but does BigCrush or similar avalanche testing really prove that? (Presumably not, otherwise there wouldn’t be image attacks for things like md5).

    [1] https://github.com/wangyi-fudan/wyhash/blob/d2a305811972f391...

  • meow_hash

    Official version of the Meow hash, an extremely fast level 1 hash

  • Meow hash claims 3-4x faster hashing over this, still passes smhasher, and is a few years old. https://mollyrocket.com/meowhash

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • aHash

    aHash is a non-cryptographic hashing algorithm that uses the AES hardware instruction

  • Apparently there is a patch for the SMHasher here which adds support for ahash:

    https://github.com/tkaitchuck/aHash/tree/master/smhasher

    There are also ahash's own benchmarks here:

    https://github.com/tkaitchuck/aHash/blob/master/compare/test...

    They use the wyhash Rust crate, so if wyhash itself was updated doing a head to head comparison would boil down to updating the wyhash crate and rerunning ahash's benchmark suite.

  • smhasher

    Automatically exported from code.google.com/p/smhasher (by injinj)

  • The meow 0.4 was faster at short keys, but failed at the smhasher "LongNeighborTest" [1]. However, doubling the AES rounds makes it pass that test. Two rounds is enough for full diffusion in AES [2]. I recently looked at computing 4 Meow keys per hash function [3], and found the speedup to be almost 2x in a microbench. That puts it in rare territory for hash speed.

    [1] https://github.com/injinj/smhasher/

    [2] Section 5.4 of Introduction to Cryptography by Trappe and Washington -- It can be shown that two rounds are sufficient to obtain full diffusion, namely, each of the 128 output bits depends on each of the 128 input bits.

    [3] https://github.com/raitechnology/raikv/blob/3ce2b23e0d9853fe...

  • smhasher

    Hash function quality and speed tests (by rurban)

  • https://github.com/rurban/smhasher/tree/aHash#smhasher

    It may be the fastest rust hash, but certainly not faster than other fast hashes. More like 2x slower.

    xxh3, t1ha0, wyhash are all much faster on the 2 machines I tested it on, an old 7 years old Intel i5-2300, and a new Ryzen 3200U.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts