Meow Hash

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • meow_hash

    Official version of the Meow hash, an extremely fast level 1 hash

  • smhasher

    Hash function quality and speed tests (by rurban)

    smhasher does not report any failure for blake3. The one failure for blake2b 256 is not ideal for a hash function, but not necessarily evidence that the function doesn't look like a random value: `Sparse` generated 50643 16-bit values, hashed them, and found 2 collisions in the high 32 bits of the output. I'm not sure what kind of flaw in the test harness you think can explain that.

    There could definitely be issues in the integration code that lets the harness call into all these functions. For example, smhasher finds issues with SHA3 for the "PerlinNoise" input sets. That input set hashes small integers in [0, 4096), with seeds in [0, 4096); I'm not convinced the sha3 wrapper does anything useful with the seed here https://github.com/rurban/smhasher/blob/37cffd7b9cdaa2140c53... . I expect something similar is happening with SHA1 and SHA2.

    The MD5 row shows no failure; only the function that truncates to the low 32 bit has failures.

    You can read the test harness or the test log (e.g., https://github.com/rurban/smhasher/blob/master/doc/blake2b-2...) and apply your own significance threshold. The statistical tests are nothing special or novel (counting collisions in bitranges of the input, and bias in individual bits, mostly); the interesting part is how the various tests generate interesting sets of inputs. In the end, it's a bit like the PRNG wars: you can always come up with a test that makes a function look bad, but a ton of failure is definitely a bad sign.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • xxHash

    Extremely fast non-cryptographic hash algorithm

    The README for xxhash has benchmarks covering fast hashes including Meow:

    https://github.com/Cyan4973/xxHash/wiki/Performance-comparis...

  • highwayhash

    Fast strong hash functions: SipHash/HighwayHash

    Google made faster Siphash variants, and also HighwayHash that's much faster.

    https://github.com/google/highwayhash

  • umash

    UMASH: a fast enough hash and fingerprint with collision bounds

    umash (https://github.com/backtrace-labs/umash) has a similar structure PH block structure, but was designed for decent bit mixing (enough to satisfy smhasher, unlike CLHASH, which needs an additional finalizer) with a lower fixed time cost: 22 cycles for a one-byte hash.

    I'm not sure how one would use that linear regression. What kind of hardware offers 675 GB/s of memory bandwidth? 140 bytes/cycle is easily more than twice the L2 read bandwidth offered by any COTS chip I'm aware of. There are also warm up effects past the fixed cost of setup and finalizers that slow down hashing for short input. For what range of input sizes (and hot/cold cache state) would you say the regression is a useful model?

  • BLAKE3

    the official Rust and C implementations of the BLAKE3 cryptographic hash function

    This is great! Many applications (like dedupe) don't need full crypto guarantees.

    If you need something fast, and crypto secure, I recommend checking out Blake3/b3sum. I'm just learning about XXH3 in this thread so I cannot comment on how it stacks up but I love b3sum for fast file hashing.

    https://github.com/BLAKE3-team/BLAKE3/tree/master/b3sum

  • Hashids.java

    Hashids algorithm v1.0.0 implementation in Java

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • cligen

    Nim library to infer/generate command-line-interfaces / option / argument parsing; Docs at

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts