highwayhash
SimSIMD

highwayhash | SimSIMD | |
---|---|---|
2 | 21 | |
911 | 1,269 | |
1.1% | 4.6% | |
4.9 | 9.8 | |
3 months ago | 17 days ago | |
Go | C | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
highwayhash
-
Can I concatenate multiple non-crypto hash functions to reduce collision?
highwayhash (alt) provides 256 bits. Fast mainly for larger inputs and supports seeds.
-
Fastest way to encode []int8 to bytes hash ?
HighwayHash can produce 256 bit output, if you really need a hash that long.
SimSIMD
-
How to Implement a Cosine Similarity Function in TypeScript
It’s a nice post, but “using array methods” probably shouldn’t be placed in the “Efficient Implementation” section. As it often happens with high-level languages, a single plain old loop is faster that three array methods.
Similarly, if you plan to query those vectors in search, you should consider continuous `TypedArray` types, and smaller scalars, than the double precision `number`.
I know very little about JS, but some of the amazing HackerNews community members have previously helped port SimSIMD to JavaScript (https://github.com/ashvardanian/SimSIMD), and I wrote a blogpost covering some of those JS/TS-specifics, NumJS, and MathJS in 2023 (https://ashvardanian.com/posts/javascript-ai-vector-search/).
Hopefully it should help unlock another 10-100x in performance
-
A not so fast implementation of cosine similarity in C++ and SIMD
Arch specific similarity functions in SIMD, with bindings in Python and other languages:
https://github.com/ashvardanian/SimSIMD
- HPy – A better C API for Python
-
How does cosine similarity work?
SciPy distances module has its own problems. It's pretty slow, and constantly overflows in mixed precision scenarios. It also raises the wrong type of errors when it overflows, and uses general purpose `math` package instead of `numpy` for square roots. So use it with caution.
I've outlined some of the related issues here: https://github.com/ashvardanian/SimSIMD#cosine-similarity-re...
-
SIMD-accelerated computer vision on a $2 microcontroller
SimSIMD https://github.com/ashvardanian/SimSIMD :
> Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE
https://news.ycombinator.com/item?id=37808036
- Deep Learning in JavaScript
-
From slow to SIMD: A Go optimization story
For other languages (including nodejs/bun/rust/python etc) you can have a look at SimSIMD which I have contributed to this year (made recompiled binaries for nodejs/bun part of the build process for x86_64 and arm64 on Mac and Linux, x86 and x86_64 on windows).
[0] https://github.com/ashvardanian/SimSIMD
-
Python, C, Assembly – Faster Cosine Similarity
Kahan floats are also commonly used in such cases, but I believe there is room for improvement without hitting those extremes. First of all, we should tune the epsilon here: https://github.com/ashvardanian/SimSIMD/blob/f8ff727dcddcd14...
As for the 64-bit version, its harder, as the higher-precision `rsqrt` approximations are only available with "AVX512ER". I'm not sure which CPUs support that, but its not available on Sapphire Rapids.
- Beating GCC 12 - 118x Speedup for Jensen Shannon Divergence via AVX-512FP16
- Show HN: Beating GCC 12 – 118x Speedup for Jensen Shannon D. Via AVX-512FP16
What are some alternatives?
xxh3 - XXH3 algorithm in Go
usearch - Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
c2goasm - C to Go Assembly
kuzu - Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
go-highway - Go implementation of Google's HighwayHash
nsimd - Agenium Scale vectorization library for CPUs and GPUs
