FANN: Vector Search in 200 Lines of Rust

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

fann

3 151 6.1 Rust

Approx nearest neighbor search in Rust (by fennel-ai)

I'd be curious how this stacks up on ANN-benchmarks (https://ann-benchmarks.com/).
FWIW exhaustive search is still probably good enough for most use cases. IMO the exhaustive search should use a heap https://github.com/fennel-ai/fann/blob/main/src/main.rs#L12 as you're only looking for top-k, it reduces time and memory complexity greatly. On a relatively unoptimized Golang implementation (though much beefier hardware) I get ~100ms to process 1M vectors of 300 dim. Still quite a bit slower than approximate, of course, but in absolute terms probably good enough for most use cases, especially because many use cases don't have 1M vectors. O(1) insert as well :)

ann-benchmarks

51 4,588 8.1 Python

Benchmarks of approximate nearest neighbor libraries in Python

I'd be curious how this stacks up on ANN-benchmarks (https://ann-benchmarks.com/).
FWIW exhaustive search is still probably good enough for most use cases. IMO the exhaustive search should use a heap https://github.com/fennel-ai/fann/blob/main/src/main.rs#L12 as you're only looking for top-k, it reduces time and memory complexity greatly. On a relatively unoptimized Golang implementation (though much beefier hardware) I get ~100ms to process 1M vectors of 300 dim. Still quite a bit slower than approximate, of course, but in absolute terms probably good enough for most use cases, especially because many use cases don't have 1M vectors. O(1) insert as well :)

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
usearch

20 1,629 9.8 C++

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

Always pleasure to see short implementations, but I’d still take HNSW over such approach. Its core implementation can also be quite compact. In our case with USearch [1] its about 2k LOC, even with a custom implementation of priority queues and custom numeric types like uint40_t for indexes beyond 4B points.
Obviously, together with bindings for Rust, Python, JavaScript, Java, Objective-C, Swift, and Wolfram it gets slightly larger, but on the bright side, you can pass JIT-compiled functions and store/search elements of different dimensionality, which allows much broader applicability [2].
[1]: https://github.com/unum-cloud/usearch

voy

4 705 7.4 Rust

🕸️🦀 A WASM vector similarity search written in Rust

This is great! See also Voy, A WASM vector similarity search written in Rust:
https://github.com/tantaraio/voy

gollum

4 32 6.0 Go

Production grade LLM-ops in Golang (by stillmatic)

I have gotten 10x speedups with SIMD on modern hardware. Goroutines make this actually fairly tricky, as you essentially have to process all the events and then sort the entire array, which is usually the bottleneck. The heap adds a small amount of complexity but is significantly more efficient, feels like good ROI.
https://github.com/stillmatic/gollum/blob/main/vectorstore.g...

Weaviate

76 9,524 10.0 Go

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.

Weaviate actually does use SIMD for AMD64 with Go assembly (cosine distance is just normalized dot product) https://github.com/weaviate/weaviate/blob/master/adapters/re...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Danswer – open-source question answering across all your docs
7 projects | news.ycombinator.com | 10 Jul 2023
I've changed my mind about Code Interpretor
3 projects | /r/ChatGPT | 9 Jul 2023
Top 10 Best Vector Databases & Libraries
10 projects | dev.to | 19 Apr 2023
Vector Databases (power your embedding similarity search and AI applications).
2 projects | /r/engineering_stuff | 7 Apr 2023
Vector database built for scalable similarity search
19 projects | news.ycombinator.com | 25 Mar 2023

FANN: Vector Search in 200 Lines of Rust

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
vector-search nearest-neighbor-search approximate-nearest-neighbor-search similarity-search Knowledge Management Tools
Post date: 15 Jun 2023

fann

ann-benchmarks

WorkOS

usearch

voy

gollum

Weaviate

Related posts

FANN: Vector Search in 200 Lines of Rust

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com vector-search nearest-neighbor-search approximate-nearest-neighbor-search similarity-search Knowledge Management Tools Post date: 15 Jun 2023

fann

ann-benchmarks

WorkOS

usearch

voy

gollum

Weaviate

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
vector-search nearest-neighbor-search approximate-nearest-neighbor-search similarity-search Knowledge Management Tools
Post date: 15 Jun 2023