FANN: Vector Search in 200 Lines of Rust

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • fann

    Approx nearest neighbor search in Rust (by fennel-ai)

  • I'd be curious how this stacks up on ANN-benchmarks (https://ann-benchmarks.com/).

    FWIW exhaustive search is still probably good enough for most use cases. IMO the exhaustive search should use a heap https://github.com/fennel-ai/fann/blob/main/src/main.rs#L12 as you're only looking for top-k, it reduces time and memory complexity greatly. On a relatively unoptimized Golang implementation (though much beefier hardware) I get ~100ms to process 1M vectors of 300 dim. Still quite a bit slower than approximate, of course, but in absolute terms probably good enough for most use cases, especially because many use cases don't have 1M vectors. O(1) insert as well :)

  • ann-benchmarks

    Benchmarks of approximate nearest neighbor libraries in Python

  • I'd be curious how this stacks up on ANN-benchmarks (https://ann-benchmarks.com/).

    FWIW exhaustive search is still probably good enough for most use cases. IMO the exhaustive search should use a heap https://github.com/fennel-ai/fann/blob/main/src/main.rs#L12 as you're only looking for top-k, it reduces time and memory complexity greatly. On a relatively unoptimized Golang implementation (though much beefier hardware) I get ~100ms to process 1M vectors of 300 dim. Still quite a bit slower than approximate, of course, but in absolute terms probably good enough for most use cases, especially because many use cases don't have 1M vectors. O(1) insert as well :)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • usearch

    Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

  • Always pleasure to see short implementations, but I’d still take HNSW over such approach. Its core implementation can also be quite compact. In our case with USearch [1] its about 2k LOC, even with a custom implementation of priority queues and custom numeric types like uint40_t for indexes beyond 4B points.

    Obviously, together with bindings for Rust, Python, JavaScript, Java, Objective-C, Swift, and Wolfram it gets slightly larger, but on the bright side, you can pass JIT-compiled functions and store/search elements of different dimensionality, which allows much broader applicability [2].

    [1]: https://github.com/unum-cloud/usearch

  • voy

    🕸️🦀 A WASM vector similarity search written in Rust

  • This is great! See also Voy, A WASM vector similarity search written in Rust:

    https://github.com/tantaraio/voy

  • gollum

    Production grade LLM-ops in Golang (by stillmatic)

  • I have gotten 10x speedups with SIMD on modern hardware. Goroutines make this actually fairly tricky, as you essentially have to process all the events and then sort the entire array, which is usually the bottleneck. The heap adds a small amount of complexity but is significantly more efficient, feels like good ROI.

    https://github.com/stillmatic/gollum/blob/main/vectorstore.g...

  • Weaviate

    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

  • Weaviate actually does use SIMD for AMD64 with Go assembly (cosine distance is just normalized dot product) https://github.com/weaviate/weaviate/blob/master/adapters/re...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts