[D]: Best nearest neighbour search for high dimensions

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

faiss

70 27,924 9.4 C++

A library for efficient similarity search and clustering of dense vectors.

If you need large scale (1000+ dimension, millions+ source points, >1000 queries per second) and accept imperfect results / approximate nearest neighbors, then other people have already mentioned some of the best libraries (FAISS, Annoy).

annoy

40 12,662 5.3 C++

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

If you need large scale (1000+ dimension, millions+ source points, >1000 queries per second) and accept imperfect results / approximate nearest neighbors, then other people have already mentioned some of the best libraries (FAISS, Annoy).

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
ann-benchmarks

50 4,568 8.1 Python

Benchmarks of approximate nearest neighbor libraries in Python

Look at ANN Benchmarks - there are quite a few indexes tested on various datasets.

pynndescent

4 837 6.5 Python

A Python nearest neighbor descent for approximate nearest neighbors

I'll assume this is the link to pynndescent, looks cool! Thanks for sharing. I haven't used it before. Also seems like it's an approximate nearest neighbor algorithm, just FYI for others seeing this.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project