pynndescent
pgANN
pynndescent | pgANN | |
---|---|---|
4 | 2 | |
841 | 290 | |
- | 0.3% | |
6.3 | 0.0 | |
about 18 hours ago | 4 months ago | |
Python | Python | |
BSD 2-clause "Simplified" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pynndescent
-
[D]: Best nearest neighbour search for high dimensions
I'll assume this is the link to pynndescent, looks cool! Thanks for sharing. I haven't used it before. Also seems like it's an approximate nearest neighbor algorithm, just FYI for others seeing this.
-
How to find "k" nearest embeddings in a space with a very large number of N embeddings (efficiently)?
If you just want quick in memory search then pynndescent is a decent option: it's easy to install, and easy to get running. Another good option is Annoy; it's just as easy to install and get running with python, but it is a little less performant if you want to do a lot of queries, or get a knn-graph quickly.
-
PynnDescent: Importing pickled index gives error - 'NNDescent' object has no attribute 'shape'
Using the latest version of PyNNDescent via pip install. Running this on Google Colab with python 3.7.13Followed the Docs and created an index with the paramspynnindex = pynndescent.NNDescent(arr, metric="cosine", n_neighbors=100)Everything works fine and I get results from pynnindex.neighbor_graph as expected.
-
[D] In UMAP and PyNNDescent, the conversion of Cosine and Correlation measures to distance metric seems problematic
PyNNDescent distances.py: pynndescent/distances.py at master · lmcinnes/pynndescent (github.com)
pgANN
-
Pinecone raises $100M Series B
Why do you use pgvector instead of pgANN? My understanding is pgANN is built with FAISS. When I compared pgvector with FAISS, pgvector was 3-5x slower.
https://github.com/netrasys/pgANN
-
Pgvector – vector similarity search for Postgres
Love it! Need better ANNs approaches in databases, using external servers such as Milvus, faiss is a pita
related: https://github.com/netrasys/pgANN
What are some alternatives?
umap - Uniform Manifold Approximation and Projection
ann-benchmarks - Benchmarks of approximate nearest neighbor libraries in Python
annoy - Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
smlar - PostgreSQL extension for an effective similarity search || mirror of git://sigaev.ru/smlar.git || see https://www.pgcon.org/2012/schedule/track/Hacking/443.en.html
Milvus - A cloud-native vector database, storage for next generation AI applications
citrus - (distributed) vector database
similarity - TensorFlow Similarity is a python package focused on making similarity learning quick and easy.
faiss - A library for efficient similarity search and clustering of dense vectors.
DiskANN - Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search