pecos
pynndescent
Our great sponsors
pecos | pynndescent | |
---|---|---|
1 | 4 | |
490 | 838 | |
0.8% | - | |
6.8 | 6.5 | |
11 days ago | about 1 month ago | |
Python | Python | |
Apache License 2.0 | BSD 2-clause "Simplified" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pecos
-
Multi label classification on sparse labels
Code: https://github.com/amzn/pecos
pynndescent
-
[D]: Best nearest neighbour search for high dimensions
I'll assume this is the link to pynndescent, looks cool! Thanks for sharing. I haven't used it before. Also seems like it's an approximate nearest neighbor algorithm, just FYI for others seeing this.
-
How to find "k" nearest embeddings in a space with a very large number of N embeddings (efficiently)?
If you just want quick in memory search then pynndescent is a decent option: it's easy to install, and easy to get running. Another good option is Annoy; it's just as easy to install and get running with python, but it is a little less performant if you want to do a lot of queries, or get a knn-graph quickly.
-
PynnDescent: Importing pickled index gives error - 'NNDescent' object has no attribute 'shape'
Using the latest version of PyNNDescent via pip install. Running this on Google Colab with python 3.7.13Followed the Docs and created an index with the paramspynnindex = pynndescent.NNDescent(arr, metric="cosine", n_neighbors=100)Everything works fine and I get results from pynnindex.neighbor_graph as expected.
-
[D] In UMAP and PyNNDescent, the conversion of Cosine and Correlation measures to distance metric seems problematic
PyNNDescent distances.py: pynndescent/distances.py at master ยท lmcinnes/pynndescent (github.com)
What are some alternatives?
Machine-Learning-Collection - A resource for learning about Machine learning & Deep Learning
umap - Uniform Manifold Approximation and Projection
nni - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
annoy - Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
m2cgen - Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
ann-benchmarks - Benchmarks of approximate nearest neighbor libraries in Python
citrus - (distributed) vector database
faiss - A library for efficient similarity search and clustering of dense vectors.