datasette-faiss
Maintain a FAISS index for specified Datasette tables (by simonw)
DP_means
Dirichlet Process K-means (by vsmolyakov)
datasette-faiss | DP_means | |
---|---|---|
1 | 1 | |
32 | 45 | |
- | - | |
10.0 | 1.7 | |
over 1 year ago | about 1 year ago | |
Python | C++ | |
Apache License 2.0 | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datasette-faiss
Posts with mentions or reviews of datasette-faiss.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-09-04.
-
LLM now provides tools for working with embeddings
I experimented with that a few months ago. Building a fresh FAISS index for a few thousand matches is really quick, so o think it's often better to filter first, build a scratch index and then use that for similarity: https://github.com/simonw/datasette-faiss/issues/3
DP_means
Posts with mentions or reviews of DP_means.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-09-04.
-
LLM now provides tools for working with embeddings
I found one implementation here: https://github.com/vsmolyakov/DP_means
Alternatively, there is a Bayesian GMM in sklearn. When you restrict it to diagonal Covariance matrices, you should be fine in high dimensions
What are some alternatives?
When comparing datasette-faiss and DP_means you can also consider the following projects:
llm-cluster - LLM plugin for clustering embeddings
llm-gpt4all - Plugin for LLM adding support for the GPT4All collection of models
llm-llama-cpp - LLM plugin for running models using llama.cpp