llm-cluster VS datasette-faiss

Compare llm-cluster vs datasette-faiss and see what are their differences.

llm-cluster

LLM plugin for clustering embeddings (by simonw)

datasette-faiss

Maintain a FAISS index for specified Datasette tables (by simonw)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
llm-cluster datasette-faiss
3 1
60 32
- -
4.9 10.0
3 months ago over 1 year ago
Python Python
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

llm-cluster

Posts with mentions or reviews of llm-cluster. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-24.
  • Embeddings: What they are and why they matter
    9 projects | news.ycombinator.com | 24 Oct 2023
    I'm trying to understand the clustering code but not doing too well.

    https://github.com/simonw/llm-cluster/blob/main/llm_cluster....

    So does this take each row from the DB, convert to a numpy array (?), then uses an existing model called MiniBatchKMeans (?) to go over that array and generate a bunch of labels. Then add it to a dictionary and print to console.

  • LLM now provides tools for working with embeddings
    7 projects | news.ycombinator.com | 4 Sep 2023
    I imagine there are all kinds of improvements that could be made to this kind of thing.

    I'd love to understand if there's a good way to automatically pick an interesting number of clusters, as opposed to picking a number at the start.

    https://github.com/simonw/llm-cluster/blob/main/llm_cluster....

datasette-faiss

Posts with mentions or reviews of datasette-faiss. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-09-04.
  • LLM now provides tools for working with embeddings
    7 projects | news.ycombinator.com | 4 Sep 2023
    I experimented with that a few months ago. Building a fresh FAISS index for a few thousand matches is really quick, so o think it's often better to filter first, build a scratch index and then use that for similarity: https://github.com/simonw/datasette-faiss/issues/3

What are some alternatives?

When comparing llm-cluster and datasette-faiss you can also consider the following projects:

telekinesis - Control Objects and Functions Remotely

llm-gpt4all - Plugin for LLM adding support for the GPT4All collection of models

roadmap - This is the public roadmap for Salesforce Heroku services.

DP_means - Dirichlet Process K-means

DBoW2 - Enhanced hierarchical bag-of-word library for C++

bert - TensorFlow code and pre-trained models for BERT

vectordb - A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.

marqo - Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

supabase - The open source Firebase alternative.