vectordb VS llm-cluster

Compare vectordb vs llm-cluster and see what are their differences.

vectordb

A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search. (by kagisearch)

llm-cluster

LLM plugin for clustering embeddings (by simonw)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
vectordb llm-cluster
6 3
552 59
5.1% -
7.6 4.9
1 day ago 3 months ago
Python Python
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

vectordb

Posts with mentions or reviews of vectordb. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-26.
  • VectorDB: Vector Database Built by Kagi Search
    9 projects | news.ycombinator.com | 26 Nov 2023
    We needed a low latency, on premise solution that we can run on edge nodes (so lightweight) with sane defaults that anyone in the team can whim in a sec.

    Result is this and we constantly benchmark performance of different embeddings to ensure best defaults.

    [1] https://github.com/kagisearch/vectordb#embeddings-performanc...

  • Embeddings: What they are and why they matter
    9 projects | news.ycombinator.com | 24 Oct 2023
    If you are looking for lightweight, low- latency, fully local, end-to-end solution (chunking, embedding, storage and vector search), try vectordb [1]

    Just spent a day updating it with latest benchmarks for text embedding models.

    [1] https://github.com/kagisearch/vectordb

llm-cluster

Posts with mentions or reviews of llm-cluster. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-24.
  • Embeddings: What they are and why they matter
    9 projects | news.ycombinator.com | 24 Oct 2023
    I'm trying to understand the clustering code but not doing too well.

    https://github.com/simonw/llm-cluster/blob/main/llm_cluster....

    So does this take each row from the DB, convert to a numpy array (?), then uses an existing model called MiniBatchKMeans (?) to go over that array and generate a bunch of labels. Then add it to a dictionary and print to console.

  • LLM now provides tools for working with embeddings
    7 projects | news.ycombinator.com | 4 Sep 2023
    I imagine there are all kinds of improvements that could be made to this kind of thing.

    I'd love to understand if there's a good way to automatically pick an interesting number of clusters, as opposed to picking a number at the start.

    https://github.com/simonw/llm-cluster/blob/main/llm_cluster....

What are some alternatives?

When comparing vectordb and llm-cluster you can also consider the following projects:

langroid - Harness LLMs with Multi-Agent Programming

telekinesis - Control Objects and Functions Remotely

onnxruntime - ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

roadmap - This is the public roadmap for Salesforce Heroku services.

txtai - 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

DBoW2 - Enhanced hierarchical bag-of-word library for C++

datasette-faiss - Maintain a FAISS index for specified Datasette tables

marqo - Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

DP_means - Dirichlet Process K-means

supabase - The open source Firebase alternative.

bert - TensorFlow code and pre-trained models for BERT