Top 23 vector-database Open-Source Projects

MeiliSearch

129 43,397 9.8 Rust

A lightning-fast search API that fits effortlessly into your apps, websites, and workflow

Project mention: Publish/Subscribe with Sidekiq | dev.to | 2024-02-21

We needed to introduce a new service for search. As we settled on using meilisearch, we needed a way to sync updates on our models with the records in meilisearch. We could've continued to use callbacks but we needed something better.

llama_index

75 30,910 10.0 Python

LlamaIndex is a data framework for your LLM applications

Project mention: LlamaIndex: A data framework for your LLM applications | news.ycombinator.com | 2024-04-07

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Milvus

104 26,857 10.0 Go

A cloud-native vector database, storage for next generation AI applications

Project mention: Ask HN: Who is hiring? (April 2024) | news.ycombinator.com | 2024-04-01

Zilliz (zilliz.com) | Hybrid/ONSITE (SF, NYC) | Full-time
I am part of the hiring team for DevRel
NYC - https://boards.greenhouse.io/zilliz/jobs/4307910005
SF - https://boards.greenhouse.io/zilliz/jobs/4317590005
Zilliz is the company behind Milvus (https://github.com/milvus-io/milvus), the most starred vector database on GitHub. Milvus is a distributed vector database that shines in 1B+ vector use cases. Examples include autonomous driving, e-commerce, and drug discovery. (and, of course, RAG)
We are also hiring for other roles that I am not personally involved in the hiring process for such as product managers, software engineers, and recruiters.

qdrant

140 17,839 9.9 Rust

Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Project mention: Boost Your Code's Efficiency: Introducing Semantic Cache with Qdrant | dev.to | 2024-04-25

I took Qdrant for this project. The reason was that Qdrant stands for high-performance vector search, the best choice against use cases like finding similar function calls based on semantic similarity. Qdrant is not only powerful but also scalable to support a variety of advanced search features that are greatly useful to nuanced caching mechanisms like ours.

Weaviate

76 9,524 10.0 Go

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.

Project mention: pgvecto.rs alternatives - qdrant and Weaviate | libhunt.com/r/pgvecto.rs | 2024-03-13

orama

11 8,059 9.4 TypeScript

🌌 Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!

Project mention: Sky's the Limit! Supercharging Your Astro Blog with Orama, the Ultimate Stargazing Search Engine! | dev.to | 2023-08-03

Let's break into the steps to utilize Orama and analyze how it works. I won't dig into the technical stuff because, hey, it's an open-source project, which means you can easily peek at the source code, no problemo!

deeplake

13 7,708 9.8 Python

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
txtai

355 6,990 9.3 Python

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Project mention: What contributing to Open-source is, and what it isn't | news.ycombinator.com | 2024-04-27

I tend to agree with this sentiment. Many junior devs and/or those in college want to contribute. Then they feel entitled to merge a PR that they worked hard on often without guidance. I'm all for working with people but projects have standards and not all ideas make sense. In many cases, especially with commercial open source, the project is the base of a companies identity. So it's not just for drive-by ideas to pad a resume or finish a school project.
For those who do want to do this, I'd recommend writing an issue and/or reaching out to the developers to engage in a dialogue. This takes work but it will increase the likelihood of a PR being merged.
Disclaimer: I'm the primary developer of txtai (https://github.com/neuml/txtai), an open-source vector database + RAG framework

RediSearch

4 5,211 9.5 C

A query and indexing engine for Redis, providing secondary indexing, full-text search, vector similarity search and aggregations.
superduperdb

24 4,346 9.9 Python

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

paradedb

16 3,803 9.8 Rust

Postgres for Search and Analytics

Project mention: Using ClickHouse to scale an events engine | news.ycombinator.com | 2024-04-11

examples

4 2,433 9.3 Jupyter Notebook

Jupyter Notebooks to help you get hands-on with Pinecone vector databases (by pinecone-io)
featureform

28 1,674 9.7 Jupyter Notebook

The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

Project mention: Still look familiar? | /r/u_featureform | 2023-07-13

pgvecto.rs

17 1,375 9.3 Rust

Scalable, Low-latency and Hybrid-enabled Vector Search in Postgres. Revolutionize Vector Search, not Database.

Project mention: My binary vector search is better than your FP32 vectors | dev.to | 2024-03-25

To evaluate the performance metrics in comparison to the original vector approach, we conducted benchmarking using the dbpedia-entities-openai3-text-embedding-3-large-3072-1M dataset. The benchmark was performed on a Google Cloud virtual machine (VM) with specifications of n2-standard-8, which includes 8 virtual CPUs and 32GB of memory. We used pgvecto.rs v0.2.1 as the vector database.

SeaGOAT

7 911 9.7 Python

local-first semantic code search engine

Project mention: Reviewing AI Code Search Tools | dev.to | 2023-09-28

In this blog post, I’ll be comparing 3 distinct AI-first code search tools I recently came across: Cody (developed by late-stage startup, Sourcegraph), SeaGOAT (an open-source project that was trending on HN last week), and Bloop (an early-stage YC startup). I’ll be evaluating them along the dimensions of user-friendliness as well as their accuracy.

autollm

1 908 9.0 Python

Ship RAG based LLM web apps in seconds.

Project mention: FLaNK Stack Weekly 06 Nov 2023 | dev.to | 2023-11-06

attu

1 875 9.6 TypeScript

The GUI for Milvus

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

canopy

13 873 9.8 Python

Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone

Project mention: How to choose the right type of database | dev.to | 2024-02-28

Pinecone: A scalable vector database service that facilitates efficient similarity search in high-dimensional spaces. Ideal for building real-time applications in AI, such as personalized recommendation engines and content-based retrieval systems.

vector-admin

1 808 9.5 TypeScript

The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.

Project mention: CRUD operations on Vector Databases | /r/LangChain | 2023-12-08

Also comes with more than a CRUD UI, also has other built-in tools for RAG applications. https://github.com/Mintplex-Labs/vector-admin

NeumAI

2 774 8.7 Python

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

Project mention: Show HN: Neum AI – Open-source large-scale RAG framework | news.ycombinator.com | 2023-11-21

Interesting to see that the semantic chunking in the tools library is a wrapper around GPT-4. Asks GPT for the python code and executes it: https://github.com/NeumTry/NeumAI/blob/main/neumai-tools/neu...

qdrant-client

2 616 9.2 Python

Python client for Qdrant vector search engine

Project mention: Show HN: Chromem-go – Embeddable vector database for Go | news.ycombinator.com | 2024-04-05

Qdrant lib project https://github.com/tyrchen/qdrant-lib, Qdrant SDK has also support for local mode, which means embeddable https://github.com/qdrant/qdrant-client

llmflows

1 616 8.6 Python

LLMFlows - Simple, Explicit and Transparent LLM Apps

Project mention: Show HN: LLMFlows – LangChain alternative for explicit and transparent apps | news.ycombinator.com | 2023-07-29

embedbase

5 480 9.5 TypeScript

A dead-simple API to build LLM-powered apps
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

vector-database related posts

Show HN: Chromem-go – Embeddable vector database for Go
4 projects | news.ycombinator.com | 5 Apr 2024
My binary vector search is better than your FP32 vectors
1 project | dev.to | 25 Mar 2024
pgvecto.rs alternatives - qdrant and Weaviate
3 projects | 13 Mar 2024
Milvus VS pgvecto.rs - a user suggested alternative
2 projects | 13 Mar 2024
RAG is Dead. Long Live RAG!
1 project | dev.to | 28 Feb 2024
Show HN: OasysDB, Storing vectors for RAG in Rust simplified
1 project | news.ycombinator.com | 27 Feb 2024
Show HN: Vector-Io: Universal Vector Data Import/Export
1 project | news.ycombinator.com | 12 Feb 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 29 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source vector-database projects? This list will help you:

	Project	Stars
1	MeiliSearch	43,397
2	llama_index	30,910
3	Milvus	26,857
4	qdrant	17,839
5	Weaviate	9,524
6	orama	8,059
7	deeplake	7,708
8	txtai	6,990
9	RediSearch	5,211
10	superduperdb	4,346
11	paradedb	3,803
12	examples	2,433
13	featureform	1,674
14	pgvecto.rs	1,375
15	SeaGOAT	911
16	autollm	908
17	attu	875
18	canopy	873
19	vector-admin	808
20	NeumAI	774
21	qdrant-client	616
22	llmflows	616
23	embedbase	480