SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python vector-database Projects
-
Project mention: Quick tip: Replace MongoDB® Atlas with SingleStore Kai in LlamaIndex | dev.to | 2025-01-21
The notebook is adapted from the LlamaIndex GitHub repo.
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
Project mention: 13 GitHub Projects that Supercharge Your AI and Development Journey 🚀 | dev.to | 2025-03-03
Stars: 25085 Author: mem0ai Star the mem0 repository⭐
-
txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
-
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Finally, we stored these vectors in our chosen database: the activeloop DeepLake database. This database is open source, something near and dear to our own open-source hearts. We will cover some additional details in a further section, but it is specifically designed to handle vector data and perform efficient similarity searches, which is crucial for quick and accurate retrieval during the RAG process.
-
lancedb
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
Project mention: The Best Way to Use Text Embeddings Portably Is with Parquet and Polars | news.ycombinator.com | 2025-02-24For another library that has great performance and features like full text indexing and the ability to version changes I’d recommend lancedb https://lancedb.github.io/lancedb/
Yes, it’s a vector database and has more complexity. But you can use it without creating indexes and it has excellent polars and pandas zero copy arrow support also.
-
deep-searcher
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
Project mention: Deep Searcher, Open source deep researcher on your private data | news.ycombinator.com | 2025-02-21github https://github.com/zilliztech/deep-searcher
-
raptor
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Project mention: RAPTOR: A Novel Tree-Based Retrieval System for Enhancing Language Models – Research Summary | dev.to | 2024-12-13This study introduces RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval), a novel tree-based retrieval system designed to enhance search capabilities for extended language models.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
Project mention: Show HN: Chromem-go – Embeddable vector database for Go | news.ycombinator.com | 2024-04-05
Qdrant lib project https://github.com/tyrchen/qdrant-lib, Qdrant SDK has also support for local mode, which means embeddable https://github.com/qdrant/qdrant-client
-
NeumAI
Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.
-
-
-
-
Project mention: Ask HN: What Are You Working On? (February 2025) | news.ycombinator.com | 2025-02-23
I'm working on Airweave https://github.com/airweave-ai/airweave , an open-source dev tool that makes any app searchable for AI agents. it connects to a source app, db, or api and converts its contents to accessible knowledge for agents. Airweave automates authentication, ingestion, enrichment, mapping, and syncing to vector stores and graph databases of choice. you can use it via our UI, API, or SDKs https://docs.airweave.ai/
we originally built this for our previous agent startup as an internal solution to ensure agents could find the relevant data on apps they're using. We then pivoted to this after some early positive reactions and decided to open-source it.
here's a short demo: https://tinyurl.com/demo-airweave
we're two engineers/friends based in Amsterdam, NL. We just launched the project, so it's rough around the edges ofc, but we're very eager to get some feedback!
feel free to reach out to me personally if you like this!
-
langchain-chatbot
AI Chatbot for analyzing/extracting information from data in conversational format.
-
-
-
RedisVL
-
Project mention: Show HN: Vicinity – Fast, Lightweight Nearest Neighbors with Flexible Back Ends | news.ycombinator.com | 2024-12-01
Not author of the library, but the documentation lists the backends here: https://github.com/MinishLab/vicinity?tab=readme-ov-file#sup...
So these are nearest neighbor search implementations, not database backends.
-
Get the source code (and leave a little ⭐ while you're there): https://github.com/AstraBert/everything-ai Get a quick-start with the documentation: https://astrabert.github.io/everything-ai/
-
-
ChatData
ChatData 🔍 📖 brings RAG to real applications with FREE✨ knowledge bases. Now enjoy your chat with 6 million wikipedia pages and 2 million arxiv papers.
-
pixeltable
Pixeltable — AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.
Project mention: Pixeltable: Store, transform, index, and iterate on data for ML | news.ycombinator.com | 2024-12-17 -
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python vector-database discussion
Python vector-database related posts
-
13 GitHub Projects that Supercharge Your AI and Development Journey 🚀
-
Deep Searcher, Open source deep researcher on your private data
-
Show HN: Vicinity – Fast, Lightweight Nearest Neighbors with Flexible Back Ends
-
Ask HN: Local RAG with private knowledge base
-
AIM Weekly 28 Oct 2024
-
vec2pg: Migrate to pgvector from Pinecone and Qdrant
-
Removing stuff is never obvious yet often better
-
A note from our sponsor - SaaSHub
www.saashub.com | 20 Mar 2025
Index
What are some of the best open-source vector-database projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | llama_index | 40,107 |
2 | mem0 | 26,246 |
3 | txtai | 10,565 |
4 | deeplake | 8,467 |
5 | lancedb | 5,869 |
6 | deep-searcher | 4,179 |
7 | raptor | 1,140 |
8 | SeaGOAT | 1,050 |
9 | qdrant-client | 900 |
10 | NeumAI | 847 |
11 | rag-demystified | 816 |
12 | llmflows | 688 |
13 | vectordb | 595 |
14 | airweave | 514 |
15 | langchain-chatbot | 414 |
16 | GradCache | 378 |
17 | vector-db-benchmark | 314 |
18 | redis-vl-python | 271 |
19 | vicinity | 258 |
20 | everything-ai | 232 |
21 | relevanceai | 221 |
22 | ChatData | 168 |
23 | pixeltable | 161 |