Python vector-database

Open-source Python projects categorized as vector-database

Top 23 Python vector-database Projects

  • llama_index

    LlamaIndex is a data framework for your LLM applications

  • Project mention: LlamaIndex: A data framework for your LLM applications | news.ycombinator.com | 2024-04-07
  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

  • Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • txtai

    đź’ˇ All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

  • Project mention: Build knowledge graphs with LLM-driven entity extraction | dev.to | 2024-02-21

    txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

  • superduperdb

    đź”® SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

  • Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • SeaGOAT

    local-first semantic code search engine

  • Project mention: Reviewing AI Code Search Tools | dev.to | 2023-09-28

    In this blog post, I’ll be comparing 3 distinct AI-first code search tools I recently came across: Cody (developed by late-stage startup, Sourcegraph), SeaGOAT (an open-source project that was trending on HN last week), and Bloop (an early-stage YC startup). I’ll be evaluating them along the dimensions of user-friendliness as well as their accuracy.

  • autollm

    Ship RAG based LLM web apps in seconds.

  • Project mention: FLaNK Stack Weekly 06 Nov 2023 | dev.to | 2023-11-06
  • canopy

    Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone

  • Project mention: How to choose the right type of database | dev.to | 2024-02-28

    Pinecone: A scalable vector database service that facilitates efficient similarity search in high-dimensional spaces. Ideal for building real-time applications in AI, such as personalized recommendation engines and content-based retrieval systems.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • NeumAI

    Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

  • Project mention: Show HN: Neum AI – Open-source large-scale RAG framework | news.ycombinator.com | 2023-11-21

    Interesting to see that the semantic chunking in the tools library is a wrapper around GPT-4. Asks GPT for the python code and executes it: https://github.com/NeumTry/NeumAI/blob/main/neumai-tools/neu...

  • llmflows

    LLMFlows - Simple, Explicit and Transparent LLM Apps

  • Project mention: Show HN: LLMFlows – LangChain alternative for explicit and transparent apps | news.ycombinator.com | 2023-07-29
  • qdrant-client

    Python client for Qdrant vector search engine

  • Project mention: Show HN: Chromem-go – Embeddable vector database for Go | news.ycombinator.com | 2024-04-05

    Qdrant lib project https://github.com/tyrchen/qdrant-lib, Qdrant SDK has also support for local mode, which means embeddable https://github.com/qdrant/qdrant-client

  • vectordb

    A Python vector database you just need - no more, no less. (by jina-ai)

  • Project mention: A Python Vector Database | news.ycombinator.com | 2023-08-13
  • langchain-chatbot

    AI Chatbot for analyzing/extracting information from data in conversational format.

  • Project mention: Legalyze – AI for Lawyers to Query Case Files | news.ycombinator.com | 2023-05-21

    We have built Legalyze.ai, a tool for lawyers to query thousands of files at once. We are using Langchain in coordination with GPT-4 and Pinecone to query massive sets of data at once.

    Lawyers can also generate procedural documents like motions and requests using their case as context.

    Contact [email protected] for a trial and check out our open source project - https://github.com/Haste171/langchain-chatbot

  • vector-db-benchmark

    Framework for benchmarking vector search engines

  • Project mention: RAG is Dead. Long Live RAG! | dev.to | 2024-02-28

    Qdrant’s benchmark results are strongly in favor of accuracy and efficiency. We recommend that you consider them before deciding that an LLM is enough. Take a look at our open-source benchmark reports and try out the tests yourself.

  • ChatData

    ChatData 🔍 📖 brings RAG to real applications with FREE✨ knowledge bases. Now enjoy your chat with 6 million wikipedia pages and 2 million arxiv papers.

  • Project mention: Show HN: ChatData – an open-source ChatGPT-like chatbot | news.ycombinator.com | 2023-11-28

    Hey there, wonderful Hacker News community! We're excited to share something special with you - ChatData. This isn't just another chat-with-documents app; it's a game-changer that melds MyScale and LangChain, empowering you to query millions of files effortlessly.

    ChatData redefines the conversation between you and knowledge. Explore the MyScale free knowledge base or delve into your uploaded documents for tailored insights and answers.

    Retriever Type: Fueled by the Retrieval Augmented Generation (RAG) framework, ChatData introduces the Self-querying retriever and VectorSQL. Build intricate queries effortlessly using LangChain, covering everything from timestamps to arrays of strings.

    Session Management: Elevate your chat experience with intuitive session management. Customize your session ID, tweak prompts, and guide ChatData through your queries with ease. It's like having a personal conversation with your knowledge!

    Build Your Own Knowledge Base: Beyond MyScale's external knowledge base, ChatData invites you to upload your files using the Unstructured API. Your privacy matters - only processed texts are stored. It's your knowledge, your way!

    Whether you're a researcher, a student, or just someone hungry for knowledge, ChatData simplifies your journey through vast data. Unleash the true potential of information retrieval and explore a world of knowledge with a friendly touch.

    We genuinely can't wait to hear your thoughts and feedback. Let's embark on this exciting journey of knowledge discovery together with ChatData (https://github.com/myscale/ChatData)!

  • DocumentGPT

    DocumentGPT is a web application that allows you to chat over your research document using OpenAI's chat API and perform semantic search using vector databases. This tool provides a seamless interface for interacting with your research document, exploring search results, and engaging in a conversation with an AI chatbot.

  • Project mention: DocumentGPT with Agents | /r/StreamlitOfficial | 2023-07-07

    Was really excited to get everything working! Check it out at: https://github.com/aju22/DocumentGPT

  • relevanceai

    Home of the AI workforce - Multi-agent system, AI agents & tools

  • citrus

    (distributed) vector database (by 0xDebabrata)

  • Project mention: Created a smol vector database in my free time. Looking to provide a LangChain integration soon! | /r/LangChain | 2023-05-06

    It supports all the basic features like creating an index, inserting vectors and searching through them. Here's the GitHub link if anyone's interested in going over it: https://github.com/0xDebabrata/citrus

  • NeoGPT

    Your Local AI Assistant: Seamlessly Chat, Execute Commands, and Interpret Code with Local Models for Ultimate Privacy.

  • Project mention: HacktoberRest | dev.to | 2023-11-01

    One of the most interesting projects I came across this month was NeoGPT. It's a GPT based application that is being built to converse with documents and videos. While still in its infancy, the project has outlined a cool roadmap and has a very active base of contributors continuously expanding on its functionality. The project appeals to my desire to learn how to work with AI and neural networks. It is also at a development stage that it is not outside of the reach of my comprehension. Icing on the cake being it's Py based, which is my sharpest tool at the moment. I see it as a decent project to stay tapped into and grow my skills as the application develops.

  • biochatter

    Backend library for conversational AI in biomedicine

  • Project mention: [D] ChatGPT4 doesn’t cut it for my work. Need a more accurate tool. | /r/MachineLearning | 2023-12-06

    We have a research-focused framework for these kinds of tasks here: https://github.com/biocypher/biochatter. Requests and contributions welcome.

  • markdown-file-query

    Semantic QA with a markdown database: Query any markdown file using vector embedding, Pinecone vector database and GPT (langchain). A weaker version of privateGPT

  • Project mention: [P] I built a Chatbot to talk with any Github Repo. 🪄 | /r/MachineLearning | 2023-04-29
  • YassQueenDB

    Graph database library that allows you to store, analyze, and search through your data in a graph format. By using the Universal Sentence Encoder, it provides an efficient and semantic approach to handle text data. 📚🧠🚀

  • vektor

    a mini vector database implementation that intends to be educational and interpretable (by notallm)

  • Project mention: Weekly Thread: What questions do you have about vector databases? | /r/vectordatabase | 2023-07-12
  • QDrant-NLP

    QDrant-NLP

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python vector-database related posts

Index

What are some of the best open-source vector-database projects in Python? This list will help you:

Project Stars
1 llama_index 30,910
2 deeplake 7,708
3 txtai 6,953
4 superduperdb 4,346
5 SeaGOAT 911
6 autollm 908
7 canopy 873
8 NeumAI 774
9 llmflows 615
10 qdrant-client 608
11 vectordb 462
12 langchain-chatbot 371
13 vector-db-benchmark 224
14 ChatData 133
15 DocumentGPT 99
16 relevanceai 97
17 citrus 92
18 NeoGPT 55
19 biochatter 40
20 markdown-file-query 25
21 YassQueenDB 14
22 vektor 12
23 QDrant-NLP 11

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com