InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 22 Python knowledge-base Projects
-
memvid
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
Project mention: Ragged – Leveraging Video Container Formats for Efficient Vector DB Distribution | news.ycombinator.com | 2025-06-28- An open-source implementation to facilitate reproduction and adoption.
I was inspired by the innovative work of Memvid (https://github.com/Olow304/memvid), which demonstrated the potential of using video formats for data storage. My project builds on this concept with a focus on CDNs and semantic search.
I believe Ragged offers a promising solution for deploying semantic search capabilities in edge computing and serverless environments, leveraging the mature video distribution ecosystem. Also sharing indexed knowledge bases in the form of offline MP4 can unlock a new class of applications.
I'm eager to hear your thoughts, feedback, and any potential use cases you envision for this approach. You can find the full paper and implementation details [here](https://github.com/nikitph/ragged).
Thank you for your time fellows
Nikit
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
archivy
Archivy is a self-hostable knowledge repository that allows you to learn and retain information in your own personal and extensible wiki.
-
simba
Portable KMS (knowledge management system) designed to integrate seamlessly with any Retrieval-Augmented Generation (RAG) system
Project mention: Simba: Unleash the Power of Your Knowledge with this Open-Source KMS | dev.to | 2025-04-12View the Project on GitHub
-
-
Project mention: Ask HN: How do you store the knowledge gained in a day? | news.ycombinator.com | 2025-05-13
I created my personnal search engine to index and retrieve content I like. I like the idea of keeping track of content you like without too much effort.
https://github.com/raphaelsty/knowledge
I did not find yet a solution that suit me for knowledge which is not an online webpage / pdf.
-
Chat2Graph is a graph-native agentic system designed to bridge the power of graph databases with advanced AI capabilities. It establishes a multi-agent system (MAS) directly upon a graph database, and harnesses the inherent strengths of graph data structures, such as relationship modeling and interpretability, to enhance core AI agent capabilities like reasoning, planning, memory, and tool utilization.
In Chat2Graph, we expect to deeply implement the concept of "Graph + AI" and deeply integrate graph computing and AI technology into the design and implementation of agents.
GitHub: https://github.com/TuGraph-family/chat2graph
-
Sevalla
Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
-
-
stark
(NeurIPS D&B 2024) STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (by snap-stanford)
-
-
-
-
codex
CoDEx: A set of knowledge graph Completion Datasets Extracted from Wikidata and Wikipedia (by tsafavi)
-
-
-
-
-
chatgpt-faq
A ChatGPT powered FAQ chatbot template for connecting your external data sources to an LLM using Llama Index as backend
-
-
-
Inquisitive - self-hosted knowledge-base with a touch of LLM/RAG.
Link: https://github.com/kanishka-linux/inquisitive
I've been working on this on and off since last couple of months to consolidate all of my digital knowledge-base like notes/links/pdf etc with the help of LLM/RAG. It is fully self-hosted with very minimal setup instructions - which should be easily installable by anyone on their local machine.
UI is very bare minimal. It still has some rough edges when it comes to the UI part, but usable. I've been regularly using it since sometime, and it works well (atleast for my use case) when it comes to searching and organizing personal knowledge-base.
Feel free to try it!
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python knowledge-base discussion
Python knowledge-base related posts
-
Show HN: Yet Another Memory System for LLM's
-
Ragged – Leveraging Video Container Formats for Efficient Vector DB Distribution
-
Memvid – Video-Based AI Memory
-
Show HN: I compressed 10k PDFs into a 1.4GB video for LLM memory
-
Rete Algorithm
-
First Personal Search Engine Prototype
-
I made a simple, open source personal knowledge management app
-
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Sep 2025
Index
What are some of the best open-source knowledge-base projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | memvid | 8,305 |
2 | kb | 3,263 |
3 | archivy | 3,238 |
4 | simba | 1,351 |
5 | pygraft | 690 |
6 | knowledge | 684 |
7 | chat2graph | 340 |
8 | DataChad | 319 |
9 | stark | 318 |
10 | topic-db | 265 |
11 | silicon | 243 |
12 | experta | 172 |
13 | codex | 166 |
14 | typedb-bio | 85 |
15 | dewy | 72 |
16 | quilly | 37 |
17 | ckb | 24 |
18 | chatgpt-faq | 20 |
19 | Tyche | 9 |
20 | sdk | 5 |
21 | inquisitive | 4 |
22 | Linki | 3 |