airweave
txtai
airweave | txtai | |
---|---|---|
2 | 385 | |
514 | 10,565 | |
47.9% | 5.1% | |
9.8 | 9.6 | |
2 days ago | 6 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
airweave
-
Ask HN: What Are You Working On? (February 2025)
I'm working on Airweave https://github.com/airweave-ai/airweave , an open-source dev tool that makes any app searchable for AI agents. it connects to a source app, db, or api and converts its contents to accessible knowledge for agents. Airweave automates authentication, ingestion, enrichment, mapping, and syncing to vector stores and graph databases of choice. you can use it via our UI, API, or SDKs https://docs.airweave.ai/
we originally built this for our previous agent startup as an internal solution to ensure agents could find the relevant data on apps they're using. We then pivoted to this after some early positive reactions and decided to open-source it.
here's a short demo: https://tinyurl.com/demo-airweave
we're two engineers/friends based in Amsterdam, NL. We just launched the project, so it's rough around the edges ofc, but we're very eager to get some feedback!
feel free to reach out to me personally if you like this!
- Show HN: Airweave – Open-Source Tool That Turns App Data into Agent Knowledge
txtai
- Chunking your data for RAG
-
Analyzing LinkedIn Company Posts with Graphs and Agents
txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.
- Getting started with LLM APIs
-
Lists of open-source frameworks for building RAG applications
Ideal For: Projects requiring quick setup and robust search capabilities. GitHub Repository
-
Show HN: I made a website to semantically search ArXiv papers
Excellent project.
As mentioned in another comment, I've put together an embeddings database using the arxiv dataset (https://huggingface.co/NeuML/txtai-arxiv) recently.
For those interested in the literature search space, a couple other projects I've worked on that may be of interest.
annotateai (https://github.com/neuml/annotateai) - Annotates papers with LLMs. Supports searching the arxiv database mentioned above.
paperai (https://github.com/neuml/paperai) - Semantic search and workflows for medical/scientific papers. Built on txtai (https://github.com/neuml/txtai)
paperetl (https://github.com/neuml/paperetl) - ETL processes for medical and scientific papers. Supports full PDF docs.
-
Building Effective "Agents"
If you're looking for a lightweight open-source framework designed to handle the patterns mentioned in this article: https://github.com/neuml/txtai
Disclaimer: I'm the author of the framework.
-
Postgres for Everything (E/Postgres)
I fully agree. Postgres has solved many of the problems that many are re-solving with GenAI related databases.
With txtai (https://github.com/neuml/txtai), I've went all in with Postgres + pgvector. Projects can start small with a SQLite backend then switch the persistence to Postgres. With this, you get all the years of battle-tested production experience from Postgres built-in for free.
-
Voice Activity Detection in Elixir with Membran
VAD is certainly a complex but underappreciated topic. If you like signal processing, FFTs and want to see a similar concept implemented in Python, then check out this code. It's has a fairly well-tuned VAD component built-in.
https://github.com/neuml/txtai/blob/master/src/python/txtai/...
-
Pinecone integrates AI inferencing with vector database
txtai (https://github.com/neuml/txtai) has had inline vectorization since 2020. It supports Transformers, llama.cpp and LLM API services. It also has inline integration with LLM models and a built-in RAG pipeline.
-
Show HN: Open-Source Colab Notebooks to Implement Advanced RAG Techniques
An alternative you can try is txtai (https://github.com/neuml/txtai).
Disclaimer: I'm the primary developer
What are some alternatives?
clientai - A unified client for AI providers with built-in agent support.
llmsherpa - Developer APIs to Accelerate LLM Projects
canopy - Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
tika-python - Tika-Python is a Python binding to the Apache Tikaâ„¢ REST services allowing Tika to be called natively in the Python community.
gptme - Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.
Milvus - Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search