MemoRAG – Enhance RAG with memory-based knowledge discovery for long contexts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
  1. MemoRAG

    Empowering RAG with a memory-based data interface for all-purpose applications!

    You've got all the details right though, so that's pretty impressive :). AFAICT from a quick glance at the code (https://github.com/qhjqhj00/MemoRAG/blob/main/memorag/memora...), it is indeed "fine tuning" (jargon!) a model on your chosen book, presumably in the most basic/direct sense: asking it reproduce sections of text at random from the book given their surrounding context, and rewarding/penalizing the neural network based on how well it did.

    The comment mentions GPU memory in the Colab Notebook merely because this process is expensive -- "fine tuning" is the same thing as "training", just with a nearly-complete starting point. Thus the call to `AutoModelForCausalLM.from_pretrained()`. To answer your question explicitly: the fine-tuning step creates a modified version of the base model as an "offline" step, so the memory requirements during inference (aka "online" operation) are unaffected. Both in terms of storage and in terms of GPU VRAM. I'm not the dev tho so obv apologies if I'm off base!

    I would passionately argue that that step is more of a small addition to the overall pipeline than a core necessity, though. Fine-tuning is really good for teaching a model to recreate style, tone, structure, and other linguistic details, but it's not a very feasible way to teach it facts. That's what "RAG" is for: making up for this deficiency in fine-tuning.

    In other words, this repo is basically like that post from a few weeks back that was advocating for "modular monoliths" that employ both strategies (monolith vs. microservices) in a deeply collaborative way. And my reaction is the same: I'm not convinced the details of this meshing will be very revolutionary, but the idea itself is deceptively clever!

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Document Search in .NET with Kernel Memory

    2 projects | dev.to | 20 May 2025
  • Kernel Memory document ingestion

    1 project | dev.to | 17 Dec 2024
  • Kernel Memory with Azure OpenAI, Blob storage and AI Search services

    1 project | dev.to | 17 Dec 2024
  • Kernel Memory with Cosmos DB for NoSQL vector search.

    2 projects | dev.to | 17 Dec 2024
  • Open source alternative to ChatGPT and ChatPDF-like AI tools

    6 projects | news.ycombinator.com | 9 Dec 2023

Did you know that Python is
the 2nd most popular programming language
based on number of references?