DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. semantifly

    My company (actually our two amazing interns) was working on this over the summer, we abandoned it but it’s 85% of the way to doing what you want it to do: https://github.com/accretional/semantifly

    We stopped working on it mostly because we had higher priorities and because I became pretty disillusioned with top-K rag. We had to build out a better workflow system anyway, and with that we could instead just have models write and run specific queries (eg list all .ts files containing the word “DatabaseClient”), and otherwise have their context set by users explicitly.

    The problem with RAG is that simplistic implementations distract and slow down models. You probably need an implementation that makes multiple passes to prune the context down to what you need to get good results, but that’s complicated enough that you might want to build something else that gives you more bang for your buck.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. chonkie

    🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library

    Semantic chunking is where I would start with now. Also check this out: https://github.com/chonkie-ai/chonkie

  4. tldw

    tl/dw (Too Long, Didn't Watch): Your Personal Research Multi-Tool - a naive attempt at 'A Young Lady's Illustrated Primer' (Open Source NotebookLM)

    Not the person you asked, but it's dependent on what you're trying to chunk. I've written a standalone chunking library for an app I'm building: https://github.com/rmusser01/tldw/blob/main/App_Function_Lib...

    It's setup so that you can perform whatever type of chunking you might prefer.

  5. webwright

    Webwright is an AI-powered terminal emulator that lives within your OS. It eliminates time spent on repetitive tasks, conjures code, summons software, and bends the OS to its will. Are you ready to release the ghost in your shell?

    https://github.com/MittaAI/webwright

    Let me know if you want to go over the code or want to discuss what works and what doesn’t. We had a loop on the action/function call “pipeline” but I changed it to just test if there was a function call or not and then just keep calling.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • TL;DW: Too Long; Didn't Watch Distill YouTube Videos to the Relevant Information

    4 projects | news.ycombinator.com | 14 Feb 2025
  • Meta is killing off its own AI-powered Instagram and Facebook profiles

    1 project | news.ycombinator.com | 4 Jan 2025
  • Getting Better LLM Responses Using AI-Friendly Documentation

    2 projects | dev.to | 19 Mar 2025
  • Repomix: Unleash the Power of AI for Your Codebase!

    1 project | dev.to | 22 Feb 2025
  • Repomix: Code Compression with Tree-sitter

    1 project | news.ycombinator.com | 16 Feb 2025

Did you know that Python is
the 2nd most popular programming language
based on number of references?