A Comprehensive Guide for Building Rag-Based LLM Applications

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • llm-applications

    A comprehensive guide to building RAG-based LLM applications for production.

  • vectara-answer

    LLM-powered Conversational AI experience using Vectara

  • RAG is a very useful flow but I agree the complexity is often overwhelming, esp as you move from a toy example to a real production deployment. It's not just choosing a vector DB (last time I checked there were about 50), managing it, deciding on how to chunk data, etc. You also need to ensure your retrieval pipeline is accurate and fast, ensuring data is secure and private, and manage the whole thing as it scales. That's one of the main benefits of using Vectara (https://vectara.com; FD: I work there) - it's a GenAI platform that abstracts all this complexity away, and you can focus on building your application.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • model.nvim

    Neovim plugin for interacting with LLM's and building editor integrated prompts.

  • For local stuff with a handful of documents, you can even just throw it into a json and call it a day. The similarity search is as simple as an np.dot: https://github.com/gsuuon/llm.nvim/blob/main/python3/store.p...

  • LLMStack

    No-code platform to build LLM Agents, workflows and applications with your data

  • Kudos to the team for a very detailed notebook going into things like pipeline evaluation wrt performance and costs etc. Even if we ignore the framework specific bits, it is a great guide to follow when building RAG systems in production.

    We have been building RAG systems in production for a few months and have been tinkering with different strategies to get the most performance out of these pipelines. As others have pointed out, vector database may not be the right strategy for every problem. Similarly there are things like lost in the middle problems (https://arxiv.org/abs/2307.03172) that one may have to deal with. We put together our learnings building and optimizing these pipelines in a post at https://llmstack.ai/blog/retrieval-augmented-generation.

    https://github.com/trypromptly/LLMStack is a low-code platform we open-sourced recently that ships these RAG pipelines out of the box with some app templates if anyone wants to try them out.

  • llama-hub

    Discontinued A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain

  • My favorite example is the asana loader[0] for llama-index. It's literally just the most basic wrapper around the Asana SDK to concatenate some strings.

    [0] - https://github.com/emptycrown/llama-hub/blob/main/llama_hub/...

  • pyod

    A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)

  • This is a feature in many commercial products already, as well as open source libraries like PyOD. https://github.com/yzhao062/pyod

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts