Show HN: Route your prompts to the best LLM

This page summarizes the projects mentioned and recommended in the original post on

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in is all you need to start monitoring your apps. Sign up for our free tier today.
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
  • dspy

    DSPy: The framework for programming—not prompting—foundation models

    I agree this is an interesting direction, I think this is on the roadmap for DSPy [], but right now they mainly focus on optimizing the in-context examples.

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • llama_index

    LlamaIndex is a data framework for your LLM applications

  • openrouter-runner

    Inference engine powering open source models on OpenRouter

    I've bumped into a few of these. I use as a model abstraction, but not as a router. does the same thing but with a more enterprise feel. Also though it's less clear how committed they are to that feature.

    That said, while I've really enjoyed the LLM abstraction (making it easy for me to test different models without changing my code), I haven't felt any desire for a router. I _do_ have some prompts that I send to gpt-3.5-turbo, and could potentially use other models, but it's kind of niche.

    In part this is because I try to do as much in a single prompt as I can, meaning I want to use a model that's able to handle the hardest parts of the prompt and then the easy parts come along with. As a result there's not many "easy" prompts. The easy prompts are usually text fixup and routing.

    My "routing" prompts are at a different level of abstraction, usually routing some input or activity to one of several prompts (each of which has its own context, and the sum of all contexts across those prompts is too large, hence the routing). I don't know if there's some meaningful crossover between these two routing concepts.

    Another issue I have with LLM portability is the use of tools/functions/structured output. Opus and Gemini Pro 1.5 have kind of implemented this OK, but until recently GPT was the only halfway decent implementation of this. This seems to be an "advanced" feature, yet it's also a feature I use even more with smaller prompts, as those small prompts are often inside some larger algorithm and I don't want the fuss of text parsing and exceptions from ad hoc output.

    But in the end I'm not price sensitive in my work, so I always come back to the newest GPT model. If I make a switch to Opus it definitely won't be to save money! And I'm probably not going to want to fiddle, but instead make a thoughtful choice and switch the default model in my code.

  • semantic-router

    Superfast AI decision making and intelligent processing of multi-modal data.

    Thanks for sharing! These are useful toos, but they are a bit different, more based on similarity search in prompt space (a bit like semantic router: Our router uses a neural network for the routing decisions, and it can be trained on your own prompts []. We're also working on adding support for on-prem deployment :)

  • RAG under "Usage" step 2.

    > "Input your Unify AP"

    Your product looks good in my view, although I have only spend about 10min thus far. The docs look pretty easy to follow along.

    I'll probably give this a try soon!

  • gateway

    A Blazing Fast AI Gateway. Route to 200+ LLMs with 1 fast & friendly API.

    Great to know this!

    I have come across Portkey's Open-source AI Gateway which kind of does the same.

    It looks like with more LLM adoption, resiliency and cost related aspects take off sooner than expected unlike other technological trends in the past.

    I'm also thinking that there is a chance if something like this could help build a better RAG pipeline or evals for the GenAI App. Because end of the day you want to reduce hallucinations but still get good generative responses.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Synthetic Data Benchmark [pdf]

    1 project | | 21 Jun 2024
  • How to create LLM fallback from Gemini Flash to GPT-4o?

    2 projects | | 13 Jun 2024
  • Show HN: Anthropic's Prompt Engineering Interactive Tutorial (Web Version)

    2 projects | | 18 May 2024
  • Show HN: LLM-powered NPCs running on your hardware

    4 projects | | 30 Apr 2024
  • Looking for cofounders to build open reliable LLM infra

    1 project | | 29 Apr 2024

Did you konow that Python is
the 1st most popular programming language
based on number of metions?