Rust LLM

Open-source Rust projects categorized as LLM

Large Language Models

Top 23 Rust LLM Projects

  1. burn

    Burn is a next generation Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.

    Project mention: Conduit: A UI-less node-based system | dev.to | 2025-05-03

    I intend to grow this into an open-source project because deep inside, this is ideally how I would like ComfyUI to be. There's still a long journey ahead for building all the custom nodes, which is especially challenging given that the majority of code for AI workflows is written in Python. However, with my hands-on experience with Candle and Burn libraries, I may be able to get pretty close!

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. aichat

    All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.

    Project mention: A tiny Go tool for generating conventional commits using Claude | news.ycombinator.com | 2025-03-24

    ```

    I like that your tool includes some previous commit messages... I'm not sure if that can be done with `aichat` but it seems like a great idea.

    I'd be tempted to wrap individual commit messages in pseudo-xml tags, as Claude really likes those[^2] and the `%B` format doesn't really show the breaks between commit messages.

    1: https://github.com/sigoden/aichat

  4. postgresml

    Postgres with GPUs for ML/AI apps.

    Project mention: Postgres Learns to RAG: Wikipedia Q&A using Llama 3.1 inside the database | news.ycombinator.com | 2024-09-24

    GitHub: https://github.com/postgresml/postgresml

    Looking forward to your feedback and any questions about the technical details.

  5. mistral.rs

    Blazingly fast LLM inference.

    Project mention: Thoughts on Mistral.rs | news.ycombinator.com | 2025-04-29
  6. code2prompt

    A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.

    Project mention: Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison | news.ycombinator.com | 2025-03-31
  7. deepclaude

    A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.

    Project mention: OpenAI O3-Mini | news.ycombinator.com | 2025-01-31

    If you would like to see the CoT process visualized, try the “Improve prompt” feature in Anthropic console. Also check out https://github.com/getAsterisk/deepclaude

  8. tensorzero

    TensorZero creates a feedback loop for optimizing LLM applications — turning production data into smarter, faster, and cheaper models.

    Project mention: Ask HN: Freelancer? Seeking freelancer? (April 2025) | news.ycombinator.com | 2025-04-01

    SEEKING FREELANCER

    TensorZero | https://github.com/tensorzero/tensorzero | Staff Front-end / Design Engineer | Remote or Onsite (NYC) | Full-time or Part-time

    TensorZero creates a feedback loop for optimizing LLM applications — turning production data into smarter, faster, and cheaper models.

    We're looking for a contract / freelance Staff Front-end / Design Engineer with the following skillset:

    ‣ Must have: expert in TypeScript, React, and web fundamentals

    ‣ Nice to have: familiar with LLMs, experience with Vite / React Router V7 (RemixJS) / Tailwind

    What we offer:

    ‣ Vast majority of your work → open source

    ‣ Flexible arrangement: remote or onsite (NYC), full-time or part-time

    ‣ Small and entirely technical team: former Rust compiler maintainer, ML researchers with 1000's of citations, decacorn CPO

    ‣ Engagement expected to last a few months

    ‣ Compensation in line with staff+ experience

    Also hiring full-time employees: https://news.ycombinator.com/item?id=43569646

    Apply: [email protected]

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. text-embeddings-inference

    A blazing fast inference solution for text embeddings models

    Project mention: Zero-Shot Text Classification on a low-end CPU-only machine? | news.ycombinator.com | 2024-10-07

    Hugging Face does maintain a package named Text Embedding Inference (TEI) with GPU/CPU-optimized container images. While I have only used this for hosting embedding models, it does appear to support Roberta architecture classifiers (specifically sentiment analysis).

    https://github.com/huggingface/text-embeddings-inference

    You can always run a zero shot pipeline in HF with a simple Flask/FastAPI application.

  11. baml

    The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)

    Project mention: Use the Gemini API with OpenAI Fallback in TypeScript | news.ycombinator.com | 2025-04-06

    I've been using using [BAML](https://github.com/boundaryml/baml) to do this, and it works really well. Lets you have multiple different fallback and retry policies, and returns strongly typed outputs from LLMs.

  12. lsp-ai

    LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.

    Project mention: Zed: The Fastest AI Code Editor | news.ycombinator.com | 2025-05-07

    Unless something's changed, every AI-backed language server I've tried in Helix suffers from the same limitation when it comes to completions: Suggestions aren't shown until the last language server has responded or timed-out. Your slowest language server determines how long you'll be waiting.

    The only project I know of that recognizes this is https://github.com/SilasMarvin/lsp-ai, which pivoted away from completions to chat interactions via code actions.

  13. trieve

    All-in-one infrastructure for search, recommendations, RAG, and analytics offered via API

    Project mention: Accurate Hallucination Detection With NER | dev.to | 2025-01-07

    You can find all the code involved in our NER system, including benchmarks, at github.com/devflowinc/trieve/tree/main/hallucination-detection.

  14. pgvecto.rs

    Scalable, Low-latency and Hybrid-enabled Vector Search in Postgres. Revolutionize Vector Search, not Database.

    Project mention: PGVector's Missing Features | dev.to | 2024-09-13

    Pgvector is very slow, seconds to 10's of seconds, on filter and order by queries. Its maintainers are working on this as you can see in this currently 83 comment long issue on Github and pgvector.rs has made improvements as you can see here, but it's messy. I strongly believe that you don't want to be fighting through these issues when adding semantic search to your product. It's going to be a long term, hard fought struggle to keep up with pgvector's updates here and continuously tune it.

  15. aici

    AICI: Prompts as (Wasm) Programs

    Project mention: What Is ChatGPT Doing and Why Does It Work? | news.ycombinator.com | 2024-06-18

    That’s right, if LLMs were really thinking/forming world models etc. we would expect them to be robust against word choice or phrasing. But in practice anyone using RAG can tell you that that is not the case.

    I’m just a practitioner so my language might be imprecise but when I say similarly structured sentences what I mean is, and this is my interpretation based on my experience with using Agents and LLMs, that the shape of the context as in the phrasing and the word choice highly bias the outputs of LLMs.

    In my own observations at work, those who interpret LLMs to be thinking often produce bad agents. LLM are not good at open ended questions, if you ask an LLM “improve this code” you will often get bad results that just look passable. But if you interpret LLMs as probabilistic models highly biased by their context then you would add a lot more context and specific instructions in the prompt in order to get the Agent to produce the right output.

    Side note, this is also why I like the AICI approach: https://github.com/microsoft/aici

  16. floneum

    Instant, controllable, local pre-trained AI models in Rust

  17. smartgpt

    A program that provides LLMs with the ability to complete complex tasks using plugins.

  18. llm-chain

    `llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks

    Project mention: A Comprehensive Guide to the llm-chain Rust crate | dev.to | 2024-06-06

    You can find the crate’s GitHub repository here.

  19. korvus

    Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python, JavaScript, Rust and C.

    Project mention: Korvus: Single-Query RAG with Postgres | news.ycombinator.com | 2024-07-11

    I find it misleading to use an f-string containing encoded `{CONTEXT}` <https://github.com/postgresml/korvus/blob/bce269a20a1dbea933...>, and after digging into TFM <https://postgresml.org/docs/open-source/korvus/guides/rag#si...> it seems in is not, in fact, an f-string artifact but rather the literal characters "{"+"CONTEXT"+"}" and are the same in all the language bindings?

    IMHO it would be much clearer if you just used the normal %s for the "outer" string and left the implicit f-string syntax as it is, e.g.

                        {

  20. extractous

    Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

    Project mention: Ask HN: What is the best method for turning a scanned book as a PDF into text? | news.ycombinator.com | 2025-02-16

    Seeing blind recommendations for AI slop is very disappointing for HN.

    For OP, there is a library written in rust that can do exactly what you need with very high accuracy and performant.

    https://github.com/yobix-ai/extractous

    Would need to OCR dependencies to get it to work on scanned books [2]

    [1] https://github.com/yobix-ai/extractous

    [2] https://github.com/yobix-ai/extractous?tab=readme-ov-file#-s...

  21. indexify

    A realtime serving engine for Data-Intensive Generative AI Applications

    Project mention: Running Durable Workflows in Postgres Using DBOS | news.ycombinator.com | 2024-12-10

    Great points. Besides performance, centralized coordination and distributed dataplane is better for operability of schedulers as well. Some examples - Being able to roll out new features in the scheduler, tracing scheduling behavior and decisions, deploying configuration changes.

    Even with a centralized scheduler it should be possible to create a DevEx that makes use of decorators to author workflows easily.

    We are doing that with Indexify(https://github.com/tensorlakeai/indexify) for authoring data intensive workflows to process unstructured data(documents, videos, etc) - it’s like Spark but uses Python instead of Scala/SQL/UDFs.

  22. MusicGPT

    Generate music based on natural language prompts using LLMs running locally

    Project mention: Show HN: MusicGPT – An Open Source App for Generating Music with Local LLMs | news.ycombinator.com | 2024-05-23
  23. femtoGPT

    Pure Rust implementation of a minimal Generative Pretrained Transformer

    Project mention: Show HN: Pure Rust Implementation of GPT | news.ycombinator.com | 2025-01-16
  24. llm-ls

    LSP server leveraging LLMs for code completion (and more?)

  25. paddler

    Stateful load balancer custom-tailored for llama.cpp 🏓🦙

    Project mention: Exo: Run your own AI cluster at home with everyday devices ⌚ | news.ycombinator.com | 2024-07-15

    Just got https://github.com/distantmagic/paddler working across 2 machines for load balancing, This will be next level and useful for Llama 400B to run across multiple machines.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Rust LLM discussion

Log in or Post with

Rust LLM related posts

  • Makepad 1.0: Rust UI Framework

    2 projects | news.ycombinator.com | 13 May 2025
  • Swiftide 0.26 - Streaming agents

    2 projects | dev.to | 8 May 2025
  • Zed: The Fastest AI Code Editor

    13 projects | news.ycombinator.com | 7 May 2025
  • Thoughts on Mistral.rs

    1 project | news.ycombinator.com | 29 Apr 2025
  • Beyond Vibe Coding: What I Discovered Testing 10 AI Coding Tools

    1 project | news.ycombinator.com | 22 Apr 2025
  • Ask HN: What are you working on (March 2025)?

    2 projects | news.ycombinator.com | 18 Mar 2025
  • Kwaak, a different take on AI coding tools

    1 project | dev.to | 25 Feb 2025
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 13 May 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source LLM projects in Rust? This list will help you:

# Project Stars
1 burn 11,003
2 aichat 6,681
3 postgresml 6,260
4 mistral.rs 5,568
5 code2prompt 5,611
6 deepclaude 5,096
7 tensorzero 4,065
8 text-embeddings-inference 3,523
9 baml 3,473
10 lsp-ai 2,744
11 trieve 2,121
12 pgvecto.rs 2,024
13 aici 2,021
14 floneum 1,862
15 smartgpt 1,760
16 llm-chain 1,477
17 korvus 1,360
18 extractous 1,097
19 indexify 994
20 MusicGPT 991
21 femtoGPT 871
22 llm-ls 769
23 paddler 753

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Rust is
the 5th most popular programming language
based on number of references?