Top 23 llmops Open-Source Projects

autogen

33 26,109 9.9 Jupyter Notebook

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

Project mention: Multi AI Agent Systems using OpenAI's new GPT-4o Model | dev.to | 2024-05-17

jina

126 20,177 9.1 Python

☁️ Build multimodal AI applications with cloud-native stack

Project mention: Jina.ai: Self-host Multimodal models | news.ycombinator.com | 2024-01-26

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
vllm

31 19,672 9.9 Python

A high-throughput and memory-efficient inference and serving engine for LLMs

Project mention: AI leaderboards are no longer useful. It's time to switch to Pareto curves | news.ycombinator.com | 2024-04-30

I guess the root cause of my claim is that OpenAI won't tell us whether or not GPT-3.5 is an MoE model, and I assumed it wasn't. Since GPT-3.5 is clearly nondeterministic at temp=0, I believed the nondeterminism was due to FPU stuff, and this effect was amplified with GPT-4's MoE. But if GPT-3.5 is also MoE then that's just wrong.
What makes this especially tricky is that small models are truly 100% deterministic at temp=0 because the relative likelihoods are too coarse for FPU issues to be a factor. I had thought 3.5 was big enough that some of its token probabilities were too fine-grained for the FPU. But that's probably wrong.
On the other hand, it's not just GPT, there are currently floating-point difficulties in vllm which significantly affect the determinism of any model run on it: https://github.com/vllm-project/vllm/issues/966 Note that a suggested fix is upcasting to float32. So it's possible that GPT-3.5 is using an especially low-precision float and introducing nondeterminism by saving money on compute costs.
Sadly I do not have the money[1] to actually run a test to falsify any of this. It seems like this would be a good little research project.
[1] Or the time, or the motivation :) But this stuff is expensive.

OpenLLM

25 8,963 9.9 Python

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

Project mention: First 15 Open Source Advent projects | dev.to | 2023-12-15

13. OpenLLM by BentoML | Github | tutorial

BentoML

16 6,603 9.8 Python

The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

Project mention: Who's hiring developer advocates? (December 2023) | dev.to | 2023-12-04

Link to GitHub -->

ragflow

7 7,744 9.7 Python

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Project mention: DeepSeek-V2 integrated, RAGFlow v0.5.0 is released | news.ycombinator.com | 2024-05-07

phidata

15 8,379 9.9 Python

Build AI Assistants with memory, knowledge and tools.

Project mention: Phidata: Add memory, knowledge and tools to LLMs | news.ycombinator.com | 2024-05-06

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
ragas

10 5,014 9.6 Python

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Project mention: Show HN: Ragas – the de facto open-source standard for evaluating RAG pipelines | news.ycombinator.com | 2024-03-21

congrats on launching! i think my continuing struggle with looking at Ragas as a company rather than an oss library is that the core of it is like 8 metrics (https://github.com/explodinggradients/ragas/tree/main/src/ra...) that are each 1-200 LOC. i can inline that easily in my app and retain full control, or model that in langchain or haystack or whatever.
why is Ragas a library and a company, rather than an overall "standard" or philosophy (eg like Heroku's 12 Factor Apps) that could maybe be more robust?
(just giving an opp to pitch some underappreciated benefits of using this library)

gateway

6 4,777 9.8 TypeScript

A Blazing Fast AI Gateway. Route to 100+ LLMs with 1 fast & friendly API.

Project mention: Adding a streaming run function to the Assistants API | news.ycombinator.com | 2024-02-07

zenml

33 3,694 9.8 Python

ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.

Project mention: FLaNK AI - 01 April 2024 | dev.to | 2024-04-01

giskard

7 3,192 10.0 Python

🐢 Open-Source Evaluation & Testing for LLMs and ML models

Project mention: Show HN: Evaluate LLM-based RAG Applications with automated test set generation | news.ycombinator.com | 2024-04-11

Awesome-LLMOps

9 3,125 8.0 Shell

An awesome & curated list of best LLMOps tools for developers

Project mention: LLM Obervability Platform | news.ycombinator.com | 2023-08-23

Thanks for the awesome product. I found your project from https://github.com/tensorchord/Awesome-LLMOps

phoenix

4 2,774 9.9 Jupyter Notebook

AI Observability & Evaluation (by Arize-ai)

Project mention: First 15 Open Source Advent projects | dev.to | 2023-12-15

11. Phoenix by Arize AI | Github | tutorial

llm-app

12 2,526 8.9 Python

LLM App templates for RAG, knowledge mining, and stream analytics. Ready to run with Docker,⚡in sync with your data sources.

Project mention: How to use LLMs for real-time alerting | dev.to | 2023-11-20

Answering queries and defining alerts: Our application running on Pathway LLM-App exposes the HTTP REST API endpoint to send queries and receive real-time responses. It is used by the Streamlit UI app. Queries are answered by looking up relevant documents in the index, as in the Retrieval-augmented generation (RAG) implementation. Next, queries are categorized for intent: an LLM probes them for natural language commands synonymous with notify or send an alert.

AGiXT

26 2,474 9.9 Python

AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.

Project mention: Conversational "memory loss"? | /r/LocalLLaMA | 2023-07-07

If you are more interested in AI assistants check out AGiXT. It has some really cool features but it is under heavy development. Not everything works jet and updates break sometimes already working functions. But it is still far better than babyAGI and other proof of concepts.

OpenPipe

13 2,388 9.9 TypeScript

Turn expensive prompts into cheap fine-tuned models

Project mention: Ask HN: How does deploying a fine-tuned model work | news.ycombinator.com | 2024-04-23

- Fireworks: $0.20
If you're looking for an end-to-end flow that will help you gather the training data, validate it, run the fine tune and then define evaluations, you could also check out my company, OpenPipe (https://openpipe.ai/). In addition to hosting your model, we help you organize your training data, relabel if necessary, define evaluations on the finished fine-tune, and monitor its performance in production. Our inference prices are higher than the above providers, but once you're happy with your model you can always export your weights and host them on one of the above!

uptrain

35 2,029 9.6 Python

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.

Project mention: A Developer's Guide to Evaluating LLMs! | dev.to | 2024-05-14

You can create an account with UpTrain and generate the API key for free. Please visit https://uptrain.ai/

envd

31 1,922 8.8 Go

🏕️ Reproducible development environment

Project mention: Show HN: Dockerfile Alternative for AI/ML | news.ycombinator.com | 2023-08-15

aici

7 1,771 9.9 Rust

AICI: Prompts as (Wasm) Programs

Project mention: Google Gemini: Context Caching | news.ycombinator.com | 2024-05-16

To me, context caching is only a subset of what is possible with full control over the model. I consider this a more complete list: https://github.com/microsoft/aici?tab=readme-ov-file#flexibi...
Context caching only gets you “forking generation into multiple branches” (i.e. sharing work between multiple generations)

trulens

14 1,669 9.8 Jupyter Notebook

Evaluation and Tracking for LLM Experiments

Project mention: Why Vector Compression Matters | dev.to | 2024-04-24

Retrieval using a single vector is called dense passage retrieval (DPR), because an entire passage (dozens to hundreds of tokens) is encoded as a single vector. ColBERT instead encodes a vector-per-token, where each vector is influenced by surrounding context. This leads to meaningfully better results; for example, here’s ColBERT running on Astra DB compared to DPR using openai-v3-small vectors, compared with TruLens for the Braintrust Coda Help Desk data set. ColBERT easily beats DPR at correctness, context relevance, and groundedness.

bionic-gpt

12 1,628 9.8 Rust

BionicGPT is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality

Project mention: Ask HN: How to structure Rust, Axum, and SQLx for clean architecture? | news.ycombinator.com | 2024-05-07

You can check out https://github.com/bionic-gpt/bionic-gpt
Basically I put db in it's own crate then crates for controller and another for pages.
The folders for each section of the web application.

hamilton

21 1,395 9.8 Jupyter Notebook

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

Project mention: Show HN: Hamilton's UI – observability, lineage, and catalog for data pipelines | news.ycombinator.com | 2024-05-02

openllmetry

3 1,328 9.8 Python

Open-source observability for your LLM application, based on OpenTelemetry

Project mention: FLaNK-AIM Weekly 13 May 2024 | dev.to | 2024-05-13

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

llmops related posts

An intuitive approach to building with LLMs

1 project | news.ycombinator.com | 20 May 2024
A Developer's Guide to Evaluating LLMs!

1 project | dev.to | 14 May 2024
Ask HN: How to structure Rust, Axum, and SQLx for clean architecture?

2 projects | news.ycombinator.com | 7 May 2024
Should I add CLA to my Open-source project?

2 projects | news.ycombinator.com | 4 May 2024
Pydantic Logfire

7 projects | news.ycombinator.com | 30 Apr 2024
AI leaderboards are no longer useful. It's time to switch to Pareto curves

1 project | news.ycombinator.com | 30 Apr 2024
Show HN: Cognita – open-source RAG framework for modular applications

3 projects | news.ycombinator.com | 27 Apr 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 22 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source llmops projects? This list will help you:

	Project	Stars
1	autogen	26,109
2	jina	20,177
3	vllm	19,672
4	OpenLLM	8,963
5	BentoML	6,603
6	ragflow	7,744
7	phidata	8,379
8	ragas	5,014
9	gateway	4,777
10	zenml	3,694
11	giskard	3,192
12	Awesome-LLMOps	3,125
13	phoenix	2,774
14	llm-app	2,526
15	AGiXT	2,474
16	OpenPipe	2,388
17	uptrain	2,029
18	envd	1,922
19	aici	1,771
20	trulens	1,669
21	bionic-gpt	1,628
22	hamilton	1,395
23	openllmetry	1,328

llmops

Top 23 llmops Open-Source Projects

llmops related posts

An intuitive approach to building with LLMs

A Developer's Guide to Evaluating LLMs!

Ask HN: How to structure Rust, Axum, and SQLx for clean architecture?

Should I add CLA to my Open-source project?

Pydantic Logfire

AI leaderboards are no longer useful. It's time to switch to Pareto curves

Show HN: Cognita – open-source RAG framework for modular applications

Index