Can we discuss MLOps, Deployment, Optimizations, and Speed?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

accelerate

18 6,948 9.7 Python

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

accelerate is a best-in-class lib for deploying models, especially across multi-gpu and multi-node.

transformers

175 125,021 10.0 Python

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

transformers uses accelerate if you call it with device_map='auto'

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
unsloth

15 7,263 9.4 Python

Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

The unsloth project offers some low-level optimizations for Llama et al, and as of today some prelim Mistral work (which I heard is the llama architecture?)

llama.cpp

769 55,846 10.0 C++

LLM inference in C/C++

llama.cpp is a great resource for running Quants, and even though it's called llama, it's the goto backend for basically all LLMs right now (ctransformers is dead)

DeepSpeed

51 32,550 9.8 Python

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

DeepSpeed can handle parallelism concerns, and even offload data/model to RAM, or even NVMe (!?) . I'm surprised I don't see this project used more.

ollama

192 58,943 9.9 Go

Get up and running with Llama 3, Mistral, Gemma, and other large language models.
uniteai

17 218 8.2 Python

Your AI Stack in Your Editor

I recently went through the same with UniteAI, and had to swap ctransformers back out for llama.cpp

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project