SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Transformer Projects
-
transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Project mention: The $100 ChatGPT: Why Karpathy's nanochat Represnts the Next Big Thing | dev.to | 2026-05-04Hugging Face Transformers: 500,000+ lines
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Speculative decoding: when and why it actually speeds up inference | dev.to | 2026-06-04
Here's a real, runnable config that uses EAGLE for offline batched generation. It's straight from the vLLM repo's eagle.md example:
-
nn
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
-
There certainly are issues on Linux as well. The Detectron2 library alone has several hundred issues related to incorrect versions of something: https://github.com/facebookresearch/detectron2/issues
The mmdetection library (https://github.com/open-mmlab/mmdetection/issues) also has hundreds of version-related issues. Admittedly, that library has not seen any updates for over a year now, but it is sad that things just break and become basically unusable on modern Linux operating systems because NVIDIA can't stop breaking backwards and forwards compatibility for what is essentially just fancy matrix multiplication.
-
git clone https://github.com/fishaudio/fish-speech.git cd fish-speech pip install uv uv sync
-
sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
Project mention: DeepSeek makes the V4 Pro price discount permanent | news.ycombinator.com | 2026-05-22There are several things at play:
Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.
Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380 https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.
Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.
And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)
---
There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.
-
-
Project mention: Mellum2 MoE, Heretic Censorship Removal, & NVIDIA Cosmos 3 Omni-model for Local AI | dev.to | 2026-06-01
-
Project mention: I built a free, local video transcription tool, because I didn't want to pay $10/hour or upload my files to a stranger's server | dev.to | 2026-05-09
Transcribes it locally using faster-whisper
-
-
RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
Project mention: RWKV-7 beats Llama 3.2 with 3x fewer training tokens and formally exceeds TC^0 | news.ycombinator.com | 2026-02-23 -
Project mention: What is an LLM evaluation harness? A deep dive into lm-eval-harness | dev.to | 2026-06-03
EleutherAI started the project in 2020 as a unified way to reproduce published LLM benchmark numbers. It's now at v0.4.12 (May 2026), ships with 200+ tasks spanning reasoning, knowledge, coding, math, multilingual, and long-context benchmarks, and supports a long list of model backends: Hugging Face transformers, vLLM, SGLang, GPT-NeoX, Megatron-DeepSpeed, plus API endpoints for OpenAI, Anthropic, and a few others.
-
PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
-
petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
-
manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)
-
-
PaddleSeg
Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
-
LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
-
Project mention: Gradient Descent on Token Input Embeddings: A ModernBERT experiment | dev.to | 2025-06-23
ModernBERT-large was chosen because it is relatively lightweight model with a strong visualization suite and a simplified attention mask (full cross-attention) that is easy to reason about. It would be interesting to see if the results in this post hold across other models.
-
-
-
-
Python Transformer discussion
Python Transformer related posts
-
What is an LLM evaluation harness? A deep dive into lm-eval-harness
-
Mellum2 MoE, Heretic Censorship Removal, & NVIDIA Cosmos 3 Omni-model for Local AI
-
What GenAI Actually Costs in Production
-
EleutherAI / Lm-Evaluation-Harness
-
Refusal in Language Models Is Mediated by a Single Direction
-
Comparison: vLLM 0.6 vs. Text Generation Inference 1.4 for Serving Code LLMs
-
War Story: We Migrated from Hugging Face Inference API to Self-Hosted LLMs and Cut Latency by 60%
-
A note from our sponsor - SaaSHub
www.saashub.com | 8 Jun 2026
Index
What are some of the best open-source Transformer projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | transformers | 161,343 |
| 2 | vllm | 81,898 |
| 3 | nn | 66,902 |
| 4 | mmdetection | 32,695 |
| 5 | fish-speech | 30,666 |
| 6 | sglang | 28,872 |
| 7 | best-of-ml-python | 23,620 |
| 8 | heretic | 23,422 |
| 9 | faster-whisper | 23,393 |
| 10 | LaTeX-OCR | 16,324 |
| 11 | RWKV-LM | 14,548 |
| 12 | lm-evaluation-harness | 12,818 |
| 13 | PaddleSpeech | 12,611 |
| 14 | petals | 10,171 |
| 15 | manga-image-translator | 9,997 |
| 16 | mmsegmentation | 9,781 |
| 17 | PaddleSeg | 9,338 |
| 18 | LMFlow | 8,488 |
| 19 | bertviz | 8,078 |
| 20 | jukebox | 8,045 |
| 21 | GPT2-Chinese | 7,605 |
| 22 | BERT-pytorch | 6,518 |
| 23 | Informer2020 | 6,503 |