SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python quantization Projects
-
Project mention: Llama-Factory: Unified, Efficient Fine-Tuning for 100 Open LLMs | news.ycombinator.com | 2025-09-18
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: I built a free, local video transcription tool, because I didn't want to pay $10/hour or upload my files to a stranger's server | dev.to | 2026-05-09
Transcribes it locally using faster-whisper
-
-
-
nunchaku
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
-
optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
-
llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
-
Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
-
Project mention: Gemma 3 270M re-implemented in pure PyTorch for local tinkering | news.ycombinator.com | 2025-08-20
-
xTuring
Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
-
neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
-
aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
-
-
-
model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
TensorFlow Model Optimization Toolkit
-
auto-round
A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.
hmm... at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy, AutoRound pushes that to ~99.4–100.n% (??) the gap is roughly 0.1–0.7 percentage points
https://github.com/intel/auto-round/blob/main/docs/gguf_alg_...
-
-
z80ai
Z80-μLM is a 2-bit quantized language model small enough to run on an 8-bit Z80 processor. Train conversational models in Python, export them as CP/M .COM binaries, and chat with your vintage computer.
Project mention: Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB | news.ycombinator.com | 2025-12-28 -
-
-
-
-
Python quantization discussion
Python quantization related posts
-
I built a free, local video transcription tool, because I didn't want to pay $10/hour or upload my files to a stranger's server
-
Advanced Quantization Algorithm for LLMs
-
LLM compressor: compress models for efficient deployment
-
Creando Subtítulos Automáticos para Vídeos con Python, Faster-Whisper, FFmpeg, Streamlit, Pillow
-
Apple Explores Home Robotics as Potential 'Next Big Thing'
-
Half-Quadratic Quantization of Large Machine Learning Models
-
New Mixtral HQQ Quantzied 4-bit/2-bit configuration
-
A note from our sponsor - SaaSHub
www.saashub.com | 9 Jun 2026
Index
What are some of the best open-source quantization projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | LlamaFactory | 71,870 |
| 2 | faster-whisper | 23,393 |
| 3 | Chinese-LLaMA-Alpaca | 18,949 |
| 4 | bitsandbytes | 8,258 |
| 5 | nunchaku | 3,861 |
| 6 | optimum | 3,409 |
| 7 | llm-compressor | 3,331 |
| 8 | Pretrained-Language-Model | 3,158 |
| 9 | ao | 2,843 |
| 10 | xTuring | 2,667 |
| 11 | neural-compressor | 2,651 |
| 12 | aimet | 2,634 |
| 13 | mixtral-offloading | 2,329 |
| 14 | mmrazor | 1,672 |
| 15 | model-optimization | 1,573 |
| 16 | auto-round | 1,436 |
| 17 | nncf | 1,169 |
| 18 | z80ai | 1,092 |
| 19 | optimum-quanto | 1,042 |
| 20 | finn | 1,003 |
| 21 | hqq | 940 |
| 22 | OmniQuant | 899 |
| 23 | SqueezeLLM | 718 |