Top 11 Python model-compression Projects

nni

5 13,726 6.7 Python

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Efficient-AI-Backbones

3 3,783 4.4 Python

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Pretrained-Language-Model

1 2,953 6.1 Python

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Project mention: Does anyone know a downloadable chatgpt model that supports conversation in Albanian? | /r/Programimi | 2023-05-16

Torch-Pruning

2 2,288 9.6 Python

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs

Project mention: [P] Help: I want to compress EfficientnetV2 using pruning. | /r/MachineLearning | 2023-06-28

I also tried structured pruning from https://github.com/VainF/Torch-Pruning, as they report EfficientNetV2 to be "prunable", but got much worse results. However, the advantage of this approach is that it keeps the model dense, and you can get a real speed-up with common GPUs, while unstructured pruning sparsifies the model and you need hardware that can exploit such sparsity.

model-optimization

1 1,464 6.7 Python

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
DeepCache

1 595 9.0 Python

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Project mention: DeepCache: Accelerating Diffusion Models for Free | news.ycombinator.com | 2023-12-05

SqueezeLLM

5 560 7.3 Python

SqueezeLLM: Dense-and-Sparse Quantization

Project mention: Llama33B vs Falcon40B vs MPT30B | /r/LocalLLaMA | 2023-07-05

Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
archai

3 453 8.3 Python

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
only_train_once

1 260 9.1 Python

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
KVQuant

1 183 5.3 Python

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Project mention: 10M Tokens LLM Context | news.ycombinator.com | 2024-02-02

UPop

1 83 8.4 Python

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python model-compression related posts

Llama33B vs Falcon40B vs MPT30B
2 projects | /r/LocalLLaMA | 5 Jul 2023
[P] Help: I want to compress EfficientnetV2 using pruning.
1 project | /r/MachineLearning | 28 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
1 project | news.ycombinator.com | 15 Jun 2023
New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.
2 projects | /r/LocalLLaMA | 14 Jun 2023
Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems
1 project | /r/machinelearningnews | 8 Jun 2022
GNN for computer vision, beating CNN & Transformer
1 project | /r/deeplearning | 4 Jun 2022
A note from our sponsor - InfluxDB
www.influxdata.com | 23 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source model-compression projects in Python? This list will help you:

	Project	Stars
1	nni	13,726
2	Efficient-AI-Backbones	3,783
3	Pretrained-Language-Model	2,953
4	Torch-Pruning	2,288
5	model-optimization	1,464
6	DeepCache	595
7	SqueezeLLM	560
8	archai	453
9	only_train_once	260
10	KVQuant	183
11	UPop	83