Top 13 model-compression Open-Source Projects

nni

5 13,708 6.7 Python

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Efficient-AI-Backbones

3 3,774 4.4 Python

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Pretrained-Language-Model

1 2,953 6.1 Python

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Project mention: Does anyone know a downloadable chatgpt model that supports conversation in Albanian? | /r/Programimi | 2023-05-16

Awesome-Knowledge-Distillation

1 2,393 0.0

Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
Torch-Pruning

2 2,274 9.6 Python

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs

Project mention: [P] Help: I want to compress EfficientnetV2 using pruning. | /r/MachineLearning | 2023-06-28

I also tried structured pruning from https://github.com/VainF/Torch-Pruning, as they report EfficientNetV2 to be "prunable", but got much worse results. However, the advantage of this approach is that it keeps the model dense, and you can get a real speed-up with common GPUs, while unstructured pruning sparsifies the model and you need hardware that can exploit such sparsity.

model-optimization

1 1,464 6.7 Python

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
DeepCache

1 582 9.0 Python

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Project mention: DeepCache: Accelerating Diffusion Models for Free | news.ycombinator.com | 2023-12-05

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
SqueezeLLM

5 560 7.3 Python

SqueezeLLM: Dense-and-Sparse Quantization

Project mention: Llama33B vs Falcon40B vs MPT30B | /r/LocalLLaMA | 2023-07-05

Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.

archai

3 451 8.3 Python

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
only_train_once

1 259 9.1 Python

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
KVQuant

1 177 5.3 Python

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Project mention: 10M Tokens LLM Context | news.ycombinator.com | 2024-02-02

UPop

1 82 8.4 Python

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31

Awesome-Pruning-at-Initialization

1 55 5.3

[IJCAI'22 Survey] Recent Advances on Neural Network Pruning at Initialization.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-02.

model-compression related posts

Llama33B vs Falcon40B vs MPT30B
2 projects | /r/LocalLLaMA | 5 Jul 2023
Has anyone tried out Squeezellm?
1 project | /r/LocalLLaMA | 2 Jul 2023
[P] Help: I want to compress EfficientnetV2 using pruning.
1 project | /r/MachineLearning | 28 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
1 project | news.ycombinator.com | 15 Jun 2023
New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.
2 projects | /r/LocalLLaMA | 14 Jun 2023
Requesting help with Custom Layers (Layer Subclassing) - Model fit builds the model again! [Keras]
1 project | /r/tensorflow | 9 Aug 2022
Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems
1 project | /r/machinelearningnews | 8 Jun 2022
A note from our sponsor - WorkOS
workos.com | 19 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →

Index

What are some of the best open-source model-compression projects? This list will help you:

	Project	Stars
1	nni	13,708
2	Efficient-AI-Backbones	3,774
3	Pretrained-Language-Model	2,953
4	Awesome-Knowledge-Distillation	2,393
5	Torch-Pruning	2,274
6	model-optimization	1,464
7	DeepCache	582
8	SqueezeLLM	560
9	archai	451
10	only_train_once	259
11	KVQuant	177
12	UPop	82
13	Awesome-Pruning-at-Initialization	55