Top 13 model-compression Open-Source Projects

nni

5 13,742 6.7 Python

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Efficient-AI-Backbones

3 3,804 5.8 Python

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Pretrained-Language-Model

1 2,960 6.1 Python

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Project mention: Does anyone know a downloadable chatgpt model that supports conversation in Albanian? | /r/Programimi | 2023-05-16

Awesome-Knowledge-Distillation

1 2,402 0.0

Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
Torch-Pruning

2 2,307 9.4 Python

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs

Project mention: [P] Help: I want to compress EfficientnetV2 using pruning. | /r/MachineLearning | 2023-06-28

I also tried structured pruning from https://github.com/VainF/Torch-Pruning, as they report EfficientNetV2 to be "prunable", but got much worse results. However, the advantage of this approach is that it keeps the model dense, and you can get a real speed-up with common GPUs, while unstructured pruning sparsifies the model and you need hardware that can exploit such sparsity.

model-optimization

1 1,470 6.8 Python

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
DeepCache

1 603 8.9 Python

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Project mention: DeepCache: Accelerating Diffusion Models for Free | news.ycombinator.com | 2023-12-05

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
SqueezeLLM

5 569 6.9 Python

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Project mention: Llama33B vs Falcon40B vs MPT30B | /r/LocalLLaMA | 2023-07-05

Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.

archai

3 455 8.3 Python

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
only_train_once

1 261 8.9 Python

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
KVQuant

1 190 5.9 Python

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Project mention: 10M Tokens LLM Context | news.ycombinator.com | 2024-02-02

UPop

1 83 8.4 Python

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31

Awesome-Pruning-at-Initialization

1 55 5.3

[IJCAI'22 Survey] Recent Advances on Neural Network Pruning at Initialization.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

model-compression related posts

Llama33B vs Falcon40B vs MPT30B

2 projects | /r/LocalLLaMA | 5 Jul 2023
Has anyone tried out Squeezellm?

1 project | /r/LocalLLaMA | 2 Jul 2023
[P] Help: I want to compress EfficientnetV2 using pruning.

1 project | /r/MachineLearning | 28 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization

1 project | news.ycombinator.com | 15 Jun 2023
New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.

2 projects | /r/LocalLLaMA | 14 Jun 2023
Requesting help with Custom Layers (Layer Subclassing) - Model fit builds the model again! [Keras]

1 project | /r/tensorflow | 9 Aug 2022
Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems

1 project | /r/machinelearningnews | 8 Jun 2022
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source model-compression projects? This list will help you:

	Project	Stars
1	nni	13,742
2	Efficient-AI-Backbones	3,804
3	Pretrained-Language-Model	2,960
4	Awesome-Knowledge-Distillation	2,402
5	Torch-Pruning	2,307
6	model-optimization	1,470
7	DeepCache	603
8	SqueezeLLM	569
9	archai	455
10	only_train_once	261
11	KVQuant	190
12	UPop	83
13	Awesome-Pruning-at-Initialization	55