Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 11 Python model-compression Projects
-
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
-
Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
-
Torch-Pruning
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
-
model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
archai
Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
-
only_train_once
OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM
-
UPop
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Project mention: Does anyone know a downloadable chatgpt model that supports conversation in Albanian? | /r/Programimi | 2023-05-16
Project mention: [P] Help: I want to compress EfficientnetV2 using pruning. | /r/MachineLearning | 2023-06-28I also tried structured pruning from https://github.com/VainF/Torch-Pruning, as they report EfficientNetV2 to be "prunable", but got much worse results. However, the advantage of this approach is that it keeps the model dense, and you can get a real speed-up with common GPUs, while unstructured pruning sparsifies the model and you need hardware that can exploit such sparsity.
Project mention: DeepCache: Accelerating Diffusion Models for Free | news.ycombinator.com | 2023-12-05
Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.
Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31
Python model-compression related posts
- Llama33B vs Falcon40B vs MPT30B
- [P] Help: I want to compress EfficientnetV2 using pruning.
- SqueezeLLM: Dense-and-Sparse Quantization
- New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.
- Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems
- GNN for computer vision, beating CNN & Transformer
-
A note from our sponsor - InfluxDB
www.influxdata.com | 23 Apr 2024
Index
What are some of the best open-source model-compression projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | nni | 13,726 |
2 | Efficient-AI-Backbones | 3,783 |
3 | Pretrained-Language-Model | 2,953 |
4 | Torch-Pruning | 2,288 |
5 | model-optimization | 1,464 |
6 | DeepCache | 595 |
7 | SqueezeLLM | 560 |
8 | archai | 453 |
9 | only_train_once | 260 |
10 | KVQuant | 183 |
11 | UPop | 83 |
Sponsored