Python model-compression

Open-source Python projects categorized as model-compression

Top 14 Python model-compression Projects

model-compression
  1. Efficient-AI-Backbones

    Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

  2. Judoscale

    Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.

    Judoscale logo
  3. Pretrained-Language-Model

    Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

  4. Torch-Pruning

    [CVPR 2023] DepGraph: Towards Any Structural Pruning

  5. model-optimization

    A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

  6. DeepCache

    [CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

    Project mention: CVPR Edition: Voxel51 Filtered Views Newsletter - June 21, 2024 | dev.to | 2024-06-21

    Project page

  7. SqueezeLLM

    [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

  8. archai

    Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.

  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. q-diffusion

    [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

  11. KVQuant

    [NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

  12. only_train_once_personal_footprint

    OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM

  13. picollm

    On-device LLM Inference Powered by X-Bit Quantization

    Project mention: On-Device LLM Inference Powered by X-Bit Quantization | news.ycombinator.com | 2024-05-29
  14. SVD-LLM

    [ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2

  15. UPop

    [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

  16. MQAT

    [TMLR, 2024] Modular Quantization-Aware Training for 6D Object Pose Estimation

    Project mention: Modular Quantization Aware Training | news.ycombinator.com | 2025-03-25
  17. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python model-compression discussion

Log in or Post with

Python model-compression related posts

  • CVPR Edition: Voxel51 Filtered Views Newsletter - June 21, 2024

    5 projects | dev.to | 21 Jun 2024
  • Llama33B vs Falcon40B vs MPT30B

    2 projects | /r/LocalLLaMA | 5 Jul 2023
  • [P] Help: I want to compress EfficientnetV2 using pruning.

    1 project | /r/MachineLearning | 28 Jun 2023
  • SqueezeLLM: Dense-and-Sparse Quantization

    1 project | news.ycombinator.com | 15 Jun 2023
  • New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.

    2 projects | /r/LocalLLaMA | 14 Jun 2023
  • Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems

    1 project | /r/machinelearningnews | 8 Jun 2022
  • GNN for computer vision, beating CNN & Transformer

    1 project | /r/deeplearning | 4 Jun 2022
  • A note from our sponsor - Judoscale
    judoscale.com | 21 Apr 2025
    Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more →

Index

What are some of the best open-source model-compression projects in Python? This list will help you:

# Project Stars
1 Efficient-AI-Backbones 4,186
2 Pretrained-Language-Model 3,081
3 Torch-Pruning 2,980
4 model-optimization 1,531
5 DeepCache 886
6 SqueezeLLM 685
7 archai 475
8 q-diffusion 347
9 KVQuant 339
10 only_train_once_personal_footprint 302
11 picollm 233
12 SVD-LLM 198
13 UPop 101
14 MQAT 3

Sponsored
Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?