Sparsebit
aimet
Sparsebit | aimet | |
---|---|---|
1 | 2 | |
320 | 1,925 | |
1.3% | 3.3% | |
5.9 | 9.6 | |
4 months ago | 5 days ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Sparsebit
aimet
-
I was looking for some great quantization open-source libraries that could actually be applied in production (both edge or cloud CPU/GPU). Do you know if I am missing any good libraries?
Qualcomm AIMET | Advanced quantization and compression techniques for trained neural network models
-
Model/Tool to use on Jetson for efficient Quantization/Pruning
Qualcomm AIMET may help you
What are some alternatives?
LLaMA-8bit-LoRA - Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only.
tkDNN - Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
sparsegpt-for-LLaMA - Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.
ludwig - Low-code framework for building custom LLMs, neural networks, and other AI models
tabmat - Efficient matrix representations for working with tabular data
open-lpr - Open Source and Free License Plate Recognition Software
FQ-ViT - [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
model-optimization - A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
qkeras - QKeras: a quantization deep learning library for Tensorflow Keras
alpaca-lora - Instruct-tune LLaMA on consumer hardware
TensorRT - NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.