Sparsebit
sparsegpt-for-LLaMA
Sparsebit | sparsegpt-for-LLaMA | |
---|---|---|
1 | 3 | |
320 | 65 | |
1.3% | - | |
5.9 | 5.2 | |
4 months ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Sparsebit
sparsegpt-for-LLaMA
-
SparseGPT: Language Models Can Be Accurately Pruned in One-Shot
https://github.com/AlpinDale/sparsegpt-for-LLaMA
> # Prune to 50\% + 4-bit with SparseGPT -- Currently not working
- [R] 🤖🌟 Unlock the Power of Personal AI: Introducing ChatLLaMA, Your Custom Personal Assistant! 🚀💬
What are some alternatives?
LLaMA-8bit-LoRA - Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only.
serge - A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
tabmat - Efficient matrix representations for working with tabular data
FQ-ViT - [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
trl - Train transformer language models with reinforcement learning.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
alpaca-lora - Instruct-tune LLaMA on consumer hardware
llama.cpp - LLM inference in C/C++
aimet - AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
sparsegpt - Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".