AutoGPTQ Alternatives

Similar projects and alternatives to AutoGPTQ

text-generation-webui

876 35,862 9.9 Python AutoGPTQ VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
llama.cpp

769 55,846 10.0 C++ AutoGPTQ VS llama.cpp

LLM inference in C/C++
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
koboldcpp

179 3,749 10.0 C++ AutoGPTQ VS koboldcpp

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
transformers

175 124,557 10.0 Python AutoGPTQ VS transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
alpaca-lora

107 18,167 3.6 Jupyter Notebook AutoGPTQ VS alpaca-lora

Instruct-tune LLaMA on consumer hardware
mlc-llm

89 16,774 9.9 Python AutoGPTQ VS mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
GPTQ-for-LLaMa

75 2,904 8.6 Python AutoGPTQ VS GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
qlora

80 9,388 7.4 Jupyter Notebook AutoGPTQ VS qlora

QLoRA: Efficient Finetuning of Quantized LLMs
exllama

64 2,582 9.0 Python AutoGPTQ VS exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
basaran

22 1,281 10.0 Python AutoGPTQ VS basaran

Discontinued Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
Dependencies

24 8,109 0.0 C# AutoGPTQ VS Dependencies

A rewrite of the old legacy software "depends.exe" in C# for Windows devs to troubleshoot dll load dependencies issues.
sentencepiece

19 9,451 8.3 C++ AutoGPTQ VS sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.
gpt-llama.cpp

12 585 8.2 JavaScript AutoGPTQ VS gpt-llama.cpp

A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.
catai

7 406 8.6 TypeScript AutoGPTQ VS catai

UI for 🦙model . Run AI assistant locally ✨
self-refine

8 476 7.9 Python AutoGPTQ VS self-refine

LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively.
SpQR

4 512 6.7 Python AutoGPTQ VS SpQR
ray-llm

4 1,126 8.8 Python AutoGPTQ VS ray-llm

RayLLM - LLMs on Ray
gptq-cuda-api

2 19 3.9 Python AutoGPTQ VS gptq-cuda-api
lora-llm-qa-g

2 5 1.9 TypeScript AutoGPTQ VS lora-llm-qa-g

This TypeScript CLI tool helps you generate question-answer pairs for input examples using OpenAI's GPT-3
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better AutoGPTQ alternative or higher similarity.

Suggest an alternative to AutoGPTQ

AutoGPTQ reviews and mentions

Posts with mentions or reviews of AutoGPTQ. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-10.

Setting up LLAMA2 70B Chat locally
1 project | /r/developersIndia | 18 Aug 2023
Experience of setting up LLAMA 2 70B Chat locally
1 project | /r/LocalLLaMA | 17 Aug 2023
GPT-4 Details Leaked
3 projects | news.ycombinator.com | 10 Jul 2023

Deploying the 60B version is a challenge though and you might need to apply 4-bit quantization with something like https://github.com/PanQiWei/AutoGPTQ or https://github.com/qwopqwop200/GPTQ-for-LLaMa . Then you can improve the inference speed by using https://github.com/turboderp/exllama .
If you prefer to use an "instruct" model à la ChatGPT (i.e. that does not need few-shot learning to output good results) you can use something like this: https://huggingface.co/TheBloke/Wizard-Vicuna-30B-Uncensored...
Loader Types
4 projects | /r/oobaboogazz | 26 Jun 2023

AutoGPTQ: an attempt at standardizing GPTQ-for-LLaMa and turning it into a library that is easier to install and use, and that supports more models. https://github.com/PanQiWei/AutoGPTQ
WizardLM-33B-V1.0-Uncensored
1 project | /r/LocalLLaMA | 24 Jun 2023
Any help converting an interesting .bin model to 4 bit 128g GPTQ? Bloke?
1 project | /r/LocalLLaMA | 18 Jun 2023

Just use the script: https://github.com/PanQiWei/AutoGPTQ/blob/main/examples/quantization/quant_with_alpaca.py
LLM.int8(): 8-Bit Matrix Multiplication for Transformers at Scale
5 projects | news.ycombinator.com | 10 Jun 2023

In the wild, people tend to use GTPQ quantization for pure GPU inference: https://github.com/PanQiWei/AutoGPTQ
And ggml's quant for CPU inference with some offload, which just got updated to a more GPTQ-like method days ago: https://github.com/ggerganov/llama.cpp/pull/1684
Some other runtimes like Apache TVM also have their own quant implementations: https://github.com/mlc-ai/mlc-llm
For training, 4-bit bitsandbytes is SOTA, as far as I know.
TBH I'm not sure why this November paper is being linked. Few are running 8 bit models when they could fit a better 3-5 bit model in the same memory pool.
Introducing Basaran: self-hosted open-source alternative to the OpenAI text completion API
9 projects | /r/LocalLLaMA | 1 Jun 2023

Instead of integrating GPTQ-for-Lllama, use AutoGPTQ instead.
AutoGPTQ - An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm
1 project | /r/aipromptprogramming | 1 Jun 2023

1 project | /r/AutoGPT | 31 May 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Stats

Basic AutoGPTQ repo stats

Mentions

Stars

3,744

Activity

9.5

Last Commit

5 days ago

AutoGPTQ/AutoGPTQ is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of AutoGPTQ is Python.

Popular Comparisons