AutoGPTQ vs self-refine

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm. (by AutoGPTQ)

self-refine

LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively. (by madaan)

few-shot-learning language-generation large-language-models llms prompting Reasoning chatgpt gpt-35 gpt-4 Prompts

Source Code

selfrefine.info

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

AutoGPTQ		self-refine
	Project
19	Mentions	8
3,806	Stars	488
5.0%	Growth	-
9.3	Activity	7.9
4 days ago	Latest Commit	4 months ago
Python	Language	Python
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

AutoGPTQ

Posts with mentions or reviews of AutoGPTQ. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-10.

Setting up LLAMA2 70B Chat locally
1 project | /r/developersIndia | 18 Aug 2023
Experience of setting up LLAMA 2 70B Chat locally
1 project | /r/LocalLLaMA | 17 Aug 2023
GPT-4 Details Leaked
3 projects | news.ycombinator.com | 10 Jul 2023

Deploying the 60B version is a challenge though and you might need to apply 4-bit quantization with something like https://github.com/PanQiWei/AutoGPTQ or https://github.com/qwopqwop200/GPTQ-for-LLaMa . Then you can improve the inference speed by using https://github.com/turboderp/exllama .
If you prefer to use an "instruct" model à la ChatGPT (i.e. that does not need few-shot learning to output good results) you can use something like this: https://huggingface.co/TheBloke/Wizard-Vicuna-30B-Uncensored...
Loader Types
4 projects | /r/oobaboogazz | 26 Jun 2023

AutoGPTQ: an attempt at standardizing GPTQ-for-LLaMa and turning it into a library that is easier to install and use, and that supports more models. https://github.com/PanQiWei/AutoGPTQ
WizardLM-33B-V1.0-Uncensored
1 project | /r/LocalLLaMA | 24 Jun 2023
Any help converting an interesting .bin model to 4 bit 128g GPTQ? Bloke?
1 project | /r/LocalLLaMA | 18 Jun 2023

Just use the script: https://github.com/PanQiWei/AutoGPTQ/blob/main/examples/quantization/quant_with_alpaca.py
LLM.int8(): 8-Bit Matrix Multiplication for Transformers at Scale
5 projects | news.ycombinator.com | 10 Jun 2023

In the wild, people tend to use GTPQ quantization for pure GPU inference: https://github.com/PanQiWei/AutoGPTQ
And ggml's quant for CPU inference with some offload, which just got updated to a more GPTQ-like method days ago: https://github.com/ggerganov/llama.cpp/pull/1684
Some other runtimes like Apache TVM also have their own quant implementations: https://github.com/mlc-ai/mlc-llm
For training, 4-bit bitsandbytes is SOTA, as far as I know.
TBH I'm not sure why this November paper is being linked. Few are running 8 bit models when they could fit a better 3-5 bit model in the same memory pool.
Introducing Basaran: self-hosted open-source alternative to the OpenAI text completion API
9 projects | /r/LocalLLaMA | 1 Jun 2023

Instead of integrating GPTQ-for-Lllama, use AutoGPTQ instead.
AutoGPTQ - An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm
1 project | /r/aipromptprogramming | 1 Jun 2023

1 project | /r/AutoGPT | 31 May 2023

self-refine

Posts with mentions or reviews of self-refine. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-07.

Self-Refine: Iterative Refinement with Self-Feedback
1 project | news.ycombinator.com | 23 Nov 2023

1 project | news.ycombinator.com | 5 Apr 2023
ChemCrow: Augmenting large-language models with chemistry tools
1 project | news.ycombinator.com | 17 Apr 2023

>the systems operation are well understood
That's like saying human behavior is well understood because we know how neurons communicate signals. It's too low level to be useful, hence psychology.
>They don't have the ability to reason or reflect.
Yes they do
https://selfrefine.info/
https://arxiv.org/abs/2303.11366
Generative Agents: Interactive Simulacra of Human Behavior
1 project | news.ycombinator.com | 10 Apr 2023

Literally you've been told it doesn't need that much handholding.
https://selfrefine.info/
Large Language Models Are Human-Level Prompt Engineers
1 project | news.ycombinator.com | 9 Apr 2023

For those curious about self-refining systems: https://selfrefine.info/ (our recent work).
This AI Paper Introduces SELF-REFINE: A Framework For Improving Initial Outputs From LLMs Through Iterative Feedback And Refinement
2 projects | /r/machinelearningnews | 7 Apr 2023

Project: https://selfrefine.info/
Self-Refine: Iterative Refinement with Self-Feedback - a novel approach that allows LLMs to iteratively refine outputs and incorporate feedback along multiple dimensions to improve performance on diverse tasks
1 project | /r/MachineLearning | 3 Apr 2023

Found relevant code at https://github.com/madaan/self-refine + all code implementations here

What are some alternatives?

When comparing AutoGPTQ and self-refine you can also consider the following projects:

exllama - A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

prompt-lib - A set of utilities for running few-shot prompting experiments on large-language models

llama.cpp - LLM inference in C/C++

temporal-graph-gen - Pre-trained models for our work on Temporal Graph Generation

text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

basaran - Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ

ray-llm - RayLLM - LLMs on Ray

transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

catai - UI for 🦙model . Run AI assistant locally ✨

gptq-cuda-api

course - The Hugging Face course on Transformers

AutoGPTQ vs exllama self-refine vs prompt-lib AutoGPTQ vs llama.cpp self-refine vs temporal-graph-gen AutoGPTQ vs text-generation-webui AutoGPTQ vs basaran AutoGPTQ vs GPTQ-for-LLaMa AutoGPTQ vs ray-llm AutoGPTQ vs transformers AutoGPTQ vs catai AutoGPTQ vs gptq-cuda-api AutoGPTQ vs course

Compare AutoGPTQ vs self-refine and see what are their differences.

AutoGPTQ

self-refine

AutoGPTQ

self-refine

What are some alternatives?