New quantization method AWQ outperforms GPTQ in 4-bit and 3-bit with 1.45x speedup and works with multimodal LLMs

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

GPTQ-for-LLaMa

75 2,913 8.6 Python

4 bits quantization of LLaMA using GPTQ

And exactly what Triton version are they comparing against? I just tried the latest version of this, and on my 4090/12900K I get 77 tokens per second for Llama 7B-128g. My own GPTQ CUDA implementation gets 151 tokens/second on the same model, same hardware. That makes it 96% faster, whereas AWQ is only 79% faster. For 30B-128g I'm currently only getting a 110% speedup over Triton compared to their 178%, but it still seems a little disingenuous to compare against their own CUDA implementation only, when they're trying to present the quantization method as being faster for inference.

llm-awq

7 1,794 8.0 Python

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

GitHub: https://github.com/mit-han-lab/llm-awq

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Voyager

53 5,152 4.7 JavaScript

An Open-Ended Embodied Agent with Large Language Models (by MineDojo)

Summary of the study by Claude-100k if anyone is interested:

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Is there any game that allow us to interact with it by python?

2 projects | /r/reinforcementlearning | 1 Dec 2023
A Coder Considers the Waning Days of the Craft

2 projects | news.ycombinator.com | 13 Nov 2023
Open/Local LLM support for MineDojo/Voyager

4 projects | /r/LocalLLaMA | 11 Oct 2023
Voyager – Minecraft Embodied Agent with Large Language Models

1 project | news.ycombinator.com | 17 Sep 2023
[D] - Are there any AI benchmarks that involve successful longterm problem solving when running as autonomous agents (like in autogpt)? How do we compare the effectiveness of models as agents?

1 project | /r/MachineLearning | 9 Jul 2023

New quantization method AWQ outperforms GPTQ in 4-bit and 3-bit with 1.45x speedup and works with multimodal LLMs

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
large-language-models embodied-learning open-ended-learning Minecraft
Post date: 2 Jun 2023

GPTQ-for-LLaMa

llm-awq

InfluxDB

Voyager

Related posts

Is there any game that allow us to interact with it by python?

A Coder Considers the Waning Days of the Craft

Open/Local LLM support for MineDojo/Voyager

Voyager – Minecraft Embodied Agent with Large Language Models

[D] - Are there any AI benchmarks that involve successful longterm problem solving when running as autonomous agents (like in autogpt)? How do we compare the effectiveness of models as agents?

New quantization method AWQ outperforms GPTQ in 4-bit and 3-bit with 1.45x speedup and works with multimodal LLMs

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA large-language-models embodied-learning open-ended-learning Minecraft Post date: 2 Jun 2023

GPTQ-for-LLaMa

llm-awq

InfluxDB

Voyager

Related posts

Is there any game that allow us to interact with it by python?

A Coder Considers the Waning Days of the Craft

Open/Local LLM support for MineDojo/Voyager

Voyager – Minecraft Embodied Agent with Large Language Models

[D] - Are there any AI benchmarks that involve successful longterm problem solving when running as autonomous agents (like in autogpt)? How do we compare the effectiveness of models as agents?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
large-language-models embodied-learning open-ended-learning Minecraft
Post date: 2 Jun 2023