[Nvidia] Guide: Getting llama-7b 4bit running in simple(ish?) steps!

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

GPTQ-for-LLaMa

75 2,913 8.6 Python

4 bits quantization of LLaMA using GPTQ

quant_cuda is from https://github.com/qwopqwop200/GPTQ-for-LLaMa, the library needed to run 4bit models. Since you're missing it, it either wasn't installed or failed during installation.

text-generation-webui

876 36,293 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

git remote add upstream https://github.com/oobabooga/text-generation-webui.git git merge --squash -m "Merge upstream" upstream/main

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
stable-diffusion-webui-docker

58 6,013 6.3 Shell

Easy Docker setup for Stable Diffusion with user-friendly UI
text-generation-webui

3 5 9.0 Python

A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, GPT-Neo, and Pygmalion. (by TheTerrasque)

You will need the latest git version, not v0.1 release. (https://github.com/TheTerrasque/text-generation-webui -> code -> download zip) - that holds the (first) official lora support code from the webui project, but I haven't tested it much.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Creando Subtítulos Automáticos para Vídeos con Python, Faster-Whisper, FFmpeg, Streamlit, Pillow
7 projects | dev.to | 29 Apr 2024
SatCLIP: A Global, General-Purpose Geographic Location Encoder
1 project | news.ycombinator.com | 28 Apr 2024
Haystack DB – 10x faster than FAISS with binary embeddings by default
2 projects | news.ycombinator.com | 28 Apr 2024
PySheets – Spreadsheet UI for Python
3 projects | news.ycombinator.com | 28 Apr 2024
AWS Serverless Diversity: Multi-Language Strategies for Optimal Solutions
4 projects | dev.to | 28 Apr 2024

[Nvidia] Guide: Getting llama-7b 4bit running in simple(ish?) steps!

This page summarizes the projects mentioned and recommended in the original post on /r/Oobabooga Post date: 16 Mar 2023

GPTQ-for-LLaMa

text-generation-webui

WorkOS

stable-diffusion-webui-docker

text-generation-webui

Related posts