[D] First glance at LLaMA

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

llama-int8

6 1,044 3.6 Python

Quantized inference code for LLaMA models

To add a bit more context, the code other people linked (https://github.com/tloen/llama-int8) assumes single GPU. So if you want to run it on 2x3090, you'll need to modify it a bit:

text-generation-webui

876 36,293 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

I got the 13B model to work with RTX 3060 12GB and CPU + 64GB RAM. This repo: https://github.com/oobabooga/text-generation-webui

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Mamba for Sequence Classification in HuggingFace
1 project | news.ycombinator.com | 29 Apr 2024
RAG with Web Search
1 project | dev.to | 29 Apr 2024
Building a Basic Forex Rate Assistant Using Agents for Amazon Bedrock
4 projects | dev.to | 29 Apr 2024
How to Build a Crystal Image Search App with Vector Search
1 project | dev.to | 29 Apr 2024
Show HN: Dotenv, if it is a Unix utility
6 projects | news.ycombinator.com | 28 Apr 2024

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning Post date: 4 Mar 2023

llama-int8

text-generation-webui

WorkOS

Related posts