A brief history of LLaMA models

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

LLaMA_MPS

4 566 10.0 Python

Discontinued Run LLaMA inference on Apple Silicon GPUs.

Most places that recommend llama.cpp for mac fail to mention https://github.com/jankais3r/LLaMA_MPS, which runs unquantized 7b and 13b models on the M1/M2 GPU directly. It's slightly slower, (not a lot), and significantly lower energy usage. To me the win not having to quantize is huge; I wish more people knew about it.

llama-dfdx

2 94 7.3 Rust

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

There's a rust deep learning library called dfdx that just setup llama: https://github.com/coreylowman/llama-dfdx

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
mlc-llm

89 16,955 9.9 Python

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
dalai

59 13,044 6.5 CSS

The simplest way to run LLaMA on your local machine

I had it running before with Dalai (https://github.com/cocktailpeanut/dalai) but have since moved to using the browser based WebGPU method (https://mlc.ai/web-llm/) which uses Vicuna 7B and is quite good.

web-llm

42 9,102 9.1 TypeScript

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

I had it running before with Dalai (https://github.com/cocktailpeanut/dalai) but have since moved to using the browser based WebGPU method (https://mlc.ai/web-llm/) which uses Vicuna 7B and is quite good.

text-generation-webui

876 36,293 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

Is it?
Literally every example I've seen so far is completely unversioned and mere weeks after being written simply doesn't work as a direct consequence.
E.g: https://github.com/oobabooga/text-generation-webui/blob/ee68...
Take this line:
    pip3 install torch torchvision torchaudio

peft

26 13,783 9.7 Python

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Wow. Less than half of those have any version specified. The rest? "Meh, I don't care, whatever."
Then this beauty:
    git+https://github.com/huggingface/peft

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
RedPajama-Data

19 4,329 6.0 Python

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

There are efforts to provide an open source replica of the training dataset and independently trained models. So far the dataset has been recreated following the original paper (allowing for some vagueness that Meta researchers didn't specify):
https://github.com/togethercomputer/RedPajama-Data/
https://twitter.com/togethercompute/status/16479179892645191...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project