Alpaca- An Instruct Tuned Llama 7B. Responses on par with txt-DaVinci-3. Demo up

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

text-generation-webui

876 36,293 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

Might I suggest looking the story between the 2nd and 10th of march? I've noticed Hacker News hasn't been following certain areas of the effort. A lot of great work had happened and continues to be happen in close conjunction with the text-generation-webui (seriously, most of the cutting edge with 4-bit GPTQ etc. has been closely tied to the project).
>https://github.com/oobabooga/text-generation-webui/

stanford_alpaca

108 28,761 2.0 Python

Code and documentation to train Stanford's Alpaca models, and generate the data.

Here's a link that opens their training data (52,000 rows) in Datasette Lite: https://lite.datasette.io/?json=https://github.com/tatsu-lab...
Means you can run SQL LIKE queries against it to try and get a feel for what's in there.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
llama

184 53,053 8.1 Python

Inference code for Llama models

This is why I think we're seeing a Stable Diffusion moment for LLMs: https://simonwillison.net/2023/Mar/11/llama/
Look at the timeline:
24th February 2023: LLaMA is announced, starts being shared with academic partners: https://research.facebook.com/publications/llama-open-and-ef...
2nd March: Someone posts a PR with a BitTorrent link to the models: https://github.com/facebookresearch/llama/pull/73
10th March: First commit to llama.cpp by Georgi Gerganov: https://github.com/ggerganov/llama.cpp/commit/26c084662903dd...
11th March: llama.cpp now runs the 7B model on a 4GB RaspberryPi: https://twitter.com/miolini/status/1634982361757790209
13th March (today): llama.cpp on a Pixel 6 phone: https://twitter.com/thiteanish/status/1635188333705043969
And now, Alpaca. It's not even lunchtime yet!

llama.cpp

769 56,891 10.0 C++

LLM inference in C/C++

This is why I think we're seeing a Stable Diffusion moment for LLMs: https://simonwillison.net/2023/Mar/11/llama/
Look at the timeline:
24th February 2023: LLaMA is announced, starts being shared with academic partners: https://research.facebook.com/publications/llama-open-and-ef...
2nd March: Someone posts a PR with a BitTorrent link to the models: https://github.com/facebookresearch/llama/pull/73
10th March: First commit to llama.cpp by Georgi Gerganov: https://github.com/ggerganov/llama.cpp/commit/26c084662903dd...
11th March: llama.cpp now runs the 7B model on a 4GB RaspberryPi: https://twitter.com/miolini/status/1634982361757790209
13th March (today): llama.cpp on a Pixel 6 phone: https://twitter.com/thiteanish/status/1635188333705043969
And now, Alpaca. It's not even lunchtime yet!

llama

3 35 1.6

Inference code for LLaMA models (by gmorenz)

> All the magic of "7B LLaMA running on a potato" seems to involve lowering precision down to f16 and then further quantizing to int4.
LLaMa weights are f16s to start out with, no lowering necessary to get to there.
You can stream weights from RAM to the GPU pretty efficiently. If you have >= 32GB ram and >=2GB vram my code here should work for you: https://github.com/gmorenz/llama/tree/gpu_offload
There's probably a cleaner version of it somewhere else. Really you should only need >= 16 GB ram, but the (meta provided) code to load the initial weights is completely unnecessarily making two copies of the weights in RAM simultaneously.

Open-Assistant

329 36,647 8.3 Python

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Such a thing already exists and there were some results - https://open-assistant.io
I'm not sure why the authors of Alpaca didn't try to train it on this dataset.

self-instruct

3 3,666 2.3 Python

Aligning pretrained language models with instruction data generated by themselves.

It says
> We train the Alpaca model on 52K instruction-following demonstrations generated in the style of self-instruct using text-davinci-003
Which leads to self-instruct https://github.com/yizhongw/self-instruct
From a glimpse they used a LM to classify instructions & train the model which IMHO very similar to RLHF

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
ChatGLM-6B

17 39,231 8.4 Python

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Also today: ChatGLM released by Tsinghua University. I've made a separate submission for it: https://news.ycombinator.com/item?id=35150190
The GitHub page is https://github.com/THUDM/ChatGLM-6B. It's all in Chinese, but the model can handle English queries on a single consumer GPU well. Considering its size, I'd say the quality of its responses are outstanding.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project