Show HN: Alpaca.cpp – Run an Instruction-Tuned Chat-Style LLM on a MacBook

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llama.cpp

773 56,891 10.0 C++

LLM inference in C/C++

I noticed there's a couple of open issues on llama.cpp investigating quality issues. It's interesting if a wrong implementation still generates plausible output. It sounds like an objective quality metric would help track down issues.
https://github.com/ggerganov/llama.cpp/issues/129
https://github.com/ggerganov/llama.cpp/issues/173

alpaca.cpp

94 9,878 9.4 C

Discontinued Locally run an Instruction-Tuned Chat-Style LLM
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
alpaca-lora

107 18,197 3.6 Jupyter Notebook

Instruct-tune LLaMA on consumer hardware

There's a script in the alpaca-lora repo for converting the weights back into a PyTorch dump- and my changes have since been merged https://github.com/tloen/alpaca-lora/pull/19

lora

83 6,597 0.0 Jupyter Notebook

Using Low-rank adaptation to quickly fine-tune diffusion models. (by cloneofsimo)

>they perform a little worse.
Be aware that LoRA performs on-par or better than fine-tuning in model quality if trained correctly as the paper shows: https://arxiv.org/abs/2106.09685

rllama

7 519 6.2 Rust

Rust+OpenCL+AVX2 implementation of LLaMA inference code

I ran it on a 128 RAM machine with a Ryzen 5950X. It's not fast, 4 seconds per token. But it's just about fits without swapping. https://github.com/Noeda/rllama/

llm

41 5,911 9.4 Rust

An ecosystem of Rust libraries for working with large language models

I have not, but I want to in near future because I'm really curious myself too. I've been following Rust community that now has llama.cpp port and also my OpenCL thing and one discussion item has been to run a verification and common benchmark for the implementations. https://github.com/setzer22/llama-rs/issues/4
I've mostly heard that, at least for the larger models, quantization has barely any noticeable effect.

ggml

69 9,725 9.8 C

Tensor library for machine learning

Georgi rewrote the code on top of his own tensor library (ggml[0]).
[0] https://github.com/ggerganov/ggml

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
stanford_alpaca

108 28,816 2.0 Python

Code and documentation to train Stanford's Alpaca models, and generate the data.

Stanford released the exact training data as well as the training script with all parameters. Boot up a p4.2xlarge (8 A100 GPUs) which costs about $40/hour and let it run for a 2-3 hours and voila. See the Readme in their repo where it mentions the fine-tuning script[0]
[0] https://github.com/tatsu-lab/stanford_alpaca

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

How it feels right now

1 project | /r/singularity | 15 Jun 2023
LoRA tuning in julia

1 project | /r/Julia | 27 May 2023
What does Lora mean?

1 project | /r/StableDiffusion | 23 May 2023
[D] An ELI5 explanation for LoRA - Low-Rank Adaptation.

4 projects | /r/MachineLearning | 19 May 2023
Combining LoRA, Retro, and Large Language Models for Efficient Knowledge Retrieval and Retention

1 project | /r/MLQuestions | 18 May 2023

Show HN: Alpaca.cpp – Run an Instruction-Tuned Chat-Style LLM on a MacBook

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
diffusion fine Lora stable-diffusion
Post date: 16 Mar 2023

llama.cpp

alpaca.cpp

InfluxDB

alpaca-lora

lora

rllama

llm

ggml

SaaSHub

stanford_alpaca

Related posts

How it feels right now

LoRA tuning in julia

What does Lora mean?

[D] An ELI5 explanation for LoRA - Low-Rank Adaptation.

Combining LoRA, Retro, and Large Language Models for Efficient Knowledge Retrieval and Retention

Show HN: Alpaca.cpp – Run an Instruction-Tuned Chat-Style LLM on a MacBook

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com diffusion fine Lora stable-diffusion Post date: 16 Mar 2023

Related posts

How it feels right now

LoRA tuning in julia

What does Lora mean?

[D] An ELI5 explanation for LoRA - Low-Rank Adaptation.

Combining LoRA, Retro, and Large Language Models for Efficient Knowledge Retrieval and Retention

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
diffusion fine Lora stable-diffusion
Post date: 16 Mar 2023