Show HN: Alpaca.cpp – Run an Instruction-Tuned Chat-Style LLM on a MacBook

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • llama.cpp

    LLM inference in C/C++

  • I noticed there's a couple of open issues on llama.cpp investigating quality issues. It's interesting if a wrong implementation still generates plausible output. It sounds like an objective quality metric would help track down issues.

    https://github.com/ggerganov/llama.cpp/issues/129

    https://github.com/ggerganov/llama.cpp/issues/173

  • alpaca.cpp

    Discontinued Locally run an Instruction-Tuned Chat-Style LLM

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • alpaca-lora

    Instruct-tune LLaMA on consumer hardware

  • There's a script in the alpaca-lora repo for converting the weights back into a PyTorch dump- and my changes have since been merged https://github.com/tloen/alpaca-lora/pull/19

  • lora

    Using Low-rank adaptation to quickly fine-tune diffusion models. (by cloneofsimo)

  • >they perform a little worse.

    Be aware that LoRA performs on-par or better than fine-tuning in model quality if trained correctly as the paper shows: https://arxiv.org/abs/2106.09685

  • rllama

    Rust+OpenCL+AVX2 implementation of LLaMA inference code

  • I ran it on a 128 RAM machine with a Ryzen 5950X. It's not fast, 4 seconds per token. But it's just about fits without swapping. https://github.com/Noeda/rllama/

  • llm

    An ecosystem of Rust libraries for working with large language models

  • I have not, but I want to in near future because I'm really curious myself too. I've been following Rust community that now has llama.cpp port and also my OpenCL thing and one discussion item has been to run a verification and common benchmark for the implementations. https://github.com/setzer22/llama-rs/issues/4

    I've mostly heard that, at least for the larger models, quantization has barely any noticeable effect.

  • ggml

    Tensor library for machine learning

  • Georgi rewrote the code on top of his own tensor library (ggml[0]).

    [0] https://github.com/ggerganov/ggml

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • stanford_alpaca

    Code and documentation to train Stanford's Alpaca models, and generate the data.

  • Stanford released the exact training data as well as the training script with all parameters. Boot up a p4.2xlarge (8 A100 GPUs) which costs about $40/hour and let it run for a 2-3 hours and voila. See the Readme in their repo where it mentions the fine-tuning script[0]

    [0] https://github.com/tatsu-lab/stanford_alpaca

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • How it feels right now

    1 project | /r/singularity | 15 Jun 2023
  • LoRA tuning in julia

    1 project | /r/Julia | 27 May 2023
  • What does Lora mean?

    1 project | /r/StableDiffusion | 23 May 2023
  • [D] An ELI5 explanation for LoRA - Low-Rank Adaptation.

    4 projects | /r/MachineLearning | 19 May 2023
  • Combining LoRA, Retro, and Large Language Models for Efficient Knowledge Retrieval and Retention

    1 project | /r/MLQuestions | 18 May 2023