[Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • llama.cpp

    LLM inference in C/C++

    You can run llama-30B on a CPU using llama.cpp, it's just slow. The alpaca models I've seen are the same size as the llama model they are trained on, so I would expect running the alpaca-30B models will be possible on any system capable of running llama-30B.

  • alpaca.cpp

    Discontinued Locally run an Instruction-Tuned Chat-Style LLM

    LLaMa/Alpaca work just fine on CPU with llama.cpp/alpaca.cpp. Not very snappy but fast enough for me.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

    You are absolutely correct. text-gen-webui offers "streaming" via paging models in and out of VRAM. Using this your CPU no longer gets bogged down with running the model, but you don't see much improvement in generation speed as the GPU is churning with loading and unloading model data from main RAM all the time.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts