Guanaco 7B, 13B, 33B and 65B models by Tim Dettmers: now for your local LLM pleasure

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • GPTQ-for-LLaMa

    4 bits quantization of LLaMa using GPTQ (by oobabooga)

  • No, I still use ooba's fork to ensure the widest compatibility. I would love to use a later version - specifically, I want to move to AutoGPTQ. But if I do that people who are still using ooba's fork (which is like 90% of people) can't use CPU offloading. They get a ton of errors and it just breaks.

  • starcoder

    Home of StarCoder: fine-tuning & inference!

  • Here's the script I use to merge a LoRA onto a base model: https://gist.github.com/TheBloke/d31d289d3198c24e0ca68aaf37a19032 (a slightly modified version of https://github.com/bigcode-project/starcoder/blob/main/finetune/merge_peft_adapters.py)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • GPTQ-for-LLaMa

    4 bits quantization of LLaMA using GPTQ

  • BTW, are you using this llama.py for quantization? https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/triton/llama.py

  • serge

    A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

  • u/The-Bloke Serge is with you (https://github.com/nsarrazin/serge/pull/334/files) can you suggest best models to set in the model manager from ggml currently :)

  • Local-LLM-Comparison-Colab-UI

    Compare the performance of different LLM that can be deployed locally on consumer hardware. Run yourself with Colab WebUI.

  • Colab webui for the guanaco-13B-GPTQ: Link

  • SillyTavern

    Discontinued LLM Frontend for Power Users. [Moved to: https://github.com/SillyTavern/SillyTavern] (by Cohee1207)

  • https://github.com/Cohee1207/SillyTavern from the repo, you will find everything you need and I use Ooba Text Generation Api as the backend

  • private-gpt

    Interact with your documents using the power of GPT, 100% privately, no data leaks

  • #1: OfflineAI example stack: PrivateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks | 0 comments #2: 💫 Found Offline Code AI: StarCoder: How to use an LLM to code | 0 comments #3: 🍿Oobabooga with NEW Uncensored Wizard Mega 13B Model | 1 comment

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • langflow

    ⛓️ Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts