OpenLLaMA 13B Released

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • koboldcpp

    A simple one-file way to run various GGML and GGUF models with KoboldAI's UI

  • Koboldcpp [1], which builds on llamacpp and adds a gui, is a great way to run these models. Most people aren't running these models at full weight, ggml quantization is recommended for cpu+gpu or gptq if you have the gpu vram.

    GGML 13b models at 4bit (Q4_0) take somewhere around 9gb of ram and q5_K_M take about 11gb. Gpu offloading support has also been added, I've been using 22 layers on my laptop rtx 2070 max q 8gb vram. I get around ~2-3 tokens per second with 13b models. In my experience, running 13b models is worth the extra time it takes to generate a response compared to 7b models. GPTQ is faster but I can't fit a quantized 13b model so I don't use it.

    TheBloke [2] has been quantizing models and uploading them to HF and will probably upload a quantized version of this online soon. His discord server also has good guides to help you get going, linked in the model card of most of his models.

    https://github.com/LostRuins/koboldcpp

    https://huggingface.co/TheBloke

  • llama.cpp

    LLM inference in C/C++

  • There are many UIs for running locally, but the easiest is koboldcpp:

    https://github.com/LostRuins/koboldcpp

    Its descended from the roleplaying community, but works fine (and performantly) for questioning and such.

    You will need to download the model from HF quantize it yourself: https://github.com/ggerganov/llama.cpp#prepare-data--run

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • lm-evaluation-harness

    A framework for few-shot evaluation of language models.

  • There is the Language Model Evaluation Harness project which evaluates LLMs on over 200 tasks. HuggingFace has a leaderboard tracking performance on a subset of these tasks.

    https://github.com/EleutherAI/lm-evaluation-harness

    https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...

  • open_llama

    OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

  • For some discussion on how to have the LLaMa tokenizer (properly) handle repeating spaces, please see this discussion: https://github.com/openlm-research/open_llama/issues/40

  • dockerLLM

    TheBloke's Dockerfiles

  • https://www.runpod.io/console/templates

    This is the readme for the one I mentioned: https://github.com/TheBlokeAI/dockerLLM/blob/main/README_Run...

    > can I use Colab/Huggingface GPUs?

    You use these templates on the runpod platform itself. Theres no free tier equivalent like you have with Colab/HF, but currently you can rent an RTX 4090 for $0.69/hr so its pretty affordable.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • I'm writing a new vector search SQLite Extension

    10 projects | news.ycombinator.com | 2 May 2024
  • The Ultimate NixOS Homelab Guide - The Install

    1 project | dev.to | 3 May 2024
  • Making a 3D Modeler, in C, in a Week

    1 project | news.ycombinator.com | 2 May 2024
  • JPEG XL and Google's War Against It

    2 projects | news.ycombinator.com | 2 May 2024
  • Online Cryptography Course by Dan Boneh

    2 projects | news.ycombinator.com | 2 May 2024