OpenLLaMA 13B Released

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

koboldcpp

180 3,817 10.0 C++

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI

Koboldcpp [1], which builds on llamacpp and adds a gui, is a great way to run these models. Most people aren't running these models at full weight, ggml quantization is recommended for cpu+gpu or gptq if you have the gpu vram.
GGML 13b models at 4bit (Q4_0) take somewhere around 9gb of ram and q5_K_M take about 11gb. Gpu offloading support has also been added, I've been using 22 layers on my laptop rtx 2070 max q 8gb vram. I get around ~2-3 tokens per second with 13b models. In my experience, running 13b models is worth the extra time it takes to generate a response compared to 7b models. GPTQ is faster but I can't fit a quantized 13b model so I don't use it.
TheBloke [2] has been quantizing models and uploading them to HF and will probably upload a quantized version of this online soon. His discord server also has good guides to help you get going, linked in the model card of most of his models.
https://github.com/LostRuins/koboldcpp
https://huggingface.co/TheBloke

llama.cpp

773 56,891 10.0 C++

LLM inference in C/C++

There are many UIs for running locally, but the easiest is koboldcpp:
https://github.com/LostRuins/koboldcpp
Its descended from the roleplaying community, but works fine (and performantly) for questioning and such.
You will need to download the model from HF quantize it yourself: https://github.com/ggerganov/llama.cpp#prepare-data--run

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
lm-evaluation-harness

34 5,070 9.9 Python

A framework for few-shot evaluation of language models.

There is the Language Model Evaluation Harness project which evaluates LLMs on over 200 tasks. HuggingFace has a leaderboard tracking performance on a subset of these tasks.
https://github.com/EleutherAI/lm-evaluation-harness
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...

open_llama

52 7,201 5.3

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

For some discussion on how to have the LLaMa tokenizer (properly) handle repeating spaces, please see this discussion: https://github.com/openlm-research/open_llama/issues/40

dockerLLM

5 284 7.3 Shell

TheBloke's Dockerfiles

https://www.runpod.io/console/templates
This is the readme for the one I mentioned: https://github.com/TheBlokeAI/dockerLLM/blob/main/README_Run...
> can I use Colab/Huggingface GPUs?
You use these templates on the runpod platform itself. Theres no free tier equivalent like you have with Colab/HF, but currently you can rent an RTX 4090 for $0.69/hr so its pretty affordable.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

I'm writing a new vector search SQLite Extension

10 projects | news.ycombinator.com | 2 May 2024
The Ultimate NixOS Homelab Guide - The Install

1 project | dev.to | 3 May 2024
Making a 3D Modeler, in C, in a Week

1 project | news.ycombinator.com | 2 May 2024
JPEG XL and Google's War Against It

2 projects | news.ycombinator.com | 2 May 2024
Online Cryptography Course by Dan Boneh

2 projects | news.ycombinator.com | 2 May 2024

OpenLLaMA 13B Released

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 18 Jun 2023

koboldcpp

llama.cpp

InfluxDB

lm-evaluation-harness

open_llama

dockerLLM

SaaSHub

Related posts

I'm writing a new vector search SQLite Extension

The Ultimate NixOS Homelab Guide - The Install

Making a 3D Modeler, in C, in a Week

JPEG XL and Google's War Against It

Online Cryptography Course by Dan Boneh