Running Pygmalion6b locally CPU only and less than 12g ram with a reasonable responses time

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llama.cpp

773 56,891 10.0 C++

LLM inference in C/C++

I've been playing with https://github.com/ggerganov/llama.cpp recently and was surprised by the litter computing resource it requires. And for people who don't know what that is, it's an implementation of inference of Facebook's LLaMA model in pure C/C++. Most of all, it doesn't require a GPU to run, uses less ram and responds on time compared to running cuda on the CPU.

ggml

69 9,725 9.8 C

Tensor library for machine learning

The first question is that llama.cpp doesn't support GPT-J models, but I found another project from the same author https://github.com/ggerganov/ggml. It includes an example of converting the vanilla GPT-6J model to the ggml format, which is the format that llama.cpp supports. Since Pygmalion-6B was fine-tuned on GPT-6J, I believe it should also work on it.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
serge

40 5,543 9.8 Svelte

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: LlamaGPT – Self-hosted, offline, private AI chatbot, powered by Llama 2

12 projects | news.ycombinator.com | 16 Aug 2023
Need Help

2 projects | /r/LangChain | 15 Jun 2023
What the hell??

1 project | /r/Weird | 3 Jun 2023
We need decentralisation of AI. I'm not fan of monopoly or duopoly.

9 projects | /r/ChatGPT | 4 May 2023
SparseGPT: Language Models Can Be Accurately Pruned in One-Shot

4 projects | news.ycombinator.com | 3 May 2023

Running Pygmalion6b locally CPU only and less than 12g ram with a reasonable responses time

This page summarizes the projects mentioned and recommended in the original post on /r/PygmalionAI
llama alpaca Docker Fastapi llamacpp
Post date: 15 Mar 2023

llama.cpp

ggml

InfluxDB

serge

Related posts

Show HN: LlamaGPT – Self-hosted, offline, private AI chatbot, powered by Llama 2

Need Help

What the hell??

We need decentralisation of AI. I'm not fan of monopoly or duopoly.

SparseGPT: Language Models Can Be Accurately Pruned in One-Shot

Running Pygmalion6b locally CPU only and less than 12g ram with a reasonable responses time

This page summarizes the projects mentioned and recommended in the original post on /r/PygmalionAI llama alpaca Docker Fastapi llamacpp Post date: 15 Mar 2023

llama.cpp

ggml

InfluxDB

serge

Related posts

Show HN: LlamaGPT – Self-hosted, offline, private AI chatbot, powered by Llama 2

Need Help

What the hell??

We need decentralisation of AI. I'm not fan of monopoly or duopoly.

SparseGPT: Language Models Can Be Accurately Pruned in One-Shot

This page summarizes the projects mentioned and recommended in the original post on /r/PygmalionAI
llama alpaca Docker Fastapi llamacpp
Post date: 15 Mar 2023