[P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

rwkv.cpp

12 1,111 6.8 C++

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
cformers

4 315 6.7 C

SoTA Transformers with C-backend for fast inference on your CPU.

it's a combination of things, and removing python from the loop isn't essential to achieving most of these performance gains. the main trick is quantizing the weights and compiling the model. concrete example that builds on top of ggml with python APIs: https://github.com/NolanoOrg/cformers

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
RWKV-LM

84 11,747 8.8 Python

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Author of RWKV shows that an X billion model is comparable to an X billion GPT model: https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-eval2.png

whisper.cpp

187 31,649 9.8 C

Port of OpenAI's Whisper model in C/C++
llama.cpp

780 57,984 10.0 C++

LLM inference in C/C++
llm

41 5,954 9.4 Rust

An ecosystem of Rust libraries for working with large language models
alpaca.cpp

94 9,878 9.4 C

Discontinued Locally run an Instruction-Tuned Chat-Style LLM
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
gpt4all.cpp

2 510 6.5 C

Locally run an Assistant-Tuned Chat-Style LLM

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Does openai whisper works on termux ?

2 projects | /r/termux | 26 May 2023
Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

5 projects | news.ycombinator.com | 4 Apr 2023
Library for running deep learning models in the browser

2 projects | /r/webdev | 17 Feb 2023
Show HN: I created automatic subtitling app to boost short videos

1 project | news.ycombinator.com | 9 Apr 2024
Prompt Engineering Guide

1 project | news.ycombinator.com | 30 Mar 2024

[P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
openai Deep Learning speech-to-text language-model Transformer
Post date: 2 Apr 2023

rwkv.cpp

cformers

InfluxDB

RWKV-LM

whisper.cpp

llama.cpp

llm

alpaca.cpp

SaaSHub

gpt4all.cpp

Related posts

Does openai whisper works on termux ?

Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

Library for running deep learning models in the browser

Show HN: I created automatic subtitling app to boost short videos

Prompt Engineering Guide

[P] rwkv.cpp: FP16 &amp; INT4 inference on CPU for RWKV language model

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning openai Deep Learning speech-to-text language-model Transformer Post date: 2 Apr 2023

Related posts

Does openai whisper works on termux ?

Show HN: Ermine.ai – Record and transcribe speech, 100% client-side (WASM)

Library for running deep learning models in the browser

Show HN: I created automatic subtitling app to boost short videos

Prompt Engineering Guide

[P] rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
openai Deep Learning speech-to-text language-model Transformer
Post date: 2 Apr 2023