[P] RWKV 14B is a strong chatbot despite only trained on Pile (16G VRAM for 14B ctx4096 INT8, more optimizations incoming)

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • ChatRWKV

    ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

  • The latest CharRWKV v2 has a new chat prompt (works for any topic), and here are some raw user chats with RWKV-4-Pile-14B-20230228-ctx4096-test663 model (topp=0.85, temp=1.0, presence penalty 0.2, frequency penalty 0.5). You are welcome to try ChatRWKV v2: https://github.com/BlinkDL/ChatRWKV

  • RWKV-LM-LoRA

    RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • https://github.com/oobabooga/text-generation-webui is a front end that works with the RWKV models

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • How the RWKV language model works

    1 project | news.ycombinator.com | 4 Jul 2023
  • [P] Raven 7B & 14B 🐦(RWKV finetuned on Alpaca+CodeAlpaca+Guanaco) and Gradio Demo for Raven 7B

    1 project | /r/MachineLearning | 28 Mar 2023
  • [D] Totally Open Alternatives to ChatGPT

    7 projects | /r/MachineLearning | 18 Mar 2023
  • [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM)

    3 projects | /r/MachineLearning | 16 Mar 2023
  • [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python

    5 projects | /r/MachineLearning | 5 Mar 2023