Top 23 text-generation Open-Source Projects

LocalAI

82 19,862 9.9 C++

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

Project mention: Drop-In Replacement for ChatGPT API | news.ycombinator.com | 2024-01-24

MOSS

4 11,819 8.5 Python

An open-source tool-augmented conversational language model from Fudan University

Project mention: Has anyone tried fine tuning on a dataset of complex tasks that require tool use? | /r/LocalLLaMA | 2023-05-05

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
GPT2-Chinese

2 7,358 2.8 Python

Chinese version of GPT2 training code, using BERT tokenizer.
textgenrnn

7 4,943 0.0 Python

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Project mention: Modern alternative to textgenrnn? | /r/MLQuestions | 2023-06-09

Try this: 1) (Not sure if that's necessary.) Uninstall textgenrnn: pip3 uninstall textgenrnn. 2) Install it using one of this commands: * pip3 install git+git://github.com/minimaxir/textgenrnn.git * pip3 install git+https://github.com/minimaxir/textgenrnn.git (Try the first one, but if it'll raise an error, try the second one.) That's discussion about this "multi_gpu_model not found" error: https://github.com/minimaxir/textgenrnn/issues/222.

lollms-webui

7 3,801 9.9 Vue

Lord of Large Language Models Web User Interface

Project mention: Show HN: I made an app to use local AI as daily driver | news.ycombinator.com | 2024-02-27

gpt-2-simple

13 3,366 0.0 Python

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

Project mention: Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations | news.ycombinator.com | 2023-09-09

Tap the contact's name in WhatsApp (I think it only works on a phone) and at the bottom of that screen there's Export Chat.
For finetuning GPT-2 I think I used this thing on Google Colab. (My friend ran it on his GPU, it should be doable on most modern-ish GPUs.)
https://github.com/minimaxir/gpt-2-simple
I tried doing something with this a few months ago though and it was a bit of a hassle to get running (needed to use a specific python version for some dependencies...), I forget the details sorry!

DialoGPT

7 2,315 0.0 Python

Large-scale pretraining for dialogue
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
RL4LMs

5 2,094 0.0 Python

A modular RL library to fine-tune language models to human preferences

Project mention: How To Setup a Model With Guardrails? | /r/LocalLLaMA | 2023-05-12

I think of guardrails as another dimension of human preferences: whether you are training a model to answer questions more gooder or avoid saying horrifying stuff, you are teaching the model a preference. So I thinks it's a straightforward RLHF problem but from a different perspective.

GODEL

5 835 3.4 Python

Large-scale pretrained models for goal-directed dialog

Project mention: Microsoft: Large-scale pretrained models for goal-directed dialog | news.ycombinator.com | 2023-06-05

Accelerated Text

2 789 0.0 JavaScript

Accelerated Text is a no-code natural language generation platform. It will help you construct document plans which define how your data is converted to textual descriptions varying in wording and structure.
SqueezeLLM

5 569 6.9 Python

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Project mention: Llama33B vs Falcon40B vs MPT30B | /r/LocalLLaMA | 2023-07-05

Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.

Magick

1 604 10.0 TypeScript

Magick is a cutting-edge toolkit for a new kind of AI builder. Make Magick with us! (by Oneirocom)

Project mention: Dify, a visual workflow to build/test LLM applications | news.ycombinator.com | 2024-04-22

Cornucopia-LLaMA-Fin-Chinese

19 536 4.4 Python

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型，并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

Project mention: Cornucopia-LLaMA-Fin-Chinese: NEW Textual - star count:263.0 | /r/algoprojects | 2023-07-31

commit-autosuggestions

1 383 0.0 Python

A tool that AI automatically recommends commit messages.
gpt-2-cloud-run

1 313 0.0 HTML

Text-generation API via GPT-2 for Cloud Run
minimal-text-diffusion

2 261 4.9 Python

A minimal implementation of diffusion models for text generation
modular-diffusion

1 255 8.0 Python

Python library for designing and training your own Diffusion Models with PyTorch.

Project mention: I Built a Modular Python Library for Designing and Training Diffusion Models from Scratch | /r/SideProject | 2023-09-06

Last week, I released a project I've been working on for months: Modular Diffusion. It's a modular Python library for designing and training your own Diffusion Models in just a few lines of code. I also wrote a documentation page. The project has already gotten some great community feedback and I'm hoping you guys like it too!

MAGIC

2 245 0.0 Python

Language Models Can See: Plugging Visual Controls in Text Generation (by yxuansu)
GoLLIE

1 208 9.6 Python

Guideline following Large Language Model for Information Extraction

Project mention: A LLM trained to follow annotation guidelines, for information extraction tasks | news.ycombinator.com | 2023-10-30

LongForm

1 197 4.2

Reverse Instructions to generate instruction tuning data with corpus examples (by akoksal)
KVQuant

1 190 5.9 Python

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Project mention: 10M Tokens LLM Context | news.ycombinator.com | 2024-02-02

rant

2 183 0.0 Rust

Rant - The templating language for procedural generation.
genius

2 175 10.0 Python

💡GENIUS – generating text using sketches! A strong text generation & data augmentation tool.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

text-generation related posts

Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations

4 projects | news.ycombinator.com | 9 Sep 2023
Modern alternative to textgenrnn?

1 project | /r/MLQuestions | 9 Jun 2023
Is there any nano-gpt/pico-gpt like implementation available for stable-diffusion models?

1 project | /r/deeplearning | 23 Apr 2023
indistinguishable

4 projects | /r/CuratedTumblr | 20 Mar 2023
Just a thought

1 project | /r/replika | 8 Feb 2023
training gpt on your own sources - how does it work? gpt2 v gpt3? and how much does it cost?

2 projects | /r/OpenAI | 31 Jan 2023
Gen Z says that school is not shipping them with the skills necessary to survive in a digital world

5 projects | /r/technology | 29 Jan 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 3 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source text-generation projects? This list will help you:

	Project	Stars
1	LocalAI	19,862
2	MOSS	11,819
3	GPT2-Chinese	7,358
4	textgenrnn	4,943
5	lollms-webui	3,801
6	gpt-2-simple	3,366
7	DialoGPT	2,315
8	RL4LMs	2,094
9	GODEL	835
10	Accelerated Text	789
11	SqueezeLLM	569
12	Magick	604
13	Cornucopia-LLaMA-Fin-Chinese	536
14	commit-autosuggestions	383
15	gpt-2-cloud-run	313
16	minimal-text-diffusion	261
17	modular-diffusion	255
18	MAGIC	245
19	GoLLIE	208
20	LongForm	197
21	KVQuant	190
22	rant	183
23	genius	175

text-generation

Top 23 text-generation Open-Source Projects

text-generation related posts

Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations

Modern alternative to textgenrnn?

Is there any nano-gpt/pico-gpt like implementation available for stable-diffusion models?

indistinguishable

Just a thought

training gpt on your own sources - how does it work? gpt2 v gpt3? and how much does it cost?

Gen Z says that school is not shipping them with the skills necessary to survive in a digital world

Index