Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Python text-generation Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
textgenrnn
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Cornucopia-LLaMA-Fin-Chinese
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Try this: 1) (Not sure if that's necessary.) Uninstall textgenrnn: pip3 uninstall textgenrnn. 2) Install it using one of this commands: * pip3 install git+git://github.com/minimaxir/textgenrnn.git * pip3 install git+https://github.com/minimaxir/textgenrnn.git (Try the first one, but if it'll raise an error, try the second one.) That's discussion about this "multi_gpu_model not found" error: https://github.com/minimaxir/textgenrnn/issues/222.
Project mention: Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations | news.ycombinator.com | 2023-09-09Tap the contact's name in WhatsApp (I think it only works on a phone) and at the bottom of that screen there's Export Chat.
For finetuning GPT-2 I think I used this thing on Google Colab. (My friend ran it on his GPU, it should be doable on most modern-ish GPUs.)
https://github.com/minimaxir/gpt-2-simple
I tried doing something with this a few months ago though and it was a bit of a hassle to get running (needed to use a specific python version for some dependencies...), I forget the details sorry!
I think of guardrails as another dimension of human preferences: whether you are training a model to answer questions more gooder or avoid saying horrifying stuff, you are teaching the model a preference. So I thinks it's a straightforward RLHF problem but from a different perspective.
Project mention: Microsoft: Large-scale pretrained models for goal-directed dialog | news.ycombinator.com | 2023-06-05
Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.
Project mention: Cornucopia-LLaMA-Fin-Chinese: NEW Textual - star count:263.0 | /r/algoprojects | 2023-07-31
Project mention: I Built a Modular Python Library for Designing and Training Diffusion Models from Scratch | /r/SideProject | 2023-09-06Last week, I released a project I've been working on for months: Modular Diffusion. It's a modular Python library for designing and training your own Diffusion Models in just a few lines of code. I also wrote a documentation page. The project has already gotten some great community feedback and I'm hoping you guys like it too!
Project mention: A LLM trained to follow annotation guidelines, for information extraction tasks | news.ycombinator.com | 2023-10-30
Leaderboard: https://github.com/allenai/CommonGen-Eval?tab=readme-ov-file...
Project mention: A Defacto Guide on Building Generative AI Apps with the Google PaLM API | /r/learnmachinelearning | 2023-09-12⚠️ Alternating Message Authors: the api strictly expects alternating authors for chat based messages. In llmx, I implement a simple check for consecutive messages and merge them with a newline character.
Python text-generation related posts
-
Show HN: WhatsApp-Llama: A clone of yourself from your WhatsApp conversations
-
Modern alternative to textgenrnn?
-
Is there any nano-gpt/pico-gpt like implementation available for stable-diffusion models?
-
indistinguishable
-
Just a thought
-
training gpt on your own sources - how does it work? gpt2 v gpt3? and how much does it cost?
-
I (re)trained an AI using the 36 lessons of Vivec, the entirety of C0DA, the communist manifesto and the top posts of /r/copypasta and asked it the most important/unanswered lore questions. What are the lore implications of these insights?
-
A note from our sponsor - InfluxDB
www.influxdata.com | 10 May 2024
Index
What are some of the best open-source text-generation projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | MOSS | 11,825 |
2 | GPT2-Chinese | 7,360 |
3 | textgenrnn | 4,943 |
4 | gpt-2-simple | 3,366 |
5 | DialoGPT | 2,315 |
6 | RL4LMs | 2,094 |
7 | GODEL | 835 |
8 | SqueezeLLM | 571 |
9 | Cornucopia-LLaMA-Fin-Chinese | 536 |
10 | commit-autosuggestions | 383 |
11 | minimal-text-diffusion | 263 |
12 | modular-diffusion | 256 |
13 | MAGIC | 245 |
14 | GoLLIE | 214 |
15 | KVQuant | 194 |
16 | genius | 175 |
17 | mutate | 149 |
18 | ctrl-sum | 145 |
19 | pistoBot | 139 |
20 | ctc-gen-eval | 93 |
21 | CommonGen-Eval | 79 |
22 | llmx | 68 |
23 | namekrea | 49 |
Sponsored