finetune-gpt2xl
aitg
Our great sponsors
finetune-gpt2xl | aitg | |
---|---|---|
9 | 1 | |
421 | 4 | |
- | - | |
0.0 | 0.0 | |
11 months ago | over 1 year ago | |
Python | Python | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
finetune-gpt2xl
-
Fine-tuning?
git clone the finetuning repo https://github.com/Xirider/finetune-gpt2xl go into the finetuning repo, install the rest of the requirements, pip install -r requirements.txt
- Training text-generating models locally
-
Dataset For GPT Fine-Tuning
I would like to understand a little better how to organize texts for Fine-Tuning, especially for GPT Neo. I plan to use this repo procedure, where is the following notice,
-
How to share the finetuned model
In the code suggested in the video (and in the repo) the flag --fp16 is used. But reading the "DeepSpeed Integration" article it is said that,
- [D] I made a script that does all the work to deploy GPT-NEO on Windows 10. (Please Test)
-
[Project] Estimating fine-tuning cost
Finetuning GPT-NEO 2.7B on Wikitext (180mb) took me about 45 minutes on one preemptible V100 instance on google cloud. It cost 1.30$ per hour and therefore around 1 $. Here are the steps: https://github.com/Xirider/finetune-gpt2xl
-
[P] Guide: Finetune GPT2-XL (1.5 Billion Parameters, the biggest model) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed
Here i explain the setup and commands to get it running: https://github.com/Xirider/finetune-gpt2xl
- Guide: Finetune GPT2-XL (1.5 Billion Parameters, the biggest model) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed
aitg
What are some alternatives?
detoxify - Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at [email protected].
nvc-gpt3-chat - This is the code I used to create a small private SMS text chat system that employs GPT3 from OpenAI and "nonviolent communication", an algorithmically based method of conflict resolution. Hopefully the chat helps users process conflict.
Extracting-Training-Data-from-Large-Langauge-Models - A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020
ALAE - [CVPR2020] Adversarial Latent Autoencoders
bertviz - BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
jukebox - Code for the paper "Jukebox: A Generative Model for Music"
kogpt - KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)