cramming
Cramming the training of a (BERT-type) language model into limited compute. (by JonasGeiping)
flan-ul2-alpaca
By ConiferLabsWA
cramming | flan-ul2-alpaca | |
---|---|---|
6 | 4 | |
1,238 | 32 | |
- | - | |
7.3 | 5.1 | |
16 days ago | about 1 year ago | |
Python | Python | |
MIT License | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cramming
Posts with mentions or reviews of cramming.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-03.
- [P] Notes on training BERT from scratch on an 8GB consumer GPU
- Cramming the training of a (BERT-type) language model into limited compute
- NanoGPT
-
New AI Research from the University of Maryland Investigates Cramming Challenge for Training a Language Model on a Single GPU in One Day
Quick Read: https://www.marktechpost.com/2023/01/03/new-ai-research-from-the-university-of-maryland-investigates-cramming-challenge-for-training-a-language-model-on-a-single-gpu-in-one-day/ Paper: https://arxiv.org/pdf/2212.14034.pdf Github: https://github.com/JonasGeiping/cramming
- Lucas Beyer on Twitter: “How Good of a Bert Can One Get in One Day on One GPU?
-
Cramming: Training a Language Model on a Single GPU in One Day - Jonas Geiping and Tom Goldstein University of Maryland 2022
Github: https://github.com/JonasGeiping/cramming
flan-ul2-alpaca
Posts with mentions or reviews of flan-ul2-alpaca.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-03.
- [P] Notes on training BERT from scratch on an 8GB consumer GPU
- [R][D]Anaconda env config for LLM research
-
Finetuning a commercially viable open source LLM (Flan-UL2) using Alpaca, Dolly15K and LoRA
Flan-UL2-Alpaca (Github)
- [P] Self-Hosted AI Chatbot Alternative: FOSS LLM with ChatGPT-Like Features (Selecting/Training)
What are some alternatives?
When comparing cramming and flan-ul2-alpaca you can also consider the following projects:
nanoGPT - The simplest, fastest repository for training/finetuning medium-sized GPTs.
askai - Command Line Interface for OpenAi ChatGPT
english-lang - The English Programming Language
aitextgen - A robust Python tool for text-based AI training and generation using GPT-2.
askai - Your simple terminal helper - A CLI integration with OpenAI's GPT3
extreme-bert - ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT”.
whisper.cpp - Port of OpenAI's Whisper model in C/C++