mesh-transformer-jax
Model parallel transformers in JAX and Haiku (by AeroScripts)
the-pile
By EleutherAI
mesh-transformer-jax | the-pile | |
---|---|---|
1 | 15 | |
0 | 1,403 | |
- | 1.6% | |
0.0 | 0.0 | |
almost 3 years ago | about 1 year ago | |
Jupyter Notebook | Python | |
Apache License 2.0 | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mesh-transformer-jax
Posts with mentions or reviews of mesh-transformer-jax.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-06-08.
-
GPT-J-6B, a 6B Parameter Text Generation Model
I'm running it quite comfortably on my 3090, although it's a really snug fit for the VRAM, and that's with a number of fixes to significantly reduce its memory use from https://github.com/AeroScripts/mesh-transformer-jax .
the-pile
Posts with mentions or reviews of the-pile.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-07.
-
The Pile
[2] https://github.com/EleutherAI/the-pile/issues/56
-
The Pile: a dataset for language modeling [pdf]
I came so close to getting my dataset DebateSum (https://huggingface.co/datasets/Hellisotherpeople/DebateSum) into the pile, but they decided at the last minute not to add it: https://github.com/EleutherAI/the-pile/issues/56
I'm still a tiny bit salty about that.
-
Sarah Silverman is suing OpenAI and Meta for copyright infringement
Anyone want to check if the book in question is in ThePile dataset?:
https://github.com/EleutherAI/the-pile/blob/master/the_pile/...
-
What Types Of Websites Are Typically Scraped To Train LLMs?
All of it, itβs quite diverse. Especially the commoncrawl bit, https://github.com/EleutherAI/the-pile.
-
Can anyone answer some questions on how GPT-NeoX-20B was developed, and future models?
For example, before this I didn't realize one of the sources of data that the pile uses is a massive number of emails gathered during the Enron lawsuits. Weird, but cool I guess.
-
How do I add AI modules?
NovelAI's Krake and Euterpe, and the rest, are finetuned versions of existing models. The original models were trained on a mass of text. Krake is a finetune of Neo-X 20b, which was trained on The Pile. NovelAI's finetunes involve further training but on various works of fiction rather than more text trawled from the internet. The statistical rules in the existing models are thus shifted in a (slightly) new direction. Modules refine those statistical rules, or weights, just a little bit more.
- GitHub - EleutherAI/the-pile
-
Sounds about right π /s
Literally The Pile.
-
What is the difference between OpenAI and the gpt3 algorithm?
The parameters are taken from large datasets like The Pile.
-
Official Beta AMA @ June 14th, 12pm EST
We use the GPT-Neo as our base model which trained on The Pile and you can see it's contents in their github repo: https://github.com/EleutherAI/the-pile
What are some alternatives?
When comparing mesh-transformer-jax and the-pile you can also consider the following projects:
jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
mesh-transformer-jax - Model parallel transformers in JAX and Haiku
datasets - π€ The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
opendyslexic - OpenDyslexic, a typeface that uses typeface shapes & features to help offset some visual symptoms of Dyslexia. Now in SIL-OFL.
DALLE-mtf - Open-AI's DALL-E for large scale training in mesh-tensorflow.