cedille-ai
mesh-transformer-jax
cedille-ai | mesh-transformer-jax | |
---|---|---|
9 | 52 | |
201 | 6,213 | |
0.0% | - | |
0.0 | 0.0 | |
about 2 years ago | over 1 year ago | |
Python | ||
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cedille-ai
-
Happy 2nd birthday to GPT-3!
GPT-3’s release has inspired a gold rush, with over 30 new large language models trained since May/2020, especially through North America and China, but also in places like Israel, Germany, Switzerland, and Abu Dhabi.
- Publiez votre conte de Noël avec Cedille!
- Cedille: The largest French language model (r/MachineLearning)
- [P] Cedille: The largest French language model
- Cedille, the largest French language model, open source with a freely accessible playground
-
Cedille, the largest French language model , released in open source
Le repo sur GitHub : https://github.com/coteries/cedille-ai
-
Show HN: Cedille, the largest French language model, released in open source
We are excited to announce Cedille, the largest language model for French (6b parameters).
Demo: https://cedille.ai
Language models are general purpose AI systems that are able to solve a range of tasks by simply being prompted for it. It can be used for example to summarize text, do translations, or for idea generation & overcoming writer's block.
You may know GPT-3, the humongous model from OpenAI. Cedille is a similar model targeting the French demographic - but smaller, as we don’t yet have $1b in the bank like they do. Although GPT-3 supports multiple languages including French, our model is competitive with GPT-3 on a range of French tasks! Plus, of course we’re open source while they keep their model closed and heavily restrict access to it.
You can try it out right away from our playground: https://app.cedille.ai
We are proponents of “open AI” and as such have released a checkpoint for the world to use (MIT license): https://github.com/coteries/cedille-ai
One of the problems with large language models is the potentially toxic, sexist or in other ways unpleasant output. We tried our best to avoid this issue by doing extensive dataset filtering. As a result, our benchmark indicates that Cedille is indeed less toxic than GPT-3.
-
[P] Cedille, the largest French language model (6b), released in open source
We are proponents of “open AI” and as such have released a checkpoint for the world to use (MIT license) : https://github.com/coteries/cedille-ai
mesh-transformer-jax
-
Large Language Models: Compairing Gen2/Gen3 Models (GPT-3, GPT-J, MT5 and More)
GPT-J is a LLM case study with two goals: Training a LLM with a data source containing unique material, and using the training frameworkMesh Transformer JAX to achieve a high training efficiency through parallelization. There is no research paper about GPT-J, but on its GitHub pages, the model, different checkpoints, and the complete source code for training is given.
-
[R] Parallel Attention and Feed-Forward Net Design for Pre-training and Inference on Transformers
This idea has already been proposed in ViT-22B and GPT-J-6B.
- Show HN: Finetune LLaMA-7B on commodity GPUs using your own text
-
[D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset
Sure. Here's the repo I used for the fine-tuning: https://github.com/kingoflolz/mesh-transformer-jax. I used 5 epochs, and appart from that I kept the default parameters in the repo.
- Boss wants me to use ChatGPT for work, but I refuse to input my personal phone number. Any advice?
-
Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy
You can skip to step 4 using something like GPT-J as far as I understand: https://github.com/kingoflolz/mesh-transformer-jax#links
The pretrained model is already available.
-
Best coding model?
The Github repo suggests it's possible you can change the number of checkpoints to make it run on a GPU.
- Ask HN: What language models can I fine-tune at home?
-
selfhosted/ open-source ChatGPT alternative?
GPT-J, which uses mesh-transformer-jax: https://github.com/kingoflolz/mesh-transformer-jax
-
GPT-J, an open-source alternative to GPT-3
They hinted at it in the screenshot, but the goods are linked from the https://6b.eleuther.ai page: https://github.com/kingoflolz/mesh-transformer-jax#gpt-j-6b (Apache 2)
What are some alternatives?
allennlp - An open-source NLP research library, built on PyTorch.
DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
awesome-huggingface - 🤗 A list of wonderful open-source projects & applications integrated with Hugging Face libraries.
tensorflow - An Open Source Machine Learning Framework for Everyone
lm-evaluation-harness - A framework for few-shot evaluation of language models.
gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
detoxify - Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at [email protected].
jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Awesome-pytorch-list - A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
KoboldAI-Client
alpaca-lora - Instruct-tune LLaMA on consumer hardware
Finetune_LLMs - Repo for fine-tuning Casual LLMs