DeepSpeed
mesh-transformer-jax
Our great sponsors
- CodiumAI - TestGPT | Generating meaningful tests for busy devs
- InfluxDB - Access the most powerful time series database as a service
- Sonar - Write Clean Python Code. Always.
- ONLYOFFICE ONLYOFFICE Docs — document collaboration in your environment
DeepSpeed | mesh-transformer-jax | |
---|---|---|
41 | 51 | |
25,088 | 5,900 | |
61.0% | - | |
9.6 | 0.0 | |
2 days ago | 4 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DeepSpeed
-
Using --deepspeed requires lots of manual tweaking
Filed a discussion item on the deepspeed project: https://github.com/microsoft/DeepSpeed/discussions/3531
Solution: I don't know; this is where I am stuck. https://github.com/microsoft/DeepSpeed/issues/1037 suggests that I just need to 'apt install libaio-dev', but I've done that and it doesn't help.
-
Whether the ML computation engineering expertise will be valuable, is the question.
There could be some spectrum of this expertise. For instance, https://github.com/NVIDIA/FasterTransformer, https://github.com/microsoft/DeepSpeed
- FLiPN-FLaNK Stack Weekly for 17 April 2023
- DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-Like Models
- DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-Like Models
-
12-Apr-2023 AI Summary
DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales (https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat)
- Microsoft DeepSpeed
-
Apple: Transformer architecture optimized for Apple Silicon
I'm following this closely, together with other efforts like GPTQ Quantization and Microsoft's DeepSpeed, all of which are bringing down the hardware requirements of these advanced AI models.
-
Facebook LLAMA is being openly distributed via torrents
- https://github.com/microsoft/DeepSpeed
Anything that could bring this to a 10GB 3080 or 24GB 3090 without 60s/it per token?
mesh-transformer-jax
- Show HN: Finetune LLaMA-7B on commodity GPUs using your own text
-
[D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset
Sure. Here's the repo I used for the fine-tuning: https://github.com/kingoflolz/mesh-transformer-jax. I used 5 epochs, and appart from that I kept the default parameters in the repo.
-
Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy
You can skip to step 4 using something like GPT-J as far as I understand: https://github.com/kingoflolz/mesh-transformer-jax#links
The pretrained model is already available.
-
Ask HN: Self-hosted/open-source ChatGPT alternative? Like Stable Diffusion
I know nothing, but have heard Hugging Face is in that direction.
https://github.com/huggingface/transformers
>Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.
> These models can be applied on:
> - Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages.
> - Images, for tasks like image classification, object detection, and segmentation.
> - Audio, for tasks like speech recognition and audio classification.
---
Also read about GPT-J, whose capability is comparable with GTP-3.
https://github.com/kingoflolz/mesh-transformer-jax
But I believe it requires buying or renting GPUs.
-
[D]: Are there any alternatives to Huggingface in the use of GPT-Neo?
Well, many models hosted on Hugging Face were actually developed without HF Transformers first (and then were ported to HF Transformers by the community). It is the case with GPT-J. Here is the original GPT-J implementation: https://github.com/kingoflolz/mesh-transformer-jax
-
dalle update
GPT-J with 6B parameters barelly scrapes by on a 16GB GPU (using KoboldAI, dunno what impact different scripts and stuff might have)
-
Yandex opensources 100B parameter GPT-like model
doesnt seem the code is there - pretrained models are there. https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b
https://huggingface.co/EleutherAI/gpt-j-6B
isnt that so ?
-
Meta announces a GPT3-size language model you can download
175B * 16 bits = 350GB, but it does compress a bit.
GPT-J-6B, which you can download at https://github.com/kingoflolz/mesh-transformer-jax, is 6B parameters but weighs 9GB. It does decompress to 12GB as expected. Assuming the same compression ratio, download size would be 263GB, not 350GB.
-
[D] Connor Leahy on EleutherAI, Replicating GPT-2/GPT-3, AI Risk and Alignment
GPT-J
-
did thy fix the game and its issues?
If you have a high-end PC, you might be able to run the Fairseq 13B model locally (it's labelled on that site as "dense_13b"). Alternatively, if you have a fairly powerful PC that can't quite run Fairseq 13B, you might still be able to run the GPT-J 6B model. You can also use the KoboldAI GPT-J 6B Google Colab if you can't run GPT-J 6B locally. While on the topic of KoboldAI, KoboldAI is also generally a pretty good frontend for running AI models locally. Back on the topic of GPT-J 6B, though, the EleutherAI GPT-J 6B demo is also a decent option for testing GPT-J 6B without any computing done on your device's end. It can also be used on mobile devices.
What are some alternatives?
ColossalAI - Making large AI models cheaper, faster and more accessible
fairscale - PyTorch extensions for high performance and large scale training.
TensorRT - NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
Megatron-LM - Ongoing research training transformer models at scale
fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
llama - Inference code for LLaMA models
gpt-neox - An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
server - The Triton Inference Server provides an optimized cloud and edge inferencing solution.
tensorflow - An Open Source Machine Learning Framework for Everyone
text-generation-webui - A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
Finetune_LLMs - Repo for fine-tuning GPTJ and other GPT models