-
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
GPT-NeoX is an example project that is using deepspeed and Zero-3 offloading. The wider project intend to train a GPT-3 sized model and release it freely to the world.
https://github.com/EleutherAI/gpt-neox
Support for this was also added to [Fairscale](https://fairscale.readthedocs.io/en/latest/) and [Fairseq](https://github.com/pytorch/fairseq) last week. In particular, the Fairscale implementation can be used in any pyotrch project without requiring the use of the Deepspeed trainer.
This is being also added to pytorch
https://github.com/pytorch/pytorch/pull/46750
Hi! I’m the one who wrote this code. My ZeRO-3 implementation is currently not working, but I’ve spoken with DeepSpeed devs and they’ve explained to me what I’ve been doing wrong. I haven’t had time to implement the fix but I don’t see any reason to assume it won’t work.
https://github.com/microsoft/DeepSpeed/issues/846
Related posts
-
A comprehensive guide to running Llama 2 locally
-
Cleared AWS Machine Learning - Specialty exam.. Happy to help!!!
-
People tricking ChatGPT “like watching an Asimov novel come to life”
-
Good practices for neural network training: identify, save, and document best models
-
D I Refuse To Use Pytorch Because Its A Facebook