Zero-3 Offload: Scale DL models to trillion parameters without code changes

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

gpt-neox

52 6,588 8.9 Python

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

GPT-NeoX is an example project that is using deepspeed and Zero-3 offloading. The wider project intend to train a GPT-3 sized model and release it freely to the world.
https://github.com/EleutherAI/gpt-neox

fairseq

89 29,262 6.0 Python

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Support for this was also added to [Fairscale](https://fairscale.readthedocs.io/en/latest/) and [Fairseq](https://github.com/pytorch/fairseq) last week. In particular, the Fairscale implementation can be used in any pyotrch project without requiring the use of the Deepspeed trainer.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Pytorch

340 78,016 10.0 Python

Tensors and Dynamic neural networks in Python with strong GPU acceleration

This is being also added to pytorch
https://github.com/pytorch/pytorch/pull/46750

DeepSpeed

51 32,739 9.8 Python

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Hi! I’m the one who wrote this code. My ZeRO-3 implementation is currently not working, but I’ve spoken with DeepSpeed devs and they’ve explained to me what I’ve been doing wrong. I haven’t had time to implement the fix but I don’t see any reason to assume it won’t work.
https://github.com/microsoft/DeepSpeed/issues/846

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

A comprehensive guide to running Llama 2 locally

19 projects | news.ycombinator.com | 25 Jul 2023
Cleared AWS Machine Learning - Specialty exam.. Happy to help!!!

2 projects | /r/AWSCertifications | 4 Apr 2023
People tricking ChatGPT “like watching an Asimov novel come to life”

1 project | news.ycombinator.com | 2 Dec 2022
Good practices for neural network training: identify, save, and document best models

1 project | dev.to | 4 Jan 2022
D I Refuse To Use Pytorch Because Its A Facebook

1 project | /r/MachineLearning | 29 Dec 2020

Zero-3 Offload: Scale DL models to trillion parameters without code changes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Deep Learning Python Pytorch GPU Machine Learning
Post date: 13 Mar 2021

gpt-neox

fairseq

InfluxDB

Pytorch

DeepSpeed

Related posts

A comprehensive guide to running Llama 2 locally

Cleared AWS Machine Learning - Specialty exam.. Happy to help!!!

People tricking ChatGPT “like watching an Asimov novel come to life”

Good practices for neural network training: identify, save, and document best models

D I Refuse To Use Pytorch Because Its A Facebook

Zero-3 Offload: Scale DL models to trillion parameters without code changes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Deep Learning Python Pytorch GPU Machine Learning Post date: 13 Mar 2021

gpt-neox

fairseq

InfluxDB

Pytorch

DeepSpeed

Related posts

A comprehensive guide to running Llama 2 locally

Cleared AWS Machine Learning - Specialty exam.. Happy to help!!!

People tricking ChatGPT “like watching an Asimov novel come to life”

Good practices for neural network training: identify, save, and document best models

D I Refuse To Use Pytorch Because Its A Facebook

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Deep Learning Python Pytorch GPU Machine Learning
Post date: 13 Mar 2021