[D] DeepSpeed vs PyTorch native API

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • gpt-neox

    An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

  • Both EleutherAI's gpt-neox and the BigScience project use DeepSpeed under the hood, probably because DeepSpeed still remains the best component for training large models. So really dependent on your scale if DeepSpeed is still your answer, or if you can get away with these native PyTorch alternatives.

  • Megatron-DeepSpeed

    Ongoing research training transformer language models at scale, including: BERT & GPT-2

  • Both EleutherAI's gpt-neox and the BigScience project use DeepSpeed under the hood, probably because DeepSpeed still remains the best component for training large models. So really dependent on your scale if DeepSpeed is still your answer, or if you can get away with these native PyTorch alternatives.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • fairscale

    PyTorch extensions for high performance and large scale training.

  • Things are slowly moving into PyTorch upstream such as the ZeRO redundancy optimizer but from my experience the team behind DeepSpeed just move faster. There is also fairscale from the FAIR team which seems to be a staging ground for experimental optimizations before they move into PyTorch. If you use Lightning, it's easy enough to try out these various libraries (docs here)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts