[P] Composer: a new PyTorch library to train models ~2-4x faster with better algorithms

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • composer

    Supercharge Your Model Training (by mosaicml)

  • I’m a researcher at MosaicML and we are excited to release Composer (https://github.com/mosaicml/composer), an open-source library to speed up training of deep learning models by integrating better algorithms into the training process.

  • ffcv

    FFCV: Fast Forward Computer Vision (and other ML workloads!)

  • PyTorch Lightning is also very slow compared to Composer. You don't have to believe us: our friends who wrote the FFCV library benchmarked us against PTL (see the lower left plot in the first cluster of graphs) , and you can see the difference for yourself. For the same accuracy, the FFCV folks found that Composer is about 5x faster than PTL on ResNet-50 on ImageNet.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

  • Re: Channels Last + U-Net: According to our expert on the lower-level aspects of things (the amazing Daya Khudia), the problem is InstanceNorm. [Daya filed an issue about the lack of compatibility between InstanceNorm and Channels Last](https://github.com/pytorch/pytorch/issues/72341), and we're hoping our friends at PyTorch fix it soon.

  • apex

    A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch (by NVIDIA)

  • I have been using their ASP package (ASP) and I have found it working well though as you said I would like to see support during the training phase as well

  • pytorch-lightning

    Discontinued Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning] (by PyTorchLightning)

  • Pytorch lightning benchmarks against pytorch on every PR (benchmarks to make sure that it is mot slower.

  • open_lth

    A repository in preparation for open-sourcing lottery ticket hypothesis code.

  • The way I see it, what we're working on is really a completely new layer in the stack: speeding up the algorithm itself by changing the math. We've still taken great pains to make sure everything else in Composer runs as efficiently as it can, but - as long as you're running the same set of mathematical operations in the same order - there isn't much room to distinguish one trainer from another, and I'd guess that there isn't much of a raw speed difference between Composer and PTL in that sense. For that reason, we aren't very focused on inter-trainer speed comparisons - 10% or 20% here or there a rounding error on the 4x or more that you can expect in the long-run by changing the math. (I will say, though, that the engineers at MosaicML are really good at what they do, and Composer is performance tuned - it absolutely wipes the floor with the OpenLTH trainer I tried to write for my PhD, even without the algorithmic speedups.)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts