Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch (by NVIDIA)
-
pytorch-lightning
Discontinued Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning] (by PyTorchLightning)
I’m a researcher at MosaicML and we are excited to release Composer (https://github.com/mosaicml/composer), an open-source library to speed up training of deep learning models by integrating better algorithms into the training process.
PyTorch Lightning is also very slow compared to Composer. You don't have to believe us: our friends who wrote the FFCV library benchmarked us against PTL (see the lower left plot in the first cluster of graphs) , and you can see the difference for yourself. For the same accuracy, the FFCV folks found that Composer is about 5x faster than PTL on ResNet-50 on ImageNet.
Re: Channels Last + U-Net: According to our expert on the lower-level aspects of things (the amazing Daya Khudia), the problem is InstanceNorm. [Daya filed an issue about the lack of compatibility between InstanceNorm and Channels Last](https://github.com/pytorch/pytorch/issues/72341), and we're hoping our friends at PyTorch fix it soon.
I have been using their ASP package (ASP) and I have found it working well though as you said I would like to see support during the training phase as well
Pytorch lightning benchmarks against pytorch on every PR (benchmarks to make sure that it is mot slower.
The way I see it, what we're working on is really a completely new layer in the stack: speeding up the algorithm itself by changing the math. We've still taken great pains to make sure everything else in Composer runs as efficiently as it can, but - as long as you're running the same set of mathematical operations in the same order - there isn't much room to distinguish one trainer from another, and I'd guess that there isn't much of a raw speed difference between Composer and PTL in that sense. For that reason, we aren't very focused on inter-trainer speed comparisons - 10% or 20% here or there a rounding error on the 4x or more that you can expect in the long-run by changing the math. (I will say, though, that the engineers at MosaicML are really good at what they do, and Composer is performance tuned - it absolutely wipes the floor with the OpenLTH trainer I tried to write for my PhD, even without the algorithmic speedups.)