ignite
xla
ignite | xla | |
---|---|---|
3 | 8 | |
4,458 | 2,296 | |
0.3% | 1.4% | |
8.7 | 9.9 | |
1 day ago | 1 day ago | |
Python | C++ | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ignite
-
Introducing PyTorch-Ignite's Code Generator v0.2.0
Along with the PyTorch-Ignite 0.4.5 release, we are excited to announce the new release of the web application for generating PyTorch-Ignite's training pipelines. This blog post is an overview of the key features and updates of the Code Generator v0.2.0 project release.
-
Distributed Training Made Easy with PyTorch-Ignite
PyTorch-Ignite's ignite.distributed (idist) submodule introduced in version v0.4.0 (July 2020) quickly turns single-process code into its data distributed version.
-
Introduction to PyTorch-Ignite
More details about distributed helpers provided by PyTorch-Ignite can be found in the documentation. A complete example of training on CIFAR10 can be found here.
xla
-
Who uses Google TPUs for inference in production?
> The PyTorch/XLA Team at Google
Meanwhile you have an issue from 5 years ago with 0 support
https://github.com/pytorch/xla/issues/202
-
Google TPU v5p beats Nvidia H100
PyTorch has had an XLA backend for years. I don't know how performant it is though. https://pytorch.org/xla
-
Why Did Google Brain Exist?
It's curtains for XLA, to be precise. And PyTorch officially supports XLA backend nowadays too ([1]), which kind of makes JAX and PyTorch standing on the same foundation.
1. https://github.com/pytorch/xla
-
Accelerating AI inference?
Pytorch supports other kinds of accelerators (e.g. FPGA, and https://github.com/pytorch/glow), but unless you want to become a ML systems engineer and have money and time to throw away, or a business case to fund it, it is not worth it. In general, both pytorch and tensorflow have hardware abstractions that will compile down to device code. (XLA, https://github.com/pytorch/xla, https://github.com/pytorch/glow). TPUs and GPUs have very different strengths; so getting top performance requires a lot of manual optimizations. Considering the the cost of training LLM, it is time well spent.
-
[D] Colab TPU low performance
While apparently TPUs can theoretically achieve great speedups, getting to the point where they beat a single GPU requires a lot of fiddling around and debugging. A specific setup is required to make it work properly. E.g., here it says that to exploit TPUs you might need a better CPU to keep the TPU busy, than the one in colab. The tutorials I looked at oversimplified the whole matter, the same goes for pytorch-lightning which implies switching to TPU is as easy as changing a single parameter. Furthermore, none of the tutorials I saw (even after specifically searching for that) went into detail about why and how to set up a GCS bucket for data loading.
- How to train large deep learning models as a startup
-
Distributed Training Made Easy with PyTorch-Ignite
XLA on TPUs via pytorch/xla.
-
[P] PyTorch for TensorFlow Users - A Minimal Diff
I don't know of any such trick except for using TensorFlow. In fact, I benchmarked PyTorch XLA vs TensorFlow and found that the former's performance was quite abysmal: PyTorch XLA is very slow on Google Colab. The developers' explanation, as I understood it, was that TF was using features not available to the PyTorch XLA developers and that they therefore could not compete on performance. The situation may be different today, I don't know really.
What are some alternatives?
torch-metrics - Metrics for model evaluation in pytorch
NCCL - Optimized primitives for collective multi-GPU communication
image-similarity-measures - :chart_with_upwards_trend: Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.
pytorch-lightning - Build high-performance AI models with PyTorch Lightning (organized PyTorch). Deploy models with Lightning Apps (organized Python to build end-to-end ML systems). [Moved to: https://github.com/Lightning-AI/lightning]
prometheus_flask_exporter - Prometheus exporter for Flask applications
why-ignite - Why should we use PyTorch-Ignite ?
pymetrix - A simple Plug and Play Library for getting analytics. See website for docs.
pocketsphinx - A small speech recognizer
code-generator - Web Application to generate your training scripts with PyTorch Ignite
ompi - Open MPI main development repository
gloo - Collective communications library with various primitives for multi-machine training.