Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression. Learn more →
Top 9 Python distributed-training Projects
-
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
Project mention: Inference on resent, cant work out the problem? | reddit.com/r/MLQuestions | 2023-05-11additionally, you might find the timm library handy for this sort of work.
-
FedML
FedML - The federated learning and analytics library enabling secure and collaborative machine learning on decentralized data anywhere at any scale. Supporting large-scale cross-silo federated learning, cross-device federated learning on smartphones/IoTs, and research simulation. MLOps and App Marketplace are also enabled (https://open.fedml.ai).
Project mention: Awesome-Federated-Learning: A curated list of federated learning publications, re-organized from Arxiv (mostly). | reddit.com/r/FederatedLearning | 2023-03-30 -
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
skypilot
SkyPilot is a framework for easily running machine learning workloads on any cloud through a unified interface.
Interesting, happy to chat and provide feedback as I have been working in this field for the last few years. Did you get inspiration by any chance from the following paper : https://arxiv.org/pdf/2205.07147.pdf and their recent implementation https://github.com/skypilot-org/skypilot ?
-
- Alpa does training and serving with 175B parameter models https://github.com/alpa-projects/alpa
-
hivemind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
Project mention: Do you think that AI research will slow down to a halt because of regulation? | reddit.com/r/singularity | 2023-05-21not if we rise to meet that challenge. here's a few tools that facilitate AI research in the face of an advanced persistent threat: Hivemind- a distributed Pytorch framework
-
-
HandyRL
HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
Project mention: What is Midjourney doing better than us? | reddit.com/r/StableDiffusion | 2023-04-04
noob here, dunno nothing about how community could contribute be reinforcing a shared training but this is maybe what we should aim. Imagin users contributing in training a large models, with a system of upvotes like midjourney)... They have control over the model and reionforcing that. We are fragmented in multiple models, loras and such. Everyone focusin on different things. Made some researches time ago and ended up here: https://github.com/chavinlo/distributed-diffusion and this https://learning-at-home.github.io/
-
Fast-Kubeflow
This repo covers Kubeflow Environment with LABs: Kubeflow GUI, Jupyter Notebooks on pods, Kubeflow Pipelines, Experiments, KALE, KATIB (AutoML: Hyperparameter Tuning), KFServe (Model Serving), Training Operators (Distributed Training), Projects, etc.
Project mention: Fast-Kubeflow: Kubeflow Tutorial, Sample Usage Scenarios (Howto: Hands-on LAB) | reddit.com/r/mlops | 2023-01-04
Python distributed-training related posts
- Do you think that AI research will slow down to a halt because of regulation?
- [D] Google "We Have No Moat, And Neither Does OpenAI": Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI
- What is Midjourney doing better than us?
- Awesome-Federated-Learning: A curated list of federated learning publications, re-organized from Arxiv (mostly).
- Run 100B+ language models at home, BitTorrent‑style
- FedML has just released its completely revamped and redesigned AI Platform and Website.
- CAI chose the path of failure. I'd like to offer a unique skill I have to help Pygmalion
-
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Jun 2023
Index
What are some of the best open-source distributed-training projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | pytorch-image-models | 25,471 |
2 | FedML | 2,815 |
3 | skypilot | 2,635 |
4 | alpa | 2,489 |
5 | hivemind | 1,512 |
6 | adaptdl | 363 |
7 | HandyRL | 266 |
8 | distributed-diffusion | 135 |
9 | Fast-Kubeflow | 40 |