Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 8 Python distributed-training Projects
-
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
-
skypilot
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
FedML
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, FEDML Nexus AI (https://fedml.ai) is your generative AI platform at scale.
-
hivemind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
-
HandyRL
HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Fast-Kubeflow
This repo covers Kubeflow Environment with LABs: Kubeflow GUI, Jupyter Notebooks on pods, Kubeflow Pipelines, Experiments, KALE, KATIB (AutoML: Hyperparameter Tuning), KFServe (Model Serving), Training Operators (Distributed Training), Projects, etc.
Project mention: Ask HN: Most efficient way to fine-tune an LLM in 2024? | news.ycombinator.com | 2024-04-04
Project mention: [Experiment] The future of AI is open-source, and here is the plan | /r/samkoesnadi | 2023-06-05FedML https://github.com/FedML-AI/FedML might already provide a lot of tools to do the job
https://github.com/learning-at-home/hivemind is also relevant
Python distributed-training related posts
- Would anyone be interested in contributing to some group projects?
- Hive mind:Train deep learning models on thousands of volunteers across the world
- Could a model not be trained by a decentralized network? Like Seti @ home or kinda-sorta like bitcoin. Petals accomplishes this somewhat, but if raw computer power is the only barrier to open-source I'd be happy to try organizing decentalized computing efforts
- Orca (built on llama13b) looks like the new sheriff in town
- [Experiment] The future of AI is open-source, and here is the plan
- Do you think that AI research will slow down to a halt because of regulation?
- [D] Google "We Have No Moat, And Neither Does OpenAI": Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI
-
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024
Index
What are some of the best open-source distributed-training projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | pytorch-image-models | 29,659 |
2 | skypilot | 5,602 |
3 | FedML | 4,052 |
4 | alpa | 2,983 |
5 | hivemind | 1,833 |
6 | adaptdl | 395 |
7 | HandyRL | 282 |
8 | Fast-Kubeflow | 69 |
Sponsored