sagemaker-training-toolkit
sagemaker-tensorflow-training-toolkit
sagemaker-training-toolkit | sagemaker-tensorflow-training-toolkit | |
---|---|---|
1 | 1 | |
470 | 267 | |
2.8% | 0.0% | |
6.3 | 0.0 | |
about 1 month ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sagemaker-training-toolkit
-
Distributed training with Horovod/MPI
I'm using sagemaker-training-toolkit to attempt hyperparameter optimization and trying to take advantage of all the cores on each machine using their MPI options (which uses Horovod with MPI to my understanding). I'm pretty new to this space and can't find anything that describes in somewhat lay-terms how training works in this distributed model. With AllReduce, how often does the reduce happen? I'm trying to figure out if all training threads are training a shared model such that every thread is training on the "latest" version of the model.
sagemaker-tensorflow-training-toolkit
-
Launch HN: Slai (YC W22) – Build ML models quickly and deploy them as apps
this is pretty cool! especially the opinionated structuring part.
now Sagemaker allows u to download ur running code and docker (https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangle...) . Also allows u to simulate local running - https://github.com/aws/sagemaker-tensorflow-training-toolkit
rather than anything else, this is basically just a way to calm worries about lock-in. Google ML resisted this for a long time, but even they had to finally do it - https://cloud.google.com/automl-tables/docs/model-export
are you planning something similar ?
What are some alternatives?
image-super-resolution - 🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.
editGAN_release
jina - ☁️ Build multimodal AI applications with cloud-native stack
sagemaker-distribution - A set of Docker images that include popular frameworks for machine learning, data science and visualization.
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
spotty - Training deep learning models on AWS and GCP instances
torchlambda - Lightweight tool to deploy PyTorch models to AWS Lambda
sagemaker-python-sdk - A library for training and deploying machine learning models on Amazon SageMaker
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.