How we were able to achieve hyper-parameter tuning (HPT) for deep learning workflows at 1.5x faster in our clusters and 3x cheaper on AWS

This page summarizes the projects mentioned and recommended in the original post on /r/learnmachinelearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • adaptdl

    Resource-adaptive cluster scheduler for deep learning training.

  • To tackle the problem of long and expensive HPT workflows, our team at Petuum collaborated with Microsoft to integrate AdaptDL with Neural Network Intelligence (NNI). AdaptDL is an open-source tool in the CASL (Composable, Automatic, and Scalable Learning) ecosystem. AdaptDL offers adaptive resource management for distributed clusters, and reduces the cost of deep learning workloads ranging from a few training/tuning trials to thousands. NNI from the Microsoft open-source community, is a toolkit for automatic machine learning (AutoML) and hyper-parameter tuning.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [Discussion] Open source scheduler and queuing system for model training/inferencing tasks?

    1 project | /r/MachineLearning | 16 Aug 2021
  • Reduce cost by 3x in the cloud and improve GPU usage in shared clusters with AdaptDL for PyTorch

    1 project | /r/u_Henry-GO | 24 Mar 2021
  • [D] Anyone deploy DL models with AWS Lambda? Trying to estimate costs

    2 projects | /r/MachineLearning | 5 Apr 2021
  • SB-1047 will stifle open-source AI and decrease safety

    2 projects | news.ycombinator.com | 29 Apr 2024
  • Getting Started with Gemma Models

    4 projects | dev.to | 15 Apr 2024