Strange training results: why is a batch size of 1 more efficient than larger batch sizes, despite using a GPU/TPU?

This page summarizes the projects mentioned and recommended in the original post on /r/learnmachinelearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • seed_rl

    Discontinued SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

  • I'm currently doing some tests in preparation for my first real bit of training. I'm using Google Cloud AI Platform to train, and am trying to find the optimal machine setup. It's a work in progress, but here's a table I'm putting together to get a sense of the efficiency of each setup. On the left you'll see the accelerator type, ordered from least to most expensive. Here you'll also find the number of accelerator's used, the cost per hour, and the batch size. To the right are the average time it took to complete an entire training iteration and how long it took to complete the minimization step. You'll notice that the values are almost identical for each setup; I'm using Google Research's SEED RL, so I thought to record both values since I'm not sure exactly everything that happens between iterations. Turns out it's not much. There's also a calculation of the the time it takes to complete a single "step" (aka, a single observation from a single environment), as well as the average cost per step.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [Q]Official seed_rl repo is archived.. any alternative seed_rl style drl repo??

    1 project | /r/reinforcementlearning | 17 Dec 2022
  • Need some help understanding what steps to take to debug a RL agent

    1 project | /r/learnmachinelearning | 17 Jul 2021
  • Strange results from training with Google Cloud TPUs, seem to be very inefficient?

    1 project | /r/learnmachinelearning | 15 Jul 2021
  • Having trouble passing custom flags with AI Platform

    1 project | /r/googlecloud | 29 Jun 2021
  • New to Linux, trying to understand why a variable isn't getting assigned in an .sh file

    1 project | /r/linuxquestions | 20 Jun 2021