[P] 10x faster reinforcement learning HPO - now with CNNs!

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • AgileRL

    Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools.

  • hlb-CIFAR10

    Train CIFAR-10 in <7 seconds on an A100, the current world record.

  • In a related but different vein (w/ hardcoded hyperparameters), if you'd like to have a research toolbench that trains rapidly on CIFAR10 (94% in <7 seconds on an A100), I made https://github.com/tysam-code/hlb-CIFAR10. It's also very breadboard-ized, for lack of a better term, so you can reclone and hack stuff in quickly to see if it works or doesn't. Most things I tested took 5 minutes or less, some a few seconds, and just a few more involved ones maybe half an hour to an hour or so, maybe a little more or less with debugging (depending upon how involved it was). I'm definitely curious about the software in this post though, as there was a lot of painful tuning involved (the reward space is, er, quite noisy).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • hlb-gpt

    Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).

  • Check it out! If LLMs are your thing, I did basically the same thing, but for 3.8 val loss on WikiText-103 in maybe 2.3ish minutes or so on an A100: https://github.com/tysam-code/hlb-gpt.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • TransformerXL + PPO Baseline + MemoryGym

    10 projects | /r/reinforcementlearning | 15 Feb 2023
  • Resources to get started with RL

    1 project | /r/reinforcementlearning | 15 May 2021
  • Train to 94% on CIFAR-10 in 3.29 seconds on a single A100

    2 projects | news.ycombinator.com | 4 Apr 2024
  • Understand how transformers work by demystifying all the math behind them

    1 project | news.ycombinator.com | 4 Jan 2024
  • The Power of Reinforcement Learning: look how this DeepRL Sektor model found a smart, super-cool exploit for Ultimate Mortal Kombat 3 in the video of a submission on DIAMBRA competition platform!

    1 project | /r/reinforcementlearning | 9 Dec 2023