Agent trains great with PPO but terrible with SAC --> Advice for Hyperparameters

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • rl-baselines-zoo

    A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

  • Take a look at these tuned sets of hyperparameters for various problems in PPO and SAC. The batch sizes are WAY smaller regardless of the problem. Your initial learning rate may also be too high.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts