RecurrentPPO (SB3-contrib) learning for autonomous driving

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • HighwayEnv

    A minimalist environment for decision-making in autonomous driving

  • Hi everyone! I'm a complete newbie to DRL, so please forgive my lack of understanding of some things on here. I'm training a recPPO from SB3-contrib on E.Leurent's Highway env [https://github.com/eleurent/highway-env] (I customized the action to be more high-level). During training I get the desired behavioural outcome from the agent but I noticed that some training metrics of the model seem quite off respect to the trend found online (especially the explained variance). I just wanted an opinion from some more navigated fellas in here! Can I somehow fix this trend by hyperparameter tuning or do I have e.g. to modify the reward function somehow? How can I improve the training? For any details I'm always available. I share the tensorboard plots obtained for RecPPO.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts