Problem with Truncated Quantile Critics (TQC) and n-step learning algorithm.

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • tmrl

    Reinforcement Learning for real-time applications - host of the TrackMania Roborace League

  • Hi all! I'm implementing a TQC with n-step learning in Trackmania (I forked original repo from here: https://github.com/trackmania-rl/tmrl, my modified version here: https://github.com/Pheoxis/AITrackmania/tree/main). It compiles, but I am pretty sure that I implemented n-step learning incorrectly, but as a beginner I don't know what I did wrong. Here's my code before implementing n-step algorithm: https://github.com/Pheoxis/AITrackmania/blob/main/tmrl/custom/custom_algorithms.py. If anyone checked what I did wrong, I would be very grateful. I will also attach some plots from my last training and outputs from printed lines (print.txt), maybe it will help :) If you need any additional information feel free to ask.

  • AITrackmania

  • Hi all! I'm implementing a TQC with n-step learning in Trackmania (I forked original repo from here: https://github.com/trackmania-rl/tmrl, my modified version here: https://github.com/Pheoxis/AITrackmania/tree/main). It compiles, but I am pretty sure that I implemented n-step learning incorrectly, but as a beginner I don't know what I did wrong. Here's my code before implementing n-step algorithm: https://github.com/Pheoxis/AITrackmania/blob/main/tmrl/custom/custom_algorithms.py. If anyone checked what I did wrong, I would be very grateful. I will also attach some plots from my last training and outputs from printed lines (print.txt), maybe it will help :) If you need any additional information feel free to ask.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • softlearning

    Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

  • # see https://github.com/rail-berkeley/softlearning/issues/60

  • stable-baselines3-contrib

    Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code

  • # https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/sb3_contrib/tqc/tqc.py :

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts