Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
softlearning
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
-
stable-baselines3-contrib
Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
Hi all! I'm implementing a TQC with n-step learning in Trackmania (I forked original repo from here: https://github.com/trackmania-rl/tmrl, my modified version here: https://github.com/Pheoxis/AITrackmania/tree/main). It compiles, but I am pretty sure that I implemented n-step learning incorrectly, but as a beginner I don't know what I did wrong. Here's my code before implementing n-step algorithm: https://github.com/Pheoxis/AITrackmania/blob/main/tmrl/custom/custom_algorithms.py. If anyone checked what I did wrong, I would be very grateful. I will also attach some plots from my last training and outputs from printed lines (print.txt), maybe it will help :) If you need any additional information feel free to ask.
Hi all! I'm implementing a TQC with n-step learning in Trackmania (I forked original repo from here: https://github.com/trackmania-rl/tmrl, my modified version here: https://github.com/Pheoxis/AITrackmania/tree/main). It compiles, but I am pretty sure that I implemented n-step learning incorrectly, but as a beginner I don't know what I did wrong. Here's my code before implementing n-step algorithm: https://github.com/Pheoxis/AITrackmania/blob/main/tmrl/custom/custom_algorithms.py. If anyone checked what I did wrong, I would be very grateful. I will also attach some plots from my last training and outputs from printed lines (print.txt), maybe it will help :) If you need any additional information feel free to ask.
# see https://github.com/rail-berkeley/softlearning/issues/60
# https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/sb3_contrib/tqc/tqc.py :
Related posts
- PPO rollout buffer for turn-based two-player game with varying turn lengths
- Show HN: An end-to-end reinforcement learning library for infinite horizon tasks
- Training an unbeatable AI in Trackmania [video]
- [P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials)
- SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), <class 'numpy.float32'>) observation space is not supported