Debugging reinforcement learning

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

boardlaw

1 36 2.9 Python

Scaling scaling laws with board games.

The 'probe envs' section further down gives one method for achieving this. Here's a concrete example from my recent work, where I'm building out a parallel MCTS (tricky!). There are three tests in the section I've highlighted, all testing the ability of the MCTS to estimate the value of a state in increasingly complex circumstances. All the tests decisively pass or fail because I sub'd out the env and agent for simple, deterministic variants. More, if - say - the trivial_test which uses a single player passes, but the test_two_player fails, that tells me the problem's something to do with how I'm handling multiple players.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Easily train AlphaZero-like agents on any environment you want

2 projects | news.ycombinator.com | 20 Dec 2023
I placed Stockfish (white) against ChatGPT (black). Here's how the game went.

1 project | /r/AnarchyChess | 10 Feb 2023
How to "fit" the output of the Critic to the dimension of the reward?

1 project | /r/reinforcementlearning | 8 Feb 2022
MuZero unable to solve non-slippery FrozenLake environment?

2 projects | /r/reinforcementlearning | 9 Aug 2021
RL for chess

2 projects | /r/reinforcementlearning | 5 Jun 2021

Debugging reinforcement learning

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning
reinforcement-learning Alphazero
Post date: 26 May 2021

boardlaw

InfluxDB

Related posts

Show HN: Easily train AlphaZero-like agents on any environment you want

I placed Stockfish (white) against ChatGPT (black). Here's how the game went.

How to "fit" the output of the Critic to the dimension of the reward?

MuZero unable to solve non-slippery FrozenLake environment?

RL for chess

Debugging reinforcement learning

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning reinforcement-learning Alphazero Post date: 26 May 2021

boardlaw

InfluxDB

Related posts

Show HN: Easily train AlphaZero-like agents on any environment you want

I placed Stockfish (white) against ChatGPT (black). Here's how the game went.

How to "fit" the output of the Critic to the dimension of the reward?

MuZero unable to solve non-slippery FrozenLake environment?

RL for chess

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning
reinforcement-learning Alphazero
Post date: 26 May 2021