Understanding RL – repo sharing

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

understanding-rl

2 15 1.0 Python
Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
pyagents

1 3 4.8 Python

Just our DRL playground.

One improvement (especially from an OOP perspective) could be to have two abstract classes for on-policy and off-policy algorithms, as in general they behave differently (at training time). That's an abstraction I implemented in my own RL playground, feel free to take a look if you're interested: https://github.com/diegochine/pyagents

ppo-implementation-details

18 573 0.0 Python

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

Another thing concerning PPO, it is known that there actually are a lot of implementation details that really make the difference with this algorithm and they are not even cited in the original paper (vectorized environments for example). So, even though your code definitely follows the pseucode that is in the paper, most probably it won't be able to solve non-trivial environments like atari. If you're interested in these details there is a nice ICLR blog post that goes through the original PPO code and analyses all of them.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Solus OS-designed for home computing

1 project | news.ycombinator.com | 16 Jun 2024
Query your database with AI using LangChain and Gradio

2 projects | dev.to | 16 Jun 2024
A most profound video game: a good cognitive aid for research

1 project | news.ycombinator.com | 16 Jun 2024
Accessing Math Solutions via Monte Carlo Self-Refine with LLaMa-3 8B

1 project | news.ycombinator.com | 15 Jun 2024
Reformatting 100k Files at Google in 2011

1 project | news.ycombinator.com | 15 Jun 2024

Understanding RL – repo sharing

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning
hardware-buttons scrape-images linkedin-bot
Post date: 21 Jul 2022

understanding-rl

Scout Monitoring

pyagents

ppo-implementation-details

Related posts

Solus OS-designed for home computing

Query your database with AI using LangChain and Gradio

A most profound video game: a good cognitive aid for research

Accessing Math Solutions via Monte Carlo Self-Refine with LLaMa-3 8B

Reformatting 100k Files at Google in 2011

Understanding RL – repo sharing

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning hardware-buttons scrape-images linkedin-bot Post date: 21 Jul 2022

understanding-rl

Scout Monitoring

pyagents

ppo-implementation-details

Related posts

Solus OS-designed for home computing

Query your database with AI using LangChain and Gradio

A most profound video game: a good cognitive aid for research

Accessing Math Solutions via Monte Carlo Self-Refine with LLaMa-3 8B

Reformatting 100k Files at Google in 2011

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning
hardware-buttons scrape-images linkedin-bot
Post date: 21 Jul 2022