-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
One improvement (especially from an OOP perspective) could be to have two abstract classes for on-policy and off-policy algorithms, as in general they behave differently (at training time). That's an abstraction I implemented in my own RL playground, feel free to take a look if you're interested: https://github.com/diegochine/pyagents
Another thing concerning PPO, it is known that there actually are a lot of implementation details that really make the difference with this algorithm and they are not even cited in the original paper (vectorized environments for example). So, even though your code definitely follows the pseucode that is in the paper, most probably it won't be able to solve non-trivial environments like atari. If you're interested in these details there is a nice ICLR blog post that goes through the original PPO code and analyses all of them.