PPO-PyTorch
autonomous-learning-library
PPO-PyTorch | autonomous-learning-library | |
---|---|---|
2 | 2 | |
1,493 | 639 | |
- | - | |
2.8 | 7.6 | |
5 months ago | about 2 months ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PPO-PyTorch
-
Where does the loss function for Policy Gradient come from?
It's just very convient implementation wise, in just a few lines you can get the "loss": (from https://github.com/nikhilbarhate99/PPO-PyTorch/blob/master/PPO.py)
-
A2C/PPO with continuous action space
In some methods, like the one here, the actor network has two heads, one for the mean and one for the variance. In other methods, like the one here, the network only outputs the mean, while the variance is pre-defined and is decaying throughout the training.
autonomous-learning-library
-
What's the best "Non-Black Box" framework for SOTA algorithms?
I find Autonomous Learning Library well-designed and clean, despite its modularity to some degree.
-
Where do people get their algorithm implementations from?
I very strongly recommend the autonomous learning library: https://github.com/cpnota/autonomous-learning-library
What are some alternatives?
HandyRL - HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.
pytorch-a2c-ppo-acktr-gail - PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
l2rpn-baselines - L2RPN Baselines a repository to host baselines for l2rpn competitions.
deep_rl_zoo - A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.
Pytorch-PCGrad - Pytorch reimplementation for "Gradient Surgery for Multi-Task Learning"
learning-to-drive-in-5-minutes - Implementation of reinforcement learning approach to make a car learn to drive smoothly in minutes
cleanrl - High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Tetris-deep-Q-learning-pytorch - Deep Q-learning for playing tetris game
pytorch-accelerated - A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal, but extensible training loop which is flexible enough to handle the majority of use cases, and capable of utilizing different hardware options with no code changes required. Docs: https://pytorch-accelerated.readthedocs.io/en/latest/
Meta-SAC - Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient - 7th ICML AutoML workshop 2020
nes-torch - Minimal PyTorch Library for Natural Evolution Strategies
Fleet-AI - Using Reinforcement Learning to play Battleship