Top 9 policy-gradient Open-Source Projects

tianshou

8 7,528 9.5 Python

An elegant PyTorch deep reinforcement learning library.

Project mention: Is it better to not use the Target Update Frequency in Double DQN or depends on the application? | /r/reinforcementlearning | 2023-07-05

The tianshou implementation I found at https://github.com/thu-ml/tianshou/blob/master/tianshou/policy/modelfree/dqn.py is DQN by default.

Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
PPO-PyTorch

2 1,527 2.8 Python

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
HandyRL

1 282 4.3 Python

HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.
pytorch-learn-reinforcement-learning

3 143 0.0 Python

A collection of various RL algorithms like policy gradients, DQN and PPO. The goal of this repo will be to make it a go-to resource for learning about RL. How to visualize, debug and solve RL problems. I've additionally included playground.py for learning more about OpenAI gym, etc.
episodic-transformer-memory-ppo

5 120 3.4 Python

Clean baseline implementation of PPO using an episodic TransformerXL memory

Project mention: Question about Transformer model input in RL | /r/reinforcementlearning | 2023-06-17

Check out this implementation https://github.com/MarcoMeter/episodic-transformer-memory-ppo

recurrent-ppo-truncated-bptt

6 108 3.2 Jupyter Notebook

Baseline implementation of recurrent PPO using truncated BPTT
nes-torch

3 17 3.6 Python

Minimal PyTorch Library for Natural Evolution Strategies
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
pbo

1 14 3.0 Python

Policy-based optimization : single-step policy gradient seen as an evolution strategy
snakeAI

1 10 5.3 C++

testing MLP, DQN, PPO, SAC, policy-gradient by snake

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

policy-gradient discussion

policy-gradient related posts

Is it better to not use the Target Update Frequency in Double DQN or depends on the application?

1 project | /r/reinforcementlearning | 5 Jul 2023
他們能回來嗎

1 project | /r/real_China_irl | 25 Feb 2023
Multi-Agent Stable Baselines

2 projects | /r/reinforcementlearning | 2 Feb 2023
Question about the old policy and new policy in TRPO code

4 projects | /r/reinforcementlearning | 6 Jul 2022
Tensorflow vs PyTorch for A3C

4 projects | /r/reinforcementlearning | 17 Nov 2021
"Tianshou: a Highly Modularized Deep Reinforcement Learning Library", Weng et al 2021 (Python PyTorch MuJuCo; PPO, DQN, A2C, DDPG, SAC, TD3, REINFORCE, NPG, TRPO, ACKTR)

1 project | /r/ResearchML | 12 Aug 2021
"Tianshou: a Highly Modularized Deep Reinforcement Learning Library", Weng et al 2021 (Python PyTorch MuJuCo; PPO, DQN, A2C, DDPG, SAC, TD3, REINFORCE, NPG, TRPO, ACKTR)

1 project | /r/reinforcementlearning | 10 Aug 2021
A note from our sponsor - Scout Monitoring
www.scoutapm.com | 12 Jun 2024

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today. Learn more →

Index

What are some of the best open-source policy-gradient projects? This list will help you:

	Project	Stars
1	tianshou	7,528
2	PPO-PyTorch	1,527
3	HandyRL	282
4	pytorch-learn-reinforcement-learning	143
5	episodic-transformer-memory-ppo	120
6	recurrent-ppo-truncated-bptt	108
7	nes-torch	17
8	pbo	14
9	snakeAI	10

policy-gradient

Top 9 policy-gradient Open-Source Projects

policy-gradient discussion

policy-gradient related posts

Is it better to not use the Target Update Frequency in Double DQN or depends on the application?

他們能回來嗎

Multi-Agent Stable Baselines

Question about the old policy and new policy in TRPO code

Tensorflow vs PyTorch for A3C

"Tianshou: a Highly Modularized Deep Reinforcement Learning Library", Weng et al 2021 (Python PyTorch MuJuCo; PPO, DQN, A2C, DDPG, SAC, TD3, REINFORCE, NPG, TRPO, ACKTR)

"Tianshou: a Highly Modularized Deep Reinforcement Learning Library", Weng et al 2021 (Python PyTorch MuJuCo; PPO, DQN, A2C, DDPG, SAC, TD3, REINFORCE, NPG, TRPO, ACKTR)

Index