rlai
d3rlpy
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
rlai
-
Python libraries for solving reinforcement learning problems implemented in OpenAI gym
I've worked through several OpenAI Gym environments with my RL library, which is based almost entirely on the RL textbook by Sutton and Barto (case studies here). No neural networks, nothing too fancy. But I do explore JAX for policy gradient methods / continuous control.
d3rlpy
- Python libraries for solving reinforcement learning problems implemented in OpenAI gym
-
Conservative Q Learning TD error not converging
Hi, I am using the discrete conservative Q learning implementation in the d3rlpy library (https://github.com/takuseno/d3rlpy) to train a policy offline to optimize mechanical ventilation treatment by using the MIMIC-III dataset (https://physionet.org/content/mimiciii-demo/1.4/).
What are some alternatives?
habitat-lab - A modular high-level library to train embodied AI agents across a variety of tasks and environments.
cleanrl - High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Coursera_Reinforcement_Learning - Coursera Reinforcement Learning Specialization by University of Alberta & Alberta Machine Intelligence Institute
exorl - ExORL: Exploratory Data for Offline Reinforcement Learning
Minari - A standard format for offline reinforcement learning datasets, with popular reference datasets and related utilities
pytorch-a2c-ppo-acktr-gail - PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).