stable-baselines
SuperSuit
Our great sponsors
stable-baselines | SuperSuit | |
---|---|---|
10 | 4 | |
4,000 | 430 | |
- | 1.4% | |
0.0 | 8.0 | |
over 1 year ago | about 1 month ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
stable-baselines
-
Distributed implementation tips
As underlined by gold-panda, you can give a try with multiprocessing. I once implemented a version based on what is done in stable_baselines v1 (https://github.com/hill-a/stable-baselines/blob/master/stable_baselines/common/vec_env/subproc_vec_env.py)
-
GAIL without actions?
Found relevant code at https://github.com/hill-a/stable-baselines + all code implementations here
-
Best framework to use if learning today
Depends what you wanna do. Universal answer would be https://stable-baselines.readthedocs.io/
-
weird mean reward graph
As you will see here it is recommended to augment this safety measure with target kl_divergence, that will ensure even smoother learning and enforce early stopping to prevent learning collapses.
-
Nvidia ISAAC gym/RL
Code for https://arxiv.org/abs/1707.06347 found: https://github.com/hill-a/stable-baselines
- Bounds for observation
-
Understanding multi agent learning in OpenAI gym and stable-baselines
I haven't read the code, but stable-baselines doesn't support multi-agent environments (https://github.com/hill-a/stable-baselines/issues/423), so I think they're trying to make learning multi-agent easier with Environment.train().
- Using Reinforment Learning to beat the first boss in Dark souls 3 with Proximal Policy Optimization
-
Reinforcement Learning Crash Course (Free)
- https://github.com/hill-a/stable-baselines (Tensorflow)
-
JAX Implementations of Actor-Critic Algorithms
- tf2 speed: https://github.com/hill-a/stable-baselines/issues/576#issuecomment-573331715
SuperSuit
-
What is a wrapper in RL?
"SuperSuit is a library that includes all commonly used wrappers in RL (frame stacking, observation, normalization, etc.) for PettingZoo and Gym environments with a nice API. We developed it in lieu of wrappers built into PettingZoo. https://github.com/Farama-Foundation/SuperSuit "
-
Simple (few states) two-agent environments?
+1 on PettingZoo, and the wrappers they provide as SuperSuit come in handy as well!. Also check out OpenSpiel
- Take a look at SuperSuit- It contains mature versions of all common preprocessing wrappers for gym environments, including ones that accept lambda functions for observations/actions/rewards
-
Understanding multi agent learning in OpenAI gym and stable-baselines
Multi-agent isn’t supported by default in stable baselines, but you can make it work with PettingZoo. This example trains a single policy to control every agent in an environment (Parameter sharing). You could use these SuperSuit wrappers to work with other methods (self-play, independent learning, etc) but you would probably need to write some custom training code. https://github.com/PettingZoo-Team/SuperSuit#parallel-environment-vectorization
What are some alternatives?
stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Ray - Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
stable-baselines - Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms
rl-baselines3-zoo - A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
PettingZoo - An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
Super-mario-bros-PPO-pytorch - Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
open_spiel - OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
Tic-Tac-Toe-Gym - This is the Tic-Tac-Toe game made with Python using the PyGame library and the Gym library to implement the AI with Reinforcement Learning
kaggle-environments
DI-engine - OpenDILab Decision AI Engine
gym