gym
baselines
Our great sponsors
gym | baselines | |
---|---|---|
96 | 14 | |
33,750 | 15,255 | |
0.8% | 0.8% | |
0.0 | 0.0 | |
about 1 month ago | 4 months ago | |
Python | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gym
-
Shimmy 1.0: Gymnasium & PettingZoo bindings for popular external RL environments
This includes single-agent Gymnasium wrappers for DM Control, DM Lab, Behavior Suite, Arcade Learning Environment, OpenAI Gym V21 & V26. Multi-agent PettingZoo wrappers support DM Control Soccer, OpenSpiel and Melting Pot. For more information, read the release notes here:
-
[P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up
how would this interact/compare with https://github.com/openai/gym?
- What has replaced OpenAI Retro Gym?
-
Understanding Reinforcement Learning
If you'd like to learn more about reinforcement learning or play with a number of samples in controlled environments, I highly recommend you look at the documentation for OpenAI's Gym library and particularly the basic usage page. OpenAI's Gym provides a standardized environment for performing reinforcement learning on classic Atari games and a few other platforms and should be an educational resource. If you'd like a more detailed example, check out this tutorial on Paperspace's blog.
-
Using the cross-entropy method to solve Frozen Lake
Frozen Lake is an OpenAI Gym environment in which an agent is rewarded for traversing a frozen surface from a start position to a goal position without falling through any perilous holes in the ice.
-
How can we model an observation space of an env with different features and sizes.
After some googling, I have found that there are a wrappers for normalization (https://github.com/openai/gym/blob/master/gym/wrappers/normalize.py)
- RL Agent Library to use graph in spaces
-
What is the "state of the art" in terms of game AI?
In regards to Competitive game AI the papers of OpenAi / Deepmind give you insight into what is coming: * Go: Alpha Go. * Dota: Open AI. * StarCraft: Alphastar. If you wanna have a go at it yourself try this: https://github.com/openai/gym.
-
[N] Gym 0.26.0 was just released, with the last breaking changes to the core Gym API, and it will be stable going forward-- this is the stable version you want to finally upgrade all your things to
It’s has docs for like 9 months now: https://www.gymlibrary.dev/
Release notes available here: https://github.com/openai/gym/releases/tag/0.26.0
baselines
-
How to proceed further? (Learning RL)
Ah sorry I understood your post. It has helped me to code quite a few of them from scratch but you can also check out https://github.com/openai/baselines or similar
-
How to tune hypeparametes in A2C-ppo?
Im currently working with A2C. The model was able to learn open ai pong, i ran this as a sanity check that i havent made any bugs. Now im trying to make the model play breakout, but still after 10m steps the model has not made any significant progress. Im using baseline hyperparameters which can be found here https://github.com/openai/baselines/blob/master/baselines/a2c/a2c.py, except my buffersize have been from 512 to 4096. Ive noticed that entropy decreases extremely slowly given the buffersize from the interval which i just gave. So my questions are how to make entropy decrease and how to increase rewards per buffer? Ive tried to decrease the entropy coefficient to almost zero, but still it acts very weirdly.
-
Boycotting 2.0 or rather PoS
I used a multitude of agents to train it but the best I found was A3C, there are a bunch of examples here you can use to test their performance (although they may require some tweaking).
-
How to speed up off-policy algorithms?
I noticed that off-policy algorithms including DQN, DDPG and TD3 in different baselines and stable-baselines are implemented with a single environment. And even if more environments were added, this won't affect performance because this will only be adding more fresh samples to replay buffer(s). What are some ways to improve speed without major changes to the algorithms? The only thing that I could think of is adding an on-policy update like in ACER but this is going to change the algorithms and I don't know whether it will improve/worsen model convergence.
-
Any beginner resources for RL in Robotics?
OpenAI baselines https://github.com/openai/baselines
-
Convergence of the PPO
It might be worth comparing your implementation to the DeepMind PPO1 & 2 ones to see if they have the same side effect: https://github.com/openai/baselines
What are some alternatives?
ml-agents - The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
carla - Open-source simulator for autonomous driving research.
tensorflow - An Open Source Machine Learning Framework for Everyone
dm_control - Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
open_spiel - OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
rlcard - Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
agents - TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
PaddlePaddle - PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
LightFM - A Python implementation of LightFM, a hybrid recommendation algorithm.
gensim - Topic Modelling for Humans