gym vs baselines

gym

A toolkit for developing and comparing reinforcement learning algorithms. (by openai)

Machine Learning

Source Code

gymlibrary.dev

Suggest alternative

Edit details

baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms (by openai)

Suggest topics

Source Code

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

gym		baselines
	Project
96	Mentions	14
33,873	Stars	15,339
0.8%	Growth	1.0%
0.0	Activity	0.0
21 days ago	Latest Commit	5 months ago
Python	Language	Python
GNU General Public License v3.0 or later	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

gym

Posts with mentions or reviews of gym. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-25.

OpenAI Acquires Global Illumination
1 project | news.ycombinator.com | 16 Aug 2023

A co-founder announced they disbanded their robots team a couple years ago: https://venturebeat.com/business/openai-disbands-its-robotic...
That was the same time they depreciated OpenAI Gym: https://github.com/openai/gym
Shimmy 1.0: Gymnasium & PettingZoo bindings for popular external RL environments
10 projects | /r/reinforcementlearning | 25 Apr 2023

This includes single-agent Gymnasium wrappers for DM Control, DM Lab, Behavior Suite, Arcade Learning Environment, OpenAI Gym V21 & V26. Multi-agent PettingZoo wrappers support DM Control Soccer, OpenSpiel and Melting Pot. For more information, read the release notes here:
Some confusion about variables and functions in mujoco-py
1 project | /r/reinforcementlearning | 25 Apr 2023

When I browse fetch_env.py, I have a question about the following code snippet:
pip install stable-baselines3[extra]
1 project | /r/reinforcementlearning | 30 Mar 2023

Nvm, this works for me '!pip install setuptools==65.5.0' Source: https://github.com/openai/gym/issues/3176
[P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up
3 projects | /r/MachineLearning | 24 Mar 2023

how would this interact/compare with https://github.com/openai/gym?
What has replaced OpenAI Retro Gym?
4 projects | /r/reinforcementlearning | 20 Mar 2023
Understanding Reinforcement Learning
2 projects | dev.to | 18 Feb 2023

If you'd like to learn more about reinforcement learning or play with a number of samples in controlled environments, I highly recommend you look at the documentation for OpenAI's Gym library and particularly the basic usage page. OpenAI's Gym provides a standardized environment for performing reinforcement learning on classic Atari games and a few other platforms and should be an educational resource. If you'd like a more detailed example, check out this tutorial on Paperspace's blog.
Using the cross-entropy method to solve Frozen Lake
2 projects | dev.to | 4 Feb 2023

Frozen Lake is an OpenAI Gym environment in which an agent is rewarded for traversing a frozen surface from a start position to a goal position without falling through any perilous holes in the ice.
Is there a publicly available state space model for the Lunar Lander environment?
1 project | /r/reinforcementlearning | 16 Jan 2023
How to Create a Behavioral Cloning Bot to Play Online Games?
1 project | /r/learnmachinelearning | 11 Jan 2023

typically a more relaxed approach is taken via reinforcement learning, but it requires that you can simulate the game via a given gamestate. take a look at e.g. https://www.gymlibrary.dev/

baselines

Posts with mentions or reviews of baselines. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-10-03.

What if I can't reproduce a UMAP exactly for a paper in revision?
1 project | /r/bioinformatics | 7 Dec 2022

I do not for the life of me pretend to really understand either GPUs or floating-point arithmetic, but I think the basic problem is that floating point arithmetic isn't associative, so the order of operations matters immensely. For reasons I don't fully grok, GPUs don't always dispatch the same inputs the same way, so these floating-point non-associativity discrepancies can pile up. See also here and here.
How to proceed further? (Learning RL)
3 projects | /r/reinforcementlearning | 3 Oct 2022

Ah sorry I understood your post. It has helped me to code quite a few of them from scratch but you can also check out https://github.com/openai/baselines or similar
Does the value of the reward matter?
1 project | /r/reinforcementlearning | 23 Jun 2022

Yes this is a good point, I always normalize my rewards such that *returns* are around -3 to 3. The baselines implementation has a good example of this. Aside from normalizing returns it's common to also normalize the advantages. Together this should allow any scale of rewards (I have games where scores range from 0-20 and games that range from 0-600,000 and haven't found a problem so long as I normalize everything :) )
How to tune hypeparametes in A2C-ppo?
2 projects | /r/reinforcementlearning | 15 Jun 2022

Im currently working with A2C. The model was able to learn open ai pong, i ran this as a sanity check that i havent made any bugs. Now im trying to make the model play breakout, but still after 10m steps the model has not made any significant progress. Im using baseline hyperparameters which can be found here https://github.com/openai/baselines/blob/master/baselines/a2c/a2c.py, except my buffersize have been from 512 to 4096. Ive noticed that entropy decreases extremely slowly given the buffersize from the interval which i just gave. So my questions are how to make entropy decrease and how to increase rewards per buffer? Ive tried to decrease the entropy coefficient to almost zero, but still it acts very weirdly.
Boycotting 2.0 or rather PoS
2 projects | /r/EtherMining | 15 May 2021

I used a multitude of agents to train it but the best I found was A3C, there are a bunch of examples here you can use to test their performance (although they may require some tweaking).
How to speed up off-policy algorithms?
2 projects | /r/reinforcementlearning | 21 Apr 2021

I noticed that off-policy algorithms including DQN, DDPG and TD3 in different baselines and stable-baselines are implemented with a single environment. And even if more environments were added, this won't affect performance because this will only be adding more fresh samples to replay buffer(s). What are some ways to improve speed without major changes to the algorithms? The only thing that I could think of is adding an on-policy update like in ACER but this is going to change the algorithms and I don't know whether it will improve/worsen model convergence.
Any beginner resources for RL in Robotics?
3 projects | /r/robotics | 19 Apr 2021

OpenAI baselines https://github.com/openai/baselines
Atary BreakoutDeterministic-v4
1 project | /r/reinforcementlearning | 1 Apr 2021

Without seeing your source code/hyperparams it's going to be difficult to give you advice. I would say to compare against good open-source implementations such as OpenAI Baselines and make sure you have implemented it correctly.
Convergence of the PPO
2 projects | /r/reinforcementlearning | 27 Mar 2021

It might be worth comparing your implementation to the DeepMind PPO1 & 2 ones to see if they have the same side effect: https://github.com/openai/baselines
Using CNN in Reinforcement Learning
1 project | /r/reinforcementlearning | 3 Mar 2021

For Atari games, RGB is often not useful. However, channels are still needed because a single-frame observation is often not a good proxy of a state. For example in games like Breakout or Pong, a single screenshot doesn't tell you the direction to which the ball is moving. So typically we preprocess the observation and use channels to stack a few recent frames. You can take a look at https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py

What are some alternatives?

When comparing gym and baselines you can also consider the following projects:

ml-agents - The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

dm_control - Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

carla - Open-source simulator for autonomous driving research.

stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

tensorflow - An Open Source Machine Learning Framework for Everyone

gdrl - Grokking Deep Reinforcement Learning

Robotics Library (RL) - The Robotics Library (RL) is a self-contained C++ library for rigid body kinematics and dynamics, motion planning, and control.

open_spiel - OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

ppo-implementation-details - The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

rlcard - Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.

gym-solutions - OpenAI Gym Solutions