Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Cleanrl Alternatives
Similar projects and alternatives to cleanrl
-
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
-
stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
PettingZoo
An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
-
wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
machin
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...
-
salina
Discontinued a Lightweight library for sequential learning agents, including reinforcement learning
-
Deep-Reinforcement-Learning-Algorithms-with-PyTorch
PyTorch implementations of deep reinforcement learning algorithms and environments
-
dopamine
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
cleanrl reviews and mentions
-
[P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials)
PettingZoo 1.24.0 is now live! This release includes Python 3.11 support, updated Chess and Hanabi environment versions, and many bugfixes, documentation updates and testing expansions. We are also very excited to announce 3 tutorials using Stable-Baselines3, and a full training script using CleanRL with TensorBoard and WandB.
-
PPO agent for "2048": help requested
Here's where the problem starts: after implementing a custom environment that follows the typical gymnasium interface, and use a slightly adjusted PPO implementation from CleanRL, I cannot get the agent to learn anything at all, even though this specific implementation seems to work just fine on basic gymnasium examples. I am hoping the RL community here can help me with some useful pointers.
- [P] 10x faster reinforcement learning hyperparameter optimization than SOTA - now with distributed training!
-
PPO ignores high rewards in deterministic sytem
Try out a standard implementation with some standard parameters from here: https://github.com/vwxyzjn/cleanrl/tree/master/cleanrl
-
SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), <class 'numpy.float32'>) observation space is not supported
I am trying to run cleanrl on the `Pendulum-v1` environment. I did that by going here and changing the default `env-id` to ` parser.add_argument("--env-id", type=str, default="Pendulum-v1",
- Cartpole and mountain car
-
cleanrl gym issues
git clone https://github.com/vwxyzjn/cleanrl.git && cd cleanrl poetry install
-
Why is my Soft Actor Critic Algorithm not learning?
Can someone please help me debug my implementation of SAC. Please let me know if you have any questions. I tried comparing my work with CleanRL and caught a couple of errors. However, my implementation does diverge a lot from theirs as I wanted to test my understanding.
-
Model-based hierarchical reinforcement learning
Shameless self-plug: as far as implementation is concerned, I am working on a (hopefully) easier to understand Dreamer architecture under the CleanRL library, toward also re-implementing Director, Dreamer-v3, and and JAX variant for faster training.
-
[P] Robust Policy Optimization is now in CleanRL 🔥!
Happy to share that CleanRL now has a new algorithm called Robust Policy Optimization — 5 lines of code change to PPO to get better performance in 57 out of 61 continuous action envs 🚀 (e.g., dm_control)
-
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024
Stats
vwxyzjn/cleanrl is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.
The primary programming language of cleanrl is Python.
Sponsored