Our great sponsors
-
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Here's where the problem starts: after implementing a custom environment that follows the typical gymnasium interface, and use a slightly adjusted PPO implementation from CleanRL, I cannot get the agent to learn anything at all, even though this specific implementation seems to work just fine on basic gymnasium examples. I am hoping the RL community here can help me with some useful pointers.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.