PPO agent for "2048": help requested

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

cleanrl

41 4,459 6.3 Python

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Here's where the problem starts: after implementing a custom environment that follows the typical gymnasium interface, and use a slightly adjusted PPO implementation from CleanRL, I cannot get the agent to learn anything at all, even though this specific implementation seems to work just fine on basic gymnasium examples. I am hoping the RL community here can help me with some useful pointers.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project