Understanding multi agent learning in OpenAI gym and stable-baselines

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • kaggle-environments

  • The ks_env.train() method seems to be the one from kaggle_environments.Environment:

  • stable-baselines3

    PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

  • Q1. However I got confused. Why does environment contains train() method? That is why environment should do training? I feel, train() should be part of the model: above article uses PPO algorithm which contains train() method. This PPO.train() gets called when we call PPO.learn() which makes sense.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • stable-baselines

    A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

  • I haven't read the code, but stable-baselines doesn't support multi-agent environments (https://github.com/hill-a/stable-baselines/issues/423), so I think they're trying to make learning multi-agent easier with Environment.train().

  • SuperSuit

    A collection of wrappers for Gymnasium and PettingZoo environments (being merged into gymnasium.wrappers and pettingzoo.wrappers

  • Multi-agent isn’t supported by default in stable baselines, but you can make it work with PettingZoo. This example trains a single policy to control every agent in an environment (Parameter sharing). You could use these SuperSuit wrappers to work with other methods (self-play, independent learning, etc) but you would probably need to write some custom training code. https://github.com/PettingZoo-Team/SuperSuit#parallel-environment-vectorization

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts