Understanding multi agent learning in OpenAI gym and stable-baselines

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

kaggle-environments

55 273 6.6 Jupyter Notebook

The ks_env.train() method seems to be the one from kaggle_environments.Environment:

stable-baselines3

46 7,894 8.2 Python

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Q1. However I got confused. Why does environment contains train() method? That is why environment should do training? I feel, train() should be part of the model: above article uses PPO algorithm which contains train() method. This PPO.train() gets called when we call PPO.learn() which makes sense.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
stable-baselines

10 4,000 0.0 Python

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

I haven't read the code, but stable-baselines doesn't support multi-agent environments (https://github.com/hill-a/stable-baselines/issues/423), so I think they're trying to make learning multi-agent easier with Environment.train().

SuperSuit

4 430 8.0 Python

A collection of wrappers for Gymnasium and PettingZoo environments (being merged into gymnasium.wrappers and pettingzoo.wrappers

Multi-agent isn’t supported by default in stable baselines, but you can make it work with PettingZoo. This example trains a single policy to control every agent in an environment (Parameter sharing). You could use these SuperSuit wrappers to work with other methods (self-play, independent learning, etc) but you would probably need to write some custom training code. https://github.com/PettingZoo-Team/SuperSuit#parallel-environment-vectorization

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project