-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Sorry, this was the paper: https://arxiv.org/abs/2104.07750 But I guess you already answered my question. Indeed, agents receive a global obervation, but cannot directly observe other agents' actions, states, orrewards, and do not share parameters. So if I understand correctly that what they're using here is independent PPO with global observation, but no centralized critic. Which is what MAPPO (https://github.com/marlbenchmark/on-policy/blob/main/onpolicy/algorithms/r_mappo/algorithm/r_actor_critic.py) does: centralized observation space, but (if I'm correct), decentralized critic.
Related posts
-
How do you compute rewards when you are using parallel environments?
-
Renderer of the environment does not work?
-
Stuck on this error for days: I can't use importlib the right way
-
Difference between setup.py, environments.yaml and requirements.txt
-
Difference between setup.py, environments.yaml and requirements.txt