on-policy vs DI-engine

on-policy

This is the official implementation of Multi-Agent PPO (MAPPO). (by marlbenchmark)

sites.google.com

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

on-policy		DI-engine
	Project
12	Mentions	3
1,125	Stars	2,553
7.8%	Growth	2.8%
4.9	Activity	8.7
10 days ago	Latest Commit	4 days ago
Python	Language	Python
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

on-policy

Posts with mentions or reviews of on-policy. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-05-03.

How do you compute rewards when you are using parallel environments?
1 project | /r/reinforcementlearning | 12 Sep 2022
Renderer of the environment does not work?
1 project | /r/reinforcementlearning | 5 Sep 2022

I am trying to feed the agents with visual observation and thus using the renderer of this environment (https://github.com/marlbenchmark/on-policy/blob/main/onpolicy/envs/mpe/rendering.py), but I get this as an image:
Stuck on this error for days: I can't use importlib the right way
1 project | /r/learnprogramming | 30 Jul 2022
Difference between setup.py, environments.yaml and requirements.txt
1 project | /r/learnprogramming | 30 May 2022

1 project | /r/learnmachinelearning | 30 May 2022
Ubuntu terminal crashes when I launch a deep reinforcement learning model
1 project | /r/Ubuntu | 25 May 2022

I am trying to run this code on my Ubuntu machine (https://github.com/marlbenchmark/on-policy).
"chmod" is not recognized as an internal or external command, operable program or batch file
2 projects | /r/learnprogramming | 3 May 2022

If you don't want to install a Linux VM, the other option is to read the source of the train_mpe.sh script and write your own version as a Windows batch file.
Confused between "centralized critic" and "centralized training decentralized execution"
1 project | /r/reinforcementlearning | 24 Apr 2022

Sorry, this was the paper: https://arxiv.org/abs/2104.07750 But I guess you already answered my question. Indeed, agents receive a global obervation, but cannot directly observe other agents' actions, states, orrewards, and do not share parameters. So if I understand correctly that what they're using here is independent PPO with global observation, but no centralized critic. Which is what MAPPO (https://github.com/marlbenchmark/on-policy/blob/main/onpolicy/algorithms/r_mappo/algorithm/r_actor_critic.py) does: centralized observation space, but (if I'm correct), decentralized critic.
Why is this implementation of PPO using a replay buffer?
2 projects | /r/reinforcementlearning | 21 Apr 2022

I don't see the buffer being cleared anywhere, but it looks to me like it may not need to... For example, the implementation of SeparatedReplayBuffer receives the episode_length (or "horizon" as is sometimes called) and sets the size of the buffer accordingly when its initialized. That way, the amount of samples collected before each policy/value update is constant. You just need one giant tensor block to collect all your samples, then after doing a networks update, why clear them out? Just overwrite the existing samples, since you know you'll collect exactly the same number of new samples.
MARL top conference papers are ridiculous
2 projects | /r/reinforcementlearning | 17 Aug 2021

https://github.com/marlbenchmark/on-policy (MAPPO-FP)

DI-engine

Posts with mentions or reviews of DI-engine. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-15.

Anyone have experience with DI-Engine?
1 project | /r/reinforcementlearning | 4 May 2023

I posted a while back asking people what frameworks they were using for RL research. Recently i stumbled upon DI-Engine which looks promising! Actively maintained, with a diverse set of algorithms already implemented.
TransformerXL + PPO Baseline + MemoryGym
10 projects | /r/reinforcementlearning | 15 Feb 2023

DI Engine
Struggling with algorithm generality? Try DI engine; here is the solution
1 project | news.ycombinator.com | 29 Apr 2022

What are some alternatives?

When comparing on-policy and DI-engine you can also consider the following projects:

gym-pybullet-drones - PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control

stable-baselines - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

auto-sklearn - Automated Machine Learning with scikit-learn

pytorch-a2c-ppo-acktr-gail - PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

tianshou - An elegant PyTorch deep reinforcement learning library.

seed_rl - SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

myosuite - MyoSuite is a collection of environments/tasks to be solved by musculoskeletal models simulated with the MuJoCo physics engine and wrapped in the OpenAI gym API.

godot_rl_agents - An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents

brain-agent - Brain Agent for Large-Scale and Multi-Task Agent Learning

ml-agents - The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

Gymnasium - An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

on-policy vs gym-pybullet-drones DI-engine vs stable-baselines on-policy vs auto-sklearn DI-engine vs pytorch-a2c-ppo-acktr-gail DI-engine vs tianshou DI-engine vs seed_rl DI-engine vs stable-baselines3 DI-engine vs myosuite DI-engine vs godot_rl_agents DI-engine vs brain-agent DI-engine vs ml-agents DI-engine vs Gymnasium

Compare on-policy vs DI-engine and see what are their differences.

on-policy

DI-engine

on-policy

DI-engine

What are some alternatives?