Ray vs maddpg

Ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. (by ray-project)

Source Code

ray.io

Docs

Suggest alternative

Edit details

maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments" (by openai)

Paper

Source Code

arxiv.org

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Ray		maddpg
	Project
42	Mentions	2
30,988	Stars	1,516
2.8%	Growth	3.9%
10.0	Activity	0.0
about 5 hours ago	Latest Commit	18 days ago
Python	Language	Python
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Ray

Posts with mentions or reviews of Ray. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-05.

Open Source Advent Fun Wraps Up!
10 projects | dev.to | 5 Jan 2024

22. Ray | Github | tutorial
Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models
1 project | news.ycombinator.com | 11 Aug 2023

Training times for GSM8k are mentioned here: https://github.com/ray-project/ray/tree/master/doc/source/te...
Ray – an open source project for scaling AI workloads
1 project | news.ycombinator.com | 11 Aug 2023
Methods to keep agents inside grid world.
1 project | /r/reinforcementlearning | 30 Jun 2023

Here's a reference from RLlib that points to docs and an example, and here's one from one of my projects that includes all my own implementations
TransformerXL + PPO Baseline + MemoryGym
10 projects | /r/reinforcementlearning | 15 Feb 2023

RLlib
Is dynamic action masking possible in Rllib?
1 project | /r/reinforcementlearning | 23 Jan 2023
AWS re:Invent 2022 Recap | Data & Analytics services
1 project | dev.to | 3 Jan 2023

⦿ AWS Glue Data Quality - Automatic data quality rule recommendations based on your data AWS Glue for Ray - Data integration with Ray (ray.io), a popular new open-source compute framework that helps you scale Python workloads
Think about it for a second
1 project | /r/mathmemes | 19 Oct 2022

https://ray.io (just dropping the link)
Elixir Livebook now as a desktop app
12 projects | news.ycombinator.com | 2 Aug 2022

I've wondered whether it's easier to add data analyst stuff to Elixir that Python seems to have, or add features to Python that Erlang (and by extension Elixir) provides out of the box.
By what I can see, if you want multiprocessing on Python in an easier way (let's say running async), you have to use something like ray core[0], then if you want multiple machines you need redis(?). Elixir/Erlang supports this out of the box.
Explorer[1] is an interesting approach, where it uses Rust via Rustler (Elixir library to call Rust code) and uses Polars as its dataframe library. I think Rustler needs to be reworked for this usecase, as it can be slow to return data. I made initial improvements which drastically improves encoding (https://github.com/elixir-nx/explorer/pull/282 and https://github.com/elixir-nx/explorer/pull/286, tldr 20+ seconds down to 3).
[0] https://github.com/ray-project/ray
Learn various techniques to reduce data processing time by using multiprocessing, joblib, and tqdm concurrent
1 project | /r/Python | 13 Jul 2022

Adding these for anyone who had a similar question about Ray vs dask 1, 2, 3

maddpg

Posts with mentions or reviews of maddpg. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-02-15.

How is the backward pass performed in MADDPG algorithm from MARL
1 project | dev.to | 5 Oct 2022

I'm using the MADDPG algorithm from https://github.com/openai/maddpg/blob/master/maddpg/trainer/maddpg.py. I understood the forward pass for both the actor and critic networks. I'm not able to understand how the actor and critic networks are updates. Like at line 188 and 191 the authors compute the critic loss and actor loss. But can anyone explain how the critic and actor networks are updated. Also, as far as I understand, when the number of agents increases from 3 to 6 for a simple spread policy in MADDPG, the computation time for Q loss and P loss at lines 188 and 191 increase super-linearly. I'm assuming this might be because both the Q loss and P loss utilize the Q values and the dimension to calculate the Q values increases with the number of increasing linearly. It would be great if anyone can help me to understand this back propagation phase much better and why does the computation time grow super-linearly. I also put a time counter to track the computation time of Q loss and P loss for 60,000 episodes with simple spread policy (3 agents, 3 landmarks, 0 adversaries). Thanks for the help, in advance! **Q loss** 3 agents 74.31 sec 6 agents 243.31 sec (3X) **P loss** 3 agents 114.86 sec 6 agents 321.76 sec (3x)
How to get my multi-agents more collaborative?
3 projects | /r/reinforcementlearning | 15 Feb 2021

Another thing is that I don't use only one centralized critic, I'm using one for each agent (they are all centralized), you could use parameter sharing for the ones of the same type if you want. A great start would be to look at how the MADDPG works in an implementation (original, tf2 ,pytorch-1 , pytorch-2 ), then you can see how it is the training of the actor and the critic and just adapt the ideas to your MA-PPO implementation.

What are some alternatives?

When comparing Ray and maddpg you can also consider the following projects:

optuna - A hyperparameter optimization framework

pymarl - Python Multi-Agent Reinforcement Learning framework

stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

multiagent-particle-envs - Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Faust - Python Stream Processing

gpt-2 - Code for the paper "Language Models are Unsupervised Multitask Learners"

gevent - Coroutine-based concurrency library for Python

transferlearning - Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习

stable-baselines - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

SCOOP (Scalable COncurrent Operations in Python) - SCOOP (Scalable COncurrent Operations in Python)

Thespian Actor Library - Python Actor concurrency library

Dask - Parallel computing with task scheduling

Ray vs optuna maddpg vs pymarl Ray vs stable-baselines3 maddpg vs multiagent-particle-envs Ray vs Faust maddpg vs gpt-2 Ray vs gevent maddpg vs transferlearning Ray vs stable-baselines Ray vs SCOOP (Scalable COncurrent Operations in Python) Ray vs Thespian Actor Library Ray vs Dask

Compare Ray vs maddpg and see what are their differences.

Ray

maddpg

Ray

maddpg

What are some alternatives?