Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
On-policy Alternatives
Similar projects and alternatives to on-policy based on common topics and language
-
gym-pybullet-drones
PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
pymarl2
Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)
-
-
ACE
[AAAI 2023] Official PyTorch implementation of paper "ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency". (by opendilab)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
on-policy reviews and mentions
-
"chmod" is not recognized as an internal or external command, operable program or batch file
I am trying to run this code (https://github.com/marlbenchmark/on-policy) on my Windows machine. Everything is successful until section 3, where:
If you don't want to install a Linux VM, the other option is to read the source of the train_mpe.sh script and write your own version as a Windows batch file.
-
Why is this implementation of PPO using a replay buffer?
I don't see the buffer being cleared anywhere, but it looks to me like it may not need to... For example, the implementation of SeparatedReplayBuffer receives the episode_length (or "horizon" as is sometimes called) and sets the size of the buffer accordingly when its initialized. That way, the amount of samples collected before each policy/value update is constant. You just need one giant tensor block to collect all your samples, then after doing a networks update, why clear them out? Just overwrite the existing samples, since you know you'll collect exactly the same number of new samples.
-
MARL top conference papers are ridiculous
https://github.com/marlbenchmark/on-policy (MAPPO-FP)
-
A note from our sponsor - InfluxDB
www.influxdata.com | 28 Mar 2024
Stats
marlbenchmark/on-policy is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of on-policy is Python.