es_pytorch
policy-adaptation-during-deployment
Our great sponsors
es_pytorch | policy-adaptation-during-deployment | |
---|---|---|
1 | 1 | |
23 | 109 | |
- | - | |
0.0 | 1.8 | |
over 2 years ago | over 3 years ago | |
Python | Python | |
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
es_pytorch
-
What is the greatest achievement of Genetic Algorithms[D]?
ES, specifically OpenAI's ES (and to an extent CMA-ES). This has been shown to be very competitive with modern state of the art RL algorithms. A huge benefit of it is that it's incredibly easy to implement (I'm gonna shamelessly plug my implementation if you want to see the inner workings)
policy-adaptation-during-deployment
-
Exploring Self-Supervised Policy Adaptation To Continue Training After Deployment Without Using Any Rewards
Code: https://github.com/nicklashansen/policy-adaptation-during-deployment
What are some alternatives?
muzero-general - MuZero
Ne2Ne-Image-Denoising - Deep Unsupervised Image Denoising, based on Neighbour2Neighbour training
pureples - Pure Python Library for ES-HyperNEAT. Contains implementations of HyperNEAT and ES-HyperNEAT.
envpool - C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
neat-python - Python implementation of the NEAT neuroevolution algorithm
stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Super-mario-bros-PPO-pytorch - Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
pytorch-a2c-ppo-acktr-gail - PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
drl_grasping - Deep Reinforcement Learning for Robotic Grasping from Octrees
dmc2gymnasium - Gymnasium integration for the DeepMind Control (DMC) suite
drq - DrQ: Data regularized Q