policy-adaptation-during-deployment
es_pytorch
policy-adaptation-during-deployment | es_pytorch | |
---|---|---|
1 | 1 | |
109 | 23 | |
- | - | |
1.8 | 0.0 | |
over 3 years ago | over 2 years ago | |
Python | Python | |
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
policy-adaptation-during-deployment
-
Exploring Self-Supervised Policy Adaptation To Continue Training After Deployment Without Using Any Rewards
Code: https://github.com/nicklashansen/policy-adaptation-during-deployment
es_pytorch
-
What is the greatest achievement of Genetic Algorithms[D]?
ES, specifically OpenAI's ES (and to an extent CMA-ES). This has been shown to be very competitive with modern state of the art RL algorithms. A huge benefit of it is that it's incredibly easy to implement (I'm gonna shamelessly plug my implementation if you want to see the inner workings)
What are some alternatives?
Ne2Ne-Image-Denoising - Deep Unsupervised Image Denoising, based on Neighbour2Neighbour training
muzero-general - MuZero
envpool - C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
pureples - Pure Python Library for ES-HyperNEAT. Contains implementations of HyperNEAT and ES-HyperNEAT.
stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
neat-python - Python implementation of the NEAT neuroevolution algorithm
pytorch-a2c-ppo-acktr-gail - PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Super-mario-bros-PPO-pytorch - Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
drl_grasping - Deep Reinforcement Learning for Robotic Grasping from Octrees
dmc2gymnasium - Gymnasium integration for the DeepMind Control (DMC) suite
drq - DrQ: Data regularized Q