q-learning-algorithms
AgileRL
q-learning-algorithms | AgileRL | |
---|---|---|
1 | 12 | |
4 | 493 | |
- | 2.6% | |
0.0 | 9.8 | |
almost 3 years ago | 6 days ago | |
Python | Python | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
q-learning-algorithms
-
actor-critic algorithms
I learn quite some things about reinforcement learning in the last months, and I feel like I understand much better deep-Q learning algorithms (if you want, you can check my [repo](https://github.com/thomashirtz/q-learning-algorithms). I would like to change a little bit my focus towards actor-critics algorithms now. The only thing is, I feel like in comparison to Q-learning algorithms, the explanations of the papers are not as precise as for Q-learning, and explanations on the internet diverge really greatly (e.g. the original paper does not give the A2C but only the A3C for one learner).
AgileRL
- [P] Introducing PPO and Rainbow DQN to our super fast evolutionary HPO reinforcement learning framework
- Introducing PPO and Rainbow DQN to our super fast evolutionary HPO reinforcement learning framework
-
[P] Significant improvements for multi-agent reinforcement learning!
Please check it out! https://github.com/AgileRL/AgileRL
- 10x faster reinforcement learning hyperparameter optimization than SOTA - now with distributed training!
- [P] 10x faster reinforcement learning hyperparameter optimization than SOTA - now with distributed training!
-
(1/2) May 2023
Deep Reinforcement Learning library focused on improving development by introducing RLOps - MLOps for reinforcement learning (https://github.com/AgileRL/AgileRL)
-
[P] 10x faster reinforcement learning HPO - now for RLHF!
https://github.com/AgileRL/AgileRL/blob/main/CONTRIBUTING.md Has a link to our discord too
- 10x faster reinforcement learning HPO - now with CNNs!
- [P] 10x faster reinforcement learning HPO - now with CNNs!
-
[P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up
GitHub: https://github.com/AgileRL/AgileRL
What are some alternatives?
bomberland - Bomberland: a multi-agent AI competition based on Bomberman. This repository contains both starter / hello world kits + the engine source code
chat-ui - Open source codebase powering the HuggingChat app
chess - Program for playing chess in the console against AI or human opponents
RLeXplore - RLeXplore provides stable baselines of exploration methods in reinforcement learning, such as intrinsic curiosity module (ICM), random network distillation (RND) and rewarding impact-driven exploration (RIDE).
fragile - Framework for building algorithms based on FractalAI
loopquest - A Production Tool for Embodied AI
de-torch - Minimal PyTorch Library for Differential Evolution
Muzero - Pytorch Implementation of MuZero for gym environment. It support any Discrete , Box and Box2D configuration for the action space and observation space.
Open-Llama - The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
easyopt - zero-code hyperparameters optimization framework
tnt - A lightweight library for PyTorch training tools and utilities
hlb-gpt - Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).