Practical_RL
alpha-zero-general
Our great sponsors
Practical_RL | alpha-zero-general | |
---|---|---|
2 | 4 | |
5,702 | 3,656 | |
1.0% | - | |
6.5 | 3.1 | |
6 days ago | about 2 months ago | |
Jupyter Notebook | Jupyter Notebook | |
The Unlicense | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Practical_RL
-
Alternatives to OpenAI’s spinning up?
there is this great github repo where there are lectures and other resources, and have a week by week jupyter notebooks where they explain and code with homeworks at the very end of it. is basics and deepRL, but just dqn and DDPG/ppo but i think will give you good start in the topic for later star working on your own.
alpha-zero-general
-
Competitive reinforcement learning for turn-based games
This is a good intro to alphazero and montecarlo treesearch , Followed by This repo.
- Looking for deeper understanding of AlphaZero algorithm
-
Any interest in a strong Santorini (no powers) AI?
I'm not planning on sharing code at the moment as I'm still working on improving it. The main part of the code is simply from https://github.com/suragnair/alpha-zero-general plus my implementation of game logic (about 100 lines). So for you to use the AI you really need the weights for the neural network. I plan on releasing a better version than the current version in say two months or so.
Thanks for the question. Code wise I didn't have to do too much work. I used code from https://github.com/suragnair/alpha-zero-general for the base MCTS algorithm used in the Alphazero architecture. Within that framework I implemented the logic of Santorini, which isn't too much. Then I used PyTorch for the training of the neural networks. The main network used (policy and value combined) is a 10x64 ResNet.
What are some alternatives?
muzero-general - MuZero
webdataset - A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
FunMatch-Distillation - TF2 implementation of knowledge distillation using the "function matching" hypothesis from https://arxiv.org/abs/2106.05237.
minigo - An open-source implementation of the AlphaGoZero algorithm
a3c_trading - Trading with recurrent actor-critic reinforcement learning
tensorflow-onnx - Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
awesome-rl - Reinforcement learning resources curated
labml - 🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
redisai-examples - RedisAI showcase
TensorFlow-Tutorials - TensorFlow Tutorials with YouTube Videos
reversatile - Reversatile: Reversi for Android
rl-trading - Using Reinforcement Learning agents as Algorithmic Traders