adaptive-policy-iteration
mctx
adaptive-policy-iteration | mctx | |
---|---|---|
2 | 10 | |
5 | 2,209 | |
- | 1.4% | |
2.6 | 0.0 | |
almost 3 years ago | 3 months ago | |
Python | Python | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
adaptive-policy-iteration
mctx
- About Monte Carlo tree search in Jax
-
Programming language dilemma
Maybe you can have your cake and eat it too. :) You could use Python with one of the hardware accelerating languages like Jax. This project for example uses Jax to implement Monte Carlo Tree Search and includes a few games as examples. https://github.com/deepmind/mctx
-
Is there any proof that AlphaZero actually exist?
recently tree search part of alpha zero has gone open source https://github.com/deepmind/mctx
-
[D] Anyone interested in training an AI for Tigris and Euphrates?
You could try starting with https://github.com/deepmind/mctx. You’ll probably need to expose your game state and actions via IPC of some sort or FFI your rust code to Python.
-
DeepMind has open-sourced the heart of AlphaGo and AlphaZero
Interesting approach to private variables https://github.com/deepmind/mctx/blob/577fc77a3cda1b796e277e...
- AlphaZero's Monte Carlo tree search implementation in Jax
-
Anyone found any working replication repo for MuZero?
Just have a look at the DM repo: https://github.com/deepmind/mctx
- MuZero Implementation
- Official DeepMind MuZero Implementation
-
Finally an official MuZero implementation
deepmind/mctx: Monte Carlo tree search in JAX (github.com)
What are some alternatives?
numpyro - Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
EfficientZero - Fork of EfficientZero to use newer libraries and to fix a few runtime bugs. Also includes pretrained models!
HJxB - Continuous-Time/State/Action Fitted Value Iteration via Hamilton-Jacobi-Bellman (HJB)
minihack - MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
KataGo - GTP engine and self-play learning in Go
leela-zero - Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.
alpha-zero-boosted - A "build to learn" Alpha Zero implementation using Gradient Boosted Decision Trees (LightGBM)
craftingway - A ffxiv crafting tool
omega - A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.
EfficientZero - Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.