mctx
minihack
mctx | minihack | |
---|---|---|
10 | 5 | |
2,209 | 451 | |
1.4% | 2.7% | |
0.0 | 6.7 | |
3 months ago | 18 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mctx
- About Monte Carlo tree search in Jax
-
Programming language dilemma
Maybe you can have your cake and eat it too. :) You could use Python with one of the hardware accelerating languages like Jax. This project for example uses Jax to implement Monte Carlo Tree Search and includes a few games as examples. https://github.com/deepmind/mctx
-
Is there any proof that AlphaZero actually exist?
recently tree search part of alpha zero has gone open source https://github.com/deepmind/mctx
-
[D] Anyone interested in training an AI for Tigris and Euphrates?
You could try starting with https://github.com/deepmind/mctx. You’ll probably need to expose your game state and actions via IPC of some sort or FFI your rust code to Python.
-
DeepMind has open-sourced the heart of AlphaGo and AlphaZero
Interesting approach to private variables https://github.com/deepmind/mctx/blob/577fc77a3cda1b796e277e...
- AlphaZero's Monte Carlo tree search implementation in Jax
-
Anyone found any working replication repo for MuZero?
Just have a look at the DM repo: https://github.com/deepmind/mctx
- MuZero Implementation
- Official DeepMind MuZero Implementation
-
Finally an official MuZero implementation
deepmind/mctx: Monte Carlo tree search in JAX (github.com)
minihack
- Difficult RL generalization benchmarks
-
Anyone found any working replication repo for MuZero?
I have an implementation of Stochastic MuZero in JAX. It's been tested solely in MiniHack environments, but can be made to work in other environments by changing the representation function.
-
Best GridWorld environment?
If you want something as simple as possible, I'd go with MiniGrid, and if you want to have a richer world with more complex settings, then MiniHack.
-
Facebook AI Introduces ‘MiniHack’: A Sandbox Framework For Designing Rich And Diverse Environments For Reinforcement Learning (RL)
4 Min Read | Github |Paper
- MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
What are some alternatives?
EfficientZero - Fork of EfficientZero to use newer libraries and to fix a few runtime bugs. Also includes pretrained models!
gym-simplegrid - Simple Gridworld Gymnasium Environment
KataGo - GTP engine and self-play learning in Go
Minigrid - Simple and easily configurable grid world environments for reinforcement learning
leela-zero - Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.
gym-gridverse - Gridworld domains in the gym interface
alpha-zero-boosted - A "build to learn" Alpha Zero implementation using Gradient Boosted Decision Trees (LightGBM)
ede - Code for the paper "Uncertainty-Driven Exploration for Generalization in Reinforcement Learning".
craftingway - A ffxiv crafting tool
EfficientZero - Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
omega - A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.