minihack
mctx
Our great sponsors
minihack | mctx | |
---|---|---|
5 | 10 | |
449 | 2,203 | |
4.5% | 2.1% | |
6.8 | 0.0 | |
8 days ago | 3 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
minihack
- Difficult RL generalization benchmarks
-
Anyone found any working replication repo for MuZero?
I have an implementation of Stochastic MuZero in JAX. It's been tested solely in MiniHack environments, but can be made to work in other environments by changing the representation function.
-
Best GridWorld environment?
If you want something as simple as possible, I'd go with MiniGrid, and if you want to have a richer world with more complex settings, then MiniHack.
-
Facebook AI Introduces ‘MiniHack’: A Sandbox Framework For Designing Rich And Diverse Environments For Reinforcement Learning (RL)
4 Min Read | Github |Paper
- MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
mctx
- About Monte Carlo tree search in Jax
-
Programming language dilemma
Maybe you can have your cake and eat it too. :) You could use Python with one of the hardware accelerating languages like Jax. This project for example uses Jax to implement Monte Carlo Tree Search and includes a few games as examples. https://github.com/deepmind/mctx
-
Is there any proof that AlphaZero actually exist?
recently tree search part of alpha zero has gone open source https://github.com/deepmind/mctx
-
[D] Anyone interested in training an AI for Tigris and Euphrates?
You could try starting with https://github.com/deepmind/mctx. You’ll probably need to expose your game state and actions via IPC of some sort or FFI your rust code to Python.
-
DeepMind has open-sourced the heart of AlphaGo and AlphaZero
Interesting approach to private variables https://github.com/deepmind/mctx/blob/577fc77a3cda1b796e277e...
- AlphaZero's Monte Carlo tree search implementation in Jax
-
Anyone found any working replication repo for MuZero?
Just have a look at the DM repo: https://github.com/deepmind/mctx
- MuZero Implementation
- Official DeepMind MuZero Implementation
-
Finally an official MuZero implementation
deepmind/mctx: Monte Carlo tree search in JAX (github.com)
What are some alternatives?
gym-simplegrid - Simple Gridworld Gymnasium Environment
EfficientZero - Fork of EfficientZero to use newer libraries and to fix a few runtime bugs. Also includes pretrained models!
Minigrid - Simple and easily configurable grid world environments for reinforcement learning
KataGo - GTP engine and self-play learning in Go
gym-gridverse - Gridworld domains in the gym interface
leela-zero - Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.
ede - Code for the paper "Uncertainty-Driven Exploration for Generalization in Reinforcement Learning".
alpha-zero-boosted - A "build to learn" Alpha Zero implementation using Gradient Boosted Decision Trees (LightGBM)
EfficientZero - Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
craftingway - A ffxiv crafting tool
omega - A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.