alpha-zero-boosted
KataGo
Our great sponsors
alpha-zero-boosted | KataGo | |
---|---|---|
2 | 49 | |
79 | 3,235 | |
- | - | |
3.2 | 9.3 | |
almost 4 years ago | 8 days ago | |
Python | C++ | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
alpha-zero-boosted
-
DeepMind has open-sourced the heart of AlphaGo and AlphaZero
> I came up with a nifty implementation in Python that outperforms the naive impl by 30x, allowing a pure python MCTS/NN interop implementation. See https://www.moderndescartes.com/essays/deep_dive_mcts/
Great post!
Chasing pointers in the MCTS tree is definitely a slow approach. Although typically there are < 900 "considerations" per move for alphazero. I've found getting value/policy predictions from a neural network (or GBDT[1]) for the node expansions during those considerations is at least an order of magnitude slower than the MCTS tree-hopping logic.
[1] https://github.com/cgreer/alpha-zero-boosted
-
MuZero: Mastering Go, chess, shogi and Atari without rules
What you can do is checkout the algorithm at a particular stages of development. AlphaZero&Friends start out not being very good at the game, then over time they learn and become super human. You typically checkpoint the weights for the model at various stages. So early on, the algo would be like a 600 elo player for chess and then eventually get to superhuman elo levels. So if you wanted to train you can gradually play against versions of the algo until you can beat them by loading up the weights at various difficulty stages.
I implemented AlphaZero (but not Mu yet) using GBDTs instead of NNs here if you're curious about how it would work: https://github.com/cgreer/alpha-zero-boosted. Instead of saving the "weights" for a GBDT, you save the splitpoints for the value/policy models, but the concept is the same.
KataGo
-
After AI beat them, professional Go players got better and more creative
> KataGo was trained with more knowledge of the game (feature engineering and loss engineering), so it trained faster.
Not really important to your point, but it's not really just that it uses more game knowledge. Mostly it's that a small but dedicated community (especially lightvector) worked hard to build on what AlphaGo and LeelaZero did. Lightvector is a genius and put a lot of effort into KataGo. It wasn't just add some game knowledge and that's it. https://github.com/lightvector/KataGo?tab=readme-ov-file#tra... has a bunch of info if you're interested.
-
Monte-Carlo Graph Search from First Principles
Immediately recognise the author as the genius behind KataGo: https://github.com/lightvector/KataGo
- Request for help getting two specific outputs from the Katago AI engine
-
KataGo should be partially resistant to cyclic groups now
(also, if you want to donate GPU time, https://katagotraining.org/ would be happy to have more people contributing to training as well!)
-
Man beats machine at Go in human victory over AI
> Kellin Pelrine, an American player who is one level below the top amateur ranking, beat the machine by taking advantage of a previously unknown flaw that had been identified by another computer. But the head-to-head confrontation in which he won 14 of 15 games was undertaken without direct computer support.
My take: what Kellin Pelrine really exploited is that the AI can't learn and adapt. Even GPT can't learn or adapt to anything beyond its context window. It took a computer to find and teach him the winning strategy, and it probably took a lot longer than AlphaGo did to train. But once he learned, he had the advantage; meanwhile AlphaGo never adapted and learned to counter the strategy itself, because it can't.
One thing to note is that he beat KataGo [1] and Leela Zero [2], but not AlphaGo or AlphaZero, because the AlphaGos aren't public. So it's possible he wouldn't actually beat the real AlphaZero with this strategy. But considering the strategy he used works in theory work against any model with AlphaGo/AlphaZero's design (he beat Leela Zero which has the exact same model), and Leela Chess and Stockfish are apparently better than AlphaZero now; I think he would still win.
[1] https://github.com/lightvector/KataGo
[2] https://github.com/leela-zero/leela-zero
Experimentally, KataGo did also try some limited ways of using external data at the end of its June 2020 run, and has continued to do so into its most recent public distributed run, "kata1" at https://katagotraining.org/. External data is not necessary for reaching top levels of play, but still appears to provide some mild benefits against some opponents, and noticeable benefits in a useful analysis tool for a variety of kinds of situations that don't occur in self-play but that do occur in human games and games that users wish to analyze.
-
I wonder if these ChatGPT answers will every get nuked
I've been using ChatGPT since launch and constantly seeking out examples of how others have been using it. A few years ago I started using KataGo with Sabaki to improve my go-playing abilities. I've known about token embeddings in neural networks before ChatGPT was a twinkle in OpenAI's eye. I was there, but I haven't seen everything you've seen, so please show me. If the truth is that ChatGPT has canned responses to some prompt or set of prompts, then I want to believe that it does. If I have misconceptions about anything, I want to break those misconceptions. As long as your beliefs and mine contradict one another, one of us has the opportunity to learn.
-
Human Go players beat top Go AIs using a "trick"
For some stuff besides LCB, see https://github.com/lightvector/KataGo/blob/master/docs/KataGoMethods.md for a summary of a few more recent other things KataGo added that hadn't been done in earlier bots.
-
DeepMind has open-sourced the heart of AlphaGo and AlphaZero
I'd suggest KataGo, which is much stronger and more actively developed than Leela Zero https://github.com/lightvector/KataGo
- KataGo changes training framework from TensorFlow to PyTorch
What are some alternatives?
neural_network_chess - Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)
katrain - Improve your Baduk skills by training with KataGo!
online-go.com - Source code for the Online-Go.com web interface
adversarial-robustness-toolbox - Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
lizzie - Lizzie - Leela Zero Interface
leela-zero - Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.
nnue-pytorch - Stockfish NNUE (Chess evaluation) trainer in Pytorch
mars - Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
BadukMegapack - Installer for various AI Baduk softwares
mctx - Monte Carlo tree search in JAX