Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I realized there was a really simple way to implement Counter Factual Regret Minimization (CFR) with a value neural network trained from self play.
The code is on https://github.com/thomasahle/liars-dice . I will try to write a blog post about how it works later.
It starts learning from completely random play, but after about a million games the model is close to the Nash Equilibrium. The pytorch model is converted to ONNX runs entirely in the browser.
Hi Jens! I also worked on it a few years ago, trying to solve the smaller cases using Linear Programming: https://github.com/thomasahle/snyd . These are the most common "one on one" cases I've encountered, since usually at pubs there are more than two players when the number of dice is large! :)
I also found the CFR papers really confusing for a long time. The papers have a lot of strange notation. But I think I finally got it, and it's very simple :-)