[Computer-go] Bayeisan/Probablistic Playouts in Computer Go

Álvaro Begué alvaro.begue at gmail.com
Thu Sep 25 16:06:06 PDT 2014


I believe this has been discussed in the mailing list before: If your prior
distribution of the win rate of a move is uniform, after L losses and W
wins the posterior distribution will be a beta distribution with alpha=W+1
and beta=L+1. The expected value of this distribution is alpha/(alpha+beta)
= (W+1)/(W+L+2), which is equivalent to the common trick of starting the
counters W and L at 1 instead of at 0.

Of course one could start with a different prior, but I think staying
within the family of beta distributions makes sense because it's very
tractable.

Is that the kind of thing you were looking for?


Álvaro.



On Thu, Sep 25, 2014 at 6:28 PM, Alexander Terenin <aterenin at ucsc.edu>
wrote:

> Hello everybody,
>
> I’m a PhD student in statistics at the University of California, Santa
> Cruz who previously worked on the Go program Orego, currently in the
> process of applying for the NSF fellowship. I am working on a Bayesian
> statistics - related research proposal that I would like to use in my
> application, and wanted to know if someone was aware of any research
> related to my topic that has been done.
>
> Currently, it seems most MCTS-based Go programs, in the playouts, treat
> the strength (win rate) of each move as a fixed, unknown value, which is
> then estimated using frequentist techniques (specifically, by playing a
> random game, and taking the estimate to be wins / total runs). Has anyone
> attempted to instead statistically estimate the strength of each move using
> Bayesian techniques, by defining a set of prior beliefs about the strength
> of a certain move, playing a random game, and then integrating the
> information gained from the random game together with the prior beliefs
> using Bayes' Rule? Equivalently, has anyone defined the strength of each
> move to be a random variable rather than a fixed and unknown value? Without
> making this email too long, there’s some theoretical advantages that might
> allow for more information to be extracted from each playout if this setup
> is used.
>
> If you are aware of any work in this direction that has been done, I would
> love to hear from you! I’ve been looking through a variety of papers, and
> have yet to find anything - it seems that any work remotely related to
> Bayes’ Rule has concerned the tree, not the playouts.
> Thank you in advance,
>
> Alex Terenin​
> aterenin at ucsc.edu> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20140925/dbb8c298/attachment.html>


More information about the Computer-go mailing list