[Computer-go] Bayeisan/Probablistic Playouts in Computer Go
Alexander Terenin
aterenin at ucsc.edu
Mon Sep 29 14:35:13 PDT 2014
Thanks to everybody for the links! They have given me a good amount of stuff to look at that will help with the proposal.
Many of these are very much in the same spirit as what I am proposing, though most seem to be concerned primarily with the tree rather than the playouts. It’s interesting that starting the counters at 1 is equivalent to using a uniform prior. Similarly, it can be easily seen that with a Beta(prior wins, prior losses) prior, the posterior will be Beta(prior wins + observed wins, prior losses + observed losses), and thus the new expected value is just total wins / total runs, as is used everywhere.
Thank you again for your time and the links,
Alex
> On Sep 25, 2014, at 4:06 PM, Álvaro Begué <alvaro.begue at gmail.com> wrote:
>
> I believe this has been discussed in the mailing list before: If your prior distribution of the win rate of a move is uniform, after L losses and W wins the posterior distribution will be a beta distribution with alpha=W+1 and beta=L+1. The expected value of this distribution is alpha/(alpha+beta) = (W+1)/(W+L+2), which is equivalent to the common trick of starting the counters W and L at 1 instead of at 0.
>
> Of course one could start with a different prior, but I think staying within the family of beta distributions makes sense because it's very tractable.
>
> Is that the kind of thing you were looking for?
>
>
> Álvaro.
>
>
>
> On Thu, Sep 25, 2014 at 6:28 PM, Alexander Terenin <aterenin at ucsc.edu <mailto:aterenin at ucsc.edu>> wrote:
> Hello everybody,
>
> I’m a PhD student in statistics at the University of California, Santa Cruz who previously worked on the Go program Orego, currently in the process of applying for the NSF fellowship. I am working on a Bayesian statistics - related research proposal that I would like to use in my application, and wanted to know if someone was aware of any research related to my topic that has been done.
>
> Currently, it seems most MCTS-based Go programs, in the playouts, treat the strength (win rate) of each move as a fixed, unknown value, which is then estimated using frequentist techniques (specifically, by playing a random game, and taking the estimate to be wins / total runs). Has anyone attempted to instead statistically estimate the strength of each move using Bayesian techniques, by defining a set of prior beliefs about the strength of a certain move, playing a random game, and then integrating the information gained from the random game together with the prior beliefs using Bayes' Rule? Equivalently, has anyone defined the strength of each move to be a random variable rather than a fixed and unknown value? Without making this email too long, there’s some theoretical advantages that might allow for more information to be extracted from each playout if this setup is used.
>
> If you are aware of any work in this direction that has been done, I would love to hear from you! I’ve been looking through a variety of papers, and have yet to find anything - it seems that any work remotely related to Bayes’ Rule has concerned the tree, not the playouts.
> Thank you in advance,
>
> Alex Terenin
> aterenin at ucsc.edu <mailto:aterenin at ucsc.edu>
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org <mailto:Computer-go at dvandva.org>
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go <http://dvandva.org/cgi-bin/mailman/listinfo/computer-go>
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20140929/386d5c1f/attachment.html>
More information about the Computer-go
mailing list