[Computer-go] evaluating number of wins versus looses

Peter McKenzie peter.mckenzie at gmail.com
Tue Mar 31 12:12:37 PDT 2015


On Mon, Mar 30, 2015 at 7:09 AM, Petr Baudis <pasky at ucw.cz> wrote:

> On Mon, Mar 30, 2015 at 09:11:52AM -0400, Jason House wrote:
> > The complex formula at the end is for a lower confidence bound of a
> > Bernoulli distribution with independent trials (AKA biased coin flip) and
> > no prior knowledge. At a leaf of your search tree, that is the most
> correct
> > distribution. Higher up in a search tree, I'm not so sure that's the
> > correct distribution. For a sufficiently high number of samples, most
> > averaging processes converge to a Normal distribution (due to central
> limit
> > theorem). For a Bernoulli distribution with a mean near 50% the required
> > number of samples is ridiculously low.
> >
> > I believe a lower confidence bound is probably best for final move
> > selection, but UCT uses an upper confidence bound for tree exploration. I
> > recommend reading the paper, but it uses a gradually increasing
> confidence
> > interval which was shown to be an optimal solution for the muli-armed
> > bandit problem. I don't think that's the best model for computer go, but
> > the success of the method cannot be denied.
> >
> > The strongest programs have good "prior knowledge" to initialize wins and
> > losses. My understanding is that they use average win rate directly
> > (incorrect solution #2) instead of any kind of confidence bound.
> >
> > TL;DR: Use UCT until your program natures
>
> The strongest programs often use RAVE or LGRF or something like that,
> with or without the UCB for tree exploration.
>
> For selecting the final move, the move with most simulations is used.
>

What about the optimization that selects the move with the most wins?


> (Using the product reviews analogy - assume all your products go on sale
> at once, have the same price, shipping etc., then with number of buyers
> going to infinity, the best product should get the most buyers and
> ratings even if some explore other products.)  I think trying the Wilson
> lower bound could be also interesting, but the inconvenience is that you
> need to specify some arbitrary confidence level.
>
> > On Mar 30, 2015 8:06 AM, "folkert" <folkert at vanheusden.com> wrote:
> > > --
> > > Finally want to win in the MegaMillions lottery? www.smartwinning.info
>
> funny in the context :)
>
> --
>                                 Petr Baudis
>         If you do not work on an important problem, it's unlikely
>         you'll do important work.  -- R. Hamming
>         http://www.cs.virginia.edu/~robins/YouAndYourResearch.html
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20150331/7bdc8971/attachment.html>


More information about the Computer-go mailing list