[Computer-go] Monte-Carlo Tree Search in other games

Heikki Levanto heikki at lsd.dk
Sun Oct 31 04:44:53 PDT 2010


On Thu, Oct 28, 2010 at 10:15:46AM +0200, Olivier Teytaud wrote:
>
> From: Olivier Teytaud <olivier.teytaud at lri.fr>
> > After the program plays all simulations, which move should it choose?
> >
> > (Wins/Visits) + SQRT(ln(...))
> > or
> > (Wins+Draw/2)/Visits + SQRT(ln(...))
> >
> >
> None of these two formula :-)
> These formulas is for choosing moves to be simulated. For turn-based games,
> when al simulations are finished, we should choose
> 
> move = argmax_m number_of_simulations(m)
> 
> or something like that (you can introduce a bias built from the success
> rate...).

I have been thinking that it might make sense to take the move with the
highest *lower* confindence bound and play that move. In a way, that would 
be the move we have least reason to believe will be a blunder.

But I am just a programmer, not a mathematician, so what do I know. If I had
more time and energy, I'd make many experiments. But at the moment, all I can
afford is to follow this list...


  - Heikki


-- 
Heikki Levanto   "In Murphy We Turst"     heikki (at) lsd (dot) dk




More information about the Computer-go mailing list