[Computer-go] Monte-Carlo Tree Search in other games
heikki at lsd.dk
Sun Oct 31 04:44:53 PDT 2010
On Thu, Oct 28, 2010 at 10:15:46AM +0200, Olivier Teytaud wrote:
> From: Olivier Teytaud <olivier.teytaud at lri.fr>
> > After the program plays all simulations, which move should it choose?
> > (Wins/Visits) + SQRT(ln(...))
> > or
> > (Wins+Draw/2)/Visits + SQRT(ln(...))
> None of these two formula :-)
> These formulas is for choosing moves to be simulated. For turn-based games,
> when al simulations are finished, we should choose
> move = argmax_m number_of_simulations(m)
> or something like that (you can introduce a bias built from the success
I have been thinking that it might make sense to take the move with the
highest *lower* confindence bound and play that move. In a way, that would
be the move we have least reason to believe will be a blunder.
But I am just a programmer, not a mathematician, so what do I know. If I had
more time and energy, I'd make many experiments. But at the moment, all I can
afford is to follow this list...
Heikki Levanto "In Murphy We Turst" heikki (at) lsd (dot) dk
More information about the Computer-go