[computer-go] UCT
Rémi Coulom
Remi.Coulom at univ-lille3.fr
Fri Nov 3 08:10:55 PST 2006
Peter Drake wrote:
> I've decided to bite the bullet and implement UCT in Orego, since (a)
> everyone else appears to be using it and (b) not being a statistician,
> improving on this is probably not where I'm likely to make a
> contribution.
>
> Here's my understanding of UCT. It is an algorithm for choosing the
> best move from a given parent node. Let
>
> p = # of runs through parent node
> r = # of runs through move in question
> w = # of wins through move in question
>
> UCT proposes that we choose the node with the highest sum
>
> average + uncertainty
>
> where
>
> average = w/r
>
> and
>
> uncertainty = sqrt(2 * logn(p) / r)
>
> Is this correct? Are there any things to watch out for?
You can try multiplying uncertainty by a well-chosen constant value.
This way, you can tune how selective your search is. I found that using
a constant < 1 improves the search on 9x9 for Crazy Stone (I use
1/sqrt(10), if I recall correctly). I wonder what is the experience of
other UCT programmers, by the way.
>
> Also, are people pruning any move for which (average + uncertainty) is
> less than the (average - uncertainty) of some other move?
I am not sure what you mean. In UCT, you only search the move that has
the highest average + uncertainty, and you don't search any other at all.
-- Rémi
More information about the computer-go
mailing list