[computer-go] Optimal explore rates for plain UCT

Christoph Birk birk at ociw.edu
Thu Mar 13 11:25:04 PDT 2008


On Thu, 13 Mar 2008, Petr Baudis wrote:
> So I have created this page:
>
> 	http://senseis.xmp.net/?CGOSBasicUCTBots
>
> and summed up what I could find in the thread about the various bots.
> Please clarify if anything there is wrong / unknown, and add your bots
> if they aren't there. I wanted to add Fluke too, but I do not know which
> of the many incarnations should I choose. :-)

I am not sure if we have an understanding of node expansion. myCtest
does not really the parent node ... let me explain what I am doing:

During decending (root at the top) the UCT tree:
  if current-node is a leaf
    if number of visits is at least MIN_VISITS then
      determine all legal moves and create children nodes
      choose a random child and descend
    endif
    run a random playout and propagate score upwards
  else
    calculate UCT score = win-ratio + C * sqrt(log(n)/m)
    decend to "best" child
  endif

> Curiously, while pachi1 with 10k playouts is 30 ELO weaker than
> drdGeneric-10k and myCtest-10k-UCT (it seems like ~1230 is _the_ rating
> for 10k UCT), with 50k playouts it is 60 ELO stronger than
> myCtest-V-0003 - is that one really just UCT with 50k playouts?

Name               #playouts    C      MIN_VISITS     ELO
myCtest-10k-UCT:   10k          0.5    50             1228

myCtest-V-0020:    50k,         0.5    MIN=50         1459
             21:    50k,         0.5    MIN=25         1483
             22:    50k,         0.5    MIN=10         1467
             23:    50k,         0.5    MIN=5          1523
             24:    50k,         0.5    MIN=2          ?

My explanation is that with fewer playouts the reduced noise with
a larger MIN_VISITS is better, while with more playouts the
deeper search-tree with a smaller MIN_VISITS improves play.

Christoph


More information about the computer-go mailing list