[computer-go] Improvement of UCT search algorithm

Don Dailey drd at mit.edu
Mon Oct 9 07:35:36 PDT 2006


On Mon, 2006-10-09 at 10:21 +0200, sylvain.gelly at m4x.org wrote:
> The solution given by Don (giving more time until the best move and
> the most 
> sampled are the same) seems quite good. But we didn't try (no time).
> Don, did this solution give you significant improvements? 

I never did a hard core test but I'm quite sure it was a significant
improvement.   Let me be more specific on what I do:

  1.  If the score is above -0.90 and below 0.90,  I require that the
      best move be the most sampled move.

  2.  If not, I extend the thinking time 2X.

  3.  But I abort the extended search as soon as the best scoring move
      is also the most sampled move.

If you want to be more conservative,  just require that the move with
the best score is "almost" the most sampled move.   "Almost" could be
defined as some percentage of the most sampled move.

If you don't want to extend your thinking time,  just use most sampled
(which I believe is a safer choice) or you could use the "best move"
that is also sampled at least N percentage as much as the most sampled.

>From observing my log files, I can see that this does in fact save the
day more often than you would think.  It is fairly common for the best
scoring move to only be temporary and not really the best move.

I'm wondering if there is way to extend this idea recursively or whether
there is any benefit to doing so.   Because this same exact problem
exists inside the tree too.   I doubt there is a lot to be gained here,
but it's something I may look in to at some point.

What I would prefer is a time-control scheme that is even more dynamic.
Right now I allocate progressively less time to each move as the game
proceeds, and that is a good thing in general,  but often you waste a
lot of time considering whether to play the only obvious move and other
times you play too quickly when the right move needs to be carefully
considered.    I believe there is much to be gained here too.

- Don







More information about the computer-go mailing list