[computer-go] analysis of UCT and other bandit algorithms for tree search

Jason House jason.james.house at gmail.com
Wed Apr 25 11:00:26 PDT 2007


On 4/25/07, Remi Munos <remi.munos at inria.fr> wrote:

> > When I did sit down and read the paper for real, I saw that both of
> those
> > things were just building up the support for the BAST algorithm.  (a) is
> to
> > justify accepting a higher regret (in exchange for better worst case
> > performance).
>
> well, this is not really accepting higher regret. I would say that this is
> to
> guarantee lower regret in all cases. UCT has (non-asymptotic, ie. for a
> fixed
> stage n) low regret (in the worst case) only when the tree is very smooth,
> ie. when the branches that appear good (from obtained rewards so far) are
> actually good and the branches that appear bad are truly bad. Of course, I
> don't know how much true this is for go. But in bad cases, UCT is terribly
> bad! Basically, one cannot expect UCT to have (for a fixed n) a regret
> independent of D (depth of the tree) unless the tree is smooth with a
> smoothness constant = 0 (ie. all values are the same!). I see BAST as a
> natural extension of UCT when the tree is not that smooth. If no
> smoothness
> exists at all, I think flat-UCB is the best (for a given tree of depth D).



I saw the analysis of regret = log(n) for UCT and regret = sqrt(n) for the
modified UCB.  I then assumed that sqrt(n) >> log(n) for most n.  The
graphical example would find (D-1)/D almost immediately.  I then assumed the
regret for normal UCT was better up until the modified UCT found the 1.

I suspect that this discussion hits on the main point of the paper.  When I
read the paper, the focus on the worst case performance overshadowed general
applicability to a larger variety of trees.  The additional description of
the variance in the L=0 results (for normal UCT and the BAST variation on
it) would do a lot to help me appreciate the results in the paper.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://computer-go.org/pipermail/computer-go/attachments/20070425/8f450da8/attachment.htm


More information about the computer-go mailing list