[computer-go] analysis of UCT and other bandit algorithms for tree search

Jason House jason.james.house at gmail.com
Wed Apr 25 06:49:53 PDT 2007


On 4/25/07, Remi Munos <remi.munos at inria.fr> wrote:
>
> > 3. The goal is to find the optimal leaf rather than the optimal subtree.
>
> I'm not sure to understand this comment. Actually the goal is to find a
> close-to optimal play at the root level. Thus finding a close-to-optimal
> min-max leaf achieves that purpose.
>

I think that was based off of two things...
a) The analysis of max time to "find the 1", and
b) The pure bandit problem applied directly to the leaves of the tree.

When I did sit down and read the paper for real, I saw that both of those
things were just building up the support for the BAST algorithm.  (a) is to
justify accepting a higher regret (in exchange for better worst case
performance).  (b) is showing the non-UCT extreme of the BAST algorithm
(L=infinity).  (If I understand correctly, L=0 is UCT)

PS: I was surprised that the bar chart near the end had L=infinity, L=20,
and L=7, but not L=0 plotted on it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://computer-go.org/pipermail/computer-go/attachments/20070425/707dc143/attachment.htm


More information about the computer-go mailing list