[computer-go] transposition

Brian Slesinsky brian at slesinsky.org
Fri May 11 19:10:50 PDT 2007


On 5/11/07, Jason House <jason.james.house at gmail.com> wrote:

> Consider a node who's had one child extensively evaluated through
> transpositions.  When UCT does come to visit it, the sqrt(sum(child
> simulations)) will be very high.  That will artificially favor exploration
> of children with less simulations.  That will have a continuing negative
> impact on the exploitation.
>
> ... but then once the parent gets updated with the sum(child simulations),
> UCT may simply stop searching that branch for a long time... Because it
> likely had a low average win rate, and now looks to be extensively explored.
>  When UCT eventually does come back to it, the problem in the previous
> paragraph will still apply.

So the issue is that if UCT is given, a priori, a tree that's already
been explored (say by some other algorithm) in a highly imbalanced
order, it isn't able to rebalance?  It seems like there ought to be a
way to make it use whatever information it's been given in a smarter
way.  After all, knowing that someone else explored one of the nodes
1000 times does mean something.

Another issue (or maybe another way of putting it): a parent may get a
large number of "virtual simulations" for free via differing paths to
the same descendants, but these simulations can't be considered
independent when calculating probabilities and confidence.

On the other hand, if you don't have transposition tables, some paths
will be explored that happen to mirror each other and yet the
algorithm won't notice the correlation and consider them as
independent runs when calculating probabilities. It seems like knowing
about a correlation should make it possible to estimate the true
probability better than not knowing it.


More information about the computer-go mailing list