[computer-go] Monte Carlo combined with minimax search
Magnus Persson
magnus.persson at phmp.se
Sun Jul 23 02:21:52 PDT 2006
Quoting Rémi Coulom <Remi.Coulom at univ-lille3.fr>:
> Peter Drake wrote:
>> On Jul 22, 2006, at 11:19 PM, Rémi Coulom wrote:
>>> I am not sure that a continuum would be better than using the
>>> probability of winning all the time. At least, I am certain that
>>> using the probability of winning all the time is much better than
>>> using expected territory all the time.
>>
>> Can you say more on this?
>
> When I switched from using territory to probability of winning, Crazy
> Stone changed from scoring 36% against GNU Go 3.6 at level 10 to
> scoring more than 60%, at 16 minutes per game, single CPU.
I found the same effect for Viking4. I do not have any numbers, but I
have never
seen such a huge improvement of a program (and that with only two lines of
code).
Here is my explanation. If expected territory is used the program will
be greedy
in won positions, that is, playing moves that are risky. It will for example
prefer moves that perhaps will kill a big opponent group 2 times out of 3 for
an expected win of +10, but will lose the game otherwise. Compared to
playing a
safe move that wins for certain with about +5 points. Also in positions
where it
is certain to lose, the expected territory evaluation will just happily
minimize
the loss and do nothing to win. A program using probability of winning will
instead play tricky moves that have some chance of winning.
It is difficult to explain this in words, but with paper and pencil it is easy
to construct examples of territory evaluation distributions for two
alternative
moves where the highest average territory mean is clearly a move on would not
want to play.
I thought for a long time that something complicated considering the
distribution of evaluated territory would have these properties, but it turned
out that simply computing the move that is most likely to win is the the best
move. I also did some experiments of mixing in some territory evaluation, but
it always led to worse play.
Best
Magnus
--
Magnus Persson, 2 Dan
Zapp at KGS
Author of the go program Viking
More information about the computer-go
mailing list