[computer-go] Monte Carlo combined with minimax search
Don Dailey
drd at mit.edu
Sun Jul 23 06:05:09 PDT 2006
On Sun, 2006-07-23 at 09:25 +0200, Rémi Coulom wrote:
>
> When I switched from using territory to probability of winning, Crazy
> Stone changed from scoring 36% against GNU Go 3.6 at level 10 to
> scoring
> more than 60%, at 16 minutes per game, single CPU.
In the paper http://www.cs.unimaas.nl/g.chaslot/mcscg.pdf , one of the
footnotes says: "Some programmers use the average of the final scores,
others use the winning percentage. It seems that many factors impact
this choice. In our program, the average of the final scores leads to a
better program."
To me this is a mystery. I too, found that wining percentage is so
strong in comparison that it's not even close.
My theory on why some still believe in territory is that if you look at
some of the moves of a "winning percentage" based program, you might
easily conclude that it is playing worse.
You also cannot test it with conventional problem sets because problem
test positions usually isolate a tactical theme and it's understood that
the goal is to "win that group" or "defend this group." A "winning
percentage" based program doesn't care about any tactic that is isolated
from the goal of winning the game.
You might think that "winning that group" is pretty much always
compatible with winning the game but you would be wrong! When I first
discovered this wonderful idea I kept trying to "fix" it - I would find
positions where it stupidly failed to defend or attack but with a lot of
agony I discovered that it wasn't actually very relevant in a large
percentage of the cases. I could always FIX the problem by adding a
few stones somewhere to make it so that winning or defending the group
was more relevant to winning or losing the game.
Finally, the first time I implemented this and played on KGS, someone
criticized some of it's moves in a private email. They gave me a
little analysis of "why this move was bad", or "why it should have
played a different move." I looked in my log files, and discovered
that the program was LOSING the game no matter what move it played. At
that moment I realized that in a sense, the program was the smart one -
the player doing the analysis was the delusional one and his wonderful
analysis was totally irrelevant.
I think there is a bit of cultural bias here too. We want to see our
program "put up a fight" even when losing is inevitable and "keep the
score close." Or when it's winning we all love to see a program try to
"mop up" and own the whole board if possible, as a matter of pride. But
a "winning percentage" based program has no vanity or ego and doesn't
care about all that superficial nonsense!
- Don
More information about the computer-go
mailing list