[computer-go] Re: Should 9x9 komi be 8.0 ?]

Magnus Persson magnus.persson at phmp.se
Tue Mar 4 10:23:59 PST 2008


Quoting Christoph Birk <birk at ociw.edu>:

> On Tue, 4 Mar 2008, Magnus Persson wrote:
>> But here you are missing the point that close to 0% winning   
>> probability means that it cannot win against random play. The   
>> opponent could lose only by killing his own groups.
>
> I don't know why you (and Don) keep bringing up the 0% against random
> play ...

Maybe we all should stop discussing hypothetical positions that has  
properties that fits our own arguments.

I mean if there is a typical situation then please post it so we can  
see what different program does in the situation and why.

> I am talking about a (typical) situation in the endgame
> where best play (as seen from the program) leads to a sure 0.5 pt loss.
> Many MC programs will make unreasonable attempts of winning by chosing
> a line that shows a possible win (10 pt) if the opponent makes a
> (stupid) mistake. Instead they should go for the (supposedly sure)
> 0.5 pt loss, because the opponent will much more likely make
> the 1pt mistake, and not the 10 pt mistake.

I do not see why an MC programs in general is biased towards winning  
with 10p instead of a single 1p mistake.

As we have repeatedly discussed here all strong programs go for  
winrate and ignore the size of the win.

What is happening here is that when MC-programs knows that a simple  
endgame is lost then it will play a sequence that makes the game as  
long and complicated as possible. I belive this is a perfectly  
reasonable stretegy. If this is wrong someone needs to provide a  
solution and show that it really makes a difference against for  
example gnugo which makes humanlike endgame mistakes. Testing against  
humans is too noise unless there is an astronomic improvement in  
playing strength.



> The problem is that the likelihood of your opponent making a mistake
> is hard to determine by the UCT (MC) playouts. I guess one needs
> to use  the meta information that is is more likely to make a small
> mistake than to make a big one.

Random playouts makes small and big endgame mistakes for about almost  
every move played. The likelyhood is measured all the time, and is the  
reason UCT (MC) is successful.

The argument I do not like here is in short something like this

1) UCT(MC) programs are so strong that it freaks out when it is behind  
in a game.
2) Solution: Make it believe it can win by playing losing moves

I have been thinking like this. I have tried it and it failed. So did  
Don and this is why we are a little stubborn on arguing that it is not  
possible to improve playing strength this way.

It is much better to make it even stronger so it takes the lead in  
more games and refuse to lose those games. It will still freak out but  
in fewer and and later in the game.


But as always I am willing to admit that I am wrong. I am happy to see
1) real positions to discuss, 2) solutions that are backed up with 3)  
solid empircal data. (That I can easily incorporate in my own  
program... ;-) )

-Magnus




More information about the computer-go mailing list