[computer-go] Some ideas how to make strong heavy playouts

Magnus Persson magnus.persson at phmp.se
Tue Apr 1 09:08:02 PDT 2008


A recurrent concept popping up in discussions on how to improve  
playouts is "balance". So I would like to try to share my philosophy  
behind the playouts of Valkyria and how I define "balance" and how it  
relates to the evaluation of go positions.

*Background

In an old school program the evaluation function would try to see  
which stones are safely connected to other stones of the same colours.  
Connected stones are called groups, and the program would probably  
also try to evaluate the safety of the groups looking at the eyespace  
at hand, weak neighbouring groups and so on. This quickly gets very  
commplicated. My old program Viking had several 1000's of handmade  
patterns for evaluating connectivity alone. This worked as a dream as  
long as the position consisted of patterns in the database... but in  
each an every game there were new situations and new patterns had to  
be added. A more robust method would be to use tactical search in the  
program to evaluate connectivity. The problem there is to ensure  
accuracy efficiently. Any tactical search tends to either become too  
time consuming, or resort to guessing.

*MC-evaluation

Programs using uniformly random MC-eval favors very solid but  
inefficient shape,  often building blocks of adjascent stones in the  
center of the board. The reason is that if stones are played more  
loosely the stones often get cut off and get killed in the simulations.

What we rather want is a program that can play efficent moves where  
stones are safely connected but occupy as much territory/eyespace as  
possible.

The tree search (UCT) cannot alone solve this problem. Shapes created  
in a 19x19 game may exist untouched to the late endgame and it is not  
possible to read out all shapes on the board. It is much better if  
secure shapes stay stable in the simulation.

They way I implemented that in Valkyria is that the playout part is  
basically reactive to random moves that attacks shapes on the board.  
It does not in any sense attempt to play the best move on the board.  
If it does not need to defend a shape it plays uniformly random  
somewhere. [Note that Valkyria also prunes really ugly moves, thus it  
plays uniformly the first move that is not pruned]

This is also how the pattern system works in Mogo as I understand it.  
If I remember correctly I would say that all Mogo patterns are very  
basic and common sense defenses against attacks on otherwise stable  
shapes.

But there also have to be balance. Valkyria also punishes bad shape.  
That is if a weak shape already is on the board, or a random move  
attacked two shapes simulatanously in the simulation, then the program  
may attack the weakness (or in a way it also reacts to the situation  
defending against "the weak shape becoming stronger"). Often the same  
move that would have been the proper defence is played.


*Eliminating noise rather than predicting  the best move

Nothing I wrote above is original or new to the readers of this list.  
But I want to make a distinction between systems that tries to predict  
the best move and a system that only tries to eliminate noise from the  
otherwise very random simulations.

Noise is eliminated when strong stones live and weak stones die almost  
always in the simulations. This way the random evaluations will mostly  
react to moves that matter in urgent fighting with shapes that are not  
yet stable. A MC-program that does this should stop defending and  
attacking strong shapes and would require much less simulations to  
discriminate between good and bad moves. Valkyria2 and Valkyria3 has  
slightly different tree search algorithms but uses the same playouts.  
Both versions needs only 512 playouts per move to win 50% against  
Gnugo 3.7.10.


Still I think predicting the best moves is very important in the tree  
part, but this may be much less important in the playouts, and perhaps  
even detrimental as some people have experienced.

*19x19

The best programs on 19x19 seems to focus the uct search on local  
fighting. This temporarilly overcomes the biases of the simulations  
locally. But the information gained locally about good shape in the  
tree is forgotten when the fighting moves elswhere. But this knowledge  
can then be rediscovered later if the fighting comes back. Could a  
future improvement to 19x19 go be to use locally narrow searches that  
seeds the playouts with strong patterns for the current shapes on the  
board? Maybe someone is already doing this? A really strong approach  
would be to eliminate the need of hard coded patterns or offline  
pattern harvesting and let the program learn during the game.

-Magnus


More information about the computer-go mailing list