[computer-go] Monte Carlo combined with minimax search

Peter Drake drake at lclark.edu
Tue Jul 18 09:53:24 PDT 2006


I've been working with Monte Carlo Go lately.  An idea has sprung to  
mind, and I want to make sure it's original and not something I read  
in a paper.  (It seems similar to some things Bouzy has done, but I  
can't find anything specific.)

Here's the plan:

- Maintain a large minimax search tree in memory.  (For reasons that  
will become clear, we can't use alpha-beta pruning.  I think we could  
use a transposition table, making the tree a DAG.)

- At each node in the tree, there is a weight for each possible  
move.  Initially, all of these weights are set to some neutral value,  
say, 0.5.

- Perform a single Monte Carlo run, making random moves to the end of  
the game.  At each node along this path to the end of the game, back  
up the resulting score.  Also, adjust the weights for the moves made  
on that path.  Thus, if a move led to a good outcome, it is more  
likely to be played again.  (The change in weights has to be fairly  
gradual so that a good move is not eliminated by one or two bad runs.)

- Keep performing runs until time (or some other resource) runs out.   
The moves in these runs are random, but favor higher-weight moves.

Has this been done before?  Is it flawed in some obvious way?

Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://computer-go.org/pipermail/computer-go/attachments/20060718/8147b060/attachment.htm


More information about the computer-go mailing list