[computer-go] Monte Carlo combined with minimax search

Don Dailey drd at mit.edu
Tue Jul 18 10:51:11 PDT 2006


I believe several of us are using this same basic algorithm, but the
devil is in the details.   

The basic idea is simply stated:  Play thousands of games, best first
fashion,  1 game at a time to the bitter end.   While doing this,  bias
the moves in favor of "good" moves (or moves that help clarify the
situation) and maintain in memory a tree of positions with the
appropriate statistics as you visit them.   

The details of how this is done probably varies significantly between
one program author and another but it works.   It's possible that the
very strongest 9x9 programs are  Monte Carlo Go programs.

You probably won't want to maintain a tree for the complete games,  it
has little value beyond some reasonable depth and will consume all the
memory in your machine (or thrash the cache and slow you way down.)  So
my program starts expanding to children nodes  once the parent has been
visited some fixed number of times.  

There are many papers out about this.  A very good one is Remi's paper:

  http://remi.coulom.free.fr/CG2006/CG2006.pdf
  

I have a version of my program that does pure alpha/beta searching where
the evaluation function is pure monte/carlo random games.   There are
lots of tricks you can use to gain  cutoff's in the tree or save a lot
of work (by varying the number of simulations intelligently for each
invocation of the evaluator) but so far I have not been able to make it
play quite as good as my best Monte Carlo which doesn't use alpha/beta
pruning (but does build a tree in memory.)    However it's close -
within about 100 ELO points.   My suspicion and it's only my opinion is
that a pure a/b searcher with monte/carlo evaluation will ultimately
prove to be the best way to do monte/carlo once some of the issues are
figured out.   Things like null move pruning doesn't seem to help in
this case.

Bouzy describes some of the problems with alpha/beta pruning with an
evaluation function that is not deterministic.    For statistical
reasons the score returned from an odd ply search is heavily biased in
favor of the first player and an even ply search is heavily biased in
favor of the opponent.   This is true in just about any game, but it is
especially nasty when using a monte/carlo evaluation since the program
latches on to lines of play that end with an especially fortunate (or
unfortunate) monte/carlo run.   

So for now, best first approaches like you are considering seem to be
"winning."

- Don




On Tue, 2006-07-18 at 09:53 -0700, Peter Drake wrote:
> I've been working with Monte Carlo Go lately.  An idea has sprung to
> mind, and I want to make sure it's original and not something I read
> in a paper.  (It seems similar to some things Bouzy has done, but I
> can't find anything specific.)
> 
> 
> Here's the plan:
> 
> 
> - Maintain a large minimax search tree in memory.  (For reasons that
> will become clear, we can't use alpha-beta pruning.  I think we could
> use a transposition table, making the tree a DAG.)
> 
> 
> - At each node in the tree, there is a weight for each possible move.
> Initially, all of these weights are set to some neutral value, say,
> 0.5.
> 
> 
> - Perform a single Monte Carlo run, making random moves to the end of
> the game.  At each node along this path to the end of the game, back
> up the resulting score.  Also, adjust the weights for the moves made
> on that path.  Thus, if a move led to a good outcome, it is more
> likely to be played again.  (The change in weights has to be fairly
> gradual so that a good move is not eliminated by one or two bad runs.)
> 
> 
> - Keep performing runs until time (or some other resource) runs out.
> The moves in these runs are random, but favor higher-weight moves.
> 
> 
> Has this been done before?  Is it flawed in some obvious way?
> 
> Peter Drake
> Assistant Professor of Computer Science
> Lewis & Clark College
> http://www.lclark.edu/~drake/
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/



More information about the computer-go mailing list