[computer-go] CGOS Monte Carlo Simulations

Don Dailey drd at mit.edu
Thu Sep 7 05:01:08 PDT 2006


A few days ago I posted a Generic Monte Carlo program on CGOS in order
to get a baseline on what to expect from a very simple Monte Carlo
players.

This is a very simple player that doesn't scale infinitely,  but shows
practical improvement at reasonable levels on current hardware.  

I posted 2 of them on CGOS,  a version doing 10,000 simulation and one
doing 100,000

Then 10,000 simulation version has achieved a rating of about 1143 and
the one doing 100,000 rates about 1456,  about 300 points stronger.

There are several very simple techniques to make it play stronger with
less simulations - but without incorporating a more directed search this
basic type of monte carlo players is probably close to it's maximum
strength - it won't get a lot better even if you gave it several days
per move to do simulations.

I would like to crank the level up even higher to see how close I am to
the ceiling, and maybe I will, but it would probably start losing due to
time if I go for a lot more simulations.   From experiments I've done
with AnchorMan I know that there is a ceiling and I believe it is well
under 1600 on CGOS.     AnchorMan is close to it's ceiling at 1500,  but
it uses the "all moves as first" heuristic and this may have a limiting
effect on the highest level it can achieve.  It is the same program in
Ogo, a program for the palm and it was optimized to play a good move as
fast as possible.   

- Don



On Thu, 2006-08-31 at 20:29 -0400, John Doe wrote:
> I understand that some authors may not wish to reveal too many details
> of their programs, but I am confused and a little frustrated by my
> inability to make a bot that climbs even above 800 in rating on CGOS.
> I am following a fairly straightforward Monte Carlo style approach --
> playing out random simulations for each potential move from a position
> and choosing the one that led to the most wins.  The random moves
> within the simulations avoid filling single-point eyes (using the
> counting of corner-touching enemies discussed on thi list) and playing
> into self-atari.  The top-level move chosen to simulate next is based
> on the UCT-style algorithm, although I do not (yet) keep any more of
> the tree in memory.  I have tried to make the simulations as fast as
> possible; the number run is based on remaining time, but is usually
> around 50,000 for a move. 
>  
> My basic question is this: what makes some other similar programs so
> much stronger?  I read the Sensai library descriptions for AnchorMan
> and ControlBoy and see that they are very similar, yet do only 5,000
> simulations, and yet are much much stronger programs overall.  Does
> this difference come primarily from the benefits of keeping more of
> the tree in memory?  From better heuristics for selecting top-level
> and/or simulation moves?  I don't imagine anyone will have a
> completely definitive answer for this, but I am just at a loss at this
> point.  Any guidance would be most appreciated. 
>  
> ~ Jon
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/



More information about the computer-go mailing list