[computer-go] Bot ratings and strength
Don Dailey
drd at mit.edu
Mon Sep 4 19:51:47 PDT 2006
John Doe,
I just posted "GenericMC_10000 on CGOS. This is a test to see what a
very generic monte/carlo player should be able to achieve doing 10000
simulations - less than a second per move.
Others should be able to duplicate this bot exactly from my description
and it would be nice if someone would try just as a check. If there is
anything or unclear missing from my description I will try to clarify
it.
Here is how it works:
1. It plays 10,000 random games per move.
2. I keep stats on each move played that is played FIRST in the
random games.
3. PASS is never played in the random games until there are no
legal moves left (which do not fill eyes of course.)
4. Do full KO checking only at the root - not in the random games.
5. Results not territory. So a win is scored as a 1 and a loss as -1
6. Never move to an "eye" which is defined as an empty point
surrounded by friendly stones on the orthogonals and no more than
1 diagonal enemy stone - unless the empty point is on the edge, in
which case it must not have any diagonal enemies.
The table of first move statistics is initialized by setting the number
of games to 1 and the score tally to zero in order that there is no
possibility of divide by zero errors and no checking necessary. This
would correspond to 1 draw.
Assuming there are no bugs in my implementation, we will see what it can
achieve on CGOS. (I think there actually may be a KO bug because some
of this is new untested code.)
The very next major enhancement to this would be the "all moves as
first" heuristic when you considered statistics of many of the moves
played during the game, not just the first move.
Earlier I claimed this should easily achieve over 1200 on CGOS. I
remember the "all as first" heuristic was a pretty large improvement at
low levels like 10000, so I'm not sure I was correct but we shall see.
At any rate, if you are doing a very large number of games, "all as
first" is not useful, but it is a huge boost if you are doing less than
a few tens of thousands. I'm not sure where the break even point is,
but "all as first" probably hurts it at some point if you are really
doing a lot of simulations.
- Don
On Thu, 2006-08-31 at 20:29 -0400, John Doe wrote:
> I understand that some authors may not wish to reveal too many details
> of their programs, but I am confused and a little frustrated by my
> inability to make a bot that climbs even above 800 in rating on CGOS.
> I am following a fairly straightforward Monte Carlo style approach --
> playing out random simulations for each potential move from a position
> and choosing the one that led to the most wins. The random moves
> within the simulations avoid filling single-point eyes (using the
> counting of corner-touching enemies discussed on thi list) and playing
> into self-atari. The top-level move chosen to simulate next is based
> on the UCT-style algorithm, although I do not (yet) keep any more of
> the tree in memory. I have tried to make the simulations as fast as
> possible; the number run is based on remaining time, but is usually
> around 50,000 for a move.
>
> My basic question is this: what makes some other similar programs so
> much stronger? I read the Sensai library descriptions for AnchorMan
> and ControlBoy and see that they are very similar, yet do only 5,000
> simulations, and yet are much much stronger programs overall. Does
> this difference come primarily from the benefits of keeping more of
> the tree in memory? From better heuristics for selecting top-level
> and/or simulation moves? I don't imagine anyone will have a
> completely definitive answer for this, but I am just at a loss at this
> point. Any guidance would be most appreciated.
>
> ~ Jon
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
More information about the computer-go
mailing list