[computer-go] Bot ratings and strength
Don Dailey
drd at mit.edu
Thu Aug 31 19:37:16 PDT 2006
I think you have a bug somewhere. If you are doing 50,000 games, you
should be well over 1200 on CGOS.
I would start very simple and get it working correctly before you play
with UCT and the fancier stuff. Just play the games, use the eye rule
(and you should make sure the eye rule is correct) and keep track of
which first move got the highest score. Go by number of games won, not
the total score and be sure to factor in komi correctly.
If this does NOT get you past 1200, it's broken.
AnchorMan does exactly 5,000 simulations and of course by definition is
1500. But AnchorMan does have a few little helpers that add a lot of
strength, but certainly no more than 200 points and probably less.
I can think of a few things that can easily go wrong:
1. Are you sure you are scoring the end of the game correctly?
2. Are you sure you are playing a truly random game without bias?
There are some slightly less than intuitive gotcha's that can
get you if you are not careful. For instance, you can't just pick
a random point in the list and if it's not legal try the next
points until you find a legal one. That's not random unless
the list you are working from was scrambled just before the
selection process. And you can't scramble the list just before
starting the random game and then get moves by traversing it in
order.
3. Is your stopping rule correct? If one side is out of moves, the
game is not over - the other side may still have moves.
I'm just trying to imagine the possible bugs you may have, but I'm quite
sure you must have something wrong.
- Don
On Thu, 2006-08-31 at 20:29 -0400, John Doe wrote:
> I understand that some authors may not wish to reveal too many details
> of their programs, but I am confused and a little frustrated by my
> inability to make a bot that climbs even above 800 in rating on CGOS.
> I am following a fairly straightforward Monte Carlo style approach --
> playing out random simulations for each potential move from a position
> and choosing the one that led to the most wins. The random moves
> within the simulations avoid filling single-point eyes (using the
> counting of corner-touching enemies discussed on thi list) and playing
> into self-atari. The top-level move chosen to simulate next is based
> on the UCT-style algorithm, although I do not (yet) keep any more of
> the tree in memory. I have tried to make the simulations as fast as
> possible; the number run is based on remaining time, but is usually
> around 50,000 for a move.
>
> My basic question is this: what makes some other similar programs so
> much stronger? I read the Sensai library descriptions for AnchorMan
> and ControlBoy and see that they are very similar, yet do only 5,000
> simulations, and yet are much much stronger programs overall. Does
> this difference come primarily from the benefits of keeping more of
> the tree in memory? From better heuristics for selecting top-level
> and/or simulation moves? I don't imagine anyone will have a
> completely definitive answer for this, but I am just at a loss at this
> point. Any guidance would be most appreciated.
>
> ~ Jon
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
More information about the computer-go
mailing list