[computer-go] Lazy Evaluation in Monte Carlo/Alpha Beta Search
for Viking4
Magnus Persson
magnus.persson at phmp.se
Sun Jul 23 10:10:30 PDT 2006
Quoting Don Dailey <drd at mit.edu>:
> Magnus,
>
> It Viking5 primarily an alpha/beta searcher using monte/carlo as the
> evaluation function? I know you have other good stuff in there, but
> does this characterize your program in general terms?
Yes, it is.
I could probably turn off a lot of features of the program and still have
reasonable strong straight forward alpha/beta searcher similar to chess
programs (but not using Q-search and Null-moves since it did not work
(Null-moves by the way appeared to work very well pruning a lot of bad moves
most of the time, but it played weaker. After that I came up with lazy
evaluations and have not tried null moves again)). It has a
transposition table
that works nicely with iterative deepening.
Move ordering works well enough on 7x7 and 9x9 using simple killer moves, a
messy variant of the history heuristic and the "promote and play the opponents
move that was the best reponse to my last move"-heuristic.
On larger boards even 1-ply search is difficult so it has to a) prune very bad
moves that should not be generated b) remaining moves has to be ordered
better.
For this I use hand made patterns of many types, and some tactical code
which I
use in the simulation part for urgent moves. Recently I added some patterns of
the type "if the opponent plays here, the next move must be one the
following".
This sort of shortcuts the idea of global search, but the evaluation is
too weak
on 19x19 to give it a "free" will.
All of this does not make the program much stronger on 19x19 but it certainly
looks better...
I have not made any testing of the gains on 13x13 and larger since it would be
so time consuming.
The monte/carlo part of the program is has a lot of extra stuff that
contribute
a lot to the playing strength. If there was nothing special about the
last move
it picks a random move and if that move is not "bad" it plays it, otherwise it
play a better move sometimes or picks a new random move. A move can be bad if
it is bad shape, is on the edge of the board with no nearby stones or it is a
tactical blunder. The tactics it can handle are simple ladders (if anyting
complex happens it is assumed that the ladder is broken), and some simple
capturing races. There are probably a lot of little things that I do not
remember. The important thing is that any code added to the simulation part is
not allowed to slow it down. The exception is probably the ladder code but it
makes the program tactically much stronger so it is worth it. Some things
actually speeds up the the program slightly, because it makes the simulations
shorter. The most complex code has to do with detecting eyes and eyeshapes
correctly. Some eyeshape is computed incrementally by my acient go board
manager, but often it is not good enough so some eyeshape code is called
whenever needed.
Other than that it has some time management code and resignation code and some
tricks to end games quickly when it is winning.
MP
More information about the computer-go
mailing list