[Computer-go] Evaluating improvements differently
alvaro.begue at gmail.com
Thu Apr 7 11:27:32 PDT 2011
I haven't spent any time in go programming recently, but a few months
ago I thought of a method to evaluate proposed improvements that might
be much better than playing a gazillion games. A search results in two
things: A move and a probability of winning (or a score that can be
mapped into a probability of winning, but let's ignore that issue for
now). Evaluating whether the moves picked by a strategy are good is
really hard, but evaluating whether the estimate of a probability of
winning is a good estimate seems much easier.
For instance, take a database of games played by strong players.
Extract a few positions from each game. Run your engine for some fixed
amount of time on each position, and measure how well it predicted the
winner after each position (cross entropy is probably the correct
measure to use). Do this before and after the proposed modification
Of course one has to be careful to pick reasonably well-played games
(games played by top engines with more time per move than you'll use
to evaluate your engine seems good enough, and will result in a much
cleaner database than collecting games played by humans) and to have a
large enough and varied enough set of positions. Also, one should
worry about over-fitting for those particular positions, but one could
use another set of positions for confirmation. These problems all seem
manageable to me.
It is possible that certain improvements can really only be measured
by playing games (time control comes to mind), but I would imagine
that for a large class of things this procedure can give you useful
results in much more reasonable times.
Your thoughts are appreciated.
More information about the Computer-go