[Computer-go] Evaluating improvements differently
jay at satirist.org
Thu Apr 7 14:37:49 PDT 2011
Álvaro Begué alvaro.begue at gmail.com:
>a method to evaluate proposed improvements that might
>be much better than playing a gazillion games. A search results in two
>things: A move and a probability of winning (or a score that can be
>mapped into a probability of winning, but let's ignore that issue for
>now). Evaluating whether the moves picked by a strategy are good is
>really hard, but evaluating whether the estimate of a probability of
>winning is a good estimate seems much easier.
My suggestion is well-known--isn't it? I made it over ten years ago and
it's been on my web site the whole time. The basic insight is similar to
Instead of looking only at game results, look at the temporal
differences in the score over the games. That contains strictly more
information, so if you use it well you at least can't do any worse.
A sudden increase in score from one move to the next may mean that the
opponent has made a mistake, but a sudden decrease means that the
program has definitely made some mistake--it misevaluated at least one
of the before and after positions. Having a measure of mistakes, even a
rough guess measure like this, must be worth something.
With playout-based programs that don't have sharp horizons and may
realize their mistakes slowly, it might make sense to look at trends in
the score deltas over a sequence of moves. Basically, low-pass filter
the temporal differences. Well, I think it'd be worth a try.
More information about the Computer-go