[computer-go] Lazy Evaluation in Monte Carlo/Alpha Beta Search
for Viking4
Magnus Persson
magnus.persson at phmp.se
Sun Jul 23 08:19:25 PDT 2006
Quoting Rémi Coulom <Remi.Coulom at univ-lille3.fr>:
> I believe your approach is dangerous, because your confidence
> intervals shrink as 1/N. They should shrink as 1/sqrt(N). Your idea
> might work in practice, but it does not look very consistant with
> theory. My idea of a fudge can be justified in the Bayesian framework
> in terms of a "safe prior". It applies to the estimation of the
> value, not the confidence bounds.
I was just thinking about this difference, since DD's method appears to prune
more aggressively when N gets large. But if the fudge constant is large
then it
might be more conservative when N is small and still prune aggressively for
increasing N´s. I had to use a minimum N to avoid instability in the search.
When N is sufficently large the effect of mistaken lazy evaluations might not
be important because the error cannot be very large compared to a small N.
Take this hypothetical scenario for Viking4 playing 19x19 searching 1 ply with
20 candidate moves all of almost equal value followed by one critical
move that
is superior. 1/N might here reject several moves that might be evaluated
slightly better then the current best but may search more moves and then find
the critical move in time that the more picky 1/sqrt(N) would not find
in time.
--
Magnus Persson
Berlin, Germany
http://www.phmp.se (Under construction)
More information about the computer-go
mailing list