[computer-go] Lazy Evaluation in Monte Carlo/Alpha Beta Search
for Viking4
Don Dailey
drd at mit.edu
Sun Jul 23 07:24:02 PDT 2006
Yes, I misunderstood your message and agree with you. Since the
number of nodes I do is essential fixed, I can find a value that works
good in practice but would be slightly non-optimal.
The best value I could choose would be too conservative on one end and
too agressive on the other end. So this is an optimization I should
do.
My mini-max searcher actually adjusts FUDGE dynamically to get an effect
similar to what you describe. I use this as a simple method for
choosing moves probabilistically, I just always play the move in the
tree part of the search that has the best fudged value.
- Don
On Sun, 2006-07-23 at 16:05 +0200, Rémi Coulom wrote:
> Don Dailey wrote:
> > I wasn't very clear about one of these points (and Remi just posted a
> > clearer explanation of the same technique I use.)
>
> My idea was maybe not quite the same as yours. From what I understand of
> your message, you decide about pruning based on the fudged value
> directly. My idea was to use the fudged value as an estimate of the move
> value, and still use alpha * sqrt(sigma²/N) as the size of the
> confidence interval around the fudged value.
>
> I believe your approach is dangerous, because your confidence intervals
> shrink as 1/N. They should shrink as 1/sqrt(N). Your idea might work in
> practice, but it does not look very consistant with theory. My idea of a
> fudge can be justified in the Bayesian framework in terms of a "safe
> prior". It applies to the estimation of the value, not the confidence
> bounds.
>
> Rémi
More information about the computer-go
mailing list