[computer-go] Example for Morphys law in Evaluation

Don Dailey drd at mit.edu
Thu Sep 21 07:30:12 PDT 2006


That may be part of the power of Monte Carlo methods, it is "almost"
immune to horizon effects.    

Of course it is not completely immune,  but there is no sudden stopping
effect, where really obvious problems are pushed over the horizon -
instead the playout reflects the problems that are likely to be
encountered.

I wonder if there is a feasible way to replace the playout portion of
our Monte Carlo programs with Chrilly's static evaluation function?  

This would basically yield a best first search which is what we are
doing anyway.   Obviously, you couldn't copy our methods directly as UCT
wouldn't work as is and you would need some kind of reasonable backup
rule - perhaps just plain mini-max.   

The main characteristics are that it must do significantly more pruning
that alpha/beta,  and it probabilistically selects some moves more than
others.   

I'm thinking one static evaluation is probably roughly equivalent to a
few simulations.

At one point it occurred to me that you could STOP each simulation
before the game ended if you already know the result (overwhelming
advantage for one side or the other.)   However I gave up on that idea
because it doesn't save a huge amount of time and it's expensive to take
the time to analyze whether the game can be stopped early.    But
Chrilly's static evaluation could "pretend" to be an estimate of the
score after some fixed number of MC simulations and substituted
directly.  

Just an idea - probably a bad one!

- Don

 


On Thu, 2006-09-21 at 15:41 +0200, Chrilly wrote:
> >>>
> >>> With static full board evaluation one needs a complete and correct
> >>> implementation of all possible situations that are likely to arise on a 
> >>> go
> >>> board. If there is limits in the evaluation then I think search will not 
> >>> be
> >>> able to compensate for this.
> >>>
> >
> >>> With MC evaluation one can add some simple local knowledge which 
> >>> improves the
> >>> evaluation in most situations, and search work quite well in removing 
> >>> those
> >>> moves that evaluated too high early in search.
> >>
> Alpha-Beta has exactly the same purpose. The search is an 
> error-filter/corrector. The problem is that the current Alpha-Beta search is 
> in Go to shallow - or the problems are too deep.
> The position I presented initially was not the board position, but the end 
> of a 5 ply search.  The black area was greater, but white could cut and 
> reduce it. Suzie did not prevent this, because she thought there is no 
> problem even after the cut.
> One has in principle the same problems with a 20 Plies search. But in this 
> case I would not have seen the consequences. Neither would be the opponent 
> GoAhead be able to exploit the problem.
> In this position Suzie would have seen the problem with a 6 plies search, 
> were white can connect q2 to the strong white group.
> 
> Chrilly
> 
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/



More information about the computer-go mailing list