[computer-go] Explanation to MoGo paper wanted.

George Dahl george.dahl at gmail.com
Wed Jul 4 13:57:23 PDT 2007


On 7/4/07, Don Dailey <drd at mit.edu> wrote:
> On Wed, 2007-07-04 at 11:34 +0200, Magnus Persson wrote:
> > but what really will make a
> > difference is in the quality in the playouts.
>
> I would like to suggest a more abstract view of things.  In the purest
> form of the algorithm there isn't an artificial distinction between the
> tree and the play-outs.  The algorithm is applied as if the whole tree
> already exists (conceptually) and nodes are updated to the end of the
> game.
>
> We had to impose end nodes and a tree that grows in depth due to the
> fact that it's impractical to store the whole tree in memory.  So we
> have a tree phase on the one hand, and on the other hand we have a
> play-out phase that simulates an unexplored tree (but without updates
> which introduces out of necessity a small inefficiency.)
>
> This makes everything a bit of a compromise but a well advised one due
> to hardware limitations.   But then we started imposing our will on the
> play-outs in order to make them "smarter."   But we didn't do the same
> to the tree portion because we now believe they are 2 separate things
> (even though they really are not.)
>
> So I prefer to think of the play-outs and the tree as the same thing.  I
> think whatever is done can be applied to both.   For instance Lazarus
> does a lot of pruning and the pruning rules are the same for tree
> portion and the play-out portion.   Actually, Lazarus saw most of the
> improvement from the tree pruning when I test each without the other.
>
> But I notice that we are now looking at the tree as the "search" portion
> and the play-outs as the "evaluation function."   I think that is
> incredible because I have always believed that "tree search" and
> "evaluation" are the same thing, just different forms or states.  Like
> water and ice,  or matter and energy.
>
> It's interesting that chess has this too.  Traditionally programs have
> always had these 3 very crude phases,   search, quiescence, evaluation.
> Modern programs have somewhat blurred these distinctions but it hasn't
> changed very much.
>
> UCT comes along and finally does away with the distinction altogether.
> Now you can call it all evaluation or search, whatever pleases you.
> But in it's purest form, UCT with totally random play-outs is a
> beautiful thing - a recursive evaluation function with zero (almost)
> domain specific knowledge.
>
> Of course now we just had to go and spoil it all by imposing domain
> specific rules.  I have done the same and I admit it.    It would be fun
> to see how far we could go if domain specific knowledge was forbidden as
> an experiment.   Once patterns are introduced along with other direct Go
> knowledge, it's still fun but it feels a bit "wrong", kind of like
> cheating.

Is it still cheating if the program learns and discover's the patterns
itself?  Then isn't it just precomputing a few things?



It's clear that when we do this, we introduce strengths and
> weaknesses to the program,  making it a bit more fragile, less
> "universal" or robust.  Stronger too, but more susceptible to
> in-transitivity.
>
> Of course we do this in Chess programs in a big way.  We very tediously
> "tell" the program what is good and what is bad.   It has no choice, it
> must accept our definition of right and wrong, our morality.   However
> in our great wisdom we provide a search mechanism in order to correct
> our bad judgments.   The search mechanism is an admission that we know
> we are wrong about many things.
>
> Of course you are right - if the play-outs are improved, the quality of
> the moves will also improve.
>
> - Don
>
>
>
>
>
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>


More information about the computer-go mailing list