[computer-go] Learning (was: thoughts on 100,000 cgos games)

David G Doshay ddoshay at mac.com
Sat May 6 15:39:23 PDT 2006


On 6, May 2006, at 2:41 PM, Don Dailey wrote:

> I believe that if there is a way to learn from game collections, it  
> has
> to be centered around the mistakes.    We really get too focused on
> "good moves" and try to make our program play "good moves" instead of
> not playing "bad moves."

I also agree that with the state of computer Go, limiting your own
mistakes and taking advantage of the opponent's mistakes is where
the most progress can be made. It is clear to me that what SlugGo
does is to avoid some GNU Go mistakes and take advantage of
them when playing against GNU Go. Unfortunately, the way it has
done this to date also generates another whole set of different
mistakes.

I just don't know how to run any automated learning algorithm
based upon the fact that "I lost all the games in this set." If I could
automate the discovery of bad moves in a database of games that
my program lost, then I could have changed the code not to make
those bad moves. But the best evaluation functions I have thought
of are already in the program ... and that is why the bad move was
made.

I have a picture from an advertisement that I have used as my
example when discussing Machine Learning. The picture shows
a bear in the foreground about to catch with its teeth a fish that
has jumped from the river as it fights its way upstream. Behind
this bear is a fisherman with eyes firmly on the bear and assuming
the same pose, mouth wide open. The caption says "Learn from
the experts."

I feel that trying to learn from a huge set of expert games how to
make expert moves is similar, and raises these questions:

1) once you perfectly imitate the expert for this one move, do you
have the specialized abilities of that expert to do what comes next?

2) if you do not, then why even start by copying that expert?

Bluntly, without the teeth or claws of the bear, what would the
fisherman do even if he did get a fish in his bite? Likewise,
without the reading ability of a pro, can the first move of one of
their tesujis/invasions/reductions be the right move for one of
our programs? Maybe, but I think it is unlikely without the ability
to consistently follow-up.

So, I agree that we will "learn more" or start playing better when
we decrease the bad moves, but other than following those games
by hand and building up a database of patterns that specifically
get the program to "play here, not there," how can this information
be used? It is clear that this is very close to what the GNU Go team
is doing ... after a blunder in a tournament or other game, reviews
of the bad move are compared with the known patterns, and a new
pattern is developed. But this just does not scale well, even though
GNU Go is getting a long way with the method.



Cheers,
David






More information about the computer-go mailing list