[computer-go] Explanation to MoGo paper wanted.
Richard Brown
batmagoo at gmail.com
Tue Jul 10 08:33:33 PDT 2007
Gunnar Farnebäck wrote:
> Don wrote:
>> Of course now we just had to go and spoil it all by imposing domain
>> specific rules. I have done the same and I admit it. It would be fun
>> to see how far we could go if domain specific knowledge was forbidden as
>> an experiment. Once patterns are introduced along with other direct Go
>> knowledge, it's still fun but it feels a bit "wrong", kind of like
>> cheating. It's clear that when we do this, we introduce strengths and
>> weaknesses to the program, making it a bit more fragile, less
>> "universal" or robust. Stronger too, but more susceptible to
>> in-transitivity.
>
> I'm on the other side of this issue. In my opinion all kinds of go
> knowledge are fair game and I'm rather disappointed that so small
> amounts of domain specific knowledge have been merged with the UCT
> search approaches.
Thank you for articulating this viewpoint, Gunnar.
A program trained upon millions of patterns harvested from professional games
[appearing soon on a server near you, for some appropriate value of "soon"]
has the property that it _severely_ prunes the game tree according to what
might be described as "heuristics", but which heuristics are _learned_ by
several hundred feedforward networks. [The heuristics are not hand-coded.]
The patterns comprise observations of what pros do, and more importantly,
what they don't do. Selecting which features of go-behavior to measure,
and upon what scale to express those measurements, are both, of course,
crucial. [The feature-extraction phase of pattern-recognition is a tricky
problem, regardless of domain: Eschewing noise in favor of information.]
Once the feedforward networks have done their learning, the patterns themselves
are discarded. [Thus there is no pattern "library", only the trained networks.]
Applying a UCT-layer atop such a severely-pruned game-tree seems promising.
I would re-phrase what Gunnar said: I am optimistic that when UCT is
layered atop a strong program that has already pruned -- via learned,
domain-specific knowledge -- most of the fruitless branches from the
tree, such a program will become even stronger by virtue of the UCT-layer.
Don is right, too, in the sense that such a program is somehow less "pure"
than one which has obtained all its knowledge from first principles.
Nonetheless, a program that could not only play a decent game of go, but
somehow emulate the _style_ of a given professional would be of interest,
would it not?
Of even more interest, perhaps, would be a program that, given a game of
go between two (unidentified) professionals whose games it had studied,
whose styles it had _learned_, could then tell you who the two players
were, _even_if_it_had_never_before_seen_that_particular_game_.
Alternatively, if the identity of the players were known, such a program
could be expected to perform well (much better than random, anyway) at a
"guess-the-next-move" task.
Could a pure-UCT program ever perform in these ways? Even if the pure-UCT
program were able to reliably and repeatedly defeat the domain-specific
program at playing go, it would not be able to perform the other tasks.
So, I must agree with Gunnar that the domain-specific approach is "fair game".
It all depends on which problem one is trying to solve, to begin with.
--
Rich
More information about the computer-go
mailing list