[Computer-go] mini-max with Policy and Value network
Hideki Kato
hideki_katoh at ybb.ne.jp
Tue May 23 08:14:30 PDT 2017
Erik van der Werf: <CAKkgGrOxqLdtKk3VgRERNeJ9diiHC_MixxzF75y4WM7grxd4GQ at mail.gmail.com>:
>On Tue, May 23, 2017 at 10:51 AM, Hideki Kato <hideki_katoh at ybb.ne.jp>
>wrote:
>
>> Agree.
>>
>> (1) To solve L&D, some search is necessary in practice. So, the
>> value net cannot solve some of them.
>> (2) The number of possible positions (input of the value net) in
>> real games is at least 10^30 (10^170 in theory). If the value
>> net can recognize all? L&Ds depend on very small difference of
>> the placement of stones or liberties. Can we provide necessary
>> amount of training data? Have the network enough capacity?
>> The answer is almost obvious by the theory of function
>> approximation. (ANN is just a non-linear function
>> approximator.)
>>
>
>A similar argument can be made for natural neural nets, but we know humans
>are able to come up with reasonable solutions. I suppose a pure neural net
>approach would require some form of recursion, but when combined with a
>search, and rolling out the decision process to some sufficiently high
>number of max steps, apparently it's not that important.. Also, I suspect
>that nearly all positions can only be reached in real games by inferior
>moves from both sides. All that may be needed is some crude means to steer
>away from chaos (and even if one would start in chaos, humans probably
>wouldn't do well either).
My argument is for "stand-alone" DCNN. Adding some (top-down?)
control to DCNNs could solve this (like human's brain). #I'm not
sure about recurrency but maybe necessary.
>(3) CNN cannot learn exclusive-or function due to the ReLU
>> activation function, instead of traditional sigmoid (tangent
>> hyperbolic). CNN is good at approximating continuous (analog)
>> functions but Boolean (digital) ones.
>>
>
>
>Are you sure about that? I can imagine using two ReLU units to construct a
>sigmoid-like step function, so I'd think a multi-layer net should be fine
>(just like with ordinary perceptrons).
Even if using many layers, it's hard to represent sharp edges by
combining ReLUs. (Not impossible but chances are few probably
due to so many local traps.)
Best, Hideki
>Best,
>Erik
>---- inline file
>_______________________________________________
>Computer-go mailing list
>Computer-go at computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
--
Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>
More information about the Computer-go
mailing list