[Computer-go] Creating the playout NN

Jim O'Flaherty jim.oflaherty.jr at gmail.com
Sun Jun 12 13:58:00 PDT 2016


BTW, by improvement, I don't mean higher Go playing skill...I mean
appearing close to the same level of Go playing skill _per_ _move_ with far
less computational cost. It's the total game outcomes that will fall.

On Sun, Jun 12, 2016 at 3:55 PM, Jim O'Flaherty <jim.oflaherty.jr at gmail.com>
wrote:

> The purpose is to see if there is some sort of "simplification" available
> to the emerged complex functions encoded in the weights. It is a typical
> reductionist strategy, especially where there is an attempt to converge on
> human conceptualization. Given the complexity of the nuances in Go, my
> intuition says that it will show excellent improvement in short term play
> at the cost of nuance in longer term play.
>
> On Sun, Jun 12, 2016 at 6:05 AM, Álvaro Begué <alvaro.begue at gmail.com>
> wrote:
>
>> I don't understand the point of using the deeper network to train the
>> shallower one. If you had enough data to be able to train a model with many
>> parameters, you have enough to train a model with fewer parameters.
>>
>> Álvaro.
>>
>>
>> On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
>> michael.markefka at gmail.com> wrote:
>>
>>> Might be worthwhile to try the faster, shallower policy network as a
>>> MCTS replacement if it were fast enough to support enough breadth.
>>> Could cut down on some of the scoring variations that confuse rather
>>> than inform the score expectation.
>>>
>>> On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
>>> <skaitschick at gmail.com> wrote:
>>> > I don't know how the added training compares to direct training of the
>>> > shallow network.
>>> > It's prob. not so important, because both should be much faster than
>>> the
>>> > training of the deep NN.
>>> > Accuracy should be slightly improved.
>>> >
>>> > Together, that might not justify the effort. But I think the fact that
>>> you
>>> > can create the mimicking NN, after the deep NN has been refined with
>>> self
>>> > play, is important.
>>> >
>>> > On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen <
>>> petri.t.pitkanen at gmail.com>
>>> > wrote:
>>> >>
>>> >> Would the expected improvement be reduced training time or improved
>>> >> accuracy?
>>> >>
>>> >>
>>> >> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
>>> >> <stefan.kaitschick at hamburg.de>:
>>> >>>
>>> >>> If I understood it right, the playout NN in AlphaGo was created by
>>> using
>>> >>> the same training set as the one used for the large NN that is used
>>> in the
>>> >>> tree. There would be an alternative though. I don't know if this is
>>> the best
>>> >>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
>>> >>> The idea is to teach a shallow NN to mimic the outputs of a deeper
>>> net.
>>> >>> For one thing, this seems to give better results than direct
>>> training on the
>>> >>> same set. But also, more importantly, this could be done after the
>>> large NN
>>> >>> has been improved with selfplay.
>>> >>> And after that, the selfplay could be restarted with the new playout
>>> NN.
>>> >>> So it seems to me, there is real room for improvement here.
>>> >>>
>>> >>> Stefan
>>> >>>
>>> >>> _______________________________________________
>>> >>> Computer-go mailing list
>>> >>> Computer-go at computer-go.org
>>> >>> http://computer-go.org/mailman/listinfo/computer-go
>>> >>
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> Computer-go mailing list
>>> >> Computer-go at computer-go.org
>>> >> http://computer-go.org/mailman/listinfo/computer-go
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Computer-go mailing list
>>> > Computer-go at computer-go.org
>>> > http://computer-go.org/mailman/listinfo/computer-go
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go at computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20160612/2bcf2e01/attachment.html>


More information about the Computer-go mailing list