[Computer-go] Creating the playout NN

Álvaro Begué alvaro.begue at gmail.com
Sun Jun 12 04:05:11 PDT 2016


I don't understand the point of using the deeper network to train the
shallower one. If you had enough data to be able to train a model with many
parameters, you have enough to train a model with fewer parameters.

Álvaro.


On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
michael.markefka at gmail.com> wrote:

> Might be worthwhile to try the faster, shallower policy network as a
> MCTS replacement if it were fast enough to support enough breadth.
> Could cut down on some of the scoring variations that confuse rather
> than inform the score expectation.
>
> On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
> <skaitschick at gmail.com> wrote:
> > I don't know how the added training compares to direct training of the
> > shallow network.
> > It's prob. not so important, because both should be much faster than the
> > training of the deep NN.
> > Accuracy should be slightly improved.
> >
> > Together, that might not justify the effort. But I think the fact that
> you
> > can create the mimicking NN, after the deep NN has been refined with self
> > play, is important.
> >
> > On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen <
> petri.t.pitkanen at gmail.com>
> > wrote:
> >>
> >> Would the expected improvement be reduced training time or improved
> >> accuracy?
> >>
> >>
> >> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
> >> <stefan.kaitschick at hamburg.de>:
> >>>
> >>> If I understood it right, the playout NN in AlphaGo was created by
> using
> >>> the same training set as the one used for the large NN that is used in
> the
> >>> tree. There would be an alternative though. I don't know if this is
> the best
> >>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
> >>> The idea is to teach a shallow NN to mimic the outputs of a deeper net.
> >>> For one thing, this seems to give better results than direct training
> on the
> >>> same set. But also, more importantly, this could be done after the
> large NN
> >>> has been improved with selfplay.
> >>> And after that, the selfplay could be restarted with the new playout
> NN.
> >>> So it seems to me, there is real room for improvement here.
> >>>
> >>> Stefan
> >>>
> >>> _______________________________________________
> >>> Computer-go mailing list
> >>> Computer-go at computer-go.org
> >>> http://computer-go.org/mailman/listinfo/computer-go
> >>
> >>
> >>
> >> _______________________________________________
> >> Computer-go mailing list
> >> Computer-go at computer-go.org
> >> http://computer-go.org/mailman/listinfo/computer-go
> >
> >
> >
> > _______________________________________________
> > Computer-go mailing list
> > Computer-go at computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20160612/a698abcb/attachment.html>


More information about the Computer-go mailing list