[Computer-go] AlphaGo Zero

Álvaro Begué alvaro.begue at gmail.com
Thu Oct 19 18:41:51 PDT 2017


Yes, residual networks are awesome! I learned about them at ICML 2016 (
http://kaiminghe.com/icml16tutorial/index.html). Kaiming He's exposition
was fantastically clear. I used them in my own attempts at training neural
networks for move prediction. It's fairly easy to train something with 20
layers with residual networks, even without using batch normalization. With
batch normalization apparently you can get to hundreds of layers without
problems, and the models do perform better on the test data for vision
tasks. But I didn't implement that part, and the additional computational
cost probably makes this not worth it for go.

Álvaro.




On Thu, Oct 19, 2017 at 8:51 PM, Brian Sheppard via Computer-go <
computer-go at computer-go.org> wrote:

> So I am reading that residual networks are simply better than normal
> convolutional networks. There is a detailed write-up here:
> https://blog.waya.ai/deep-residual-learning-9610bb62c355
>
> Summary: the residual network has a fixed connection that adds (with no
> scaling) the output of the previous level to the output of the current
> level. The point is that once some layer learns a concept, that concept is
> immediately available to all downstream layers, without need for learning
> how to propagate the value through a complicated network design. These
> connections also provide a fast pathway for tuning deeper layers.
>
> -----Original Message-----
> From: Computer-go [mailto:computer-go-bounces at computer-go.org] On Behalf
> Of Gian-Carlo Pascutto
> Sent: Wednesday, October 18, 2017 4:33 PM
> To: computer-go at computer-go.org
> Subject: Re: [Computer-go] AlphaGo Zero
>
> On 18/10/2017 19:50, cazenave at ai.univ-paris8.fr wrote:
> >
> > https://deepmind.com/blog/
> >
> > http://www.nature.com/nature/index.html
>
> Select quotes that I find interesting from a brief skim:
>
> 1) Using a residual network was more accurate, achieved lower error, and
> improved performance in AlphaGo by over 600 Elo.
>
> 2) Combining policy and value together into a single network slightly
> reduced the move prediction accuracy, but reduced the value error and
> boosted playing performance in AlphaGo by around another 600 Elo.
>
> These gains sound very high (much higher than previous experiments with
> them reported here), but are likely due to the joint training.
>
> 3) The raw neural network, without using any lookahead, achieved an Elo
> rating of 3,055. ... AlphaGo Zero achieved a rating of 5,185.
>
> The increase of 2000 Elo from tree search sounds very high, but this may
> just mean the value network is simply very good - and perhaps relatively
> better than the policy one. (They previously had problems there that SL
> > RL for the policy network guiding the tree search - but I'm not sure
> there's any relation)
>
> 4) History features Xt; Yt are necessary because Go is not fully
> observable solely from the current stones, as repetitions are forbidden.
>
> This is a weird statement. Did they need 17 planes just to check for ko?
> It seems more likely that history features are very helpful for the
> internal understanding of the network as an optimization. That sucks though
> - it's annoying for analysis and position setup.
>
> Lastly, the entire training procedure is actually not very complicated at
> all, and it's hopeful the training is "faster" than previous approaches -
> but many things look fast if you can throw 64 GPU workers at a problem.
>
> In this context, the graphs of the differing network architectures causing
> huge strength discrepancies are both good and bad. Making a better pick can
> cause you to get massively better results, take a bad pick and you won't
> come close.
>
> --
> GCP
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20171019/5b50d638/attachment-0001.html>


More information about the Computer-go mailing list