[Computer-go] Converging to 57%
fotland at smart-games.com
Tue Aug 23 23:17:05 PDT 2016
I train using approximately the same training set as AlphaGo, but so far without the augmentation with rotations and reflection. My target is about 55.5%, since that's what Alphago got on their training set without reinforcement learning.
I find I need 5x5 in the first layer, at least 12 layers, and at least 96 filters to get over 50%. My best net is 55.3%, 18 layers by 96 filters. I use simple SGD with a 64 minibatch, no momentum, 0.01 learning rate until it flattens out, then 0.001. I have two 980TI, and the best nets take about 5 days to train (about 20 epochs on about 30M positions). The last few percent is just trial and error. Sometimes making the net wider or deeper makes it weaker. Perhaps it's just variation from one training run to another. I haven’t tried training the same net more than once.
> -----Original Message-----
> From: Computer-go [mailto:computer-go-bounces at computer-go.org] On Behalf
> Of Gian-Carlo Pascutto
> Sent: Tuesday, August 23, 2016 12:42 AM
> To: computer-go at computer-go.org
> Subject: Re: [Computer-go] Converging to 57%
> On 23-08-16 08:57, Detlef Schmicker wrote:
> > So, if somebody is sure, it is measured against GoGod, I think a
> > number of other go programmers have to think again. I heard them
> > reaching 51% (e. g. posts by Hiroshi in this list)
> I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
> Something I noticed from the papers is that the prediction percentage
> keeps going upwards with more epochs, even if slowly, but still clearly
> In my experience my networks converge rather quickly (like >0.5% per
> epoch after the first), get stuck, get one more 0.5% gain if I lower the
> learning rate (by a factor 5 or 10) and don't gain any more regardless
> of what I do thereafter.
> I do use momentum. IIRC I tested without momentum once and it was worse,
> and much slower.
> I did not find any improvement in playing strength from doing Facebook's
> 3 move prediction. Perhaps it needs much bigger networks than 128 x 12.
> Adding ladder features also isn't good enough to (consistently) keep the
> network from playing into them. (And once it's played the first move,
> you're totally SOL because the resulting positions aren't in the
> training set and you'll get 99% confidence for continuing the losing
> ladder moves)
> I'm currently doing a more systematic comparison of all methods (and
> GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
> (rather than looking at prediction %). I'll post the results here, if
> anything definite comes out of it.
> Computer-go mailing list
> Computer-go at computer-go.org
More information about the Computer-go