[Computer-go] Converging to 57%

Robert Waite winstonwaite at gmail.com
Wed Aug 24 00:30:33 PDT 2016

@Detlef It is comforting to hear that GoGoD data seemed to converge towards
51% in your testing. When I ran KGS data... it definitely converged more
quickly but I stopped them short. I think it all makes sense if figure 5 of
the DarkForest paper is the convergence of KGS data... and it doesn't seem
clear... but looking at the paper now... they are comparing with Maddison
and makes sense they would show the numbers for the same dataset.

@GCP the three move strength graphs looked shaky to me... it doesnt seem
like a clear change in strength. For the ladder issue... I think MCTS and a
value or fast rollout network are how AG overcame weaknesses like that. The
fast rollout network is actually the vaguest part to me... i have red some
of the ancestor papers... and can see that people in the field know what
they are describing mostly.. but I don't know where to begin to get the
pattern counts listed in the AG tables at the end of the paper

@David Have you matched your network vs. GnuGo? I think accuracy and loss
are indicators of model health... but playing strength seems diff. The AG
paper only mentions beating Pachi at 100k rollouts with the RL... not the
SL... at 85% winrate. The DarkForest paper shows more data with winrates...
KGS network vs Pachi 10k won ~23% of games.. but GoGoD trained won ~59%.
They also tacked on extended features and 3 step prediction... so who knows.

I am actually feeling a million times better about 51% being the heavy zone
for GoGoD data. Makes my graphs make more sense.

Graphs now:



Gonna keep going with the magenta and black line... figure I can get to 48
percent. I can run 10 million pairs in a day... so the graph width is 1
week. Lol... feel so happy if 57 isnt expected on GoGoD. 51 looks fine and
approachable on my graphs.

For the game phase batched data... the DarkForest paper explicitly calls
out that they got stuck in poor minima without it. I figured that
randomness was fine... but you could definitely get some skews... like no
beginning moves in a minibatch of size 16 like AG. Their paper didn't
elaborate... but did mention 16 threads... to generate a pair.. i select
one random game from all of the available sgf files... and split the game
into 16 sections. I am using threading too.. so more to that... but
basically 16 sets of 16 makes for a 256 minibatch like DarkForest team.

Think the only way to beat Zen or CrazyStone is to get the value network or
fast-rollout with MCTS. Of course... CrazyStone is evolving too... so maybe
not a goal.

On Tue, Aug 23, 2016 at 11:17 PM, David Fotland <fotland at smart-games.com>

> I train using approximately the same training set as AlphaGo, but so far
> without the augmentation with rotations and reflection. My target is about
> 55.5%, since that's what Alphago got on their training set without
> reinforcement learning.
> I find I need 5x5 in the first layer, at least 12 layers, and at least 96
> filters to get over 50%. My best net is 55.3%, 18 layers by 96 filters. I
> use simple SGD with a 64 minibatch, no momentum, 0.01 learning rate until
> it flattens out, then 0.001. I have two 980TI, and the best nets take about
> 5 days to train (about 20 epochs on about 30M positions). The last few
> percent is just trial and error. Sometimes making the net wider or deeper
> makes it weaker. Perhaps it's just variation from one training run to
> another. I haven’t tried training the same net more than once.
> David
> > -----Original Message-----
> > From: Computer-go [mailto:computer-go-bounces at computer-go.org] On Behalf
> > Of Gian-Carlo Pascutto
> > Sent: Tuesday, August 23, 2016 12:42 AM
> > To: computer-go at computer-go.org
> > Subject: Re: [Computer-go] Converging to 57%
> >
> > On 23-08-16 08:57, Detlef Schmicker wrote:
> >
> > > So, if somebody is sure, it is measured against GoGod, I think a
> > > number of other go programmers have to think again. I heard them
> > > reaching 51% (e. g. posts by Hiroshi in this list)
> >
> > I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
> > GoGoD.
> >
> > Something I noticed from the papers is that the prediction percentage
> > keeps going upwards with more epochs, even if slowly, but still clearly
> > up.
> >
> > In my experience my networks converge rather quickly (like >0.5% per
> > epoch after the first), get stuck, get one more 0.5% gain if I lower the
> > learning rate (by a factor 5 or 10) and don't gain any more regardless
> > of what I do thereafter.
> >
> > I do use momentum. IIRC I tested without momentum once and it was worse,
> > and much slower.
> >
> > I did not find any improvement in playing strength from doing Facebook's
> > 3 move prediction. Perhaps it needs much bigger networks than 128 x 12.
> >
> > Adding ladder features also isn't good enough to (consistently) keep the
> > network from playing into them. (And once it's played the first move,
> > you're totally SOL because the resulting positions aren't in the
> > training set and you'll get 99% confidence for continuing the losing
> > ladder moves)
> >
> > I'm currently doing a more systematic comparison of all methods (and
> > GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
> > (rather than looking at prediction %). I'll post the results here, if
> > anything definite comes out of it.
> >
> > --
> > GCP
> > _______________________________________________
> > Computer-go mailing list
> > Computer-go at computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20160824/954da856/attachment.html>

More information about the Computer-go mailing list