[Computer-go] Source code (Was: Reducing network size? (Was: AlphaGo Zero))
sheppardco at aol.com
Wed Oct 25 18:02:10 PDT 2017
I think it uses the champion network. That is, the training periodically generates a candidate, and there is a playoff against the current champion. If the candidate wins by more than 55% then a new champion is declared.
Keeping a champion is an important mechanism, I believe. That creates the competitive coevolution dynamic, where the network is evolving to learn how to beat the best, and not just most recent. Without that dynamic, the training process can go up and down.
From: Computer-go [mailto:computer-go-bounces at computer-go.org] On Behalf Of uurtamo .
Sent: Wednesday, October 25, 2017 6:07 PM
To: computer-go <computer-go at computer-go.org>
Subject: Re: [Computer-go] Source code (Was: Reducing network size? (Was: AlphaGo Zero))
Does the self-play step use the most recent network for each move?
On Oct 25, 2017 2:23 PM, "Gian-Carlo Pascutto" <gcp at sjeng.org <mailto:gcp at sjeng.org> > wrote:
On 25-10-17 17:57, Xavier Combelle wrote:
> Is there some way to distribute learning of a neural network ?
Learning as in training the DCNN, not really unless there are high
bandwidth links between the machines (AFAIK - unless the state of the
Learning as in generating self-play games: yes. Especially if you update
the network only every 25 000 games.
My understanding is that this task is much more bottlenecked on game
generation than on DCNN training, until you get quite a bit of machines
that generate games.
Computer-go mailing list
Computer-go at computer-go.org <mailto:Computer-go at computer-go.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go