[Computer-go] AlphaZero tensorflow implementation/tutorial

Dani dshawul at gmail.com
Sun Dec 9 17:34:22 PST 2018


Thanks for the tutorial! I have some questions about training

a) Do you use Dirichlet noise during training, if so is it limited to first
30 or so plies ( which is the opening phase of chess) ?
The alphazero paper is not clear about it.

b) Do you need to shuffle batches if you are doing one epoch? Also after
generating game positions from each game,
do you shuffle those postions? I found the latter to be very important to
avoid overfitting.

c) Do you think there is a problem with using Adam Optimizer instead of SGD
with learning rate drops?

Daniel

On Sun, Dec 9, 2018 at 6:23 PM cody2007 via Computer-go <
computer-go at computer-go.org> wrote:

> Thanks for your comments.
>
> >looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
> I'd expect that'd make me win even more against the algorithm since it
> would explore a far smaller amount of the search space, right?
> Certainly something I'd be interested in testing though--I just would
> expect it'd take many months more months of training however, but would be
> interesting to see how much performance falls apart, if at all.
>
> >for not cheating against gnugo, use --play-out-aftermath of gnugo
> parameter
> Yep, I evaluate with that parameter. The problem is more that I only play
> 20 turns per player per game. And the network seems to like placing stones
> in terrotories "owned" by the other player. My scoring system then no
> longer counts that area as owned by the player. Probably playing more turns
> out and/or using a more sophisticated scoring system would fix this.
>
> >If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Yeah, I agree more training is probably the key here. I'll take a look at
> leela-zero.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
> xavier.combelle at gmail.com> wrote:
>
> looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
>
> for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>
> If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>
> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief
> tutorial. The code runs on a single GPU. While performance is not that
> great, I suspect its mostly been limited by hardware limitations (my
> training and evaluation has been on a single Titan X). The network can beat
> GNU go about 50% of the time, although it "abuses" the scoring a little
> bit--which I talk a little more about in the article:
>
>
> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>
> -Cody
>
> _______________________________________________
> Computer-go mailing listComputer-go at computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20181209/4815f651/attachment-0001.html>


More information about the Computer-go mailing list