[Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

"Ingo Althöfer" 3-Hirn-Verlag at gmx.de
Wed Dec 6 11:41:51 PST 2017

> The AlphaZero paper shows it out-performs AlphaGoZero, but they are
> comparing to the 20-block, 3-day version. Not the 40-block, 40-day
> version that was even stronger.
> As papers rarely show failures, can we take it to mean they couldn't
> out-perform their best go bot, do you think? ...
> In other words, do you think the changes they made from AlphaGo Zero to
> Alpha Zero have made it weaker ...

Just some speculation:

The article on AlphaGo Zero is in NATURE.
Perhaps they made the AlphaZero research simultaneously,
and when facing problems with acceptance in a journal (like NATURE)
they decided to publish a preversion on AlphaZero in arXiv.
So, perhaps the 40-block 40-day experiment was not yet done when
they had written the AlphaZero paper.

Just speculating...

