[Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Gian-Carlo Pascutto gcp at sjeng.org
Thu Dec 7 01:12:58 PST 2017


On 06-12-17 22:29, Brian Sheppard via Computer-go wrote:
> The chess result is 64-36: a 100 rating point edge! I think the
> Stockfish open source project improved Stockfish by ~20 rating points in
> the last year.

It's about 40-45 Elo FWIW.

> AZ would dominate the current TCEC. 

I don't think you'll get to 80 knps with a regular 22 core machine or
whatever they use. Remember that AZ hardware is about 16 x 1080 Ti's.
You'll lose that (70 - 40 = 30 Elo) advantage very, very quickly.

IMHO this makes it all the more clear how silly it is that so much
attention is given to TCEC with its completely arbitrary hardware choice.

> The Stockfish team will have some self-examination going forward for
> sure. I wonder what they will decide to do.

Probably the same the Zen team did. Ignore a large part of the result
because people's actual computers - let alone mobile phones - can't run
a neural net at TPU speeds.

The question is if resizing the network makes the resulting program more
competitive, enough to overcome the speed difference. And, aha, in which
direction are you going to try to resize? Bigger or smaller?

-- 
GCP


More information about the Computer-go mailing list