[Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Hiroshi Yamashita yss at bd.mbn.or.jp
Wed Dec 6 01:24:28 PST 2017


Hi,

DeepMind makes strongest Chess and Shogi programs with AlphaGo Zero method.

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
https://arxiv.org/pdf/1712.01815.pdf

AlphaZero(Chess) outperformed Stockfish after 4 hours,
AlphaZero(Shogi) outperformed elmo after 2 hours.

Search is MCTS. 

AlphaZero(Chess) searches     80,000 positions/sec.
Stockfish        searches 70,000,000 positions/sec.
AlphaZero(Shogi) searches     40,000 positions/sec.
elmo             searches 35,000,000 positions/sec.

Thanks,
Hiroshi Yamashita



More information about the Computer-go mailing list