[Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Richard Lorentz richard.lorentz at csun.edu
Wed Dec 6 09:50:01 PST 2017


One chess result stood out for me, namely, just how much easier it was 
for AlphaZero to win with white (25 wins, 25 draws, 0 losses) rather 
than with black (3 wins, 47 draws, 0 losses).

Maybe we should not give up on the idea of White to play and win in chess!

On 12/06/2017 01:24 AM, Hiroshi Yamashita wrote:
> Hi,
>
> DeepMind makes strongest Chess and Shogi programs with AlphaGo Zero 
> method.
>
> Mastering Chess and Shogi by Self-Play with a General Reinforcement 
> Learning Algorithm
> https://urldefense.proofpoint.com/v2/url?u=https-3A__arxiv.org_pdf_1712.01815.pdf&d=DwIGaQ&c=Oo8bPJf7k7r_cPTz1JF7vEiFxvFRfQtp-j14fFwh71U&r=i0hg-cKH69CA5MsdosvezQ&m=w0qxE9GOfBVzqPOT0NBm1nsdQqJMlNu40BOCWfsO-gQ&s=dsola-9J77ArHVeuVc0ZCZKn2nJOsjfsnJzPc_MdPDo&e= 
>
>
> AlphaZero(Chess) outperformed Stockfish after 4 hours,
> AlphaZero(Shogi) outperformed elmo after 2 hours.
>
> Search is MCTS.
> AlphaZero(Chess) searches     80,000 positions/sec.
> Stockfish        searches 70,000,000 positions/sec.
> AlphaZero(Shogi) searches     40,000 positions/sec.
> elmo             searches 35,000,000 positions/sec.
>
> Thanks,
> Hiroshi Yamashita
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__computer-2Dgo.org_mailman_listinfo_computer-2Dgo&d=DwIGaQ&c=Oo8bPJf7k7r_cPTz1JF7vEiFxvFRfQtp-j14fFwh71U&r=i0hg-cKH69CA5MsdosvezQ&m=w0qxE9GOfBVzqPOT0NBm1nsdQqJMlNu40BOCWfsO-gQ&s=Dflm7ezefzMJ9xLNmNYrSQKWa7qvG9FkzlCHngo_NcY&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20171206/9c9db57c/attachment-0001.html>


More information about the Computer-go mailing list