[Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

Álvaro Begué alvaro.begue at gmail.com
Thu Feb 4 12:43:06 PST 2016


I just want to see how to get 0.5 for the initial position on the board
with some definition.

One possibility is that 0=loss, 1=win, and the number they are quoting is
sqrt(average((prediction-outcome)^2)).


On Thu, Feb 4, 2016 at 3:40 PM, Hideki Kato <hideki_katoh at ybb.ne.jp> wrote:

> I think the error is defined as the difference between the
> output of the value network and the average output of the
> simulations done by the policy network (RL) at each position.
>
> Hideki
>
> Michael Markefka: <CAJg7PAN9G2_htRs0mfKuFi82yef7gNFCsouE4ez4f37_pK=
> KQw at mail.gmail.com>:
> >That sounds like it'd be the MSE as classification error of the eventual
> result.
>
> >
>
> >I'm currently not able to look at the paper, but couldn't you use a
>
> >softmax output layer with two nodes and take the probability
>
> >distribution as winrate?
>
> >
>
> >On Thu, Feb 4, 2016 at 8:34 PM, Álvaro Begué <alvaro.begue at gmail.com>
> wrote:
>
> >> I am not sure how exactly they define MSE. If you look at the plot in
> figure
>
> >> 2b, the MSE at the very beginning of the game (where you can't possibly
> know
>
> >> anything about the result) is 0.50. That suggests it's something else
> than
>
> >> your [very sensible] interpretation.
>
> >>
>
> >> Álvaro.
>
> >>
>
> >>
>
> >>
>
> >> On Thu, Feb 4, 2016 at 2:24 PM, Detlef Schmicker <ds2 at physik.de> wrote:
>
> >>>
>
> >>> -----BEGIN PGP SIGNED MESSAGE-----
>
> >>> Hash: SHA1
>
> >>>
>
> >>> >> Since all positions of all games in the dataset are used, winrate
>
> >>> >> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the
>
> >>> >> number 70% could be wrong.  MSE is 0.37 just means the average
>
> >>> >> error is about 0.6, I think.
>
> >>>
>
> >>> 0.6 in the range of -1 to 1,
>
> >>>
>
> >>> which means -1 (eg lost by b) games -> typical value -0.4
>
> >>> and +1 games -> typical value +0.4 of the value network
>
> >>>
>
> >>> if I rescale -1 to +1 to  0 - 100% (eg winrate for b) than I get about
>
> >>> 30% for games lost by b and 70% for games won by B?
>
> >>>
>
> >>> Detlef
>
> >>>
>
> >>>
>
> >>> Am 04.02.2016 um 20:10 schrieb Hideki Kato:
>
> >>> > Detlef Schmicker: <56B385CE.4080804 at physik.de>: Hi,
>
> >>> >
>
> >>> > I try to reproduce numbers from section 3: training the value
>
> >>> > network
>
> >>> >
>
> >>> > On the test set of kgs games the MSE is 0.37. Is it correct, that
>
> >>> > the results are represented as +1 and -1?
>
> >>> >
>
> >>> >> Looks correct.
>
> >>> >
>
> >>> > This means, that in a typical board position you get a value of
>
> >>> > 1-sqrt(0.37) = 0.4  --> this would correspond to a win rate of 70%
>
> >>> > ?!
>
> >>> >
>
> >>> >> Since all positions of all games in the dataset are used, winrate
>
> >>> >> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the
>
> >>> >> number 70% could be wrong.  MSE is 0.37 just means the average
>
> >>> >> error is about 0.6, I think.
>
> >>> >
>
> >>> >> Hideki
>
> >>> >
>
> >>> > Is it really true, that a typical kgs 6d+ position is judeged with
>
> >>> > such a high win rate (even though it it is overfitted, so the test
>
> >>> > set number is to bad!), or do I misinterpret the MSE calculation?!
>
> >>> >
>
> >>> > Any help would be great,
>
> >>> >
>
> >>> > Detlef
>
> >>> >
>
> >>> > Am 27.01.2016 um 19:46 schrieb Aja Huang:
>
> >>> >>>> Hi all,
>
> >>> >>>>
>
> >>> >>>> We are very excited to announce that our Go program, AlphaGo,
>
> >>> >>>> has beaten a professional player for the first time. AlphaGo
>
> >>> >>>> beat the European champion Fan Hui by 5 games to 0. We hope
>
> >>> >>>> you enjoy our paper, published in Nature today. The paper and
>
> >>> >>>> all the games can be found here:
>
> >>> >>>>
>
> >>> >>>> http://www.deepmind.com/alpha-go.html
>
> >>> >>>>
>
> >>> >>>> AlphaGo will be competing in a match against Lee Sedol in
>
> >>> >>>> Seoul, this March, to see whether we finally have a Go
>
> >>> >>>> program that is stronger than any human!
>
> >>> >>>>
>
> >>> >>>> Aja
>
> >>> >>>>
>
> >>> >>>> PS I am very busy preparing AlphaGo for the match, so
>
> >>> >>>> apologies in advance if I cannot respond to all questions
>
> >>> >>>> about AlphaGo.
>
> >>> >>>>
>
> >>> >>>>
>
> >>> >>>>
>
> >>> >>>> _______________________________________________ Computer-go
>
> >>> >>>> mailing list Computer-go at computer-go.org
>
> >>> >>>> http://computer-go.org/mailman/listinfo/computer-go
>
> >>> >>>>
>
> >>> >> _______________________________________________ Computer-go
>
> >>> >> mailing list Computer-go at computer-go.org
>
> >>> >> http://computer-go.org/mailman/listinfo/computer-go
>
> >>> -----BEGIN PGP SIGNATURE-----
>
> >>> Version: GnuPG v2.0.22 (GNU/Linux)
>
> >>>
>
> >>> iQIcBAEBAgAGBQJWs6WFAAoJEInWdHg+Znf4eTsP/21vawWsmrZkDuAjTkwbKB2S
>
> >>> 7LpLi3huuLlepkulmUr3rIUvDHhTOwD04pDHjjVrIDBB1k3JjQQ/YKWDfijQQYu6
>
> >>> ZI1GK55pglUPH+uc+rxfM89ziJwCQrza71l5XU+5ffcBwxRjeAL+D1fGGyr0CPlv
>
> >>> WKR/Q07XDslXhwlk2O6NDpd80d38dMlMV9lO4s8Zf3Y+o8WJOuyEdybRpg8VOibq
>
> >>> o59RCAWUiVkTs++iSihcIrVAwGnLtkPyMJ/lBN6zMyZQeuM0dyYL+IAoMH9IdCLQ
>
> >>> 0jpbtJEqtSsp1ZjWs9s/M4pxKlvUZLThtYSjyGDJ2qDYXII6DeBgxHGUoUxc5A6a
>
> >>> HVF04gG77U2fMCa/6eGlQN2380kNCjdyRCDUZc9St3tbQPnWU+syk6U/inF7bhAA
>
> >>> 7ONJD0dcjZROmblqurv32pO6sLuS8wA4DfJhpM5xSSJcYI46YQtVWL4OXY+dtx6S
>
> >>> 6uQ1fiPqgo4WM0iHEOnh7BEz0NqZeahIUJJVmgKODzp2krOqbpOpbwe7WUI7UHmK
>
> >>> 3LCNC9oMRybNuc+jrbHqFwT+tgQLTqpbHZuDVzKkBcxqPSj7hRvjLXAjkWNCzL7j
>
> >>> Yo4MySS6rzenuj9ZRSrQDSYfowRZyzPzMnmjkMbM7R7wpR5CL4U95LqOdMnce2IG
>
> >>> s/6iYcuUH8KqpG9NMy0U
>
> >>> =TnKW
>
> >>> -----END PGP SIGNATURE-----
>
> >>> _______________________________________________
>
> >>> Computer-go mailing list
>
> >>> Computer-go at computer-go.org
>
> >>> http://computer-go.org/mailman/listinfo/computer-go
>
> >>
>
> >>
>
> >>
>
> >> _______________________________________________
>
> >> Computer-go mailing list
>
> >> Computer-go at computer-go.org
>
> >> http://computer-go.org/mailman/listinfo/computer-go
>
> >_______________________________________________
>
> >Computer-go mailing list
>
> >Computer-go at computer-go.org
>
> >http://computer-go.org/mailman/listinfo/computer-go
> --
> Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20160204/f4b26596/attachment.html>


More information about the Computer-go mailing list