[Computer-go] Aya reaches pro level on GoQuest 9x9 and 13x13

Roel van Engelen ich.bun.ut at gmail.com
Sat Nov 19 08:53:46 PST 2016


Hi Detlef

My bot is not pro jet but i build gosu games
<https://play.google.com/store/apps/details?id=nl.ingele.gosugames> (
similar to waltheri.net <http://ps.waltheri.net/> ) and i found certain
"odd" positions
occurring in 200+ games where over 80% of the pro's chooses move A while
90% of the games picking move A is
lost by that player.

suggesting pro players in certain positions choose a "sub optimal" move

to me it seems that the influence of these "sub optimal" moves is
diminished by using reinforcement learning for a
limited time, unfortunately my implementation is not ready enough to verify
this..

Roel

On 19 November 2016 at 09:07, Detlef Schmicker <ds2 at physik.de> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Hiroshi,
>
> thanks a lot for your info.
>
> You did not try reinforcement learning I think. Do you have any idea,
> why this would make the policy network 250ELO stronger, as mentioned
> in the alphago paper (80% winrate)?
>
> Are pros playing so bad?
>
> Do you think playing strength would be better, if one only takes into
> account the moves of the winning player?
>
> Detlef
>
> Am 19.11.2016 um 05:18 schrieb Hiroshi Yamashita:
> > Hi,
> >
> >> Did you not find a benefit from a larger value network? Too
> >> little data and too much overfitting? Or more benefit from more
> >> frequent evaluation?
> >
> > I did not find larger value network is better. But I think I need
> > more taraining data and stronger selfplay. I did not find
> > overfitting so far, and did not try more frequent evaluation.
> >
> >>> Policy + Value vs Policy, 1000 playouts/move, 1000 games. 9x9,
> >>> komi 7.0 0.634  using game result. 0 or 1
> >>
> >> I presume this is a winrate, but over what base? Policy network?
> >
> > Yes. Policy network(only root node) + value network  vs  Policy
> > network(only root node).
> >
> >> How do you handle handicap games? I see you excluded them from
> >> the KGS dataset. Can your value network deal with handicap?
> >
> > I excluded hadicap games. My value network can not handle hadicaps.
> > It it only for komi 7.5.
> >
> > Thanks, Hiroshi Yamashita
> >
> > _______________________________________________ Computer-go mailing
> > list Computer-go at computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.22 (GNU/Linux)
>
> iQIcBAEBAgAGBQJYMAhCAAoJEInWdHg+Znf4M+4P/RcgEbK7TpyPOf3BKdEEaw1u
> hGkCFYRDhTKHyqCDtlCTKAyoi8sUl0fCMCNOvzV17Cg46uZwNgS3PDqkPFVDuD7I
> GBZQgNDXmc9+80Vn0KdDbbBAwGhsH0emzKLndwcN9oshk6cylpIiwB73JC7kvijY
> uZb9iA+nOQNBbAvDDNxJNiTVz0qe3XPYSIZOaYa/HTwdnG3aFAkiC8bom3vs8Bn4
> h45NkY5YkcScQug4hWP7g9IWa3wEdbVPVKtE/B1SxcjOm5aksuOkJvoFFJwEsId1
> tifcT81JzThGJt1TgFpotgbA8QgDRGc6z3BXNggw5AuIU32zonqbljHiynG6Uz7I
> djxywrngr9Xif8KYlteSYVViA9cJZRwbE+nHFT1Fn8lc3BDk2lypG++IaMq0QwWM
> UmEn8U9TKhD4um8HcFSJGvrqUZBnsO8bcp9rUTFssqFm5ZGsoY0nwRt8EezKZ/Sh
> jZqbqplmYDIBoZ6f/VwQfe3OtPLSzmDtCYpx7lh4eXBTLQ74gr8NxksyE9JGXHk4
> tQ5bfRq4gobCkFuwHf2ypIhw8TNRvzq9QI4B3Hin7XcR6KKE27zqh3pNChH9VnXN
> jv5Elre4y71HCYlc5pZdeu6WK8RS+ju3nwsWJhfgZGsu5J0apFlt5XSzW2UnL+I5
> 0p6AUG2zTq7iuxuAZlaO
> =jpkW
> -----END PGP SIGNATURE-----
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20161119/c0eb22c9/attachment.html>


More information about the Computer-go mailing list