[Computer-go] Aya reaches pro level on GoQuest 9x9 and 13x13
gcp at sjeng.org
Fri Nov 18 00:49:55 PST 2016
On 17/11/2016 22:38, Hiroshi Yamashita wrote:
> Features are 49 channels.
> Value Net is 32 Filters, 14 Layers.
> 32 5x5 x1, 32 3x3 x11, 32 1x1 x1, fully connect 256, fully connect tanh 1
> Features are 50 channels.
Thank you for this information. It takes a long time to train the
networks, so knowing which experiments have not worked is very valuable.
Did you not find a benefit from a larger value network? Too little data
and too much overfitting? Or more benefit from more frequent evaluation?
> Policy + Value vs Policy, 1000 playouts/move, 1000 games. 9x9, komi 7.0
> 0.634 using game result. 0 or 1
I presume this is a winrate, but over what base? Policy network?
> I also made 19x19 Value net. 19x19 learning positions are from KGS 4d over,
> GoGoD, Tygem and 500 playouts/move selfplay. 990255 games. 32 positions
> are selected from a game. Like Detlef's idea, I also use game result.
> I trust B+R and W+R games with komi 5.5, 6.5 and 7.5. In other games,
> If B+ and 1000 playouts at final position is over +0.60, I use it.
How do you handle handicap games? I see you excluded them from the KGS
dataset. Can your value network deal with handicap?
At least in the KGS ruleset, handicap stones are added to the score
calculation, so it is required that the network knows the exact handicap.
More information about the Computer-go