[Computer-go] AGZ Policy Head
remi.coulom at free.fr
Fri Dec 29 04:50:16 PST 2017
I also wonder about this. A purely convolutional approach would save a lot of weights. The output for pass can be set to be a single bias parameter, connected to nothing. Setting pass to a constant might work, too. I don't understand the reason for such a complication.
----- Mail original -----
De: "Andy" <andy.olsen.tx at gmail.com>
À: "computer-go" <computer-go at computer-go.org>
Envoyé: Vendredi 29 Décembre 2017 06:47:06
Objet: [Computer-go] AGZ Policy Head
Is there some particular reason AGZ uses two 1x1 filters for the policy head instead of one?
They could also have allowed more, but I guess that would be expensive? I calculate that the fully connected layer has 2*361*362 weights, where 2 is the number of filters.
By comparison the value head has only a single 1x1 filter, but it goes to a hidden layer of 256. That gives 1*361*256 weights. Why not use two 1x1 filters here? Maybe since the final output is only a single scalar it's not needed?
Computer-go mailing list
Computer-go at computer-go.org
More information about the Computer-go