[Computer-go] AGZ Policy Head
andy.olsen.tx at gmail.com
Thu Dec 28 21:47:06 PST 2017
Is there some particular reason AGZ uses two 1x1 filters for the policy
head instead of one?
They could also have allowed more, but I guess that would be expensive? I
calculate that the fully connected layer has 2*361*362 weights, where 2 is
the number of filters.
By comparison the value head has only a single 1x1 filter, but it goes to a
hidden layer of 256. That gives 1*361*256 weights. Why not use two 1x1
filters here? Maybe since the final output is only a single scalar it's not
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go