[Computer-go] Aya reaches pro level on GoQuest 9x9 and 13x13

valkyria at phmp.se valkyria at phmp.se
Mon Nov 21 06:35:19 PST 2016


Yes, I think the important thing of the value function is to detect 
moves that are very bad so that MC-eval does not have to sample more 
than once for many variations.

If the evaluation function was trained on pro moves only, it would not 
know what a bad move looks like. At least the evaluation function would 
not be able to see thee difference between "very bad", "never good" and 
"sometimes possible".

Magnus

On 2016-11-21 15:22, Gian-Carlo Pascutto wrote:
> For the Value Network indeed the procedure is as described, with one
> move at time U being uniformly sampled from {1,361} until it is legal. 
> I
> think it's because we're not interested (only) in playing good moves,
> but also analyzing as diverse as possible positions to learn whether
> they're won or lost. Throwing in one totally random move vastly
> increases the diversity and the number of odd positions the network
> sees, while still not leading to totally nonsensical positions.



More information about the Computer-go mailing list