[Computer-go] CNN for winrate and territory
ds2 at physik.de
Sun Feb 8 02:22:29 PST 2015
I am working on a CNN for winrate and territory:
- input 2 layers for b and w stones
- 1. output: 1 layer territory (0.0 for owned by white, 1.0 for owned
by black (because I missed TANH in the first place I used SIGMOID))
- 2. output: label for -60 to +60 territory leading by black
the loss of both outputs is trained
the idea is, that this way I do not have to put komi into input and make
the winrate from the statistics of the trained label:
e.g. komi 6.5: I sum the probabilites from +7 to +60 and get something
like a winrate
I trained with 800000 positions with territory information through 500
playouts from oakfoam, which I symmetrized by the 8 transformation
leading to >6000000 positions. (It is expensive to produce the positions
due to the playouts....)
The layers are the same as the large network from Christopher Clark
<http://arxiv.org/find/cs/1/au:+Clark_C/0/1/0/all/0/1>, Amos Storkey
I get reasonable territory predictions from this network (compared to
500 playouts of oakfoam), the winrates seems to be overestimated. But
anyway, it looks as it is worth to do some more work on it.
The idea is, I can do the equivalent of lets say 1000 playouts with a
call to the CNN for the cost of 2 playouts some time...
Now I try to do a soft turnover from conventional playouts to CNN
predicted winrates within the framework of MC.
I do have some ideas, but I am not happy with them.
Maybe you have better ones :)
Thanks a lot
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go