[Computer-go] Detlef's DCNN data

Hiroshi Yamashita yss at bd.mbn.or.jp
Fri Sep 18 11:08:53 PDT 2015

```Hi,

I tried Detlef's DCNN learning data with Aya.
http://computer-go.org/pipermail/computer-go/2015-April/007573.html
I tested 10000 playout/move selfplay, and DCNN with Aya got around
90% winrate.

DCNN returns each move probabilty. I multiply it by 1000, and multiply
it by each move's rating. (r *= 1000 means multiply by 1000).

Test games are less than 100. But It seems muliply constant has no
effect. 90% winrate is about +400 Elo. But this is selfplay and
playout does not understand semeai(capture race). So I guess +50
or +100 Elo against human.

10000 playout Aya with DCNN vs 10000 playout Aya without DCNN.
(1 thread, selfplay, Xeon W3680 3.3GHz, GTS 450)

winrate  wins/games
0.943      83/88    r *= 1000
0.897      78/87    r *= 500
0.913      84/92    r *= 200
0.932      82/88    r *= 100
0.914      85/93    r *= 50

Select maximum uct_rave move.
MM_gamma is each move's rating from Remi's Elo rating paper.
---------------------------------------------------------------
r = result_DCNN(pos(x,y));
if ( r < 0.001 ) r = 0.001;
r *= 1000;
MM_gamma *= r;

C = 0.31
ucb   = moveWins/moveCount + C * sqrt( log(moveSum+1) / moveCount );
rave  = raveWins/raveCount + C * sqrt( log((moveSum+1)*175) / ((moveSum+1)*0.48) );

W1 = (1.0 / 0.9);  // from fuego
W2 = (1.0 / 20000);
beta = raveCount / (raveCount + moveCount * (W1 + W2 * raveCount));

K = 1200;
bias = 0.01 * log(1 + MM_gamma) * sqrt( K / (K + moveCount));

ucb_rave = beta * rave + (1 - beta) * ucb + bias;
---------------------------------------------------------------

Aya calls DCNN when node is created. Aya makes 900 nodes in 10000
playouts. GTS 450 needs 17.4ms for a position. 900*17.4 = 15.6 sec
is needed. Aya needs 5 sec for 10000 playout without DCNN, and
20.6 sec with DCNN. So 4 times slower.

I heard HiraBot jumped from 2d to 3d by using Detlef's data. He uses
DCNN only in root node. HiraBot prediction rate without DCNN is 38.5%.
MC_ark jumped from 2k to 1d by using Detlef's data. MC_ark uses DCNN
only in root node and root's children. Aya's prediction rate is 38.8%,
and Detlef's DCNN is 44%.

Time for one position

CUDA cores  clock
GTS 450  17.4 ms     192     783MHz
GTX 970   1.6 ms   1,664    1050MHz

*CPU    235.0 ms  ... Xeon W3680 3.3GHz one thread.

GTX 970 is 11 times faster than GTS 450.
Maybe it is equal CUDA cores ratio (8.6) x clock ratio(1.3).
I also use caffe. Installing caffe was the most difficult part...
And thank you Detlef for publishing your data!

My test code and Makefile.
http://yss-aya.com/20150907detlef_test.zip

Regards,
Hiroshi Yamashita

```