[Computer-go] mini-max with Policy and Value network

Gian-Carlo Pascutto gcp at sjeng.org
Mon May 22 11:22:53 PDT 2017


On 22-05-17 17:47, Erik van der Werf wrote:
> On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto <gcp at sjeng.org
> <mailto:gcp at sjeng.org>> wrote:
> 
> Well, I think that's fundamental; you can't be wide and deep at the same
> time, but at least you can chose an algorithm that (eventually) explores
> all directions.

Right. But I'm uncomfortable with the current setup, because many
options won't get explored at all in practical situations. It would seem
logical that some minimum amount of (more spread) search effort would
plug enough holes to stop bad blunders, but finding a way to do that and
preserve strength seems elusive so far.

> BTW I'm a bit surprised that you are still able to find 'big tactical
> holes' with Leela now playing as 8d KGS

I've attached an example. It's not the prettiest one (one side has a 0.5
pt advantage in the critical variation so exact komi is an issue), but
it's a recent one from my mailbox.

This is with Leela 0.10.0 so you can follow along:

Leela: loadsgf tactics2.sgf 261
=


Passes: 0            Black (X) Prisoners: 7
White (O) to move    White (O) Prisoners: 19

   a b c d e f g h j k l m n o p q r s t
19 . X O O . . . . . . . O O X X X O X . 19
18 . X X O O . . . . O O O X X X O O . O 18
17 . . X X O . . . O O X O O X . X O O . 17
16 . . X O O . . O O X X X X . . X X X . 16
15 . . X O . . O X X X X . . . . . X . X 15
14 . . . X O . O X O O O X . X X O O X(X)14
13 . . . X O O O O O X X X X O O X X X O 13
12 . . . . X X . O X X O O X O . O O X O 12
11 . . . . . . . O O X O X X O . . O O O 11
10 . . . X . X X O O O O O X O O O O . . 10
 9 . . . . X . X X O . O X O X X X O . .  9
 8 . . . . X . X O . . O X O O X O . . .  8
 7 X X X X O X X O . O O X O X X O O . .  7
 6 O O O O O O O . . . O X O . X X O . .  6
 5 X O O . X O O O O O X X X X X O . . .  5
 4 X X O O O X X O . O X X . X O O . . .  4
 3 X . X X X X O . O . O X . . X O . . .  3
 2 X O X . . O O . O . O O X X X O O . .  2
 1 . X . O O . . O O . O X X . X X O . .  1
   a b c d e f g h j k l m n o p q r s t

Hash: 106A3898CEC94132 Ko-Hash: 67E390C41BF2577

Black time: 00:30:00
White time: 00:30:00

Leela: heatmap
=

94.46% G11
 4.20% C1
 1.31% E2
 0.03% all other moves together

Note that O16 P17 N15 wins immediately. It's not that Leela is
completely blind to it, because that sequence is in some variations. But
in here, O16 won't get searched for a long time (it is actually the 4th
rated move) due to the skewed probabilities.

Leela: play w g11
=

Leela: heatmap
=

99.9% F11
 0.1% C1
   0% all other moves together

Leela: genmove b
....

Score looks bad. Resigning.
= resign

https://timkr.home.xs4all.nl/chess2/resigntxt.htm

If black plays O16 instead here he wins by 0.5 points.

-- 
GCP
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tactics2.sgf
Type: application/x-go-sgf
Size: 1667 bytes
Desc: not available
URL: <http://computer-go.org/pipermail/computer-go/attachments/20170522/067bbfab/attachment.sgf>


More information about the Computer-go mailing list