[Computer-go] mini-max with Policy and Value network

Erik van der Werf erikvanderwerf at gmail.com
Mon May 22 08:47:40 PDT 2017


On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto <gcp at sjeng.org> wrote:

> On 22-05-17 11:27, Erik van der Werf wrote:
> > On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto <gcp at sjeng.org
> > <mailto:gcp at sjeng.org>> wrote:
> >
> >     ... This heavy pruning
> >     by the policy network OTOH seems to be an issue for me. My program
> has
> >     big tactical holes.
> >
> >
> > Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> > move predictor (a.k.a. policy net), but I never saw the need to do hard
> > pruning. Steenvreter uses the predictions to set priors, and it is very
> > selective, but with infinite simulations eventually all potentially
> > relevant moves will get sampled.
>
> With infinite simulations everything is easy :-)
>
> In practice moves with, say, a prior below 0.1% aren't going to get
> searched, and I still regularly see positions where they're the winning
> move, especially with tactics on the board.
>
> Enforcing the search to be wider without losing playing strength appears
> to be hard.
>
>
Well, I think that's fundamental; you can't be wide and deep at the same
time, but at least you can chose an algorithm that (eventually) explores
all directions.

BTW I'm a bit surprised that you are still able to find 'big tactical
holes' with Leela now playing as 8d KGS

Best,
Erik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20170522/e827bb87/attachment.html>


More information about the Computer-go mailing list