[Computer-go] mini-max with Policy and Value network
Erik van der Werf
erikvanderwerf at gmail.com
Mon May 22 08:47:40 PDT 2017
On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto <gcp at sjeng.org> wrote:
> On 22-05-17 11:27, Erik van der Werf wrote:
> > On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto <gcp at sjeng.org
> > <mailto:gcp at sjeng.org>> wrote:
> > ... This heavy pruning
> > by the policy network OTOH seems to be an issue for me. My program
> > big tactical holes.
> > Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> > move predictor (a.k.a. policy net), but I never saw the need to do hard
> > pruning. Steenvreter uses the predictions to set priors, and it is very
> > selective, but with infinite simulations eventually all potentially
> > relevant moves will get sampled.
> With infinite simulations everything is easy :-)
> In practice moves with, say, a prior below 0.1% aren't going to get
> searched, and I still regularly see positions where they're the winning
> move, especially with tactics on the board.
> Enforcing the search to be wider without losing playing strength appears
> to be hard.
Well, I think that's fundamental; you can't be wide and deep at the same
time, but at least you can chose an algorithm that (eventually) explores
BTW I'm a bit surprised that you are still able to find 'big tactical
holes' with Leela now playing as 8d KGS
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go