[Computer-go] mini-max with Policy and Value network

Gian-Carlo Pascutto gcp at sjeng.org
Mon May 22 06:56:43 PDT 2017


On 22-05-17 11:27, Erik van der Werf wrote:
> On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto <gcp at sjeng.org
> <mailto:gcp at sjeng.org>> wrote:
> 
>     ... This heavy pruning
>     by the policy network OTOH seems to be an issue for me. My program has
>     big tactical holes.
> 
> 
> Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> move predictor (a.k.a. policy net), but I never saw the need to do hard
> pruning. Steenvreter uses the predictions to set priors, and it is very
> selective, but with infinite simulations eventually all potentially
> relevant moves will get sampled.

With infinite simulations everything is easy :-)

In practice moves with, say, a prior below 0.1% aren't going to get
searched, and I still regularly see positions where they're the winning
move, especially with tactics on the board.

Enforcing the search to be wider without losing playing strength appears
to be hard.

-- 
GCP


More information about the Computer-go mailing list