[Computer-go] mini-max with Policy and Value network
gcp at sjeng.org
Mon May 22 06:56:43 PDT 2017
On 22-05-17 11:27, Erik van der Werf wrote:
> On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto <gcp at sjeng.org
> <mailto:gcp at sjeng.org>> wrote:
> ... This heavy pruning
> by the policy network OTOH seems to be an issue for me. My program has
> big tactical holes.
> Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> move predictor (a.k.a. policy net), but I never saw the need to do hard
> pruning. Steenvreter uses the predictions to set priors, and it is very
> selective, but with infinite simulations eventually all potentially
> relevant moves will get sampled.
With infinite simulations everything is easy :-)
In practice moves with, say, a prior below 0.1% aren't going to get
searched, and I still regularly see positions where they're the winning
move, especially with tactics on the board.
Enforcing the search to be wider without losing playing strength appears
to be hard.
More information about the Computer-go