[computer-go] low-hanging fruit - yose
Álvaro Begué
alvaro.begue at gmail.com
Thu Dec 13 12:56:16 PST 2007
At the end of a playout there is probably some code that says samoething
like
reward = (score > komi) ? 1.0 : 0.0;
You can just replace it with
reward = 1 / (1 + exp(- K * (score - komi)));
A huge value of K will reproduce the old behaviour, a tiny value will result
in a program that tries to maximize expected score, and values in the middle
will blend both things nicely. Of course you would precompute this in a
table.
This seems elegant and simple to me. Now we only need to know how it affects
performance. I bet there are values of K that would make everyone happy (no
measurable loss in strength, still play good-looking moves even if the game
is decided).
Álvaro.
On Dec 13, 2007 3:42 PM, Chris Fant <chrisfant at gmail.com> wrote:
> On Dec 13, 2007 3:33 PM, Chris Fant <chrisfant at gmail.com> wrote:
> > Seems like the final solution to this would need to build out the
> > search tree to the end of the game, finding a winning line. And then
> > search again with a different evaluation function (one based on
> > points). If the second search cannot find a line that wins bigger
> > than the first search did, just play the move returned by the first
> > search. And you could get more clever be allowing the second search
> > to start with some information from the first search. Note that when
> > I say "winning line", I mean all the way to the end. No MC here.
> >
>
>
> Actually, I suppose it need not be to the absolute end of the game.
> As long as all MC sims that finish out the game prior to scoring lead
> to a win, then you can consider the tree portion a guaranteed winning
> line and try the second search to maximize points.
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://computer-go.org/pipermail/computer-go/attachments/20071213/41809740/attachment.htm
More information about the computer-go
mailing list