Re[4]: [computer-go] UCT vs MC
Dmitry Kamenetsky
dimkadimon at mail.ru
Wed Feb 21 15:40:11 PST 2007
Thank you Don and Sylvain. I now understand this issue completely.
One more question. Line 23 states: for i:=node.size()-2 to 0 do. The leaf node should be stored in node[node.size()-1], so why do we start at node.size()-2? Is it not necessary to update the value of the leaf node?
-----Original Message-----
From: Don Dailey <drd at mit.edu>
To: Dmitry Kamenetsky <dimkadimon at mail.ru>, computer-go <computer-go at computer-go.org>
Date: Wed, 21 Feb 2007 12:54:43 -0500
Subject: Re: Re[2]: [computer-go] UCT vs MC
>
> On Wed, 2007-02-21 at 16:56 +0300, Dmitry Kamenetsky wrote:
> > Thank you for your answer. However, I am even more confused now. I
> > understand that "-" is for negamax, but I don't understand why it
> > became "1-". I am trying to implement your algorithm and I just want
> > to know what lines 7, 16 and 26 should be?
>
> I'm not sure this is what you are looking for, but in negamax, scores
> can be negative or positive. The scores are always adjusted so that
> you can view positive numbers as "good" and negative as "bad" from the
> point of view you are referencing. So to get the score from the
> "other"
> point of view you simple negate it.
>
> But in UCT, we don't deal with negative numbers. A score is between
> 0 and 1, so 0.001 is almost losing and 0.999 is almost winning for
> example.
>
> To change 0.99 to the other players point of view in this system, where
> scores must be between 0 and 1, you must negate it and add 1. So 0.99
> becomes: 1 - 0.99 = 0.01
>
> I hope that is what you are asking about and that this explains it.
>
> - Don
>
>
More information about the computer-go
mailing list