[Computer-go] action-value Q for unexpanded nodes

Álvaro Begué alvaro.begue at gmail.com
Sun Dec 3 12:30:28 PST 2017


The text in the appendix has the answer, in a paragraph titled "Expand and
evaluate (Fig. 2b)":
  "[...] The leaf node is expanded and and each edge (s_t, a) is
initialized to {N(s_t, a) = 0, W(s_t, a) = 0, Q(s_t, a) = 0, P(s_t, a) =
p_a}; [...]"



On Sun, Dec 3, 2017 at 11:27 AM, Andy <andy.olsen.tx at gmail.com> wrote:

> Figure 2a shows two bolded Q+U max values. The second one is going to a
> leaf that doesn't exist yet, i.e. not expanded yet. Where do they get that
> Q value from?
>
> The associated text doesn't clarify the situation: "Figure 2: Monte-Carlo
> tree search in AlphaGo Zero. a Each simulation traverses the tree by
> selecting the edge with maximum action-value Q, plus an upper confidence
> bound U that depends on a stored prior probability P and visit count N for
> that edge (which is incremented once traversed). b The leaf node is
> expanded..."
>
>
>
>
>
>
> 2017-12-03 9:44 GMT-06:00 Álvaro Begué <alvaro.begue at gmail.com>:
>
>> I am not sure where in the paper you think they use Q(s,a) for a node s
>> that hasn't been expanded yet. Q(s,a) is a property of an edge of the
>> graph. At a leaf they only use the `value' output of the neural network.
>>
>> If this doesn't match your understanding of the paper, please point to
>> the specific paragraph that you are having trouble with.
>>
>> Álvaro.
>>
>>
>>
>> On Sun, Dec 3, 2017 at 9:53 AM, Andy <andy.olsen.tx at gmail.com> wrote:
>>
>>> I don't see the AGZ paper explain what the mean action-value Q(s,a)
>>> should be for a node that hasn't been expanded yet. The equation for Q(s,a)
>>> has the term 1/N(s,a) in it because it's supposed to average over N(s,a)
>>> visits. But in this case N(s,a)=0 so that won't work.
>>>
>>> Does anyone know how this is supposed to work? Or is it another detail
>>> AGZ didn't spell out?
>>>
>>>
>>>
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go at computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20171203/9f0ee0d4/attachment.html>


More information about the Computer-go mailing list