[Computer-go] FYI KL-UCB

Hideki Kato hideki_katoh at ybb.ne.jp
Tue Jul 23 01:39:02 PDT 2013


Hi,

That's an error :).  Olivier Cappe (one of the authors) replied very 
quickly and gave me another link to the correct version, 
http://jmlr.org/proceedings/papers/v19/garivier11a/garivier11a.pdf .  
Also note that there is a typo (misplaced inf sign) in Eq. (1) and 
(2).

Hideki

ukasz Lew: <CAPXT8E4ODjD07Qwci+eOuZ-Eozthjpcf2XM=wgMPF-a=re0_rQ at mail.gmail.com>:
>On Tue, Jul 23, 2013 at 8:50 AM, Hideki Kato <hideki_katoh at ybb.ne.jp> wrote:
>
>> Thanks Lukasz,
>>
>> For introducing such an interesting paper.
>>
>> I have a quesion, though.  The second algorithm in Figures 1, 2 and 3
>> is termed UCB2 but is apparently called MOSS in Sections 5 (and 1).  Do
>> you know which algorithm is actually used in the numerical
>> experiments?
>>
>
>I don't know, but you might mail the author.
>
>
>>
>> BTW, I guess for MC Go programs, possibly the least "risky" algorithm be
>> the best in practice, isn't it?
>>
>
>I won't speculate. Only experiments can tell.
>
>
>>
>> Hideki
>>
>>  ukasz Lew: <
>> CAPXT8E4pMwmvkiiTuyHHpBVavgeUPGQLNnODyJoAmFGo0uOo_g at mail.gmail.com>:
>> >KL-UCB algorithm
>> >http://arxiv.org/pdf/1102.2490v4.pdf
>> >
>> >"Thus, KL-UCB is optimal for Bernoulli distributions and strictly
>> dominates
>> >a-UCB for any
>> >bounded reward distributions."
>> >http://www.princeton.edu/~sbubeck/SurveyBCB12.pdf (page 18)
>> --
>> Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at dvandva.org
>> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>>
-- 
Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>



More information about the Computer-go mailing list