[Computer-go] FYI KL-UCB
hideki_katoh at ybb.ne.jp
Tue Jul 23 01:39:02 PDT 2013
That's an error :). Olivier Cappe (one of the authors) replied very
quickly and gave me another link to the correct version,
Also note that there is a typo (misplaced inf sign) in Eq. (1) and
ukasz Lew: <CAPXT8E4ODjD07Qwci+eOuZ-Eozthjpcf2XM=wgMPF-a=re0_rQ at mail.gmail.com>:
>On Tue, Jul 23, 2013 at 8:50 AM, Hideki Kato <hideki_katoh at ybb.ne.jp> wrote:
>> Thanks Lukasz,
>> For introducing such an interesting paper.
>> I have a quesion, though. The second algorithm in Figures 1, 2 and 3
>> is termed UCB2 but is apparently called MOSS in Sections 5 (and 1). Do
>> you know which algorithm is actually used in the numerical
>I don't know, but you might mail the author.
>> BTW, I guess for MC Go programs, possibly the least "risky" algorithm be
>> the best in practice, isn't it?
>I won't speculate. Only experiments can tell.
>> ukasz Lew: <
>> CAPXT8E4pMwmvkiiTuyHHpBVavgeUPGQLNnODyJoAmFGo0uOo_g at mail.gmail.com>:
>> >KL-UCB algorithm
>> >"Thus, KL-UCB is optimal for Bernoulli distributions and strictly
>> >a-UCB for any
>> >bounded reward distributions."
>> >http://www.princeton.edu/~sbubeck/SurveyBCB12.pdf (page 18)
>> Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>
>> Computer-go mailing list
>> Computer-go at dvandva.org
Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>
More information about the Computer-go