[Computer-go] FYI KL-UCB

Hideki Kato hideki_katoh at ybb.ne.jp
Mon Jul 22 23:50:07 PDT 2013

Thanks Lukasz,

For introducing such an interesting paper.

I have a quesion, though.  The second algorithm in Figures 1, 2 and 3 
is termed UCB2 but is apparently called MOSS in Sections 5 (and 1).  Do 
you know which algorithm is actually used in the numerical 

BTW, I guess for MC Go programs, possibly the least "risky" algorithm be 
the best in practice, isn't it?


ukasz Lew: <CAPXT8E4pMwmvkiiTuyHHpBVavgeUPGQLNnODyJoAmFGo0uOo_g at mail.gmail.com>:
>KL-UCB algorithm
>"Thus, KL-UCB is optimal for Bernoulli distributions and strictly dominates
>a-UCB for any
>bounded reward distributions."
>http://www.princeton.edu/~sbubeck/SurveyBCB12.pdf (page 18)
Hideki Kato <mailto:hideki_katoh at ybb.ne.jp>

More information about the Computer-go mailing list