I have a quesion, though.  The second algorithm in Figures 1, 2 and 3 
is termed UCB2 but is apparently called MOSS in Sections 5 (and 1).  Do 
you know which algorithm is actually used in the numerical 

BTW, I guess for MC Go programs, possibly the least "risky" algorithm be 
the best in practice, isn't it?


>KL-UCB algorithm
>"Thus, KL-UCB is optimal for Bernoulli distributions and strictly dominates
>a-UCB for any
>bounded reward distributions."
>http://www.princeton.edu/~sbubeck/SurveyBCB12.pdf (page 18)
