[computer-go] Improvement of UCT search algorithm

sylvain.gelly at m4x.org sylvain.gelly at m4x.org
Mon Oct 9 01:21:09 PDT 2006


Le Dimanche 08 Octobre 2006 15:37, Łukasz Lew a écrit :
> I would like to add that in my experiments with a small number of
> playouts ~30000, without all as first heuristics, the common situation
> was that a bot explored the tree more or less properly, but in the end
> it finished with a wrong move, because last refutation found didn't
> manage to make a significant change to the root decision.
>
> How do You deal with this problem?
> How do You choose the final move?

Hello,

I don't know if the "You" was only Don, or all people who use UCT?

If the question was addressed to all, I would first answer that indeed it is a 
problem. MoGo sometimes chooses a bad move just because at the very end the 
move has a larger value (common when the value is decreasing as for each 
move, it then finds the refutation). 
We tried choosing the move which maximize 
value-1/sqrt(nbSimulationsForThisMove), taking the lower bound of a 
confidence interval. This doesn't give significant improvement so we 
currently (and from the beginning) choose always the move with the highest 
value. 
The solution given by Don (giving more time until the best move and the most 
sampled are the same) seems quite good. But we didn't try (no time).
Don, did this solution give you significant improvements?

Sylvain



More information about the computer-go mailing list