[computer-go] Improvement of UCT search algorithm
sylvain.gelly at m4x.org
sylvain.gelly at m4x.org
Mon Oct 9 01:21:09 PDT 2006
Le Dimanche 08 Octobre 2006 15:37, Łukasz Lew a écrit :
> I would like to add that in my experiments with a small number of
> playouts ~30000, without all as first heuristics, the common situation
> was that a bot explored the tree more or less properly, but in the end
> it finished with a wrong move, because last refutation found didn't
> manage to make a significant change to the root decision.
>
> How do You deal with this problem?
> How do You choose the final move?
Hello,
I don't know if the "You" was only Don, or all people who use UCT?
If the question was addressed to all, I would first answer that indeed it is a
problem. MoGo sometimes chooses a bad move just because at the very end the
move has a larger value (common when the value is decreasing as for each
move, it then finds the refutation).
We tried choosing the move which maximize
value-1/sqrt(nbSimulationsForThisMove), taking the lower bound of a
confidence interval. This doesn't give significant improvement so we
currently (and from the beginning) choose always the move with the highest
value.
The solution given by Don (giving more time until the best move and the most
sampled are the same) seems quite good. But we didn't try (no time).
Don, did this solution give you significant improvements?
Sylvain
More information about the computer-go
mailing list