[computer-go] .. if Monte-Carlo programs would play infinite
strong
Jacques Basaldúa
jacques at dybot.com
Sat Nov 25 02:05:30 PST 2006
Maybe I did no explain my point well enough.
The problem with infinite is that we get a better approximation to a
wrong value.
With few simulations we get that value with, say 1/10 error. With an
astronomical amount
of simulations we get the same value with an error of 1e-200, but it's
still wrong!. It is
proved that simulating a go position converges, but it does not converge
to the same
value as if the position was played by perfect players, it only
converges to the asymptotic
limit of random play.
I am not an MC developer, but as far as I know, UCT keeps a limited
(i.e. n-ply) tree
in memory and intentionally unbiasses the nodes to make the convergence
faster, that
does not change anything, assuming constant tree size.
A simple test :
1: after 100 simulations, choose the highest number in (0.96, 2.1, 3.18)
2: after 1e9 simulations, choose it in (0.9999999, 2.0000001, 3.000001)
You chose the same value (= same move).
That's why, I insist, if you don't increase the size of the tree and
only get a better
approximation to a wishful but frequently misconceived value (the limit
of random
play) witch is *not* a good evaluation of the game, you don't
significantly improve
your play. Of course, if you increase the tree, you reach perfect play,
that's not
the point.
Jacques.
More information about the computer-go
mailing list