[computer-go] Monte-Carlo Go Misnomer?
Sylvain Gelly
sylvain.gelly at m4x.org
Thu Feb 8 09:06:02 PST 2007
Hello,
> Is there any known (by theory or tests) function of how much a increase
> in the strength of the simulation policy increases the strength of the
> MC/UCT Program as a whole?
I think that is a very interesting question.
In our work on MoGo we found that there could be a decrease of the
strength of the MC/UCT program while using a stronger simulation
policy. It is why in MoGo it is more the "sequence idea", than the
"strength idea". Our best simulation policy is quite weak compared to
others we tested.
But we have further experiments, in a work with David Silver from the
university of Alberta. We found out that the relation "strong
simulation policy" <=> "strong MC program" is wrong at a much larger
scale. So the "intransivity" is true even with much much stronger
simulation policies.
Of course there is the simple counter example of a deterministic
player. But our results hold even if we randomise (in a lot of
manners, and tuning as best as we can the parameters) the much
stronger policy.
I have some theory about this phenomenon in general, but not enough
"polished" for the moment. I really think that understanding deeply
this experimental evidence, deeper than some intuition, would help
going further.
But maybe some already did.
Sylvain
More information about the computer-go
mailing list