[Computer-go] parameter optimization (clop & others)
sheppardco at aol.com
Thu Feb 7 11:11:54 PST 2013
I use Clop for optimizing a lot of Pebbles parameters. But I can't qualify
performance in the way you describe, because I actually don't know what the
I can relate some observations from using it:
0) Many parameters are easier to optimize using other systems. E.g., Remi
uses minorization/majorization to tune pattern values, and Erica's pattern
weights are tuned via TD, and you would use ID3 for decision trees, etc. I
would use Clop only if there weren't a strong theoretical model for how
parameters ought to be set. I.e., the only way to measure "good" is by
1) It is very important to limit the number of parameters concurrently
optimized. E.g., to 4 or fewer. The optimum might be 3. I tend to optimize
parameters that are functionally related, and I rotate attention among such
groups in order to optimize the whole set (currently ~60 in the whole set
optimized by Clop). The factor that hurts performance at large numbers of
parameters: cross-product terms generate a lot of spurious theories about
the location of the optimum.
2) Optimizing multiple parameters would work a lot better if the number of
cross-product terms could be limited. E.g., start with a only an X + X^2
model, and slowly add cross-product terms prioritized by the correlation
between XY and residual error. At some point I will implement this, because
the off-diagonal coefficients of the model are almost all very close to
3) You have to work to create the "inverted bowl" shape that Clop works well
on. For instance, you have to use a lot of opponents, and a lot of starting
positions, and some self-play. If you don't do this then parameters that
differ by epsilon can have very different performance. It is OK to have a
noisy estimate, but not OK to have a function that is inherently
discontinuous or jagged.
4) You have to hope that optimization conducted at rapid speeds will be
relevant at slow speeds. Pebbles is tuning at about 2% of its full speed. I
have no idea whether this is a good choice.
5) An overnight run is fine for me. E.g., 3 parameters tuned per run means
that 60 parameters could be tweaked every 3 weeks. (I don't actually
optimize every night.)
From: computer-go-bounces at dvandva.org
[mailto:computer-go-bounces at dvandva.org] On Behalf Of Olivier Teytaud
Sent: Thursday, February 07, 2013 7:00 AM
Subject: [Computer-go] parameter optimization (clop & others)
as many of you I guess, I often try to optimize parameters of strategies.
In games, if I optimize against a fixed opponent, this is a noisy
optimization problem: I want the parameters with the best success rate
against the chosen opponent,
which can be evaluated thanks to repeated (time-consuming) games.
The state of the art in noisy optimization is slightly unreadable.
For most tools there are properties such as
<< log( distance to optimum ) ~ - C log( number of evaluations) >>
for some positive C.
In some cases, C=1/2 (or something close to 1/2 depending on derivability
conditions), but this is restricted to easy problems.
For other families of functions and under some technical assumptions aimed
at getting rid of too simple objective functions,
Basically, one then get rates such as C=1/4, for a quadratic objective
function. However, the bounds are C<=1/2, so there is still a gap.
So basically one can get nearly any result depending on the assumptions :-)
I'd like to know which rates you get on your favorite optimization problems.
Maybe many people here don't care about noisy optimization
from a maths point of view, but I'm pretty sure that people here work on
real noisy optimization problem and if they plot curves such as the equation
above it will provide interesting information.
Thanks for any feedback :-)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go