[Computer-go] CLOP: Confident Local Optimization forNoisyBlack-Box Parameter Tuning
Brian Sheppard
sheppardco at aol.com
Tue Oct 4 09:54:17 PDT 2011
Hi, Remi. I have a question about the "burn-in" process for CLOP.
Normally you need a lot of data to make a decent regression function. For
example, if you have N arguments in your function, then CLOP
(Correlated-All) needs 1 + N * (N+3) / 2 parameters. So if you want 10
observations per parameter, then you need 10 + 5N(N+3) samples.
But even getting *one* sample can be tricky, because the 'logit' for a
sample is +INF if the sample wins all of its games, and -INF if the sample
loses all of its games. So you need a sample that has some wins and some
losses. If the true value of the function is near 0.5, then the average
number of trials required to obtain a sample is around 3, which is fine.
But some of the test functions in your paper are very different. For
example, the Correlated2 function is nearly 0 for most of the domain
[-1,1]^4. When I sample randomly, it takes ~5K samples (that is, ~20K
trials) to turn up enough samples to fit a regression line.
I tried initializing my win/loss counters to epsilon instead of zero. But
that technique was not robust, because any reasonable epsilon is actually
larger than Correlated2 for most of the domain. Consequently, the "reduce
the weights" step does not reduce enough weights, and the logistic
regression ends up fitting epsilon, rather than Correlated2.
So I cannot get a valid measurement with less than 20K trials before the
first regression step. But your paper shows regret curves that start out at
10 trials.
What am I missing?
Thanks,
Brian
More information about the Computer-go
mailing list