[Computer-go] CLOP: Confident Local Optimization forNoisyBlack-Box Parameter Tuning
Remi.Coulom at free.fr
Tue Oct 4 12:18:04 PDT 2011
On 4 oct. 2011, at 18:54, Brian Sheppard wrote:
> Hi, Remi. I have a question about the "burn-in" process for CLOP.
> Normally you need a lot of data to make a decent regression function. For
> example, if you have N arguments in your function, then CLOP
> (Correlated-All) needs 1 + N * (N+3) / 2 parameters. So if you want 10
> observations per parameter, then you need 10 + 5N(N+3) samples.
> But even getting *one* sample can be tricky, because the 'logit' for a
> sample is +INF if the sample wins all of its games, and -INF if the sample
> loses all of its games. So you need a sample that has some wins and some
> losses. If the true value of the function is near 0.5, then the average
> number of trials required to obtain a sample is around 3, which is fine.
I deal with +INF/-INF with a prior: the Gaussian prior regularizes the regression, so its tends to remain flat and close to 0.5 when very few samples have been collected.
> But some of the test functions in your paper are very different. For
> example, the Correlated2 function is nearly 0 for most of the domain
> [-1,1]^4. When I sample randomly, it takes ~5K samples (that is, ~20K
> trials) to turn up enough samples to fit a regression line.
I am not sure I understand what you mean. If you use regularization, you can perform regression even with zero samples. Of course, it is very inaccurate. But if you are careful to take confidence intervals into consideration, you can still do statistics with very few samples, and determine with some significance that an area is bad.
> I tried initializing my win/loss counters to epsilon instead of zero. But
> that technique was not robust, because any reasonable epsilon is actually
> larger than Correlated2 for most of the domain. Consequently, the "reduce
> the weights" step does not reduce enough weights, and the logistic
> regression ends up fitting epsilon, rather than Correlated2.
> So I cannot get a valid measurement with less than 20K trials before the
> first regression step. But your paper shows regret curves that start out at
> 10 trials.
> What am I missing?
I am not sure what you are missing.
In the case of Correlated2: In the beginning CLOP will sample uniformly at random (if you run the algorithm in the paper with N=0, then w(x)=1 everywhere). As soon at it find its first win, it will start focusing around that first win. You should be able to easily run CLOP on Correlated2. Just edit DummyExperiment.clop and DummyScript.py. You can also take a look at Gian-Carlo's chess data: it is a bit similar, as most games are lost in the beginning.
One important aspect of CLOP is the use of the confidence interval. It does not matter if the regression is very inaccurate. Even with an inaccurate regression, it can get confident that some areas of the search space are below average, so they should not be sampled.
If you sample uniformly at random until you get an accurate regression, then, yes, it will take forever. Maybe what you are missing is that CLOP does not need an accurate regression at all to already focus its sampling on a promising region.
More information about the Computer-go