[computer-go] 10k UCT bots

Don Dailey drdailey at cox.net
Tue May 13 11:47:56 PDT 2008



Jason House wrote:
> On May 13, 2008, at 1:51 PM, Mark Boon <tesujisoftware at gmail.com> wrote:
>
>>
>> On 13-mei-08, at 14:10, Álvaro Begué wrote:
>>> What others do is the right thing to do. Your method will introduce
>>> some biases.
>>
>> Could you elaborate what bias it could lead to? I also do the same as 
>> Jason. I did consider the possibility of a bias but couldn't 
>> immediately think of one.
>
> And I was thinking "let's not repeat this topic"...
>
> The probability of picking a point is proportional to 1 + number of 
> illegal points before it.
>
> In practice, illegal moves are pretty rare until endgame. At that 
> point, it's a trade off between bias and speed. Random number 
> generation is not cheap. I have yet to see empirical proof that the 
> pick and scan is bad. I've tried both methods in the past and saw no 
> measurable difference in strength.
I may perhaps take a look at this myself.    I think there is a way to 
deal with this issue and get the best of both worlds or at least 
approach it.   If this method does not degrade the quality, it's a moot 
point.   Otherwise, here is an idea:

   1. For each move list size, pre-compute N tables of move list 
traversal orderings.
   2. At move selection time alternative between them.

So you would have something in the C language like:  

     int *order[S][C] 

Your first move is selected randomly, then from then on you use the 
array returned to form an offset from this. 

This would pay off if you really wanted something very close to uniform 
random and your random number generator was clearly expensive. 

I think there is a great deal of chaos that would make this "almost" 
perfectly uniform.   The first move would be selected as you already do, 
using the random number generator but after this you traverse the list 
in random order as specified by this pre-computed table.     Even though 
the bias is still there for any given move,  it's basically 
non-predictable.   On the very next move you will using a different 
table which you choose from in round robin fashion.  

Don't know how flawed this idea is.   I believe it is would come close 
to uniformly random but wouldn't be perfect.  There would be a minor 
bias when measured over a lot of games,  it would probably occur that a 
certain move was being chosen slightly more often,  but nothing like the 
bias in the current method.  

Is it worth it?   Probably not,  but just a crazy idea.   Likely there 
are simpler ways to eliminate most of the bias while still minimizing 
the number of random move generations. 

- Don
  



>
>
>> What good does moving it to the end of the list do? Next time around 
>> it's just as likely to be picked as when left in place. Or do you 
>> only process the 'end' after the 'front' moves are all tried?
>
> The range of the random number is reduced by one after each failed 
> lookup. Shuffled data has no impact on future use of the array of 
> empty points.
>
>
>>
>>
>> Mark
>>
>> _______________________________________________
>> computer-go mailing list
>> computer-go at computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/


More information about the computer-go mailing list