[computer-go] 10k UCT bots
Don Dailey
drdailey at cox.net
Tue May 13 11:47:56 PDT 2008
Jason House wrote:
> On May 13, 2008, at 1:51 PM, Mark Boon <tesujisoftware at gmail.com> wrote:
>
>>
>> On 13-mei-08, at 14:10, Álvaro Begué wrote:
>>> What others do is the right thing to do. Your method will introduce
>>> some biases.
>>
>> Could you elaborate what bias it could lead to? I also do the same as
>> Jason. I did consider the possibility of a bias but couldn't
>> immediately think of one.
>
> And I was thinking "let's not repeat this topic"...
>
> The probability of picking a point is proportional to 1 + number of
> illegal points before it.
>
> In practice, illegal moves are pretty rare until endgame. At that
> point, it's a trade off between bias and speed. Random number
> generation is not cheap. I have yet to see empirical proof that the
> pick and scan is bad. I've tried both methods in the past and saw no
> measurable difference in strength.
I may perhaps take a look at this myself. I think there is a way to
deal with this issue and get the best of both worlds or at least
approach it. If this method does not degrade the quality, it's a moot
point. Otherwise, here is an idea:
1. For each move list size, pre-compute N tables of move list
traversal orderings.
2. At move selection time alternative between them.
So you would have something in the C language like:
int *order[S][C]
Your first move is selected randomly, then from then on you use the
array returned to form an offset from this.
This would pay off if you really wanted something very close to uniform
random and your random number generator was clearly expensive.
I think there is a great deal of chaos that would make this "almost"
perfectly uniform. The first move would be selected as you already do,
using the random number generator but after this you traverse the list
in random order as specified by this pre-computed table. Even though
the bias is still there for any given move, it's basically
non-predictable. On the very next move you will using a different
table which you choose from in round robin fashion.
Don't know how flawed this idea is. I believe it is would come close
to uniformly random but wouldn't be perfect. There would be a minor
bias when measured over a lot of games, it would probably occur that a
certain move was being chosen slightly more often, but nothing like the
bias in the current method.
Is it worth it? Probably not, but just a crazy idea. Likely there
are simpler ways to eliminate most of the bias while still minimizing
the number of random move generations.
- Don
>
>
>> What good does moving it to the end of the list do? Next time around
>> it's just as likely to be picked as when left in place. Or do you
>> only process the 'end' after the 'front' moves are all tried?
>
> The range of the random number is reduced by one after each failed
> lookup. Shuffled data has no impact on future use of the array of
> empty points.
>
>
>>
>>
>> Mark
>>
>> _______________________________________________
>> computer-go mailing list
>> computer-go at computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
More information about the computer-go
mailing list