[computer-go] Pay-as-you-go cluster?
Magnus Persson
magnus.persson at phmp.se
Mon Oct 2 07:18:26 PDT 2006
Quoting Darren Cook <darren at dcook.org>:
>> One UCT per each CPU? then, merge results?
> P.S. Since I started typing this I see a couple more ideas have come
> through... I think the above is similar to what Remi is saying, but I'm
> suggesting minimax fixed depth, instead of UCT on the master node. UCT
> may be more efficient.
I am afraid that what people are proposing here sofar is only efficient with a
few processors. The tree built by a single processor UCT is in many positions
very selective. The branching may occur at the root but also very deep in the
tree. For example, if there are two main variations most branching may
occur at
ply 3-n for both variations. This very true for 7x7 wheras for 19x19 this is
less of a problem.
A second problem is that UCT opearates on single simulations. With a awful lot
of processors I think they will keep the master processor saturated with
updating the tree. Thus the tree might need to be distributed as well and then
there will be no simple way of doing this.
A perhaps in principle more serious problem is that the evaluation of
every node
has no absolute meaning. This means that if tree is grown in a distributed
fashion it might no longer be possible to tell which move is best so far. With
sequential UCT the search adapts itself after every single simulation, and if
this is important then an efficient massively parallell implementation of UCT
might be impossible.
There is also a huge problem of memory here. UCT fills memory very quick on a
normal PC, which means one might need to sacrifice precision to save memory
even with just a few extra processors, unless the tree itself is distributed
over several machines.
A different way of looking at this problem is that I guess it is very hard to
efficiently search a *game* tree, no matter what kind of search algorithm you
use. If it is hard for alph-beta search it is also hard for UCT becuase of the
nature of the problem.
If one finds a good way to paralellize a search algorithm, it is probably
because it was inefficient from the very beginning. For example if I turn off
all go knowledge in Valkyria, it will search less selective and thus be much
easier to parallelize efficiently than it would be with the knowledge. But the
"inefficient" version will probably still play much better.
---
Above I was pessimistic but here is an idea I have. The scheduler of
simulations
should do a monte carlo simulation of UCT! That is the scheduler should
build a
queue of pending simulations generated by speculation. Instead of playing a
random game at the terminal leaves, a result is picked randomly using the win
rate of some parent node that has been searched a safe number of times
and then
update the tree *as if* a random game had been played. This is repeated
as long
as all slave processors is working. If new results comes in then the jobs in
the queue is sent back to the now idle slave process and the scheduler starts
restoring all nodes that has been changed by the speculation process and
updates the tree with the new real results.
But after writing this I think the idea is flawed. My hope was that the
batches
generated this way would be unbiased in the sense that with one slave
processor
the jobs issued could actually be exactly what a normal sequential UCT
might do.
But the problem here is that after sending away a batch, the tree is updated
with the result from a potentially very old batch. And it might be that the
next speculated batch will be very similar to the previous one becuase the
results from the very old batch update obscure parts of the tree.
Anyway I think the idea of generating jobs based on random speculation is
interesting and perhaps someone smarter than me can figure out something
better.
Magnus
More information about the computer-go
mailing list