[computer-go] Pay-as-you-go cluster?

Chrilly c.donninger at wavenet.at
Mon Oct 2 10:32:08 PDT 2006


----- Original Message ----- 
From: "Don Dailey" <drd at mit.edu>
To: "computer-go" <computer-go at computer-go.org>
Sent: Monday, October 02, 2006 6:39 PM
Subject: Re: [computer-go] Pay-as-you-go cluster?


> On Mon, 2006-10-02 at 13:10 -0300, Eduardo Sabbatella wrote:
>>
>> > If one finds a good way to paralellize a search
>> > algorithm, it is probably
>> > because it was inefficient from the very beginning.
>>
>> That is not true.
>
>
> Yes, it is.
>
Oh, yes it is. E.g. I got the code of the massive parallel chess programm 
P-Conners which was in turn based on Zugwang. I could not find the chess 
programm. It was a big communication loop. There was just one call to 
"onidle()". And the onidle() was the chess programm. Zugzwang communicated 
all the time. Just when there was nothing to communicate, a small step in 
the chess programm was done, the  communicator was called again....
Zugzwang had a very good parallel speedup, but the chess engine was at least 
10-times slower then most of the other programms. This was also beneficial 
for the parallel speedup, because there was no mismatch between 
communication latency and search speed. They even distributed the quiescence 
search. In Hydra the minimum distribution depth is 7.

Same holds e.g. for tree pruning. Pruning makes a tree non-uniform. This is 
bad for the parallel speedup. A known trick to improve the parallel speedup 
is to switch off pruning and search extensions. But a singular programm 
without pruning/extensions is far from optimal. I think the upper bound for 
the speedup in "real" Alpha-Beta is about 25. No matter how many processors 
one has. "Real" means measured from the best possible singular-version.

UCT:
Why can one not use the Hydra-trick of  hash-table replication. E.g. one 
broadcasts the result for every node that was visited k-Times? This is the 
only parallel communication. Would be relative trivial to programm and the 
results are shared implictly. At the end one simply takes the result of 
processor 0.

Chrilly



More information about the computer-go mailing list