[Computer-go] Go playing software accelerator development

"Ingo Althöfer" 3-Hirn-Verlag at gmx.de
Wed May 22 02:56:35 PDT 2013


Hi, another piece of information:

Chrilly Donninger, Ulf Lorenz, and a few others had developed a very
strong parallel chess program (Hydra) on FPGA cards, between 2000 and 2004.

DOnninger also tried a traditional go bot (without MC components)
with very fast hardware around 2004. However, he (together with Peter Woitke)
was not really successful.

One anecdote: The sponsor of the Hydra project was a Sheikh from Arabia.
When he lost interest he had a date with Donninger - brainstorming
on other enterprises. Donninger: "We could try the same approach on Go."
Sheikh (with his limited understanding of English): "Where do you want to go?"
Donninger: "Go is a famous and difficult Asian board game."
Sheikh: "I see, I see. But where shall we go?"

Break.

Donninger with a new attempt: "We also might design a strong Poker bot."
Sheikh: "Fantastic. I like it. Wonderful. Fantastic. What a marvelous idea...."
After a few minutes the Sheikh cooled down.
Donninger: "Where has your enthusiasm gone?"
Sheikh: "I am a Muslim. I am not allowed to invest ion Poker."

Ingo.

PS: Ulf Lorenz is the thesis advisor of Lars Schaefers.

> Gesendet: Mittwoch, 22. Mai 2013 um 10:18 Uhr
> Von: "Lars Schäfers" <slars at upb.de>
> An: "computer-go at dvandva.org" <computer-go at dvandva.org>
> Betreff: Re: [Computer-go] Go playing software accelerator development
>
> Nice to see somebody working on accelerator hardware for Go (again).
>
> There actually were some attempts in the past and even publications that
> might be interesting for you [1][2].
>
> I think in principle you have two possible directions when looking at
> accelerating basic MCTS for Go:
> a) Accelerating the computation of single playouts
> b) Computing a large number of playouts in parallel (likely using some
> pipelining technique)
>
> The problem with b) is that N parallel simulations are less valuable
> then N sequential ones, as simulations in MCTS are guided depending on
> former simulation outcomes. See e.g. [3] for according experimental
> results. You would likely end up with something that was called
> leaf-parallelization [4].
>
> a) would be great, but is extremely difficult to achieve. The
> computation of strong playouts doesn't pose to much points for
> fine-grained parallelization and kind as well as amount of computations
> that need to be done for the generation and placement of moves is pretty
> inhomogeneous. Likely your hardware will run at a speed of about 150 to
> 200 Mhz (?), making it difficult to achieve a significant speedup over
> modern CPUs with 8 or more cores each running at more than 3 Ghz.
>
>
> I'm somehow convinced that looking at accelerator hardware for MCTS Go
> might be a bit to early. The algorithm and heuristics used are still
> changing a lot from year to year and adapting your HW design is likely
> more involved than adapting software. But that's just my personal point
> of view. Good luck!
>
> - Lars
>
>
> [1]
> Gao, Haiying, Wang, Fuming, Lei, Wei, and Lin, Yun: Monte Carlo
> simulation of 9x9 Go game on FPGA , IEEE International Conference on
> Intelligent Computing and Intelligent Systems, 865–869, October 2010
>
> [2]
> Koizumi, Kenichi, Inaba, Mary, Hiraki, Kei, Ishii, Yasuo, Miyoshi,
> Takefumi, and Yoshizoe, Kazuki: Triple Line-Based Playout for Go - An
> Accelerator for Monte Carlo Go, Proceedings of the 2009 International
> Conference on Reconfigurable Computing and FPGAs, 161–166, 2009
>
> [3]
> Segal, Richard B.: On the Scalability of Parallel UCT, International
> Conference on Computer and Games, 36–47, 2010
>
> [4]
> Chaslot, Guillaume M.J-B., Winands, Mark H.M., and van den Herik, H.
> Jaap: Parallel Monte-Carlo Tree Search, Conference on Computers and
> Games, 60–71, 2008
>
>
>
> On Wed, 2013-05-22 at 06:35 +0000, Рождественский Дмитрий wrote:
> > Incredible, 100 nanoseconds is only about 300 instructions of a CPU.
> > Are you talking about 19x19? And 1 microsecond for my design will
> > probably be a worst-case (as I calculate freedom and capture
> > iteratively). When almost all stones have free places around it will
> > be down to ~100 nanoseconds.
> > As to the number of possible accelerators on-chip - it varies upon
> > price. I think it can be 5-250, for the price $250-$5000. So the cost
> > of a single simple accelerator will be $20-$50.
> >
> > Dmitry
> >
> > 21.05.2013, 23:13, "Mark Boon" <tesujisoftware at gmail.com>:
> > > Sounds interesting. But 1 microsecond for a move is not particularly
> > > fast. There are already implementations that do that in the 100-300
> > > nanoseconds range on one core. 1 microsecond is probably considered
> > > as 'semi-light' playout. I suppose the question then becomes, how
> > > many of these could your accelerator do in parallel?
> > >
> > > Mark
> > >
> > >
> > > On Tue, May 21, 2013 at 8:06 AM, Alexander Kozlovsky
> > > <alexander.kozlovsky at gmail.com> wrote:
> > >         Я тоже кстати из ЛИАПа, с четвертого факультета, может и
> > >         пересекались :)
> > >
> > >         
> > >         On Tue, May 21, 2013 at 7:02 PM, Рождественский Дмитрий
> > >         <divx4log at yandex.ru> wrote:
> > >                 Hi all,
> > >
> > >                 I have got an idea to create a hardware accelerator
> > >                 for Go playing software. It will probably be a USB
> > >                 (or maybe PCI-Express) device that will be able to
> > >                 do some basic, but very time-consuming for
> > >                 general-purpose CPU calculations very fast. For
> > >                 example load a goban layout, make a number of random
> > >                 moves (as used in Monte-Carlo algorithm) and unload
> > >                 result back to a computer.
> > >
> > >                 As long as it will be a hardware, it will be able to
> > >                 do specified calculations only, but the speed will
> > >                 be very high. For example, making just a copy of the
> > >                 particular goban layout will require typically about
> > >                 10 nanoseconds only (one internal clock cycle).
> > >                 Calculation of the validity and results of a
> > >                 particular move (including a check for ko and
> > >                 captured stones) will probably take 1 microsecond.
> > >                 This as usual may vary during debugging, but the
> > >                 current move calculation engine draft I've started
> > >                 to develop is about this figures.
> > >
> > >                 My nearest aims here are:
> > >                 - to understand a demand from go playing software
> > >                 developers, and
> > >                 - to understand what particular calculation chains
> > >                 are most demanded for hardware acceleration.
> > >
> > >                 Dmitry
> > >                 _______________________________________________
> > >                 Computer-go mailing list
> > >                 Computer-go at dvandva.org
> > >                 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> > >      
> > >         _______________________________________________
> > >         Computer-go mailing list
> > >         Computer-go at dvandva.org
> > >         http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> > > ,
> > >
> > > _______________________________________________
> > > Computer-go mailing list
> > > Computer-go at dvandva.org
> > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> > >
>
> --
> Lars Schaefers
> Computer Engineering Group of Prof. Dr. Marco Platzner
> Paderborn Center for Parallel Computing, University of Paderborn
> Pohlweg 47-49, 33098 Paderborn, Germany
> Tel: +49 (0)5251 60 4341, Fax: +49 (0)5251 60 5377
> Office: Building O 3.119
>
>
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go



More information about the Computer-go mailing list