[computer-go] XML alternatives to SGF

Gunnar Farnebäck gunnar at lysator.liu.se
Tue Oct 23 13:31:54 PDT 2007


Jason House wrote:
> An XML alternative [1] to SGF has recently come to my attention.  What 
> do others think of this alternative?  Personally, the effect of a tag 
> affecting the previous tag seems kind of strange to me.

For use in GNU Go it would need to have quite compelling benefits to 
become interesting.

Let's look at numbers. GNU Go 3.7.10 roughly consists of 2.4 MB C code 
(83000 lines), 1.4 MB pattern data, 0.45 MB testcase files, 1.8 MB sgf 
game records (1500 sgf files), and 2 MB documentation. Of the C code 
2600 lines come from the sgf library.

If we want to use an available C library for XML, expat seems like a 
possible choice. The whole distribution is 2.5 MB but maybe it's 
possible to get away with the 400 kB (13000 lines) C code in the lib 
directory. Five times bigger than our sgf library but manageable. (That 
cannot be said of libxml2 though, with some 140000 lines of code.)

A potential problem with an XML library is the internal representation 
of the game tree. For debugging purposes it's not unusual to dump 
reading trees containing literally millions of moves, sometimes up to 
the limit of the available RAM. If an XML tree requires more bytes per 
move, the functionality would suffer. Does anybody know how big a node 
would become in expat for a move tag?

Next problem is of course the file size of the game records. If they are 
5 or 10 times as large we're talking 9 MB or 18 MB for the game records. 
  Not a huge amount by itself but when considering the number of copies 
of GNU Go being distributed it sums up.

So what are the benefits? So far I haven't seen anything that is 
relevant for GNU Go. The readability is not really an issue, it's almost 
never possible to visualize a game record without a graphical viewer 
anyway, regardless of coordinate representation, and from the examples 
I've seen XML has been worse off than sgf on readability. Character sets 
are a non-issue for GNU Go, information about players is simply ignored. 
Version control conflicts have never happened with game records and I 
don't foresee it for the future.

But I can provide a hint for something I would find useful. If it's 
something I'm missing in today's sgf viewers it's a good way to dump and 
inspect a transposition table. It's possible to expand the 
transpositions into a big tree with duplicate subtrees but that makes it 
very difficult to traverse it efficiently. Alternatively the tree is cut 
off when the same position is reached again but then there's no easy way 
to find where the position was first reached, which is needed to follow 
the continuations.

/Gunnar


More information about the computer-go mailing list