fotland at smart-games.com
Mon Jan 2 11:42:30 PST 2017
I think the character set property just refers to the contents of comments and similar fields. The sgf format itself is entirely in the common characters in UTF-8 and US-ASCII. There is no need to assume a character set before the property. If you find the character set property in the root node, it should apply to a root comment, even if it comes earlier in the properties in the root node.
From: Computer-go [mailto:computer-go-bounces at computer-go.org] On Behalf Of Clark B. Wierda
Sent: Monday, January 02, 2017 11:35 AM
To: computer-go at computer-go.org
Subject: Re: [Computer-go] SGF
On Fri, Dec 30, 2016 at 3:52 PM, Dave Dyer <ddyer at real-me.net> wrote:
Character encoding (usually UTF8 these days) ought not to be part of
the standard, it ought to be up to the containing file to describe the
encoding at that level. Likewise, nothing in the standard ought to
require support for particular character sets. Rather, if a sgf record
contains an unsupported character set, it will fail at the "reading"
phase, independent of the actual contents of the file.
I've used sgf as a general format for more than 70 different games,
as well as Go, and I only treat it as a rough guide. I use a generic
read/write process that doesn't care about the content, and any sensible
user of the "standard" ought to do likewise.
The details of what properties exist and how they are used is always
going to be a negotiation between content originators and third party
Since character set is a defined property, my main issue in writing a parser is assumptions until finding that property.
I would prefer we define a default that will apply until that property is found. Currently, I use UTF8 which has worked so far. I reopen the file with the defined character set (if supported) when I hit that property and restart the parse.
I'm glad to see that there is still discussion on the format.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go