[computer-go] Don't cares in Pattern Matching

Eric Boesch ericboesch at hotmail.com
Wed Jun 14 05:35:08 PDT 2006


Pattern matching can require a time/space tradeoff. If the patterns are very 
irregular and full of holes, then an attempt to minimize time usage by 
treating each don't-care case that is tested as three or four different 
patterns could lead inevitably to space usage that grows expontentially with 
the number of patterns, as Chrilly alluded to. Fortunately, actual patterns 
are compact -- there are no useful patterns with the shape {a1, j9, k3, s14} 
-- and everyone indicated that their space usage ratios were manageable. 
Nonetheless, Peter Drake and I had a discussion in this mailing list a while 
back of pattern matching heuristics that aren't biased exclusively towards 
minimal time. (And just as a minimal-time pattern matcher can be pretty 
compact for nicely behaved pattern sets, a pattern matcher that emphasizes 
compactness can be also be pretty fast with those sets.)

I would say that the difference between a Patricia tree and an acyclic DFA 
often can be just the same code viewed two different ways. As to how to 
decide which intersection to test at each decision node/DFA state, two 
reasonable approaches among many others are to minimize the number of 
patterns that match so far that don't care about the intersection tested, or 
to maximize the expected number of bits of information gained by the test 
compared to the amount of time spent performing the test.

Some programmers test against only a fixed set of nesting pattern shapes, 
meaning the order of tests at a given depth in the tree is always the same. 
Depending on the data set, the spiral approach could save space (because the 
test at each node does not need to be stored, reducing space usage per node; 
the patterns can and probably should be stored in a Zobrist hash table 
instead of a tree to reduce storage space even more) or waste it by choosing 
the wrong intersections to test on. The spiral approach is just one example 
of testing against all of a set of nesting pattern shapes, all of which use 
time proportional to the size of the largest pattern. Testing against all of 
a set of non-nesting pattern shapes can at worst take time proportional to 
the total size of all the different pattern shapes (but considerably less 
than the total size of all patterns if you use any reasonable 
implementation).

That should give a partial summary of the pattern matching issues that have 
been discussed in the last few years in this mailing list. There's been lots 
more discussion also, such as about what kinds of rules people use in 
addition to the "this intersection is black/white/empty/off the board" 
variety.

>From: "Chrilly" <c.donninger at wavenet.at>
>Reply-To: computer-go <computer-go at computer-go.org>
>To: "computer-go" <computer-go at computer-go.org>
>Subject: [computer-go] Do'nt cares in Pattern Matching
>Date: Tue, 13 Jun 2006 18:13:45 +0200
>
>A possibility for a pattern matcher is to distribute the patterns in bins. 
>This method is decribed in "A Pattern Matcher for Goliath" by Mark Boon.
>Mark used the center-point and the 4 Adjacent stones as the bin-index. 
>There are 243 different combinations of these 5 points. Gives 243 bins.
>When searching a matching pattern for a point x, one needs only to search 
>the much shorter lists of patterns in the corressponding bin.
>
>The problems are do'nt cares. Mark "solved" the problem by not allowing 
>do'nt cares on these 5 points. Thats not really nice, at least the Suzie Go 
>expert has protested against this restriction.
>
>A brute-force solution is to replace one don't care by 3 different 
>patterns. 2 do'nt cares, would be already 9. (Although my expert is already 
>happy with 1 don't care).
>
>Is there any better solution?
>
>There are numerous other searching techniques. E.g. Martin Mueller uses 
>according his thesis a Patricia-Tree. How does one solve the do'nt care 
>problem in Patricia-Trees?
>Gnu-Go uses DFA's. Has a DFA a problem with Don't cares?
>
>Personally I like the Mark Boon approach most. Hashing is usually fast and 
>the method is simple. But the don't care is a little bit nasty. Any other 
>approaches around?
>
>Chrilly
>
>_______________________________________________
>computer-go mailing list
>computer-go at computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/




More information about the computer-go mailing list