[computer-go] State of the art of pattern matching

Santiago Basaldúa jacques at dybot.com
Thu Mar 27 06:13:44 PDT 2008


Mark Boon wrote:

> What I have now takes 10-15 microseconds on a 2Ghz Intel computer to  
> find all the patterns on a board (on average for 9x9, for 19x19 it's  
> more like 15-20 usec) 

>From your difference between 9x9 and 19x19 I assume that you are updating
the patterns of the cells after a stone was placed, (else the difference 
should be 361/81 times) not a computation from scratch. I make this sure 
so that we compare apples to apples. As far as the patterns computing is
concerned, (i.e. excluding the board part verifying captures, etc.) for
a pattern of 40 neighbors I get times easily below 500 nanoseconds 
(even on an old P4 3.4 GHz) for updating 40 hashes. I explained that about
June last year, when I wrote it. Since it passed my tests (June 07) I 
have never changed a character in the code that writes the 67090 asm
lines. It is just a black box, that works 100%. Each board cell has an 
updating function that knows where the board limits are, the resulting
code (for hash 32 bit mode) is something like an endless:

           lea     ebx, [edx + ecx*2 - SizeOf_bccCell]
           mov     eax, 0967f6762h
           bt      dword ptr [ebx], bidx_StoneBit
           cmovc   eax, esi
           xor     [ebx + 4], eax
           lea     ebx, [edx + ecx*2]
           mov     eax, 0d83b6518h
           bt      dword ptr [ebx], bidx_StoneBit
           cmovc   eax, esi
           xor     [ebx + 4], eax
           lea     ebx, [edx + ecx*2 + SizeOf_bccCell]
           mov     eax, 0121001f7h
           bt      dword ptr [ebx], bidx_StoneBit
           cmovc   eax, esi
           xor     [ebx + 4], eax

The longest of these functions has 206 lines, the typical about 120. There
is not a single conditional jump, no vars in RAM, etc.


> It's written in Java so making it in C would possibly give a two-fold
> speedup.

I hate the Java war but, when I said that it is taking a 10 year handicap, 
at least for patterns that is true with no doubt. My code running on 1998 
hardware (it is compatible) outperforms Java code for updating patterns.


> Somehow smaller sets don't go much faster, but larger sets do slow down, 
> every ten-fold increase in number of patterns seems to double the time.

What I wrote above updates the hashes (or the masks, because the board 
has many modes) but does not search the hash to get urgency. For searching
in a sorted list, I use the second best language after automated asm ..

.. manually written asm, of course ;-)

;; One iteration:
;; --------------
CompStep	MACRO	name

		mov	eax, ebx
		add	eax, ecx
		shr	eax, 1
		and	al,  0fch       ;; Align to a valid pointer
		cmp	edx, [eax]
		jz	&name&out
		cmovc   ecx, eax	;; This is true if [eax] > value
		cmovnc  ebx, eax	;; This is true if the previous is false

		ENDM

Also (almost) without conditional jumps. The only conditional jump is selected
once when the hash is found and exits. 10 steps of that can search in a 1024
long list, 20 steps in 1024^2. A doubling in table size is just adding one step,
(8 instructions, say 64 clock cycles).

I hope this helps.

Jacques.



More information about the computer-go mailing list