[computer-go] Optimal explore rates for plain UCT

Petr Baudis pasky at ucw.cz
Mon Mar 10 16:12:59 PDT 2008


On Mon, Mar 10, 2008 at 03:40:53PM -0700, Christoph Birk wrote:
> On Mon, 10 Mar 2008, Petr Baudis wrote:
>>  With 110k playouts per move and no domain knowledge in the playouts,
>> the ratings are now:
>>
>> 	c=0.2  (pachi1-p0.2-light)	ELO 1627 (285 games)
>> 	c=1.0  (pachi1-p1.0-light)	ELO 1590 (120 games)
>> 	c=0.05 (pachi1-p0.05-light)	ELO 1531 (286 games)
>> 	c=2.0  (pachi1-p2.0-light)	ELO 1511 (118 games)
>
> I have two "light" UCT bots on CGOS:
> Name              #playouts         c (*)         CGOS-ELO
> myCtest-V-0003    50000             0.25          1508
> myCtest-10k-UCT   10000             0.25          1246
>
> (*): I use c=0.5 outside the sqrt()
>
> What is your 'create-new-node' threshold? I use 50.

I actually added that to my mail at the last minute: "Also, I expand UCT
leaves at the second hit. This retains conservative memory usage but it
is important for strength - I saw huge strength increase when I lowered
this to 2 from the original value of 5."

Even with just threshold 2, for 110k playouts I need only about 20M of
memory for the tree, so I'm actually wondering about how to improve my
program by spending more memory usefully. ;-)

-- 
				Petr "Pasky" Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it.	-- J. W. von Goethe


More information about the computer-go mailing list