[Computer-go] 7.0 Komi and weird deep search result

terry mcintyre terrymcintyre at yahoo.com
Thu Apr 7 11:18:57 PDT 2011


From: Brian Sheppard <sheppardco at aol.com>


> Measuring small differences is a big problem for me. I would like to have 
>better tools here.

> For instance, I am trying to measure whether a particular rule is an 
>improvement, where with the rule it wins 60.5%, and without 60.0%. You need a 
>staggering number of games to establish confidence. Yet this is the small, 5 to 
>10 Elo gain that Don referred to.

> I hoped to isolate cases where the *move* differs between versions, and then 
>analyze (perhaps using a standard oracle like Fuego) whether those moves are 
>plusses or minuses. But this is MCTS, and the program does not always play the 
>same way even in the same position.

A very tough problem! How many is "a staggering number", just out of curiosity?

I believe at least one developer is using a network of idle workstations to run 
tests. Is anybody using Amazon or some other cloud service? I recently read 
where a firm rented 10,000 cores for 8 hours for $8000 - a princely sum, but it 
does scale down as well as up.

Sadly, Fuego ( or any existing program ) may not be a very good "oracle" to 
determine whether move A or move B is best in a given situation.

Does anybody have experience with testing particular "hard cases", rather than 
"1000 random games from scratch"?

That is, based on past experience, program X did move A in situation Y, which 
turned out to be a disaster. Strong players suggest that B, C, or D would be 
better. 


There are more than a few such "what was the player thinking?" instances in the 
archives.


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20110407/74af7475/attachment.html>


More information about the Computer-go mailing list