[computer-go] Hybrid theory

terry mcintyre terrymcintyre at yahoo.com
Fri Feb 1 13:18:21 PST 2008


UCT is based on a theory of a multi-armed bandit, with uncertain knowledge about which "arms" would be most productive. Is it possible to graft various sources of knowledge into a sort of meta-bandit algorithm?

As to fusing top-level knowledge with random playouts, I love the idea, and am trying to imagine  how to implement it. One idea is this: certain moves in certain situations might trigger a forced reply. Playing one of a pair of miai, for instance, should result in a high probability of the matching move played as a response - especially if failure to do so would result in killing a group. 
Analysis of a group could conclude that life depends on certain external liberties, or the ability to play one of two alternate moves, yadda yadda; those threats then trigger appropriate automatic responses with high probability.
 
Regarding the scalability study, the results are tricking in more slowly now, I think. Is the number of machines in use the same as before? I'm very curious about that flat spot for Mogo-16, 17, and 18. ( http://cgos.boardspace.net/study/index.html )


Terry McIntyre <terrymcintyre at yahoo.com> 





      ____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping


More information about the computer-go mailing list