[computer-go] Hybrid theory
terry mcintyre
terrymcintyre at yahoo.com
Fri Feb 1 13:18:21 PST 2008
UCT is based on a theory of a multi-armed bandit, with uncertain knowledge about which "arms" would be most productive. Is it possible to graft various sources of knowledge into a sort of meta-bandit algorithm?
As to fusing top-level knowledge with random playouts, I love the idea, and am trying to imagine how to implement it. One idea is this: certain moves in certain situations might trigger a forced reply. Playing one of a pair of miai, for instance, should result in a high probability of the matching move played as a response - especially if failure to do so would result in killing a group.
Analysis of a group could conclude that life depends on certain external liberties, or the ability to play one of two alternate moves, yadda yadda; those threats then trigger appropriate automatic responses with high probability.
Regarding the scalability study, the results are tricking in more slowly now, I think. Is the number of machines in use the same as before? I'm very curious about that flat spot for Mogo-16, 17, and 18. ( http://cgos.boardspace.net/study/index.html )
Terry McIntyre <terrymcintyre at yahoo.com>
____________________________________________________________________________________
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping
More information about the computer-go
mailing list