[Computer-go] learning patterns for mc go
jason.james.house at gmail.com
Wed May 19 20:13:15 PDT 2010
I mostly skimmed it, but here's what I got from it: In a simulation,
pick moves based off the leaf node's RAVE values, but discount moves
whose follow-up moves have already been taken.
The tiling is simply a tracking of how effect a move is when combined
with a specific follow-up move. Near the start of a simulation, this
would match RAVE values. Deep in a simulation, it's highly situational
and based on which follow-up moves remain open.
I hope that helps!
Sent from my iPhone
On May 19, 2010, at 7:50 PM, Darren Cook <darren at dcook.org> wrote:
>>> My problem is that I can't find many papers about learning of MC
>>> policies, in particular patterns.
>> A just published paper about learning MC policies:
>> It works quite well for Havannah (not tested on hex I think).
> I struggled with this paper ("Multiple Overlapping Tiles for
> Monte Carlo Tree Search"), as it wasn't clear to me what a "tile" was.
> Specifically I couldn't work out if they were 2d patterns of
> black/white/empty, or are they are a sequence of moves (e.g. joseki,
> forcing moves, endgame sente/gote sequences, etc. in go)? Or perhaps
> something else altogether?
> While I wear the dunce's cap and stand in the corner, is some kind
> able to explain the idea in go terms?
> Darren Cook, Software Researcher/Developer
> http://dcook.org/gobet/ (Shodan Go Bet - who will win?)
> http://dcook.org/work/ (About me and my work)
> http://dcook.org/blogs.html (My blogs and articles)
> Computer-go mailing list
> Computer-go at dvandva.org
More information about the Computer-go