[computer-go] monte carlo transposition reuse (MCTR)
Chris Fant
chrisfant at gmail.com
Fri Sep 14 07:27:12 PDT 2007
Does it defeat it based on number of samples taken or time allotted per turn?
On 9/14/07, Jason House <jason.james.house at gmail.com> wrote:
> I know I'm only wading in the kiddie pool of computer go with my 1-ply bots,
> but I think I may have found a useful enhancement to monte carlo.
>
> HouseBot supports three 1-ply search modes:
> 1plyMC - Uniform sampling
> 1plyShuffle - Uniform sampling with monte carlo transposition reuse
> 1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)
>
> Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably expects.
> What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
> time. I'm basic this on self-play data from CGOS. Currently,
> http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html
> shows 10 matches between housebot-617-UCB has played housebot-618-shuff.
> housebot-617-UCB (1plyUCT) lost every time.
>
> While tricky, it should be possible to combine UCT and MCTR for an even
> stronger bot. MCTR can be thought of as a low bias alternative to the AMAF
> heuristic. Rather than using all moves, MCTR takes only the top N moves,
> where N is computed based on which moves were played in the random game.
> From an open board position MCTR uses about 1/3 of the moves that AMAF
> would. Computation of the resulting winning percentage must also be
> weighted based on the probabilities of duplicating results (roughly
> speaking, it's 1/N).
>
> As a result of using MCTR, winning rates are no longer integers as one would
> expect. Here's the estimated winning rates for all three algorithms when
> asked for a white response to black G3:
>
> 1plyMC: 781 / 1272
> 1plyShuffle: 140.15 / 231.75
> 1plyUCT: 936 / 1515
>
> 1plyShuffle is slower because of the extra work information tracking, but
> the variance in estimates should be far lower than the numbers would
> indicate. I have yet to do the computations, but a sample size of 231.75
> has an estimation error of around 6000 normal MC runs for that position.
> That is why my implementation of MCTR is defeating my (1ply) implementation
> of UCT.
>
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
More information about the computer-go
mailing list