[computer-go] Improvement of UCT search algorithm
Łukasz Lew
lukasz.lew at gmail.com
Tue Oct 10 08:51:28 PDT 2006
On 10/10/06, sylvain.gelly at m4x.org <sylvain.gelly at m4x.org> wrote:
> Le Lundi 09 Octobre 2006 17:34, Don Dailey a écrit:
> > I would like to know the results if you do some tests on this.
> Hello,
>
> here are the results comparing the methods of choosing the best move. I have
> not yet tested the more complicated solution of giving more time if
> necessary.
>
> The benchmark:
> - current MoGo with 70000 simulations/move (exactly, no more time even if
> necessary).
> - gnugo 3.6 all by default (level 8 I think).
> - 9x9 with 7.5 komi
>
> Results: (number of win/number of games with MoGo playing black, then with
> MoGo playing white, then percentage over all the games).
> * Choosing the move with the highest value: 338/425(b),352/425(w) (81.2%/850)
> * Choosing the move with the highest (value-(standard
> deviation)/sqrt(simulations)): 332/400(b),326/400(w) (82.2%/800)
> * Choosing the move with the highest number of simulations: 322/400(b),341/400
> (w) (82.9%/800)
Correct me if i'm wrong.
UCT explores move m with a highest
avg_m + c* sqrt ( n / log (n_m) )
so those values are kept almost at the same level.
n is the same for all siblings, so a child with a highest avg_m has
also highest n_m.
BTW have someone tried to remove log from the equation?
Lukasz
>
> So indeed choosing the move with highest number of simulations seem a little
> better, whereas it is not statistically very significant (I could try with
> more games, but 800 is already quite a lot :-)).
>
> > But it's hard to imagine this making MoGo even stronger :-)
> :-)
> I have to find a way, because I have a bet with Yizao, that MoGo will beat him
> one day in 9x9 without handicap. If I succeed, he pays me a meal in a
> restaurant :-)).
> Ok, this will not be this time :-/ :-).
>
> > On Mon, 2006-10-09 at 16:53 +0200, sylvain.gelly at m4x.org wrote:
> > > Le Lundi 09 Octobre 2006 16:35, Don Dailey a écrit :
> > > > On Mon, 2006-10-09 at 10:21 +0200, sylvain.gelly at m4x.org wrote:
> > > > > The solution given by Don (giving more time until the best move and
> > > > > the most
> > > > > sampled are the same) seems quite good. But we didn't try (no time).
> > > > > Don, did this solution give you significant improvements?
> > > >
> > > > I never did a hard core test but I'm quite sure it was a significant
> > > > improvement. Let me be more specific on what I do:
> > >
> > > Thank you Don for the details. I
> > > have though on a similar method, but I wasn't so motived to do it
> > > because I though that it would not be significant. But if you think it
> > > would be quite better, I'll try :-).
> > > I'll let you know if this is successful.
> > >
> > > Sylvain
>
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
More information about the computer-go
mailing list