[Computer-go] UCB-1 tuned policy
fotland at smart-games.com
Wed Apr 15 22:37:59 PDT 2015
I didn’t notice a difference. Like everyone else, once I had RAVE implemented and added biases to the tree move selection, I found the UCT term made the program weaker, so I removed it.
> -----Original Message-----
> From: Computer-go [mailto:computer-go-bounces at computer-go.org] On Behalf Of
> Igor Polyakov
> Sent: Tuesday, April 14, 2015 3:37 AM
> To: computer-go at computer-go.org
> Subject: [Computer-go] UCB-1 tuned policy
> I implemented UCB1-tuned in my basic UCB-1 go player, but it doesn't seem
> like it makes a difference in self-play.
> It seems like it's able to run 5-25% more simulations, which means it's
> probably exploiting deeper (and has less steps until it runs out of room to
> play legal moves), but I have yet to see any strength improvements on
> 9x9 boards.
> As far as I understand, the only thing that's different is the formula.
> Has anyone actually seen any difference between the two algorithms?
> Computer-go mailing list
> Computer-go at computer-go.org
More information about the Computer-go