[Computer-go] MC in games of imperfect information
ojflewis at yahoo.co.uk
Thu May 20 01:08:07 PDT 2010
Before trying to bias / weight the MC playouts, it would be worth trying a
pure-MC approach. As you've described it below, this would be "Give all
players random cards, then play the game out randomly". If you have access
to the rules-based bot, that is ideal, as you have a fixed-strength opponent
you can test against. Although pure-MC in Go has been left behind by MCTS,
it should be a good place to start to validate the approach. The fact that
the players will respond with bad moves most of the time doesn't invalidate
the approach (at least in Go). I wouldn't go down the route of playing out
with deterministic rules as the choice of these could have a major influence
on the validity of the playout results.
The problem of imperfect information reminds me a bit of backgammon, where
there is perfect information, but the dice rolls mean a huge branching
factor and similar uncertainty about possible future moves. A neural
network that did TD learning through self-play was a successful approach in
that domain (google TD Gammon to find the paper by Tesauro).
On Fri, May 14, 2010 at 4:30 PM, Isaac Deutsch <ibd at gmx.ch> wrote:
> Hi all,
> I'm thinking about creating a computer player for Tichu (
> http://en.wikipedia.org/wiki/Tichu), a game that is rather widespread here
> in Switzerland. However, to my knowledge there exists no official bot that
> plays it. Someone I know has created a bot that plays using rules only (if
> X, play cards YZ), but his findings were that the bot plays pretty weak.
> Of course, the game is solvable when there are only 2 players left because
> then, the distribution of cards is clear and the best strategy can be
> With 3 or even all 4 players still in the game, it is clear that it is not
> clear which player has which cards. :) It is a game of imperfect
> information. I was wondering if Monte Carlo (MC) can and should be used at
> this stage of the game. It seems that there has been some success of MC in
> If yes, how would MC be used? When a move is to be made, should all legal
> moves be generated, and should a number of playouts be simulated for each
> (1-ply MC)? What do the playouts look like? Give all players (weighted)
> random cards, then play the game out with deterministic rules? Give all
> players (weighted) random cards, then play the game out (weighted) randomly?
> I'm pretty much just trying to think of something based on what 'works' in
> Go. :) Because of the uncertainty in the distribution of cards gives so many
> possible combinations, it seems impracticable to create a tree with all
> possible distributions.
> If no, what alternatives are there? Neural networks?
> Computer-go mailing list
> Computer-go at dvandva.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go