[Computer-go] MC in games of imperfect information

Oliver Lewis ojflewis at yahoo.co.uk
Thu May 20 01:08:07 PDT 2010


Before trying to bias / weight the MC playouts, it would be worth trying a
pure-MC approach.  As you've described it below, this would be "Give all
players random cards, then play the game out randomly".  If you have access
to the rules-based bot, that is ideal, as you have a fixed-strength opponent
you can test against.  Although pure-MC in Go has been left behind by MCTS,
it should be a good place to start to validate the approach.  The fact that
the players will respond with bad moves most of the time doesn't invalidate
the approach (at least in Go).  I wouldn't go down the route of playing out
with deterministic rules as the choice of these could have a major influence
on the validity of the playout results.

The problem of imperfect information reminds me a bit of backgammon, where
there is perfect information, but the dice rolls mean a huge branching
factor and similar uncertainty about possible future moves.  A neural
network that did TD learning through self-play was a successful approach in
that domain (google TD Gammon to find the paper by Tesauro).

Oliver


On Fri, May 14, 2010 at 4:30 PM, Isaac Deutsch <ibd at gmx.ch> wrote:

> Hi all,
>
> I'm thinking about creating a computer player for Tichu (
> http://en.wikipedia.org/wiki/Tichu), a game that is rather widespread here
> in Switzerland. However, to my knowledge there exists no official bot that
> plays it. Someone I know has created a bot that plays using rules only (if
> X, play cards YZ), but his findings were that the bot plays pretty weak.
>
> Of course, the game is solvable when there are only 2 players left because
> then, the distribution of cards is clear and the best strategy can be
> calculated.
>
> With 3 or even all 4 players still in the game, it is clear that it is not
> clear which player has which cards. :) It is a game of imperfect
> information. I was wondering if Monte Carlo (MC) can and should be used at
> this stage of the game. It seems that there has been some success of MC in
> poker.
> If yes, how would MC be used? When a move is to be made, should all legal
> moves be generated, and should a number of playouts be simulated for each
> (1-ply MC)? What do the playouts look like? Give all players (weighted)
> random cards, then play the game out with deterministic rules? Give all
> players (weighted) random cards, then play the game out (weighted) randomly?
> I'm pretty much just trying to think of something based on what 'works' in
> Go. :) Because of the uncertainty in the distribution of cards gives so many
> possible combinations, it seems impracticable to create a tree with all
> possible distributions.
>
> If no, what alternatives are there? Neural networks?
>
> Greetings,
> Isaac
> _______________________________________________
> Computer-go mailing list
> Computer-go at dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20100520/e8dd5842/attachment.html>


More information about the Computer-go mailing list