[Computer-go] Utilizing multiple parametersets/bias systems/bots
leandromarcolino at gmail.com
Mon Apr 27 17:07:14 PDT 2015
You may want to take a look at my papers, they are kind of related to these
ideas (although I didn't yet change the "team" on the fly, as you are
Concerning changing the settings dynamically, my most recent work may help
(when using voting):
Let me know if you have questions/comments... :)
On Sun, Apr 26, 2015 at 7:59 AM, Marc Landgraf <mahrgell87 at gmail.com> wrote:
> I lately tried to think about, whether it would be possible to combine the
> strengths of different bots, or at least different parameter sets/bias
> systems for one bots in some way. They may shine at different
> situations/phases during the game, but how to figure out, which one is
> currently the better one?
> What I now came up with, was the following:
> For simplicity we assume for now, that our different bots are using the
> same playouts, but different approaches during the tree phase. So maybe
> they use different ways to bias nodes, different selection formulas etc.
> Gonna focus on them using different bias systems.
> Now you split up your playouts in percentiles:
> 25%: Bot1 selects the white moves, Bot2 the black ones
> 25%: the other way around
> 0<x%<50%: Bot1 selects for both.
> 50%-x%: Bot2 selects for both.
> You track the win rates of those first 2 quarters separately and also
> calculate winrate of Bot1 vs Bot2.
> Now if the bots are identical obviously both should win 50%. But if the
> bots are different, you may see different results.
> E.g. when Bot1 wins 55% of his games, his move selection is probably
> better then Bot2's move selection. Here you have to be careful about wrong
> conclusions, because if you would setup a depth-first bot vs a width-first
> you would certainly also get win rates heavily in favor of the depth-first.
> But where this could shine is, when using different bias systems. Because
> it actually tells you, which bias system is doing better in the current
> board situation.
> Now you can use that knowledge to calculate x. E.g. if either bot wins
> 60%+ he gains all 50% of the remaining playouts, and let the balance slide
> linearly, if the win rate is 40-60% for the bots against each other. (use
> whatever formula here, open for testing)
> This should enable you to figure out on the fly, which bias system is
> doing a better job at the current situation, while doing playouts. You are
> just tracking some additional stats.
> Of course, there are pros and cons to this method:
> + In general, switching selection method in the tree should not cost any
> time, and tracking those additional stats also costs close to no time. Only
> additional time used comes from using a second bias function or similar
> (because now you have to calculate the bias twice, for most nodes)
> - At the same time, the amount of data is actually increasing, because you
> have to track the stats for the different bots. This may cause memory
> issues! (but when using a distributed memory solution anyway, it does not
> create additional data, if each memory/thread unit is assigned to one of
> those percentiles of playouts)
> +- In general the costs increase depends on how different the 2 Bots are.
> If they are the same, there would be basically no cost.
> + Allows you to figure out, which bot/bias/selection is doing better right
> - but may lead to false conclusions, like above mentioned depth vs width
> +- As long as both bots are of similar strength, you should not lose from
> using this kind of system. Worst case is, that you play the wrong bots
> move, if you had above mentioned false conclusions. But when they are close
> in strength, that is not worse than using just one bot all the time. Of
> course, if one bot is dominating the other one in all situations, you are
> losing quality, when figuring out again and again, that this is the case.
> (because in 50% of the playouts, half of the moves were selected by the
> worse bot)
> So some quick ideas how it could be modified further:
> - all percentages are obviously placeholders and could be adjusted (even
> - assuming you have a low cost and a very high time cost bias function,
> you can actually check, if it is worth using the high cost bias function,
> or if the board situation is simple enough to churn out more playouts using
> the cheaper function.
> - identifying certain game situations: you may know, that one bias is
> doing much better in corner fights or liberty races, so if that one is
> dominating, you are probably in such a situation and can now adjust further
> (e.g. modify playouts, add additional routines, whatever)
> - using more then 2 routines, possibly in conjunction with the idea above
> to identify situations.
> Of course, one could now also think how to expand this idea, with using
> different playouts, but I'm not sure how to judge, which playouts are the
> better ones, when using different ones. Only because Bot1's playouts tell
> me, I'm winning 60%, it does not mean, that his playouts are better then
> Bot2's, who is only giving me 40%. (Even though I would wish so :D) So
> right now I need the same playouts, to judge my selection routine. Maybe
> someone else has an idea, how to judge playouts?
> What do you guys think? Is there anything with potential in those ideas?
> Or is the cost and danger of wrong conclusions too high for the possible
> Sadly can't test it with my own bot,as my versions strictly dominate each
> other, and with its current strength it would probably be not conclusive
> Computer-go mailing list
> Computer-go at computer-go.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go