[Computer-go] Idea how to improve RAVE
ajahuang at gmail.com
Mon Apr 1 12:08:55 PDT 2013
Your idea is interesting. If I understand correctly, your idea is similar
to Fuego's "weight RAVE updates", see
Weight RAVE updates.
Gives more weight to moves that are closer to the position for which the
RAVE statistics are stored. The weighting function is linearly decreasing
from 2 to 0 with the move number from the position for which the RAVE
statistics are stored to the end of the simulated game.
2013/3/31 Alexander Kozlovsky <alexander.kozlovsky at gmail.com>
> I have an idea have to improve RAVE, but this is still rough.
> So, I want to describe it here in the hope it lead to some
> interesting discussion. I hope my not-so-good English
> allows me to describe the idea adequately.
> If you don't want to read all the details, you can first scroll down
> to "Use cases" to read when this improvement may be useful.
> --- Current RAVE statistics implementation ---
> Let's say, we want to accumulate RAVE statistics. Let's say
> we have four arrays for this: black_rave_total, black_rave_wins,
> white_rave_total, white_rave_wins.
> We increment black_rave_total[intersection] += 1
> if, during last simulation, black was first who play on this intersection.
> We increment black_rave_win[intersection] += 1
> if, during last simulation, black was first who play on this intersection,
> and the simulation result is "black win".
> --- New proposal ---
> What if we add two arrays: black_rave_win_move_sum
> and white_rave_win_move_sum.
> These arrays will accumulates sum of move numbers when black
> was first who play on the intersection and black win in this simulation.
> Concrete example:
> That is, let's say we already done ten simulations for current node.
> In six simulations, black was first who play on B4 intersection.
> In three of this simulations black win.
> In first of this three simulations, move number when black
> play on B4 was 20 (this move number is counted from the
> start of random simulation, not from the start of the game)
> In second of three simulation, the move number for B4 was 25.
> In the third simulation where black play on B4 and win,
> the move number was 72.
> In this case, black_rave_win_move_sum for B4 will be
> 20 + 25 + 72 = 117
> This number allows us to calculate average move number
> for B4 when simulation result was successful for black:
> 117 / 3 = 39.
> I denote this as black_rave_avg_win_move_num:
> black_rave_avg_win_move_num[pos] =
> black_rave_win_move_sum[pos] / black_rave_wins[pos]
> In current RAVE, we use winrate to determine the "best" move:
> black_rave_winrate[pos] = black_rave_wins[pos] / black_rave_total[pos]
> I propose to use "weighted winrate" instead:
> black_weighed_rave_winrate[pos] =
> black_rave_winrate[pos] / black_rave_avg_win_move_num[pos]
> In current example, winrate for B4 is 3/6 = 0.5
> weighted winrate will be (3/6) / (117/3) = 0.0128205
> Weighted winrate will be bigger for successful moves which must played
> during simulation. Good endgame moves will have low weighted winrate.
> --- Use cases ---
> 1) Let's say we have two moves with good RAVE winrate: E5 and A4.
> A4 have bigger winrate, because A4 is inside safe territory, and each
> successful simulation have A4. E5 is critical, and must be played
> very early for result to be successful. Each simulation with E5 also
> have A4, but some simulations without E5 were also successful
> because of dumb opponent play during simulation.
> So, A4 have bigger RAVE winrate. But E5 have bigger
> weighted winrate, because A4 can be played at any time during
> simulation, and E5 must be played early, or it will be useless.
> With using of weighted RAVE winrate we can determine that
> E5 is more important then A4, despite the fact A4 have bigger
> RAVE winrate.
> 2) Let's say black must do three moves during simulation
> in order to win - B2, B3, C3, exactly in this order. Without
> this moves black cannot win the simulation.
> All of this moves have the same winrate, because the
> simulation is successful for black only if all three moves
> are played during simulations.
> So, if we use simple RAVE winrate, we can have problems
> with determination of correct move order.
> But B2 have bigger weighted winrate then B3 and C3,
> (and B3 have bigger weighted winrate then C3), because
> in all successful simulations B2 played before B3,
> and hence average move number for B2 is strictly less
> then average move number for B3 and C3. So, when using
> weighted winrate, we can determine correct move order.
> What do you think, am I missing something?
> Computer-go mailing list
> Computer-go at dvandva.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go