[computer-go] The effect of the UCT-constant on Valkyria

Magnus Persson magnus.persson at phmp.se
Sat May 3 09:48:20 PDT 2008


Quoting David Fotland <fotland at smart-games.com>:

> So I'm curious then.  With simple UCT (no rave, no priors, no progressive
> widening), many people said the best constant was about 0.45.  What are the
> new concepts that let you avoid the constant?
>
> Is it RAVE, because the information gathered during the search lets you
> focus the search accurately without the UCT term?  Many people have said
> that RAVE has no benefit for them.

Yes, it is RAVE, and mor specifil as it was last presented here  
recently in the mailing list by the Mogo team, and not how it is was  
originally presented in the mogo paper. Also there may be several  
minor details that are peculiar to my implementation. Actually I did  
not understand some aspects of the Mogo method mailed here and just  
guessed some details. It suddenly worked and I could feel that the  
search was unusually strong and selective, and since then I just  
adjusted some parameters.

I used to do progressive widening but that is now turned off. RAVE is  
free to pick any move that is not pruned right away.

Currently I believe that RAVE is only effective if one gets other  
parameters right. For me it meant changing the uct parameter from 0.8  
into 0.1. I also know of many pathological situations where Valkyria  
currently will not find the best move, but rather the second best. It  
is possible that other programs suffers even more than Valkyria from  
similar problems and that this to some extent has to do with that the  
nature of the playouts may interfere with AMAF. For example V either  
plays forced moves or uniformly random among moves that are not  
pruned. Other programs may rely on patterns to pick all moves in the  
playouts and this might be bad for AMAF (this is a wild speculation).

> Do most of the strongest programs use RAVE?  I think from Crazystone's
> papers, that it does not use RAVE.  Gnugomc does not use rave.

You might not need it if you have strong pattern matching priors for  
the tree part similar to Crazystone. RAVE makes it possible to ignore  
most bad moves in a given positions. The weakness is that often some  
good (with a chance of being the best possible move) are also ignored  
completely.

> Is it the prior values from go knowledge, like opening books, reading
> tactics before the search etc?  Do all of the top programs have opening
> books now?  I know mogo does.

Valkyria has just 4 moves in a hardcoded openingbook. Previous  
versions used a book with several 1000's of positions that was both  
self learned and modified by hand, but as long as the program changes  
the book tend become inaccurate, so right now I do not use it and is  
planning to write something more efficient than the old one which kept  
each position as file on the harddrive.

> Do most of the top programs read tactics before the search?  I know Aya
> does.

Valkyria only does some simple tactics in the playouts. It is stronger  
than anything I ever programmed (on 9x9 at least) so currently I  
cannot see how to integrate precomputed tactical results in the later  
search. I think Aya is special because it was very strong doing search  
before it went MC.

> Does it matter how prior values are used to guide the search?  I think mogo
> uses prior knowledge to initialize the RAVE values.  Do other programs
> include it some other way, by initializing the FPU value, or by initializing
> the UCT visits and confidence, or some extra, "prior" term in the equation?

Right know Valkyria sets priors for AMAF so that moves that are a good  
local response to the last move have a prior 100% winrate with 20-100  
visits depending on the priority of the triggered pattern. I think  
Mogo has a fixed number of visits for the priorities but modifies the  
winrate, but I never saw this described in a way that made it clear.

Previously I biased the UCT values after everyting else was computed  
but found that this led to some bad behavior. By biasing the AMAF  
values these biases will get less influential as the true winrate has  
more weight than the AMAF-scores.


> Are there other techniques (not RAVE) that people are using to get
> information from the search to guide the move ordering?  I think crazystone
> estimates ownership of each point and uses it to set prior values in some
> way.

I used to do that long time ago in Viking (the precursor to Valkyria)  
that used alphabeta + MC-eval. As I remember it then it had a great  
impact on move ordering that was quite bad (or even nonexistent) for  
Viking.

I have tried it in Valkyria but was never able to see an improvement.  
But I did not try hard enough to tell for sure. Both ownership and  
AMAF use the same information (playouts), so trying to use it twice is  
perhaps partially a waste of effort.

-Magnus



More information about the computer-go mailing list