[computer-go] cgos
Łukasz Lew
lukasz.lew at gmail.com
Mon Sep 18 11:51:54 PDT 2006
On 9/18/06, Don Dailey <drd at mit.edu> wrote:
> CGOS has 2 numbers associated with each player, a rating and an
> uncertainty value. I'm probably doing the same thing "TrueSkill" is
> doing.
If You are referring to K as the uncertainty measure, then there are
plenty of such systems, some of them created ad hoc, some of them
created on statistical analysis like ELO
(but K factor is added ad hoc).
I advise True Skill because:
- it is heavily tested
- it is developed on Microsoft Research and used in XBox Live Console.
- evaluation of both rating and uncertainty are theoretically supported
while in ELO only rating updating is based on theoretical model
- TrueSkill site gives explicit equations for a case of two player
game rating updates, so
it should be relatively straight forward to implement it.
>
> The uncertainty probably changes too fast, and I can improve the early
> rating estimates significantly - I will make those improvements in the
> next CGOS.
>
> I could fix some things now - but I have too much to do and I want to
> focus the time I spend for this on the new CGOS. I probably will make
> the one change you request, to show ALL the matches in the cross-tables.
That is so great for me!
BTW
I want to support a feature request of sending opponent name and version by GTP.
>
> I get a lot of requests, usually by private email to change things and
> people don't realize this. The requests are often conflicting - a lot
> of this is a matter of personal taste and judgment.
>
> The changes I make will improve the rating drift situation. But even
> the current CGOS will eventually correct itself - it's just a little
> sluggish at doing so. This will be improved with better ways of
> getting initial rating estimates in the new CGOS.
I'm afraid that new versions of players may appear to fast.
Moreover the drift itself is not a big problem. The problem is its effect that
ratings of programs playing in different environments are incomparable.
For instance ZG1bot-MC-100k escaped just after the deflation started
and was affected
only slightly. Stronger ZG1bot-MC-200k played during those happy
deflation times and got
rating lower than 100k version.
Probably Valkyria UCT3 vs UCT4 is experiencing similar problem.
This way the value of CGOS as a tool for evaluation of programs and
motivation of programmers is diminishing. Especially for strong
programs where Anchor has no direct effect.
I vote on fixing rating of some strong gnugo version, but probably CPU
is a problem.
Lukasz
>
> - Don
>
>
> On Mon, 2006-09-18 at 15:11 +0200, Łukasz Lew wrote:
> > Another solution is to implement TrueSkill rating system.
> > The main difference is that it has two numbers per player
> > - it's strength and uncertainty about it.
> > This way MoGo, Valkyria_UCT3/4 etc would have still large uncertainty
> > what would solve both problems: would grow faster to the "destination
> > rating" and would not drain points from their opponents so badly until
> > they would get to the destination.
> > increase their rating faster, and not
> >
> >
>
>
More information about the computer-go
mailing list