[computer-go] cgos

Łukasz Lew lukasz.lew at gmail.com
Mon Sep 18 11:51:54 PDT 2006


On 9/18/06, Don Dailey <drd at mit.edu> wrote:
> CGOS has 2 numbers associated with each player,  a rating and an
> uncertainty value.   I'm probably doing the same  thing "TrueSkill" is
> doing.

If You are referring to K as the uncertainty measure, then there are
plenty of such systems, some of them created ad hoc, some of them
created on statistical analysis like ELO
(but K factor is added ad hoc).

I advise True Skill because:
- it is heavily tested
- it is developed on Microsoft Research and used in XBox  Live Console.
- evaluation of both rating and uncertainty are theoretically supported
   while in ELO only rating updating is based on theoretical model
- TrueSkill site gives explicit equations for a case of two player
game rating updates, so
  it should be relatively straight forward to implement it.



>
> The uncertainty probably changes too fast, and I can improve the early
> rating estimates significantly - I will make those improvements in the
> next CGOS.
>
> I could fix some things now - but I have too much to do and I want to
> focus the time I spend for this on the new CGOS.    I probably will make
> the one change you request, to show ALL the matches in the cross-tables.

That is so great for me!

BTW
I want to support a feature request of sending opponent name and version by GTP.


>
> I get a lot of requests, usually by private email to change things and
> people don't realize this.  The requests are often conflicting - a lot
> of this is a matter of personal taste and judgment.
>
> The changes I make will improve the rating drift situation.  But even
> the current CGOS will eventually correct itself - it's just a little
> sluggish at doing so.   This will be improved with better ways of
> getting initial rating estimates in the new CGOS.

I'm afraid that new versions of players may appear to fast.

Moreover the drift itself is not a big problem. The problem is its effect that
ratings of programs playing in different environments are incomparable.

For instance ZG1bot-MC-100k escaped just after the deflation started
and was affected
only slightly. Stronger ZG1bot-MC-200k played during those happy
deflation times and got
rating lower than 100k version.

Probably Valkyria UCT3 vs UCT4 is experiencing similar problem.
This way the value of CGOS as a tool for evaluation of programs and
motivation of programmers is diminishing. Especially for strong
programs where Anchor has no direct effect.

I vote on fixing rating of some strong gnugo version, but probably CPU
is a problem.

Lukasz

>
> - Don
>
>
> On Mon, 2006-09-18 at 15:11 +0200, Łukasz Lew wrote:
> > Another solution is to implement TrueSkill rating system.
> > The main difference is that it has two numbers per player
> > - it's strength and uncertainty about it.
> > This way MoGo, Valkyria_UCT3/4 etc would have still large uncertainty
> > what would solve both problems: would grow faster to the "destination
> > rating" and would not drain points from their opponents so badly until
> > they would get to the destination.
> > increase their rating faster, and not
> >
> >
>
>


More information about the computer-go mailing list