[Computer-go] Practical significance?
dailey.don at gmail.com
Mon Nov 26 15:09:59 PST 2012
On Mon, Nov 26, 2012 at 3:46 PM, Mark Boon <tesujisoftware at gmail.com> wrote:
> I would imagine that "practical significance" depends on the absolute
> level. Between two beginners 23% means little as it can be overtaken
> by a day's study. Between top-professionals it probably means the
> difference between a legendary 9p winning many top-title tournaments
> and a 9p who never wins a top title in his life.
True, but you are making a statement about the stability of the rating or
strength of the player, I am assuming a reliable and stable rating
difference, not an ELO guess. You would not be able to claim a practical
superiority over anyone if you only played 5 or 10 games in your life as a
But I do get your point - it's a matter of perception in the case of a long
time pro but in the case of a beginner, even if the superiority is the
same it is much more subject to change over time.
Another way to see this, is that if you are are a 40 year old 2 Dan player
and you have even chances against an 8 year old prodigy, he is already
going to perceived as the superior player because he surely will be within
a few weeks or months.
> On Mon, Nov 26, 2012 at 4:13 AM, Don Dailey <dailey.don at gmail.com> wrote:
> > On Mon, Nov 26, 2012 at 4:05 AM, "Ingo Althöfer" <3-Hirn-Verlag at gmx.de>
> > wrote:
> >> One general comment:
> >> Ratings are not transitive. For instance,
> >> A1 may score 25 % against B,
> >> and A2 may score 22 % against B.
> >> Then it can not be concluded that A1 will score more than 50 %
> >> in direct duel with A2.
> >> It is rather easy it construct triples of "semi-simple" agents A, B, C
> >> for some "normal" game where
> >> A score 95+ percent against B,
> >> B scores 95+ percent against C,
> >> C scores 95+ percent against A.
> > Hi Ingo,
> > The ELO system which tries to model game playing skill mathematically
> > some assumptions that are not completely true, but are approximations to
> > the reality. One assumption made by the ELO system is that skill IS
> > transitive. It works quite well because in practice human skill and
> > program skill is nearly transitive. So it has proven to be a very
> > model indeed.
> > As you say it is not difficult to artificially construct classes of
> > who do not have transitive relationships between each other. One very
> > simple way to do this is to take 3 equal players, and give them each a
> > different opening book such that the book will get them quickly into
> > or winning situations against each other. You can create your own
> > "rocks/paper/scissors" non-transitive relationship this way.
> > You can also do it with the playing algorithm but it's a bit more
> > but certainly possible. You give one program a serious weakness that
> > of the other 2 can easily exploit but that the other program cannot
> > - so each program has a unique exploitable weakness that only one of the
> > other 2 programs can exploit.
> > Don
> >> Ingo.
> >> -------- Original-Nachricht --------
> >> > Datum: Sun, 25 Nov 2012 17:03:33 -0800
> >> > Von: Leandro Marcolino <sorianom at usc.edu>
> >> > An: computer-go at dvandva.org
> >> > Betreff: [Computer-go] Practical significance?
> >> > Hello all!..
> >> >
> >> > I am currently doing a research about Computer Go. I can't tell the
> >> > details
> >> > about it yet, but I will post them here after (if) my paper is
> >> > accepted...
> >> >
> >> > In my research I compare many systems (An), playing against a fixed
> >> > strong
> >> > adversary (B). So A1 would have a percentage of victory x1 against B,
> >> > while
> >> > A2 would have a percentage of victory x2, etc... Then I compare the
> >> > percentage of victories, and for most cases I can show that one system
> >> > is
> >> > better than another with 95% of confidence. However, my adviser is
> >> > asking
> >> > me about not only the STATISTICAL significance of the results, but
> >> > the
> >> > PRACTICAL significance of them. I mean, if one system is, for example
> >> > only
> >> > 1% better than another, with 99% of confidence, the result would have
> >> > statistical significance, but wouldn't really matter in a practical
> >> > sense.
> >> >
> >> > In my case, the difference between the systems can range from about 4%
> >> > to
> >> > about 23%. Doesn't seem to be enough to argue that one system would be
> >> > one-handicap stone better than another. But what would be the minimum
> >> > difference for me to argue that one system is significantly better
> >> > another, in a practical sense? (or they are not, in the end?..) Would
> >> > calculating ELO-ratings help me in answering this question?
> >> >
> >> > I think it gets even more complex if we think that, let's say,
> >> > the
> >> > percentage of victory from 95% to 100% seems to be much more
> >> > (in a practical sense) than changing from 30% to 35%, even though the
> >> > difference between the two systems is still only 5%. In my case, I am
> >> > dealing with percentages of victories that range from around 30% to
> >> > around
> >> > 53%.
> >> >
> >> > What do you guys think?..
> >> >
> >> > Thanks for your help!..
> >> >
> >> > Regards,
> >> > Leandro
> >> _______________________________________________
> >> Computer-go mailing list
> >> Computer-go at dvandva.org
> >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> > _______________________________________________
> > Computer-go mailing list
> > Computer-go at dvandva.org
> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> Computer-go mailing list
> Computer-go at dvandva.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Computer-go