[computer-go] Odd results on 19x19

Rémi Coulom Remi.Coulom at univ-lille3.fr
Sun Jan 6 10:24:41 PST 2008


David Fotland wrote:
> The styles of CS (CS-9-17-10k-1CPU), MFGO (mfgo12exp-15), and GNUGO
> (gnugo3.7.10_10) are different, and it's generating some odd results.
>
> Many Faces beats GnuGo 70%.  There are not many games, but this is
> consistent with over 100 test games I've run.
> CS beats GnuGo 55%.  Over 100 games played.
> CS beats Many Faces 90%.  Only 20 games, but consistent with earlier
> results.
>
> If we look at results against GnuGo, Many Faces seems stronger than CS, but
> in games against CS, Many Faces is much weaker.
>
> Many Faces plays a fighting style, and CS plays a territorial style, but I'm
> still surprised at the difference.
>
> David
>
> _______________________________________________
> computer-go mailing list
> computer-go at computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>   

I noticed that too. My feeling is that is because MF is a classical 
program with a global search, GNU a classical program with no global 
search, and Crazy Stone a MC program. MF beats GNU thanks to global 
search. But MF's strength without the global search (whatever that would 
mean) is inferior to that of GNU. CS also has a global search, so MF's 
global-search advantage does not work against CS.

I guess that KCC Igo had the same problem as MF against Crazy Stone.

I thought about a model for multi-dimensional Elo ratings once (don't 
give only one value to each player, but two or three, with an 
appropriate formula for predicting game outcome). Maybe I'll try it on 
CGOS data when I have time. This would not rate players along a 
one-dimensional line. Here is a reference to a similar idea:

http://dx.doi.org/10.1016/j.jspi.2004.05.008


      Abstract

The Bradley–Terry model is widely and often beneficially used to rank 
objects from paired comparisons. The underlying assumption that makes 
ranking possible is the existence of a latent linear scale of merit or 
equivalently of a kind of transitiveness of the preference. However, in 
some situations such as sensory comparisons of products, this assumption 
can be unrealistic. In these contexts, although the Bradley–Terry model 
appears to be significantly interesting, the linear ranking does not 
make sense. Our aim is to propose a 2-dimensional extension of the 
Bradley–Terry model that accounts for interactions between the compared 
objects. From a methodological point of view, this proposition can be 
seen as a multidimensional scaling approach in the context of a logistic 
model for binomial data. Maximum likelihood is investigated and 
asymptotic properties are derived in order to construct confidence 
ellipses on the diagram of the 2-dimensional scores. It is shown by an 
illustrative example based on real sensory data on how to use the 
2-dimensional model to inspect the lack-of-fit of the Bradley–Terry model.

Rémi



More information about the computer-go mailing list