Hi.
In my master thesis I have to compare two clusterings. There are a
number of measures one can use to do this, the most common probably
being Rand index. The major problem I have with Rand index is that I
don't know what is the critical value I should compare it to. Some
people suggest 0.7 as a rule of thumb without any theoretical
explanation.
I found a paper
http://www.quantlet.de/scripts/compstat2002_wh/paper/full/C_06_saporta.pdf
which claims that Rand index is asymptotically normal and they have a
formula for variance, which produces weird results in my case (sample
of size 2000, 75 classes).
Can somebody give me some tips on the subject ?
--
Alexander Sirotkin