No, you do not have a metric. You have no idea how each subject has
mentally scaled the values between a 1 and a 9. You cannot construct a
Euclidean distance measure from these values. Whatever you decide to do,
you may be disappointed in the results. It is possible for a subject to
rate Concepts 1 and 2 as very similar; Concepts 2 and 3 as very similar; and
Concepts 1 and 3 as not very similar.
If you construct a frequency table for each of your k(k1)/2=435 possible
pairings, with the values 19 as one dimension and the 435 pairs as the
other, with entries the number of responses for each value 19 for each
pair, you can quickly visually scan and find strange pairings.
You may wish either to rethink your methodology or to consider some
nonEuclidean form of multidimensional scaling (see Kruskal's work).
You are in the U.K. You should have easy access to the Clustan clustering
programs; also see online Statsoft textbook. I am not familiar with SPSS or
SAS implementations.
Be careful in defining your proximities as "similarities" (9=least similar
and 1=most similar) or "dissimilarities" (9=most similar and 1=least
similar). No, it is not O.K. to allow your computer program to construct
another similarity matrix. Your subjects have already done so. Your
proximity matrix should be a kxk concepts matrix, not an NxN subjects matrix
(your are clustering the concepts, not the people; the people are the
"variables" in this instance).
>From: Ufuk Yildirim <[log in to unmask]>
>ReplyTo: "Classification, clustering, and phylogeny estimation"
> <[log in to unmask]>
>To: [log in to unmask]
>Subject: Help on HCA and MDS
>Date: Tue, 10 Jun 2003 11:57:57 +0100
>
>Hi everyone,
>I have couple of questions on (hierarchical) cluster analysis and
>Multidimensional scaling. As part of my research, I collected data using a
>method called 'similarity rating' on a scale of 1 to 9. There are 30
>variables (30 concepts from physics to be exact). I want to find out how
>people organise these concepts. The software I am using is SPSS 11, because
>SPSS is the only one I know how to use and one of the two statistical
>packages available in university computers (I think the other one is SAS).
>I should add that I am not very familiar with the theoretical background of
>these analyses, though trying my best to get as much information as I
>can/need. For example, I have been reading a lot lately on MDS and HCA, but
>I still do not know what the basic assumptions are for MDS and HCA. I need
>to find a good book which explains things conceptually, with little
>mathematical notation.
>Now my real problem, as I enter the data in SPSS, I use the subjects'
>ratings of the pairwise similarities for the 30 concepts. I want to know
>which of these is the appropriate statistical analysis for my analysis. I
>am confused with the metric/nonmetric distinction. My data is nonmetric I
>think. Can I use HCA with nonmetric data? If I can, and if HCA is
>appropriate, what is the best method? Ward's? Betweengroups linkage? or
>withingroups linkage? etc. Since my original data is already a proximity
>matrix (or at least I think it is), what HCA is doing seems to be wrong. It
>tries to create proximity matrix again. Is this ok? When I run the analysis
>as it is, it seem fine, but when I change the syntax so that it uses the
>original data matrix in /MATRIX IN ('filename.sav'), a totally different
>clustering is produced. Which one is correct? Is there a clearly written
>book on multivariate analysis using SPSS?
>
>For MDS, I have similar problem. What are the things I need to do to get a
>clear picture of how people organise these 30 concepts. Because stress
>value with low dimensions is quite law, I have to increase the number of
>dimensions. By the way in SPSS results, there a lot of stress values:
>normalized raw stress, StressI, StressII and SStress. Which of these
>should I use to interpret my results? Also, what are "Dispersion Accounted
>For (D.A.F.)" and "Tucker's Coefficient of Congruence" used for? What is
>the difference between Simplex and Torgerson in initial configuration
>options?
>
>I know this is a lot, but as I mentioned earlier there isn't any book on
>multivariate statistics using SPSS as far as I know. Many books on
>multivariate statistics explain things to make life more difficult. If you
>could help me, I would be very happy.
>
>Thank you very much for your interest and help in advance.
>
>Sincerely,
>
>Ufuk YILDIRIM
_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail
