CLASS-L Archives

September 2008

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Stephen McTaggart <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Thu, 4 Sep 2008 15:08:19 +1200
Content-Type:
multipart/alternative
Parts/Attachments:
text/plain (3152 bytes) , text/html (6 kB)
Hi, I am using correspondence analysis to examine degrees of homogamy/social distance in society using  the occupations of husbands and  wives as markers of social position. I have done this over five historical points using New Zealand census data (1981-2001). I'm using the dimension scores  (1 and 2) of the CA process to achieve a ranked scale of homogamy/social interaction . It is expected that the order of the  ranking will be similar to that of the ranking of occupations in the issco model . This indeed is the case with three of the time periods. At two points in time however this ranking is inverted. Has anyone got tips on how to  explain this/switch this around? I believe that the 'best fit model' in correspondence analysis can be  a little nebulous. Greenacre talks about 'rotating the axis.' Will this work and how might I do this in SAS?
Any help will be useful.
Cheers, Stephen

________________________________
From: Classification, clustering, and phylogeny estimation [mailto:[log in to unmask]] On Behalf Of Liza Rovniak
Sent: Thursday, 4 September 2008 10:40 a.m.
To: [log in to unmask]
Subject: cluster analysis validation technique

Hi,

I am hoping someone here can help me with a "how to" question on running McIntyre and Blashfield's (1980) nearest-centroid evaluation procedure to validate the stability of my cluster analysis solution. I am a newbie to cluster analysis, so this is my first time running this procedure.

I have a sample of  about 900 observations and have randomly split the sample in two (Sample A and Sample B). I conducted hierarchical cluster analysis and then calculated the centroid vectors for a 3-cluster solution on each of these two subsamples (i.e., steps 1 through 4 of McIntrye and Blashfield's evaluation technique).

Step 5 of McIntrye and Blashfield's technique is to calculate "the squared Euclidean distance for each of Sample B's objects from each of the centroids of Sample A," and Step 6 is to assign "each object  in Sample B to the closest centroid vector." At this point, I am not sure what buttons to press in SPSS to complete the analysis. One possibility I tried is to use K-means cluster analysis to achieve these two steps, but K-means uses simple Euclidean distance (not squared Euclidean distance as recommended by McIntyre and Blashfield) to assign the observations to clusters. Is this okay? (someone told me it was, but I just want to double-check).  I would greatly appreciate any guidance on what buttons to press in SPSS/appropriate syntax to complete steps 5 and 6 of this analysis.

Thank you.

Liza Rovniak

Liza S. Rovniak, PhD, MPH
Adjunct Assistant Professor
Center for Behavioral Epidemiology & Community Health
Graduate School of Public Health, San Diego State University
San Diego, CA 92123
Phone: 858-505-4770, ext. 152; Fax: 858-505-8614
Email: [log in to unmask]

---------------------------------------------- CLASS-L list. Instructions: http://www.classification-society.org/csna/lists.html#class-l

----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l


ATOM RSS1 RSS2