Hi,
I am hoping someone here can help me with a “how to”
question on running McIntyre and Blashfield’s (1980) nearest-centroid
evaluation procedure to validate the stability of my cluster analysis solution.
I am a newbie to cluster analysis, so this is my first time running this
procedure.
I have a sample of about 900 observations and have
randomly split the sample in two (Sample A and Sample B). I conducted
hierarchical cluster analysis and then calculated the centroid vectors for a
3-cluster solution on each of these two subsamples (i.e., steps 1 through 4 of
McIntrye and Blashfield’s evaluation technique).
Step 5 of McIntrye and Blashfield’s technique is to
calculate “the squared Euclidean distance for each of Sample B’s
objects from each of the centroids of Sample A,” and Step 6 is to assign
“each object in Sample B to the closest centroid vector.” At
this point, I am not sure what buttons to press in SPSS to complete the
analysis. One possibility I tried is to use K-means cluster analysis to achieve
these two steps, but K-means uses simple Euclidean distance (not squared
Euclidean distance as recommended by McIntyre and Blashfield) to assign the
observations to clusters. Is this okay? (someone told me it was, but I just
want to double-check). I would greatly appreciate any guidance on what
buttons to press in SPSS/appropriate syntax to complete steps 5 and 6 of this
analysis.
Thank you.
Liza Rovniak
Liza S. Rovniak, PhD, MPH
Adjunct Assistant Professor
Center for Behavioral Epidemiology & Community Health
Graduate School of Public Health, San Diego State University
San Diego, CA 92123
Phone: 858-505-4770, ext. 152; Fax: 858-505-8614
Email: [log in to unmask]