CLASS-L Archives

October 2000


Options: Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Clare Guse <[log in to unmask]>
Fri, 13 Oct 2000 14:08:50 -0500
text/plain; charset="us-ascii"; format=flowed
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
text/plain (58 lines)
Hi all,

Thanks to everyone who responded to my query.

For anyone who may be interested, I will try to summarize the responses I
have gotten to my original message (shown below).  I make no assertions as
to the appropriateness of any of these suggestions.  Here is some
additional background on the study:  males and a small number of females
were asked to rate on a 5 point scale (1 = never, 5 = always) their
response to a partner using violence against them.  There are three
behavioral responses: stop their aggression, increase their aggression,
laugh at partner's effort, and 5 feelings: angry, afraid, amused, insulted
and threatened.  The most highly correlated are laugh and amused.  The
sample size is 61.

The reason I'm concerned about correlation is that Aldenderfer and
Blashfield (1984) state that using highly correlated variables is implicit
weighting.  However, they don't define "high" correlation.

1) use Mahalanobis distance
2) do principal components analysis first
3) drop one of the correlated pair
4) replace the most highly correlated pair with their average
5) it's not a problem

I think that I may end up doing 3 and/or 4 and comparing the results to
including all the variables.  I'm reluctant to use principal components
since I'm not familiar with the technique and it would seem to complicate
final interpretation.  My reading of Aldenderfer and Blashfield (1984)
would suggest that using Mahalanobis distance would be a good way to handle
this situation, but unfortunately I don't have that option in the
statistical software that I am using.


 >I am beginning to perform a cluster analysis with 7 variables reflecting
a subject's behaviors and >feelings in reaction to a partner's use of
violence against them.  However, some of these variables are >correlated
(correlations range from 0.010 to 0.696) and I'm not sure how to handle
this situation.  What >level of correlation is a problem?  Should one of
the pair of correlated variables be removed from >consideration and, if so,
how does one choose?

Clare Guse, MS
Dept. of Family & Community Medicine
Division of Research
Center for Practice-Based Research (CPBR)
Medical College of Wisconsin
[log in to unmask]
[log in to unmask]
414-456-6522  (FAX)