CLASS-L Archives

September 2003


Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Bob Green <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Sun, 21 Sep 2003 07:37:17 +1000
text/plain (24 lines)
I am interested in the question of whether pooling data from the same
individuals into a single variable which would violate the assumption of
the independence of observations in multiple regression, is problematic in
cluster analysis.

Briefly, I have data collected at baseline and 4 time points asking whether
someone smoked and the reasons why. Any individual might give 1-3
responses, which could range from a single word to a sentence. These
open-ended responses have been coded by coders. There are therefore 5 time
periods x potentially 3 responses.

I have received advice that it is acceptable to pool this data into 1
variable and have run the analysis using the cluster option in a content
analysis software program and the results were both interpretable and made
sense (the analysis was performed using the default options of a similarity
matrix, average linkage and the Jaccard coefficient) .  However, my
readings and enquiries to date have not been of much assistance in
providing substantiative support for this approach.  Any advice or
references in relation to this question is appreciated,


Bob Green