Subject: | |
From: | |
Reply To: | Classification, clustering, and phylogeny estimation |
Date: | Mon, 22 Sep 2003 10:18:30 -0400 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
There are several different ways to explore this kind of data.
I would base the within cases vector of variables on the set of possible
responses (a set of multiple dichotomies)
To start I would try leaving each time point as a separate case for the
cluster analyses. I would then also explore (cluster) each subset of
cases and crosstab the results to look for consistency.
Hope this helps.
Art
[log in to unmask]
Social Research Consultants
University Park, MD USA
(301) 864-5570
Bob Green wrote:
> I am interested in the question of whether pooling data from the same
> individuals into a single variable which would violate the assumption of
> the independence of observations in multiple regression, is problematic in
> cluster analysis.
>
> Briefly, I have data collected at baseline and 4 time points asking whether
> someone smoked and the reasons why. Any individual might give 1-3
> responses, which could range from a single word to a sentence. These
> open-ended responses have been coded by coders. There are therefore 5 time
> periods x potentially 3 responses.
>
> I have received advice that it is acceptable to pool this data into 1
> variable and have run the analysis using the cluster option in a content
> analysis software program and the results were both interpretable and made
> sense (the analysis was performed using the default options of a similarity
> matrix, average linkage and the Jaccard coefficient) . However, my
> readings and enquiries to date have not been of much assistance in
> providing substantiative support for this approach. Any advice or
> references in relation to this question is appreciated,
>
> regards
>
> Bob Green
>
|
|
|