well, basically it is up to you to decide what for your application the
most important features are.
Clusters with bad cohesion and separation may well still be useful, but
whether they are depends on what you want to use them for.
If *you* think that cohesion and separation are required here, that's bad
news. But if you don't, why worry about them?
You have all the data analytic information in place and I don't have
objections against your interpretation of it (one may do more but whether
this makes sense again depends on what the clusters are used for and how
they should be interpreted). But the responsibility to decide what is
required in your area and in this situation is yours.
I guess you want to know something like whether your clusters are "real".
Well, that's a question to which there is no proper answer, because it
crucially depends on what is meant by a "real" cluster, and this again
depends your field.
On Wed, 21 Sep 2011, Matthew Pirritano wrote:
> I'm a first time poster. I have data on coping strategies used by couples
> undergoing infertility treatment. I have created clusters of the coping
> strategies keeping male and female scores separate. There are 4 coping
> scores, based on composite scores of 4 subscales (active-avoidance,
> active-confronting, passive-avoidance, meaning-based). So I have 8 variables
> in my cluster analysis. I've started with Hierarchical clustering using
> Ward's method and squared Euclidean distance. I then used those cluster
> centers as the starting centers for a k-means cluster analysis. Based on my
> dendrogram from the hierarchical analysis and the clinical interpretability
> of the k-means solutions I arrived at a 5 cluster solution. These cluster's
> predict well a number of outcome variables, such as stress. These
> predictions are well in line with theory and previous research. That's the
> external validity.
> I then went to validate the clusters using the average silhouette. I've
> tested all solutions between 2 and 12 clusters and my average silhouette is
> never greater than .4. I've tried different clustering methods and different
> distance measures, with the same results. The highest average silhouette I
> get is when I multiply men and women's scores. I've seen this done before,
> but I'm not sure how to interpret the resulting scores. Any ideas? And that
> solution was only for 2 clusters.
> So, is it still possible that could still discuss the original 5 cluster
> solution despite not finding good separation and cohesion with the average
> silhouette? Is all lost, or is there a way to save the situation?
> Any help is much appreciated. Please let me know if you need more info or if
> I've violated any list protocol.
> CLASS-L list.
> Instructions: http://www.classification-society.org/csna/lists.html#class-l
*** --- ***
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[log in to unmask], www.homepages.ucl.ac.uk/~ucakche