All,

 

I’m a first time poster. I have data on coping strategies used by couples undergoing infertility treatment. I have created clusters of the coping strategies keeping male and female scores separate. There are 4 coping scores, based on composite scores of 4 subscales (active-avoidance, active-confronting, passive-avoidance, meaning-based). So I have 8 variables in my cluster analysis. I’ve started with Hierarchical clustering using Ward’s method and squared Euclidean distance. I then used those cluster centers as the starting centers for a k-means cluster analysis. Based on my dendrogram from the hierarchical analysis and the clinical interpretability of the k-means solutions I arrived at a 5 cluster solution. These cluster’s predict well a number of outcome variables, such as stress. These predictions are well in line with theory and previous research. That’s the external validity.

 

I then went to validate the clusters using the average silhouette. I’ve tested all solutions between 2 and 12 clusters and my average silhouette is never greater than .4. I’ve tried different clustering methods and different distance measures, with the same results. The highest average silhouette I get is when I multiply men and women’s scores. I’ve seen this done before, but I’m not sure how to interpret the resulting scores. Any ideas? And that solution was only for 2 clusters.

 

So, is it still possible that could still discuss the original 5 cluster solution despite not finding good separation and cohesion with the average silhouette? Is all lost, or is there a way to save the situation?

 

Any help is much appreciated. Please let me know if you need more info or if I’ve violated any list protocol.

 

Thanks

Matt

 

 

---------------------------------------------- CLASS-L list. Instructions: http://www.classification-society.org/csna/lists.html#class-l