All,

 

I'm a first time poster. I have data on coping strategies used by couples
undergoing infertility treatment. I have created clusters of the coping
strategies keeping male and female scores separate. There are 4 coping
scores, based on composite scores of 4 subscales (active-avoidance,
active-confronting, passive-avoidance, meaning-based). So I have 8 variables
in my cluster analysis. I've started with Hierarchical clustering using
Ward's method and squared Euclidean distance. I then used those cluster
centers as the starting centers for a k-means cluster analysis. Based on my
dendrogram from the hierarchical analysis and the clinical interpretability
of the k-means solutions I arrived at a 5 cluster solution. These cluster's
predict well a number of outcome variables, such as stress. These
predictions are well in line with theory and previous research. That's the
external validity.

 

I then went to validate the clusters using the average silhouette. I've
tested all solutions between 2 and 12 clusters and my average silhouette is
never greater than .4. I've tried different clustering methods and different
distance measures, with the same results. The highest average silhouette I
get is when I multiply men and women's scores. I've seen this done before,
but I'm not sure how to interpret the resulting scores. Any ideas? And that
solution was only for 2 clusters.

 

So, is it still possible that could still discuss the original 5 cluster
solution despite not finding good separation and cohesion with the average
silhouette? Is all lost, or is there a way to save the situation?

 

Any help is much appreciated. Please let me know if you need more info or if
I've violated any list protocol.

 

Thanks

Matt

 

 


----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l