The issue of determining the number of clusters is less of a problem when clustering is done using maximum likelihood via latent class modeling. See for example http://www.statisticalinnovations.com/articles/lcclurev.pdf , which appeared as chapter 3 in Hagenaars and McCutcheon (eds), Applied Latent Class Analysis, Cambridge U. Press, 2002. ----- Original Message ----- From: "Art Kendall" <[log in to unmask]> To: <[log in to unmask]> Sent: Sunday, May 09, 2004 5:36 PM Subject: Re: [Help] Criteria for choosing # of clusters in hierarchical clustering > Just as there are a number of stopping rules in factor analysis which > can narrow down the the number of solutions to try to interpret, the > various stopping rules in cluster analysis can narrow down the number of > solutions to try to interpret. > > If you would like to find some idea what number of cluster would be > appropriate take a look at the TWOSTEP procedure in SPSS. However, this > approach does not have any necessary relation between the clusterings > for different numbers of cluster the way hierarchical solutions do. > (i.e., the three cluster solution is NOT the 4 cluster solution with 2 > of the groups combined. > > "The procedure produces information criteria (AIC or BIC) by numbers of > clusters in the solution, cluster frequencies for the final clustering, > and descriptive > statistics by cluster for the final clustering." > > You can specify whether you want the software to "automatically" pick a > number of cluster, "automatically" pick a number of cluster up to some > maximum number, or to find a fixed number of clusters. You can specify > Bayesian Information Criterion (BIC) or the Akaike Information Criterion > (AIC) to be used as the criterion in automatic choice of number of > clusters. > > You can use categorical or continuous variables or both. > > Hope this helps. > > Art > [log in to unmask] > Social Research Consultants > University Park, MD USA > (301) 864-5570 > > Fred wrote: > > > Dear all, > > > > I am now working on the hierarchical clustering methods, and > > confused about the following problem. > > > > As you know, to form clustering from the hierarchical tree generated by > > the pairwise distance bw the elements, we have to set a threshold value > > to cut the tree horizonally such that the vertical links intersecting with > > this horizonal critical value will be the final clusters. > > > > However, I do not find a very robust criterion for choosing the > > optimal number of clusters or calculating this threshold value to make > > the > > clustering results good different pairwise distance(similairty) measure. > > > > So any one has some point on this problem or recommended papers > > or methods? > > > > Thanks for your help. > > > > Fred > > > > > > >