CLASS-L Archives

May 2004

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Content-Transfer-Encoding:
7bit
Sender:
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Subject:
From:
Jason Niles <[log in to unmask]>
Date:
Wed, 12 May 2004 11:10:11 -0400
Content-Type:
text/plain; charset="iso-8859-1"
MIME-Version:
1.0
Reply-To:
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Parts/Attachments:
text/plain (77 lines)
The issue of determining the number of clusters is less of a problem when
clustering is done using maximum likelihood via latent class modeling.  See
for example http://www.statisticalinnovations.com/articles/lcclurev.pdf ,
which appeared as chapter 3 in  Hagenaars and McCutcheon (eds), Applied
Latent Class Analysis, Cambridge U. Press, 2002.

----- Original Message -----
From: "Art Kendall" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Sunday, May 09, 2004 5:36 PM
Subject: Re: [Help] Criteria for choosing # of clusters in hierarchical
clustering


> Just as there are a number of stopping rules in factor analysis which
> can narrow down the the number of solutions to try to interpret, the
> various stopping rules in cluster analysis can narrow down the number of
> solutions to try to interpret.
>
> If you  would like to find some idea what number of cluster would be
> appropriate take a look at the TWOSTEP procedure in SPSS.  However, this
> approach does not have any necessary relation between the clusterings
> for different numbers of cluster the way hierarchical solutions do.
> (i.e., the three cluster solution is NOT the 4 cluster solution with 2
> of the groups combined.
>
> "The procedure produces information criteria (AIC or BIC) by numbers of
> clusters in the solution, cluster frequencies for the final clustering,
> and descriptive
> statistics by cluster for the final clustering."
>
> You can specify whether you want the software to "automatically" pick a
> number of cluster, "automatically" pick a number of cluster up to some
> maximum number, or to find a fixed number of clusters. You can specify
> Bayesian Information Criterion (BIC) or the Akaike Information Criterion
> (AIC)  to be used as the criterion in automatic choice of number of
> clusters.
>
> You can use categorical or continuous variables or both.
>
> Hope this helps.
>
> Art
> [log in to unmask]
> Social Research Consultants
> University Park, MD  USA
> (301) 864-5570
>
> Fred wrote:
>
> > Dear all,
> >
> > I am now working on the hierarchical clustering methods, and
> > confused about the following problem.
> >
> > As you know, to form clustering from the hierarchical tree generated by
> > the pairwise distance bw the elements, we have to set a threshold value
> > to cut the tree horizonally such that the vertical links intersecting
with
> > this horizonal critical value will be the final clusters.
> >
> > However, I do not find a very robust criterion for choosing the
> > optimal number of clusters or calculating this threshold value to make
> > the
> > clustering results good different pairwise distance(similairty) measure.
> >
> > So any one has some point on this problem or recommended papers
> > or methods?
> >
> > Thanks for your help.
> >
> > Fred
> >
> >
> >
>

ATOM RSS1 RSS2