CLASS-L Archives

May 2004

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Art Kendall <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Sun, 9 May 2004 17:36:52 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (61 lines)
Just as there are a number of stopping rules in factor analysis which
can narrow down the the number of solutions to try to interpret, the
various stopping rules in cluster analysis can narrow down the number of
solutions to try to interpret.

If you would like to find some idea what number of cluster would be
appropriate take a look at the TWOSTEP procedure in SPSS. However, this
approach does not have any necessary relation between the clusterings
for different numbers of cluster the way hierarchical solutions do.
(i.e., the three cluster solution is NOT the 4 cluster solution with 2
of the groups combined.

"The procedure produces information criteria (AIC or BIC) by numbers of
clusters in the solution, cluster frequencies for the final clustering,
and descriptive
statistics by cluster for the final clustering."

You can specify whether you want the software to "automatically" pick a
number of cluster, "automatically" pick a number of cluster up to some
maximum number, or to find a fixed number of clusters. You can specify
Bayesian Information Criterion (BIC) or the Akaike Information Criterion
(AIC) to be used as the criterion in automatic choice of number of
clusters.

You can use categorical or continuous variables or both.

Hope this helps.

Art
[log in to unmask]
Social Research Consultants
University Park, MD USA
(301) 864-5570

Fred wrote:

> Dear all,
>
> I am now working on the hierarchical clustering methods, and
> confused about the following problem.
>
> As you know, to form clustering from the hierarchical tree generated by
> the pairwise distance bw the elements, we have to set a threshold value
> to cut the tree horizonally such that the vertical links intersecting with
> this horizonal critical value will be the final clusters.
>
> However, I do not find a very robust criterion for choosing the
> optimal number of clusters or calculating this threshold value to make
> the
> clustering results good different pairwise distance(similairty) measure.
>
> So any one has some point on this problem or recommended papers
> or methods?
>
> Thanks for your help.
>
> Fred
>
>
>

ATOM RSS1 RSS2