CLASS-L Archives

June 2005

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Wed, 29 Jun 2005 13:17:30 -0500
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (115 lines)
Let me restate the problem -- I am thinking there is an easier problem
then what I have been receiving -- though thank you for those.

We have two dendrograms on the same objects and want to find stopping
rules to cut each such thazt the overlap in the partitions is maxium. My
thought (which may make this a more complicated problem) was to generate a
contingency table of the two partitions and reorder the table to put the
largest intersection counts on the diagonal.


Is there a simpler rule to identify which clusters from the two tree have
maximum overlap?

Thanks


Bill
---

		        Biostatistics Consulting Center
		http://ilya.wustl.edu/~shannon/bcc_announcement.pdf


William D. Shannon, Ph.D.

Associate Professor of Biostatistics in Medicine
Division of General Medical Sciences and Biostatistics

Washington University School of Medicine
Campus Box 8005, 660 S. Euclid
St. Louis, MO   63110

Phone: 314-454-8356
Fax: 314-454-5113
e-mail: [log in to unmask]
web page: http://ilya.wustl.edu/~shannon


On Wed, 29 Jun 2005, Michael Brusco wrote:

> This is a particular type of seriation problem for which optimal solutions
> can typically be obtained for rather large problems.  Depending on computer
> RAM, Dynamic programming can provide an optimal solution for problems up to
> about 25x25 (512MB or RAM).  Branch-and-bound can provide guaranteed
> optimal solutions for even larger problems and is not encumbered by memory
> limitations.  However, the branch-and-bound approach is more sensitive to
> the data entries in the matrix and can require more CPU time.  I've also
> been able to solve some rather large problems using integer
> programming.  There are a number of references that focus on seriation
> problems related to the dominance index you've mentioned, anti-Robinson
> gradient indices, and least-squares unidimensional scaling.  All of these
> are amenable to optimal solution via branch-and-bound or dynamic
> programming.  I can send you some fortran programs if you like.
>
> Citations:
> Hubert, Arabie, & Meulman (2001),  Combinatorial Data Analysis:
> Optimization by Dynamic Programming, SIAM.
> Brusco & Stahl (in press), Branch-and-Bound Applications in Combinatorial
> Data Analysis, Springer.
> Hubert & Golledge (1981), Psychometrika - dynamic programming
> Brusco & Stahl (2001), Psychometrika, pp. 5-24 - dynamic programming
> Brusco (2002) Psychometrika, pp. 459-471 - branch and bound (anti-Robinson
> gradients)
>
> Mike
>
> At 06:53 PM 6/29/2005 +0200, you wrote:
> >If your matrix can be reorganize as an Robinson matrix, you have a lot of
> >good algorithms as I described in the manual of PermutMatrix, a software
> >for seriation and clustering (http://www.lirmm.fr/~caraux/PermutMatrix/).
> >I also give some references.
> >For the general case, I think you have to perform every reordering if you
> >want an exact solution.
> >
> >Gilles
> >
> >
> >Le 18:18 29/06/2005, vous avez écrit:
> > >Mighty mathematicians we have a question.
> > >
> > >We have small matrices of integers that we need to reorder so that the sum
> > >of the diagonal is maximum.
> > >
> > >These matrices are relatively small (say 10x10 max) so we can brute force
> > >our way by performing every reordering, but would rather not.
> > >
> > >Can anyone help we biostatisticians with an algorithm or theorem?
> > >
> > >
> > >Bill
> > >---
> > >
> > >                        Biostatistics Consulting Center
> > >                http://ilya.wustl.edu/~shannon/bcc_announcement.pdf
> > >
> > >
> > >William D. Shannon, Ph.D.
> > >
> > >Associate Professor of Biostatistics in Medicine
> > >Division of General Medical Sciences and Biostatistics
> > >
> > >Washington University School of Medicine
> > >Campus Box 8005, 660 S. Euclid
> > >St. Louis, MO   63110
> > >
> > >Phone: 314-454-8356
> > >Fax: 314-454-5113
> > >e-mail: [log in to unmask]
> > >web page: http://ilya.wustl.edu/~shannon
>
> Michael J. Brusco, Marketing Department, College of Business, Florida State
> University, Tallahassee, FL 32306-1110.  Voice: (850)644-6512, FAX:
> (850)644-4098, email: [log in to unmask]
>

ATOM RSS1 RSS2