CLASS-L Archives

June 2005

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Michael Brusco <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Wed, 29 Jun 2005 15:45:32 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (132 lines)
Sorry Bill.   I misread your e-mail the first time.  I read it as a 
reordering of the "rows and columns" of the reordered matrix to maximize
the "above diagonal sum of entries". To maximize the sum "on the main
diagonal", the problem involves reordering of the columns. Although this
is still probably solvable using B&B for some pretty sizable matrices.

Again, my apologies.

Mike

At 01:17 PM 6/29/2005 -0500, you wrote:
>Let me restate the problem -- I am thinking there is an easier problem
>then what I have been receiving -- though thank you for those.
>
>We have two dendrograms on the same objects and want to find stopping
>rules to cut each such thazt the overlap in the partitions is maxium. My
>thought (which may make this a more complicated problem) was to generate a
>contingency table of the two partitions and reorder the table to put the
>largest intersection counts on the diagonal.
>
>
>Is there a simpler rule to identify which clusters from the two tree have
>maximum overlap?
>
>Thanks
>
>
>Bill
>---
>
> Biostatistics Consulting Center
> http://ilya.wustl.edu/~shannon/bcc_announcement.pdf
>
>
>William D. Shannon, Ph.D.
>
>Associate Professor of Biostatistics in Medicine
>Division of General Medical Sciences and Biostatistics
>
>Washington University School of Medicine
>Campus Box 8005, 660 S. Euclid
>St. Louis, MO 63110
>
>Phone: 314-454-8356
>Fax: 314-454-5113
>e-mail: [log in to unmask]
>web page: http://ilya.wustl.edu/~shannon
>
>
>On Wed, 29 Jun 2005, Michael Brusco wrote:
>
> > This is a particular type of seriation problem for which optimal solutions
> > can typically be obtained for rather large problems. Depending on computer
> > RAM, Dynamic programming can provide an optimal solution for problems up to
> > about 25x25 (512MB or RAM). Branch-and-bound can provide guaranteed
> > optimal solutions for even larger problems and is not encumbered by memory
> > limitations. However, the branch-and-bound approach is more sensitive to
> > the data entries in the matrix and can require more CPU time. I've also
> > been able to solve some rather large problems using integer
> > programming. There are a number of references that focus on seriation
> > problems related to the dominance index you've mentioned, anti-Robinson
> > gradient indices, and least-squares unidimensional scaling. All of these
> > are amenable to optimal solution via branch-and-bound or dynamic
> > programming. I can send you some fortran programs if you like.
> >
> > Citations:
> > Hubert, Arabie, & Meulman (2001), Combinatorial Data Analysis:
> > Optimization by Dynamic Programming, SIAM.
> > Brusco & Stahl (in press), Branch-and-Bound Applications in Combinatorial
> > Data Analysis, Springer.
> > Hubert & Golledge (1981), Psychometrika - dynamic programming
> > Brusco & Stahl (2001), Psychometrika, pp. 5-24 - dynamic programming
> > Brusco (2002) Psychometrika, pp. 459-471 - branch and bound (anti-Robinson
> > gradients)
> >
> > Mike
> >
> > At 06:53 PM 6/29/2005 +0200, you wrote:
> > >If your matrix can be reorganize as an Robinson matrix, you have a lot of
> > >good algorithms as I described in the manual of PermutMatrix, a software
> > >for seriation and clustering (http://www.lirmm.fr/~caraux/PermutMatrix/).
> > >I also give some references.
> > >For the general case, I think you have to perform every reordering if you
> > >want an exact solution.
> > >
> > >Gilles
> > >
> > >
> > >Le 18:18 29/06/2005, vous avez écrit:
> > > >Mighty mathematicians we have a question.
> > > >
> > > >We have small matrices of integers that we need to reorder so that
> the sum
> > > >of the diagonal is maximum.
> > > >
> > > >These matrices are relatively small (say 10x10 max) so we can brute
> force
> > > >our way by performing every reordering, but would rather not.
> > > >
> > > >Can anyone help we biostatisticians with an algorithm or theorem?
> > > >
> > > >
> > > >Bill
> > > >---
> > > >
> > > > Biostatistics Consulting Center
> > > > http://ilya.wustl.edu/~shannon/bcc_announcement.pdf
> > > >
> > > >
> > > >William D. Shannon, Ph.D.
> > > >
> > > >Associate Professor of Biostatistics in Medicine
> > > >Division of General Medical Sciences and Biostatistics
> > > >
> > > >Washington University School of Medicine
> > > >Campus Box 8005, 660 S. Euclid
> > > >St. Louis, MO 63110
> > > >
> > > >Phone: 314-454-8356
> > > >Fax: 314-454-5113
> > > >e-mail: [log in to unmask]
> > > >web page: http://ilya.wustl.edu/~shannon
> >
> > Michael J. Brusco, Marketing Department, College of Business, Florida State
> > University, Tallahassee, FL 32306-1110. Voice: (850)644-6512, FAX:
> > (850)644-4098, email: [log in to unmask]
> >

Michael J. Brusco, Marketing Department, College of Business, Florida State
University, Tallahassee, FL 32306-1110. Voice: (850)644-6512, FAX:
(850)644-4098, email: [log in to unmask]

ATOM RSS1 RSS2