The best reference in this area is the classic book:
Time warps, string edits, and macromolecules : the theory and practice of sequence comparison
By:David Sankoff; Joseph B Kruskal
Publisher:Reading, Mass. : Addison-Wesley Pub. Co., 1983.
ISBN:0201078090
--stephen hirtle
At 01:08 PM 3/9/2005, you wrote:
>I am getting some great input from people. Thank you.
>
>The method I was trying to think of is 'unfolding' which several people
>identified, and Doug Carroll has directed me to multidimensional
>unfolding.
>
>
>Here is the problem that I may be faced with (given funding becomes
>available)
>:
>
>1. Viruses contain sets of genes which for argument we will call A, B C,
>D, etc.
>
>
>2. We want to infer a relationship among virus 'species', though species
>here is according to the virologists (or at least some of them) generally
>not thought to apply in the strict evolutionary sense, so I use this term
>loosely
>
>
>3. The order of the genes within each virus species is different as a
>result of significant genome rearrangements over time
>
> e.g., ABCD ABDC .... CDBA ... etc
>
>
>4. If we view the arrangement of genes among many viruses can we identify
>families of viruses where the rearrangement is less within the family than
>between families (sounds like clustering)
>
>
>5. One approach is to define a metric on gene arrangement patterns (e.g.,
>the minimum number of rearrangements needed to have identical gene
>arrangement between two virus species) and proceed using standard
>clustering
>
>
>6. I also thought unfolding is something I need to consider, so thank you
>for your input.
>
>
>Anyone ever approach a problem like this?
>
>
>Bill
>---
>
> Joint Meeting of the Interface and
> Classification Society of North America
>
> http://ilya.wustl.edu/if_csna_2005_meeting/
> Abstracts and Registration Deadline is 4/9/05
>
>
>William D. Shannon, Ph.D.
>
>Associate Professor of Biostatistics in Medicine
>Division of General Medical Sciences and Biostatistics
>
>Washington University School of Medicine
>Campus Box 8005, 660 S. Euclid
>St. Louis, MO 63110
>
>Phone: 314-454-8356
>Fax: 314-454-5113
>e-mail: [log in to unmask]
>web page: http://ilya.wustl.edu/~shannon
>
>
>On Wed, 9 Mar 2005, J. Douglas Carroll wrote:
>
>> Maybe this is the usage of this term in some statistical fields, but the
>> mathematical psychologist Clyde Coombs's (as far as I know original) use of
>> the word "unfolding" implied finding a real valued continuum (a single
>> dimension defined on at least an interval, not merely an ordinal, scale)
>> such that all the input orders can be generated via a model in which each
>> order is inversely related to the order of distances from an "ideal point"
>> on that continuum. Unfolding was later generalized by some students of
>> Coombs's to the multidimensional case, in which the single dimension was
>> generalized to a multidimensional space, and the input orders were assumed
>> to be inversely monotonically related to (Euclidean) distances from a set
>> of ideal points in this multidimensional space. This generalization is
>> referred to as "MULTIDIMENSIONAL unfolding".
>>
>> While it's certainly true that, in Coombs's original unidimensional version
>> of unfolding analysis, an order can be associated with the unidimensional
>> continuum resulting from unfolding analysis, its purpose was NOT to
>> determine an ordering, but to determine an underlying
>> continuum. Furthermore, the order defined by the resulting continuum will
>> generally NOT be in any realistic sense a "consensus order"; in extreme
>> cases its average rank order correlation (calculated by any reasonable rank
>> order correlation coefficient) with the input orders could be zero, in fact.
>>
>> Doug Carroll
>>
>>
>> At 10:27 AM 3/9/2005 -0600, shannon wrote:
>> >Yes -- unfolding is the word. Thanks
>> >
>> >
>> >Bill
>> >---
>> >
>> > Joint Meeting of the Interface and
>> > Classification Society of North America
>> >
>> > http://ilya.wustl.edu/if_csna_2005_meeting/
>> > Abstracts and Registration Deadline is 4/9/05
>> >
>> >
>> >William D. Shannon, Ph.D.
>> >
>> >Associate Professor of Biostatistics in Medicine
>> >Division of General Medical Sciences and Biostatistics
>> >
>> >Washington University School of Medicine
>> >Campus Box 8005, 660 S. Euclid
>> >St. Louis, MO 63110
>> >
>> >Phone: 314-454-8356
>> >Fax: 314-454-5113
>> >e-mail: [log in to unmask]
>> >web page: http://ilya.wustl.edu/~shannon
>> >
>> >
>> >On Wed, 9 Mar 2005, Paul R Swank wrote:
>> >
>> > > Do you mean unfolding?
>> > >
>> > > Paul R. Swank, Ph.D.
>> > > Professor, Developmental Pediatrics
>> > > Medical School
>> > > UT Health Science Center at Houston
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Classification, clustering, and phylogeny estimation
>> > > [mailto:[log in to unmask]] On Behalf Of shannon
>> > > Sent: Wednesday, March 09, 2005 9:19 AM
>> > > To: [log in to unmask]
>> > > Subject: statistical method
>> > >
>> > >
>> > > What is the name of the statistical method which generates an order from a
>> > > set of orders:
>> > >
>> > > Vote preferences: A > B > C > D
>> > > B > A > C > D
>> > > A > B > C > D
>> > > A > C > D > B
>> > > etc
>> > >
>> > > It is something like peeling?
>> > >
>> > >
>> > > Bill
>> > > ---
>> > >
>> > > Joint Meeting of the Interface and
>> > > Classification Society of North America
>> > >
>> > > http://ilya.wustl.edu/if_csna_2005_meeting/
>> > > Abstracts and Registration Deadline is 4/9/05
>> > >
>> > >
>> > > William D. Shannon, Ph.D.
>> > >
>> > > Associate Professor of Biostatistics in Medicine
>> > > Division of General Medical Sciences and Biostatistics
>> > >
>> > > Washington University School of Medicine
>> > > Campus Box 8005, 660 S. Euclid
>> > > St. Louis, MO 63110
>> > >
>> > > Phone: 314-454-8356
>> > > Fax: 314-454-5113
>> > > e-mail: [log in to unmask]
>> > > web page: http://ilya.wustl.edu/~shannon
>> > >
>>
>>
>>
>> ######################################################################
>> # J. Douglas Carroll, Board of Governors Professor of Management and #
>> #Psychology, Rutgers University, Graduate School of Management, #
>> #Marketing Dept., MEC125, 111 Washington Street, Newark, New Jersey #
>> #07102-3027. Tel.: (973) 353-5814, Fax: (973) 353-5376. #
>> # Home: 14 Forest Drive, Warren, New Jersey 07059-5802. #
>> # Home Phone: (908) 753-6441 or 753-1620, Home Fax: (908) 757-1086. #
>> # E-mail: [log in to unmask] #
>> ######################################################################
>>
|