CLASS-L Archives

March 2003


Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Reply To:
Classification, clustering, and phylogeny estimation
Sat, 1 Mar 2003 15:09:17 -0600
TEXT/PLAIN (53 lines)
I have a dataset that I do not know how to calculate pairwise distances.

In sibpair linkage analysis a unit of analysis is the sibpair (2 brothers,
2 sisters, or a brother and sister). The covariate vector contains
information on each of the pair (e.g., two ages, two heights, two
weights). There is no way to order these covariates.

Consider the two ages on two sibpairs. Let age_ij be the age for the i^th
sibling in the j^th sibpair. The data matrix could be in any of the
following orders:

     Sibpair   Age1      Age2
        1      age_11   age_21
        2      age_12   age_22

or      1      age_21   age_11
        2      age_12   age_22

or      1      age_11   age_21
        2      age_22   age_12

or      1      age_21   age_11
        2      age_22   age_21

We want to calculate the distance between the sibpairs since these are the
units of analysis.

We can arbitrarily invoke a rule (e.g., youngest is always the first age)
and use standard distance measures -- but these are ad hoc and the genetic
linkage people are unsatisfied with this (though it is done routinely).

This is analogous to the difference between correlation and intraclass

Any suggustions?


William D. Shannon, Ph.D.

Assistant Professor of Biostatistics in Medicine
Division of General Medical Sciences and Biostatistics

Washington University School of Medicine
Campus Box 8005, 660 S. Euclid
St. Louis, MO   63110

Phone: 314-454-8356
Fax: 314-454-5113
e-mail: [log in to unmask]
web page: