## CLASS-L@LISTS.SUNYSB.EDU

 Options: Use Proportional Font Show Text Part by Default Show All Mail Headers Message: [<< First] [< Prev] [Next >] [Last >>] Topic: [<< First] [< Prev] [Next >] [Last >>] Author: [<< First] [< Prev] [Next >] [Last >>]

 Subject: Distance measure From: shannon <[log in to unmask]> Reply To: Classification, clustering, and phylogeny estimation Date: Sat, 1 Mar 2003 15:09:17 -0600 Content-Type: TEXT/PLAIN Parts/Attachments: TEXT/PLAIN (53 lines)
```I have a dataset that I do not know how to calculate pairwise distances.

In sibpair linkage analysis a unit of analysis is the sibpair (2 brothers,
2 sisters, or a brother and sister). The covariate vector contains
information on each of the pair (e.g., two ages, two heights, two
weights). There is no way to order these covariates.

Consider the two ages on two sibpairs. Let age_ij be the age for the i^th
sibling in the j^th sibpair. The data matrix could be in any of the
following orders:

Sibpair   Age1      Age2
1      age_11   age_21
2      age_12   age_22

or      1      age_21   age_11
2      age_12   age_22

or      1      age_11   age_21
2      age_22   age_12

or      1      age_21   age_11
2      age_22   age_21

We want to calculate the distance between the sibpairs since these are the
units of analysis.

We can arbitrarily invoke a rule (e.g., youngest is always the first age)
and use standard distance measures -- but these are ad hoc and the genetic
linkage people are unsatisfied with this (though it is done routinely).

This is analogous to the difference between correlation and intraclass
correlation.

Any suggustions?

Bill
---

William D. Shannon, Ph.D.

Assistant Professor of Biostatistics in Medicine
Division of General Medical Sciences and Biostatistics

Washington University School of Medicine
Campus Box 8005, 660 S. Euclid
St. Louis, MO   63110

Phone: 314-454-8356
Fax: 314-454-5113