Hi Jim
I have not received any response.
The goal is to cluster the sibpairs and apply advanced genetic linkage
programs to the sibpairs within a cluster. The assumptions are as follows:
1. There are mulitple genetic models in the population. If treated
as a single homogeneous population we lose power to detect the genetic
effect due to the mixture of heterogeneous groups.
2. The covariates are distributed differently across the genetic
subgroups. Therefore clustering on the covariates will result in
homogeneous subsets and the withincluster genetic models will have more
power to detect the genetic effect.
It is essential in any partitioning that sibpairs are contained in the
same subset since the genetic linkage models are fit to sibpair data.
We are beginning to discuss this problem in my group and one suggestion I
made is based on calculating convex hulls. For example age measured on the
two siblings can be represented by two points ordered as (min(age),
max(age)) and (max(age), min(age)). By representing all sibpairs data as
the two points we can fit a convex hull to the sibpair.
Similarly we can fit a convex hull around all the data using the two
points for each pair of siblings.
We might be able to measure distance as a function of the two areas
coverend by the convex hulls.
We hope to have some ideas developed for CSNA.
Bill

William D. Shannon, Ph.D.
Assistant Professor of Biostatistics in Medicine
Division of General Medical Sciences and Biostatistics
Washington University School of Medicine
Campus Box 8005, 660 S. Euclid
St. Louis, MO 63110
Phone: 3144548356
Fax: 3144545113
email: [log in to unmask]
web page: http://ilya.wustl.edu/~shannon
On Tue, 11 Mar 2003, F. James Rohlf wrote:
> Have you received any responses yet?
>
> What are the distances based on? Genetic data or the height and weights you
> mention? The answer to your question must depend on what you wish to do with
> these distances. Do you want to cluster the entire matrix or just compute an
> average distance between sibs vs. between families?
>
> Jim
>
> > Original Message
> > From: Classification, clustering, and phylogeny estimation
> > [mailto:[log in to unmask]]On Behalf Of shannon
> > Sent: Saturday, March 01, 2003 4:09 PM
> > To: [log in to unmask]
> > Subject: Distance measure
> >
> >
> > I have a dataset that I do not know how to calculate pairwise distances.
> >
> > In sibpair linkage analysis a unit of analysis is the sibpair (2 brothers,
> > 2 sisters, or a brother and sister). The covariate vector contains
> > information on each of the pair (e.g., two ages, two heights, two
> > weights). There is no way to order these covariates.
> >
> > Consider the two ages on two sibpairs. Let age_ij be the age for the i^th
> > sibling in the j^th sibpair. The data matrix could be in any of the
> > following orders:
> >
> > Sibpair Age1 Age2
> > 1 age_11 age_21
> > 2 age_12 age_22
> >
> > or 1 age_21 age_11
> > 2 age_12 age_22
> >
> > or 1 age_11 age_21
> > 2 age_22 age_12
> >
> > or 1 age_21 age_11
> > 2 age_22 age_21
> >
> > We want to calculate the distance between the sibpairs since these are the
> > units of analysis.
> >
> > We can arbitrarily invoke a rule (e.g., youngest is always the first age)
> > and use standard distance measures  but these are ad hoc and the genetic
> > linkage people are unsatisfied with this (though it is done routinely).
> >
> > This is analogous to the difference between correlation and intraclass
> > correlation.
> >
> > Any suggustions?
> >
> > Bill
> > 
> >
> > William D. Shannon, Ph.D.
> >
> > Assistant Professor of Biostatistics in Medicine
> > Division of General Medical Sciences and Biostatistics
> >
> > Washington University School of Medicine
> > Campus Box 8005, 660 S. Euclid
> > St. Louis, MO 63110
> >
> > Phone: 3144548356
> > Fax: 3144545113
> > email: [log in to unmask]
> > web page: http://ilya.wustl.edu/~shannon
> >
>
>
