## CLASS-L@LISTS.SUNYSB.EDU

 Options: Use Proportional Font Show Text Part by Default Show All Mail Headers Message: [<< First] [< Prev] [Next >] [Last >>] Topic: [<< First] [< Prev] [Next >] [Last >>] Author: [<< First] [< Prev] [Next >] [Last >>]

 Subject: Re: looking for a distance measure From: "Shannon, William" <[log in to unmask]> Reply To: Classification, clustering, and phylogeny estimation Date: Wed, 23 Sep 2009 22:48:33 -0500 Content-Type: text/plain Parts/Attachments: text/plain (49 lines)
```A follow up based on some questions I got from members of the lsit.

The data will be a list of distinct 0's and 1's and missing values.  Suppose patient 1 received drug A with no effect and then drug B which was effective -- their data would be (0 1 Missing Missing).  Patient 2 receives drugs C and D with no effect but A works, and B is never given -- their data would be (1 Missing 0 0).  Etc.

Assume the columns or entries of the vectors corresponding to drug A B C D where the entry is 0 if not effective, 1 if effective, and missing if not given.  Assume also the order of drug given is random.

It may be order and number of ineffective drugs given should be ignored and distance based on responding to the same drug or different drug.

Thank you

Bill Shannon, PhD
Associate Prof. of Biostatistics in Medicine
Washington University School of Medicine
Director, Biostatistical Consulting Center
314-454-8356
________________________________________
From: Shannon, William
Sent: Wednesday, September 23, 2009 11:44 AM
Cc: Shannon, William; Farrokh Alemi
Subject: looking for a distance measure

Hi Everyone

I may be working with a data set that has the following structure and will need to develop a distance measure.  I have not had time to think carefully about it but am hoping someone might have already worked with data like this.

Patients present to the doctor with a disease and it is unknown which of four drugs they will respond to (the goal of this project is to improve the ability to predict and be able to give the correct drug first).  MD’s treat these patients  empirically – give them drug A and see if they respond, if not give them drug B and see if they respond, etc.

We assume a patient either responds or does not, and that there is no carry over or order of drug effect (i.e., if you respond to drug B it is irrelevant if you had already had drug A).  I also assume there is no set order on which drugs are given first.

The data for each patient will be a vector of 0’s for non response and a 1 for response, with the number of 0’s dependent on how many drugs were given empirically before a response occurred.

How do we calculate a pair wise distance matrix between pairs of patients with this data?

Thank you.

Bill Shannon, PhD
Associate Professor of Biostatistics in Medicine
Washington University School of Medicine
St. Louis, MO

314-454-8356