 Subject: Re: looking for a distance measure
Date: Wed, 23 Sep 2009
```A follow up based on some questions I got from members of the lsit.

The data will be a list of distinct 0's and 1's and missing values.  Suppose patient 1 received drug A with no effect and then drug B which was effective -- their data would be (0 1 Missing Missing).  Patient 2 receives drugs C and D with no effect but A works, and B is never given -- their data would be (1 Missing 0 0).  Etc.

Assume the columns or entries of the vectors corresponding to drug A B C D where the entry is 0 if not effective, 1 if effective, and missing if not given.  Assume also the order of drug given is random.

It may be order and number of ineffective drugs given should be ignored and distance based on responding to the same drug or different drug.

Hi Everyone

I may be working with a data set that has the following structure and will need to develop a distance measure.  I have not had time to think carefully about it but am hoping someone might have already worked with data like this.

Patients present to the doctor with a disease and it is unknown which of four drugs they will respond to (the goal of this project is to improve the ability to predict and be able to give the correct drug first).  MD’s treat these patients  empirically – give them drug A and see if they respond, if not give them drug B and see if they respond, etc.

We assume a patient either responds or does not, and that there is no carry over or order of drug effect (i.e., if you respond to drug B it is irrelevant if you had already had drug A).  I also assume there is no set order on which drugs are given first.

The data for each patient will be a vector of 0’s for non response and a 1 for response, with the number of 0’s dependent on how many drugs were given empirically before a response occurred.

How do we calculate a pair wise distance matrix between pairs of patients with this data?

