Classification, clustering, and phylogeny estimation
Mon, 6 Oct 2008 09:35:08 +0800
I have a question about what classification method to use, PCA or DFA.
The experiment is the following: Total metabolites were measured from
tissue of animals treated with a drug for five different durations and
each duration was repeated five times. The data thus consists of a peak
list with 25 columns (5x5 treatments) with about 2500 variables, where
each variable represents one metabolite.
This experiment was repeated 5 times, thus resulting in 5 series of 25
The questions we were asking are the following:
1. Do the drug treatments have a differential effect on the metabolism?
We tried to answer this question by using PCA. In the PCA scores plot,
the different drug treatments cluster strongly, so we would like to
believe that this means the drug treatments have a differential effect.
2. Are the series reproducible?
Using a DFA-analysis, we do see that the clusters of each series are
arranged in a very similar manner in the DFA scores plot. Now, does this
mean that our measurement was reproducible?
I am not quite sure what method to use and was told that PCA is not the
way to go, because we have an a priory variability in our data due to
the experimental design.
Any help is greatly appreciated!