I am not hundred percent certain whether this contributes to your
problem, but let me try nevertheless.
One of the major problems in using ratios of variables could be the fact
that the ratio of normal variables is Cauchy distributed, and the Cauchy
distribution is the standard counterexample to all standard statistical
optimality results. For instance, Cauchy distributed variables do even
have an expected value, and the arithmetic mean of standard Cauchy
distributed variables has the same distribution as one single variable,
i.e. we can not learn from the data by increasing the sample size.
Hope this comment is of some help, the more so as discriminant analysis
often relies on a model where variables are taken to be normally
distributed, so that, in my view, taking the ratio of these variables
could lead to such problems.
Prof Dr Thomas Augustin
Department of Statistics
University of Munich
Tel +49 89 2180 3520
Fax+49 89 2180 5044
[log in to unmask]
Richard Wright schrieb:
> There is a scattered literature on the dangers, or otherwise, of using
> ratios in correlational analyses.
> I have read what looks like a non-obfuscatory paper on this topic by
> Firebaugh and Gibbs "User's Guide to Ratio Variables" from American
> Sociological Review, Vol.50, No.5 (1985) pp.713-722.
> On page 721 the authors state: "Avoid mixed methods (part ratio, part
> component). If Z is controlled by division rather than by
> residualization, all of the other variables should be divided by Z.
> Should only some of the variables by divided by Z, the effect of Z is
> 'controlled' for some variables and not for others, and a defensible
> interpretation of the results is difficult."
> The reason for my interest is that I am trying to evaluate a
> morphometric paper that does linear discriminant analysis on a mixture
> of measurements and ratios derived from those same measurements. For
> example the analysis includes (A) Length as well as Height/Length and
> (B) Height and Breadth as well as Height/Breadth and Height/Length.
> This paper seems to be an example of the 'mixed method' that Firebaugh
> and Gibbs warn against, where data are part ratio, part measurement,
> and spurious correlations are introduced into the data.
> So my first question is whether I am correct in this interpretation.
> My second question also concerns ratios.
> In his Multivariate Statistical Methods, 2nd ed. 1994, B.F.J. Manly
> suggests controlling for the effects of absolute size difference in a
> PCA of pots (goblets) by expressing the measurements as "a proportion
> of the sum of all measurements on that goblet."
> Given that each variable is divided by the same sum, this example of
> the use of ratios seems to be a case that Firebaugh and Gibbs would
> not frown on.
> I shall welcome any comments on these questions and any pointers to
> relevant literature.
> CLASS-L list.
> Instructions: http://www.classification-society.org/csna/lists.html#class-l