CLASS-L Archives

September 2008


Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Thomas Augustin <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Mon, 29 Sep 2008 20:49:55 +0200
text/plain (92 lines)
Dear Richard,

I am not hundred percent certain whether this contributes to your 
problem, but let me try nevertheless.

One of the major problems in using ratios of variables could be the fact 
that the ratio of normal variables is Cauchy distributed, and the Cauchy 
distribution is the standard counterexample to all standard statistical 
optimality results. For instance, Cauchy distributed variables do even 
have an expected value, and the arithmetic mean of standard Cauchy 
distributed variables has the same distribution as one single variable, 
i.e. we can not learn from the data by increasing the sample size.

Hope this comment is of some help, the more so as discriminant analysis 
often relies on a model where variables are taken to be normally 
distributed, so that, in my view,  taking the ratio of these variables 
could lead to such problems.

Best wishes



Prof  Dr  Thomas Augustin
Department of  Statistics
University of Munich
Ludwigstr. 33/II
D-80539 Munich

Tel +49 89 2180 3520
Fax+49 89 2180 5044
[log in to unmask]

Richard Wright schrieb:
> There is a scattered literature on the dangers, or otherwise, of using
> ratios in correlational analyses. 
> I have read what looks like a non-obfuscatory paper on this topic by
> Firebaugh and Gibbs "User's Guide to Ratio Variables" from American
> Sociological Review, Vol.50, No.5 (1985) pp.713-722.
> On page 721 the authors state: "Avoid mixed methods (part ratio, part
> component). If Z is controlled by division rather than by
> residualization, all of the other variables should be divided by Z.
> Should only some of the variables by divided by Z, the effect of Z is
> 'controlled' for some variables and not for others, and a defensible
> interpretation of the results is difficult." 
> The reason for my interest is that I am trying to evaluate a
> morphometric paper that does linear discriminant analysis on a mixture
> of measurements and ratios derived from those same measurements. For
> example the analysis includes (A) Length as well as Height/Length and
> (B) Height and Breadth as well as Height/Breadth and Height/Length. 
> This paper seems to be an example of the 'mixed method' that Firebaugh
> and Gibbs warn against, where data are part ratio, part measurement,
> and spurious correlations are introduced into the data.
> So my first question is whether I am correct in this interpretation.
> My second question also concerns ratios.
> In his Multivariate Statistical Methods, 2nd ed. 1994, B.F.J. Manly
> suggests controlling for the effects of absolute size difference in a
> PCA of pots (goblets) by expressing the measurements as "a proportion
> of the sum of all measurements on that goblet."
> Given that each variable is divided by the same sum, this example of
> the use of ratios seems to be a case that Firebaugh and Gibbs would
> not frown on.
> I shall welcome any comments on these questions and any pointers to
> relevant literature.
> Richard
> ----------------------------------------------
> CLASS-L list.
> Instructions:

CLASS-L list.