Print

Print


James Rohlf

Thank you for getting those comments from Richard Reyment. I am unsure what alterations are referred to in the sentence 'any alteration introduced into a matrix of compositions changes the sum in a manner that is beyond control'. I obviously need to look at the Geol Soc London book to understand the implications of simplex space. Please pass my thanks to Richard Reyment. I have long admired the clarity and practicality of his book Multivariate Morphometrics.

Richard Wright 


>Subject: Re: using ratios in MV correlational analysis
>   From: "F. James Rohlf" <[log in to unmask]>
>   Date: Mon, 29 Sep 2008 21:08:43 -0400
>     To: [log in to unmask]
>
>The following are some comments by Richard Reyment who has worked on
>problems in this area:
>
>"Is this useful. Generally speaking, biologists know abolsutely nothing
>about the geometry of the simplex, and this is also true of a great many
>statisticians. For the geomathematical fraternity, however, the subject is
>of great importance because it is often connected to analyses involving
>large-scale economic aspects where an inappropriate analysis can waste great
>sums of money.
>
>     G G Simpson was among the first biologists to point out that ratios
>cannot be used in correlation exercises such as indictaed in the Course 1
>agenda.
>Originally, it was Karl Pearson who in 1898 proved that ratios induce
>spurious correlations. This was in relation to so-called standardised
>data-vectors.
>
>   Of recent years geomathematicians have taken the subject much further,
>following the results of the statistician John Aitchison, who proved that
>correlation coefficients are not defined in simplex space, that is the space
>in which percentages, frequencies etc lie. This is the outcome of the fact
>that such data have a constant sum and any alteration introduced into a
>matrix of compositions changes the sum in a manner that is beyond control.
>This is not a problem for open-space data of course.
>
>   Ref. John Aitchison: The Statistical analysis of Compositional Data;
>Chapman and Hall (1986), slightly revised version reprinted in 2003.
>
>   Hence, multivariate analyses involving compositional data must be made
>using the appropriate algebra for distributions on the simplex.  
>Applying the "open-space" standard version can only lead to incorrect
>results.
>
>   Since the original work was published by Aitchison, the Applied
>Mathematicians Professors Vera Pawlowsky-Glahn and Juan Josť Egozcue have
>raised the bar several levels in that they introduced the concept of a
>finite dimensional Hilbert Space into the analysis of simplicial geometry.
>This leads to very elegant solutions.
>
>   An indispensible reference is the recently published volume edited by A.
>Buccianti, G. Mateu-Figueras and V. Pawlowsky-Glahn
>
>COMPOSITIONAL DATA-ANALYSIS IN THE GEOSCIENCES: FROM THEORY TO PRACTICE
>
>Published by the Geological Society of London, Special Publication No 264,
>2006 (212 pp.)
>
>  http://www.geolsoc.org.uk/bookshop
>
>
>   Best wishes
>
>Richard A. Reyment"
>
>------------------------
>F. James Rohlf, Distinguished Professor
>Ecology & Evolution, Stony Brook University
>www: http://life.bio.sunysb.edu/ee/rohlf
>
>
>> -----Original Message-----
>> From: Classification, clustering, and phylogeny estimation
>> [mailto:[log in to unmask]] On Behalf Of Richard Wright
>> Sent: Saturday, September 27, 2008 2:05 AM
>> To: [log in to unmask]
>> Subject: using ratios in MV correlational analysis
>> 
>> There is a scattered literature on the dangers, or otherwise, of using
>> ratios in correlational analyses.
>> 
>> I have read what looks like a non-obfuscatory paper on this topic by
>> Firebaugh and Gibbs "User's Guide to Ratio Variables" from American
>> Sociological Review, Vol.50, No.5 (1985) pp.713-722.
>> 
>> On page 721 the authors state: "Avoid mixed methods (part ratio, part
>> component). If Z is controlled by division rather than by
>> residualization, all of the other variables should be divided by Z.
>> Should only some of the variables by divided by Z, the effect of Z is
>> 'controlled' for some variables and not for others, and a defensible
>> interpretation of the results is difficult."
>> 
>> The reason for my interest is that I am trying to evaluate a
>> morphometric paper that does linear discriminant analysis on a mixture
>> of measurements and ratios derived from those same measurements. For
>> example the analysis includes (A) Length as well as Height/Length and
>> (B) Height and Breadth as well as Height/Breadth and Height/Length.
>> 
>> This paper seems to be an example of the 'mixed method' that Firebaugh
>> and Gibbs warn against, where data are part ratio, part measurement,
>> and spurious correlations are introduced into the data.
>> 
>> So my first question is whether I am correct in this interpretation.
>> 
>> My second question also concerns ratios.
>> 
>> In his Multivariate Statistical Methods, 2nd ed. 1994, B.F.J. Manly
>> suggests controlling for the effects of absolute size difference in a
>> PCA of pots (goblets) by expressing the measurements as "a proportion
>> of the sum of all measurements on that goblet."
>> 
>> Given that each variable is divided by the same sum, this example of
>> the use of ratios seems to be a case that Firebaugh and Gibbs would
>> not frown on.
>> 
>> I shall welcome any comments on these questions and any pointers to
>> relevant literature.
>> 
>> Richard
>> 
>> ----------------------------------------------
>> CLASS-L list.
>> Instructions: http://www.classification-
>> society.org/csna/lists.html#class-l
>
>----------------------------------------------
>CLASS-L list.
>Instructions: http://www.classification-society.org/csna/lists.html#class-l

----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l