CLASS-L Archives

June 2010

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"F. James Rohlf" <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Mon, 7 Jun 2010 07:27:40 -0400
Content-Type:
multipart/alternative
Parts/Attachments:
text/plain (1926 bytes) , text/html (5 kB)
It is a function of the standardization used in correspondence analysis. See
papers and books by Greenacre. For example Journal of applied stat. 1993.
20:251-269.

 

----------------------

F. James Rohlf, Distinguished Professor

Dept. Ecology and Evolution, Stony Brook University, NY 11794-5245

 

From: Classification, clustering, and phylogeny estimation
[mailto:[log in to unmask]] On Behalf Of Eric Wajnberg
Sent: Monday, June 07, 2010 3:58 AM
To: [log in to unmask]
Subject: correspondence analysis with strange triangular structure

 

Dear All, 

I am doing a simple correspondence analysis on a contingency table having
more than 5000 lines (that are representing genes) and 4 columns. I am then
doing a clustering analysis on the coordinates of all genes on the factorial
axes. The results of the correspondence analysis are quite surprising to me.
The coordinates of all rows on the 3 axes are strictly within a simplex that
is a perfect tetrahedron. Of course, the extremities of the tetrahedron are
corresponding to the coordinates of the column in the factorial space. How
could such a structure be obtained? 
  
I first thought that this was due to a specific structure in my original
dataset, so I've decided to do the analysis again on a randomly drawn
contingency table of the same size (each cell was drawn from a uniform
distribution between 0 and 100). I collected again such a triangle
structure. 
  
It thus seems that a simple correspondence analysis is always producing such
a triangular structure on a space with a small number of dimensions, but I
never heard about this before. I suspect that people are usually not getting
this because either they have more dimensions, or less rows in the original
table, leading to points that look to be spread on factorial plans in a more
homogeneous way.


----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l


ATOM RSS1 RSS2