It is a function of the
standardization used in correspondence analysis. See papers and books by Greenacre.
For example Journal of applied stat. 1993. 20:251-269.
----------------------
F. James Rohlf, Distinguished
Professor
Dept. Ecology and Evolution,
Stony Brook University, NY 11794-5245
From: Classification,
clustering, and phylogeny estimation [mailto:[log in to unmask]] On
Behalf Of Eric Wajnberg
Sent: Monday, June 07, 2010 3:58 AM
To: [log in to unmask]
Subject: correspondence analysis with strange triangular structure
Dear All,
I am doing a simple correspondence analysis on a contingency table having more
than 5000 lines (that are representing genes) and 4 columns. I am then doing a
clustering analysis on the coordinates of all genes on the factorial axes. The
results of the correspondence analysis are quite surprising to me. The
coordinates of all rows on the 3 axes are strictly within a simplex that is a
perfect tetrahedron. Of course, the extremities of the tetrahedron are
corresponding to the coordinates of the column in the factorial space. How
could such a structure be obtained?
I first thought that this was due to a specific structure in my original
dataset, so I’ve decided to do the analysis again on a randomly drawn
contingency table of the same size (each cell was drawn from a uniform
distribution between 0 and 100). I collected again such a triangle structure.
It thus seems that a simple correspondence analysis is always producing such a
triangular structure on a space with a small number of dimensions, but I never
heard about this before. I suspect that people are usually not getting this
because either they have more dimensions, or less rows in the original table, leading
to points that look to be spread on factorial plans in a more homogeneous way.