CLASS-L Archives

September 2004

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show HTML Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender:
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Date:
Mon, 6 Sep 2004 09:26:28 -0400
Reply-To:
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Subject:
From:
Art Kendall <[log in to unmask]>
Content-Type:
multipart/alternative; boundary="------------080807040603010502030701"
In-Reply-To:
<001101c493c9$405ca720$2402a8c0@orchid>
Organization:
Social Research Consultants
MIME-Version:
1.0
Parts/Attachments:
text/plain (4 kB) , text/html (6 kB)
It is many years  since I was current on the factor analysis literature.
I have retired and no longer have access to databases of abstracts like
DIALOG,  ORBIT, or PsychInfo.  If you have a friend in a university or a
government agency they might be able to do a search for you.

Since most clustering grew up around grouping cases (rows in the
original data matrix), how is transposing the data matrix and using the
same algorithms problematic in clustering variables (columns)?  Just the
opposite, one of the oldest methods of clustering cases  was to
standardize then transpose the data matrix and factor it.  (this
approach was big in the 1960's & 1970's).

I have a gut feeling (not a thought out opinion) that an oblique
solution means that you end up with measures that do not have
discriminant validity.

SPSS has had many varieties of factor analysis for many years.  It has
used 2 kinds of data, 7 kinds of extraction, and 4 kinds of rotation.
(56 different "methods"!) Maybe some of those combinations would meet
your needs.  [For those of us who use methods that other create,  it
sure would be nice if someone were to use this framework and produce a
document advising on when to use the options. ]

to get details like algorithms and lit cites go to
http://support.spss.com/
login as "guest"
password "guest"
<statistics>
<algorithms>
then <catpca> <catreg> <cluster> <discriminant> <factor> <overals>
<proximities> <quick cluster> <twostep cluster>

The ANSWERTREE add-on  and new TREE procedure in the base module may
also be relevant.


kinds of data: SPSS can work on a correlation matrix or a covariance
matrix.  In Psych, the means of variables are usually arbitrary, so
correlations are more common. However, much of the development of
factoring was from psych and ed.  Perhaps the math psych list would have
more current people .
Society for Mathematical Psychology: MPSYCH Listserv
<http://aris.ss.uci.edu/smp/mpsych.html>

quote from SPSS about the extractions available
Available methods are principal components, unweighted least squares,
generalized least squares, maximum likelihood, principal axis factoring,
alpha factoring, and image factoring.
end quote.
there are more details in the <help>.

quote from SPSS <help> about the rotations available.   These

    *
      Varimax Method. An orthogonal rotation method that minimizes the
      number of variables that have high loadings on each factor. It
      simplifies the interpretation of the factors.
    *
      Direct Oblimin Method. A method for oblique (nonorthogonal)
      rotation. When delta equals 0 (the default), solutions are most
      oblique. As delta becomes more negative, the factors become less
      oblique. To override the default delta of 0, enter a number less
      than or equal to 0.8.
    *
      Quartimax Method. A rotation method that minimizes the number of
      factors needed to explain each variable. It simplifies the
      interpretation of the observed variables.
    *
      Equamax Method. A rotation method that is a combination of the
      varimax method, which simplifies the factors, and the quartimax
      method, which simplifies the variables. The number of variables
      that load highly on a factor and the number of factors needed to
      explain a variable are minimized.
    *
      Promax Rotation. An oblique rotation, which allows factors to be
      correlated. It can be calculated more quickly than a direct
      oblimin rotation, so it is useful for large datasets.
    *

end quote.

Art
[log in to unmask]
Social Research Consultants
University Park, MD  USA
(301) 864-5570


Wolfgang M. Hartmann wrote:

> Thank you for the nice response,
> I kmow that in practice transposing the matrix is a common, but do not
> think
> of it as a very valid approach. (Higher order) Factor analysis with
> oblique rotation
> and restrictions penalizing nonzero loadings would sound good for me.
> Would
> you know of any references for such an approach?
> Wolfgang
>
>
>
>     In SPSS all of the few dozen Proximity (similarity measures) can
>     be applied to variables.  (After the data are transformed and
>     transposed)  The  Proximity matrix can then be read into the
>     variety of cluster procedures.  Or the transposed data can be read
>     directly into the CLUSTER, or Quick cluster procedure.  I see no
>     reason (given that you want to cluster variables) that the TWOSTEP
>     cluster could not read a transposed data matrix.
>
>
>
>
>     Of course there are all of the varieties of factor analysis which
>     are more commonly used to group variables.  The CATPCA procedure
>     factors categorical variables.
>
>     When the variables are used to classify or differentiate a
>     categorical variable, there are procedures like DISCRIMINANT  or
>     the various
>
>


ATOM RSS1 RSS2