CLASS-L Archives

February 2008


Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Art Kendall <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Mon, 4 Feb 2008 16:44:32 -0500
text/plain (4 kB) , text/html (5 kB)
This could be handled as 33 cases and 78 variables in a    TWOSTEP 
procedure defining the variables a nominal.

Another approach would be to reduce to the data to 13 rank variables 
according to how often each sound was preferred, and then clustering 
cases on the 13 variables.

another way you could construct your input data would be like this.
You have 33 lower triangular matrices with 13 rows and 13 columns.
You can recode your data , e.g.,  -1 meaning the row sound was 
preferred, zero meaning a tie, and +1 meaning the column variable was 

This sounds more like a multidimensional preference scaling  (unfolding) 
Which can be done with PROXSCAL in SPSS.

Art Kendall
Social Research Consultants

Arnaud Trollé wrote:
> Thank you all for your help.
> Sorry, you're right Art, I've been too evasive, I should have begun by defining 
> the framework of my study :
> During a listening test, 33 subjects were presented 78 pairs of sounds (i.e. 
> number of possible combinations between 13 sounds).  For each pair, the 
> subject is asked to indicate which sound he prefers, three possibilities : ``first 
> sound preferred", "second sound preferred", and "no preference". Actually, my 
> data set consits of 33 cases for 78 categorical variables (all with 3 modalities).
> Before any other analysis, my first objective is to find out whether there exists 
> any sub-groups of subjects with distinct preference logics.
> So, my approach is exploratory. However, if there exists any subgroups (with, 
> for each, a meaningful size), I'm expecting at most a weak number of 
> subgroups. Thus, I first went in for partitioning methods such as the k-modes 
> of which I've heard. But, I've got to few experience to even judge whether 
> this method is one of the most adapted or not to my study case ?
> I hope these elements will help to work out a little more my initial questionning.
> Best Regards,
> Arnaud.
> ______________________________________
> De : Classification, clustering, and phylogeny estimation [CLASS-
> [log in to unmask]] de la part de Art Kendall [[log in to unmask]]
> Date d'envoi : lundi 4 février 2008 18:30
> À : [log in to unmask]
> Objet : Re: About Partitioning Categorical Data ...
> Please tell us more about your application? Are the values ordered? Are
> you trying to find groups of variables or groups of cases (rows,
> subjects, entities)?
> How many cases (rows) do you have?  How many variables?  Do all of the
> variables have 3 values? Are you trying to see how an existing partition
> of cases or variables works with other cases or variables?
> Often it is helpful to us to know the substantive meaning of your
> variables, and what a case represents.
> SPSS is widely available but there are also many specific purpose
> programs around depending on what you are trying to do.
> If SPSS itself does not have a procedure, you can call any R procedures
> from within SPSS. So you might be able to use several procedures.
> If you are partitioning variables into sets, then you might look at
> Categorical Principal Components analysis (CATPCA).
> If you are partitioning cases into sets, then you might look TWOSTEP
> which clusters cases based on either/both categorical and continuous
> variables
> If you have an existing 3 value variable, that you want to see how the
> cases with each value differ on another, TREES, CATREG, and DISCRIMINANT
> might be what you could use.
> If you have three sets of variables, you can confirm how well a three
> factor solution fits in CATPCA by specifying the number of factors you want.
> Art Kendall
> Social Research Consultants
> Arnaud Trollé wrote:
>> Hello,
>> I'd like to cluster categorical data (3 categories) by means of a partitioning
>> method; I'm quite a beginner in that field and I would need to be enlightened.
>> From a bibliographic review I carried out about that topic, it appeared to me
>> that a method is often used :the k-modes method. From her/his experience,
>> could anyone confirm or deny that it is the case ? If denied, which method
>> could be more "powerful" ?
>> Thanks in advance.
>> Best Regards.
>> Arnaud.
>> PhD Student in Acoustics.
>> Lyon, France.
>> ----------------------------------------------
>> CLASS-L list.
>> Instructions:
> ----------------------------------------------
> CLASS-L list.
> Instructions:
> ----------------------------------------------
> CLASS-L list.
> Instructions:

CLASS-L list.