CLASS-L Archives

March 2004

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Yudi Agusta <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Mon, 22 Mar 2004 09:11:58 +1100
Content-Type:
text/plain
Parts/Attachments:
text/plain (58 lines)
?= <[log in to unmask]>
MIME-Version: 1.0
Message-Id: <[log in to unmask]>
Content-Transfer-Encoding: 8bit

Hi Sandra,

I am currently working on probabilistic-based classification methods. It
sounds that your dataset can also be analysed using some sort of methods. If
you are willing to, I can have a try to analyse your dataset. The methods
that I use, especially that which uses the Minimum Message Length (principle)
is very good for analysing dataset with highly overlapping groups and all of
them work in an unsupervised way.

Kind Regards,
--
Yudi Agusta
PhD candidate in Computer Science
School of CSSE, Monash University,
Victoria, 3800 Australia.
Telp: +61-3-99055190


On Sat, 20 Mar 2004 00:15, you wrote:
> Hi,
>
> I have performed a cluster analysis on a medical dataset consisting of 100
> children measured on 4 variables.
>
> The dendograms suggested there were three groups, so I did a k-means
> clustering with k=3. I didn't set the initial centroids of the k-means =
> centres of hierarchical clustering, and the two types of clustering did not
> repeat the same partiton. Arnold's test for cluster proved to be non
> significant. YET, I managed to find two groups of children who had a very
> different profile on the 4 variables clustered and and a similar response
> on a 5th variable, which was very surprising.
>
> Now, I understand I haven't identified 3 groups of very different children,
> everything so far suggests there are no sharply differing groups. I cannot
> make any inferences from my sample, obviously. But could I say I have found
> some sort of multivariate thresholds on the basis of the matrix of
> distances, which allow me to gain a certain insight into the data?  Or is
> it just all a big fluke, not worth the paper it's written on?!
>
> I welcome any comments/suggestions. I am only new to the topic (and the
> list), but I am keen to learn!
>
> Thanks for your time so far
>
> Sandra
>
> Sandra Alba
> University Medicine - Level 7
> Derriford Hospital
> Plymouth
> PL6 8DH
> UK

ATOM RSS1 RSS2