?= <[log in to unmask]>
Message-Id: <[log in to unmask]>
I am currently working on probabilistic-based classification methods. It
sounds that your dataset can also be analysed using some sort of methods. If
you are willing to, I can have a try to analyse your dataset. The methods
that I use, especially that which uses the Minimum Message Length (principle)
is very good for analysing dataset with highly overlapping groups and all of
them work in an unsupervised way.
PhD candidate in Computer Science
School of CSSE, Monash University,
Victoria, 3800 Australia.
On Sat, 20 Mar 2004 00:15, you wrote:
> I have performed a cluster analysis on a medical dataset consisting of 100
> children measured on 4 variables.
> The dendograms suggested there were three groups, so I did a k-means
> clustering with k=3. I didn't set the initial centroids of the k-means =
> centres of hierarchical clustering, and the two types of clustering did not
> repeat the same partiton. Arnold's test for cluster proved to be non
> significant. YET, I managed to find two groups of children who had a very
> different profile on the 4 variables clustered and and a similar response
> on a 5th variable, which was very surprising.
> Now, I understand I haven't identified 3 groups of very different children,
> everything so far suggests there are no sharply differing groups. I cannot
> make any inferences from my sample, obviously. But could I say I have found
> some sort of multivariate thresholds on the basis of the matrix of
> distances, which allow me to gain a certain insight into the data? Or is
> it just all a big fluke, not worth the paper it's written on?!
> I welcome any comments/suggestions. I am only new to the topic (and the
> list), but I am keen to learn!
> Thanks for your time so far
> Sandra Alba
> University Medicine - Level 7
> Derriford Hospital
> PL6 8DH