I'm not sure the problem that needed to be addressed is what we think it
is. I saw an email in the last couple of days from the original poster of
the problem and believe the issue is to generate 3 random samples (without
replacement) from a small number of observations such that each sample has
the same distribution as the original data.
I might be confusing this and maybe the original poster can resend a
description of the problem to the list.
Bill Shannon
> I agree. I don't yet see the point of why this should be done.
> >
> > >2) On clustering with R1=R2=R3=R. kmeans clustering implicitly assumes
> > > clusters to have unit matrix correlation. So transforming the data to
> > > unit covariance and then applying 3means will give clusters with
> > > approximately R1=R2=R3=R.
> >
> > R1=R2=R3, maybe but =R???
> >
> > Surely it is most unlikely that the overall correlation structure
> > would mirror
> > the withincluster structure? It is also hard to think why that might be
> > desirable. If it were then an obvious way to achieve it would be
> > to randomly
> > allocate the data points to the three clusters.
Murray Jorgensen
> >
> > May be even better with a Gausiian mixture
> > > model where covariance matrices of the clusters are restricted to cI,
> > > where I is unit matrix and c may depend on the cluster. This again has
> > > to be applied to data which is sphered, i.e. transformed to unit
> > > covariance first. I hope this "covariance model" can be found
> > in mclust,
> > > mentioned previously in this discussion.
Christian Hennig
> > >
