Dear all
thanks for all your suggestions which I highly appreciate and take into
account for my work.
Some of you asked me about the motivation of such partitionning: to be more
concrete it deals with the crossvalidation
of some econometric regression models.
In other words we want to split the data into three subsamples that should
be as homogeneous as possible in terms of the correlations patterns among
all variables (dependent and regressors).
We need to do this because our modelling tool uses the out of sample
forecasting ability to suggest the best model.
I hope this helps you to understand better the sense and the meaning of my
query.
We already tried some random allocation of the data, but we want to
implement in our tool some routines that do this in a more objective and
optimal way.
This is because the random solution results almost always in a non
satisfactory partitions when the data set is very small (ie 50 obs).
I apologise if the matter is not strictly related with cluster analysis.
Maybe it can be considered an allocationoptimization problem.
