I am not sure if is exactly the reference you need, but the following may be
of interest because it shows one method to avoid having to compute an entire
distance matrix. It was intended for quickly finding near neighbors but it
should also be useful for finding distant neighbors (I looked for points in
the same partition whereas you would be interested in points not in the same
partition). Main limitation for my application was that it works best for
large numbers of points in very lowdimensional spaces.
Rohlf, F. J. 1977. A probabalistic minimum spanning tree algorithm.
Information Proc. Letters, 7:4448.
> Original Message
> From: Classification, clustering, and phylogeny estimation
> [mailto:[log in to unmask]]On Behalf Of Wolfgang M. Martmann
> Sent: Wednesday, April 17, 2002 9:45 AM
> To: [log in to unmask]
> Subject: Name and Reference of Algorithm
>
>
> Hi:
> For some time I'm working on a problem of sampling a set of K
> observations (cases) from a large data set with N >> K cases so that
> the selected observations are as "different as possible". In more
> mathematical terms, I'm interested in locating those K cases which
> will result in a (not necessarily Euclidean) distance matrix in which
> the smallest offdiagonal entry d_ij is as large as possible.
>
> I have developed an algorithm which seems to work very well and
> generates sets which are either optimal or close to optimality without
> computing the entire distance matrix. However, I'm thinking more
> and more that this maybe a known problem to people who work in
> Cluster Analysis, MDS, or classification. I wonder if anybody on
> this list could point me to some references about this search
> problem.
>
> Thanks, Wolfgang Hartmann
>
