CLASS-L Archives

June 2000

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Ognian Asparoukhov <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Sat, 24 Jun 2000 09:10:32 +0300
Content-Type:
text/plain
Parts/Attachments:
text/plain (42 lines)
At 11:34 23.6.2000, you wrote:
>Hi everyone,
>can someone tell me the limits on the number of variables in relation to
>sample size? Are there any good references on this topic? Thanks in advance,
>Winnie

The good empirical rule for discriminant analysis (classification) is:

N>=p, where

N is the numer of observations (sample size);
p is the number of variable.
However this rule is appropriate for two classes.

In general the minimum sample size depends on the procedure you will use.
The parametric statistical procedures require less N,
while the nonparametric ones require more N.
But you have to use as more as possible training observations,
except if you have tremendous data set.

I any case you need some unbiased estimation of
the classification accuracy (cross-validation; leave-one-out;
test sample) in order to determine the particular classifier's
efficiency.

And the most important questions are:
a) selection of variables (the best subset)
        Even you have many variables and moderate N,
        you could use different variables slection procedures and
        you will decrease p
b) choice of an appropriate discriminant procedure

Ognian Asparoukhov

--
Ognian Asparoukhov                        Phone:  ++(359) 2 700-528
Centre of Biomedical Engineering                  ++(359) 2 700-326
Bulgarian Academy of Sciences             Fax:    ++(359) 2 723-787
Acad. Georgi Bonchev Street, Bl. 105      E-mail: [log in to unmask]
1113 Sofia, BULGARIA                           [log in to unmask]


ATOM RSS1 RSS2