I am starting to play around with nonlinear discriminant analysis (kernel method) for categorizing people based on their EEG data.  

Some questions:
1) I have more variables than people (at least in the training data set), what are good methods of data reduction?  I do not want to use principal components or related methods, because one goal of this analysis (at least at this point) is to see which variables are important.

2) I am using SAS PROC DISCRIM, and when I use only one variable, results are more or less as expected.  But when I use multiple variables, I get warnings that 

         The ellipsoid centered at an observation in TESTDATA= data set does not 
         contain any training set observations in DATA= data set or BY group.  This     
         observation is classified into group "Other"

and this actually applies to ALL the observations in the testdata set.  Any ideas on what is causing this?  It happens even with only a few variables being used.

3)  Is there any way to assess the importance of the different variables to the nonlinear discriminant function?

Some notes:
Many of my variables are highly correlated
I have 600 variables
I have about 1,000 people total, divided into training and test data sets


Peter L. Flom, PhD
Brainscope, Inc.
212 263 7863 (MTW)
212 845 4485 (Th)
917 488 7176 (F)

CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l