CLASS-L Archives

May 2012


Options: Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Marc Lalancette <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Tue, 15 May 2012 21:05:15 +0000
text/plain (3871 bytes) , text/html (6 kB)

I was referred to this list by someone on sci.stat.math (!topic/sci.stat.math/-bDEys5WTjk).  I apologize if this is not the right forum for this type of question.  I have limited stats knowledge and I've been doing some research to find a "good" solution to my problem.  I'll first describe what I want to do and then what I came up with based on some more or less fruitful research.  I'd appreciate some suggestions, tips, references to similar or "better" methods, etc.

I have 3-d motion data for 3 markers stuck on a person's head, while he/she is trying to be still.  The first step is making sure the markers did not fall off, so I calculate the 3 between marker distances across time and in what follows I basically treat that as a 3-d Euclidean vector, even though it's not Euclidean, but I'm not sure how else to combine these 3 distances...  (Also, later, I could ask the same questions about each marker's position to see if the person moved and in that case it is Euclidean 3-d space.)  I want to detect changes, and get a "good" partition of the time series into intervals based on when changes occurred, i.e. a list of roughly stationary intervals and an associated position for each interval.  I'm assuming there won't be many changes and that they would mostly be fast (i.e. step-like time series), but they could also be slow, in which case I'd still want to detect it and split it in chunks that are "mostly still", depending on the measurement error.

After some research, here's what I came up with.  My first idea was to use 3-d distances between samples adjacent in time to evaluate the measurement error, i.e. an approximation of the distribution if there were no movement (that's why the assumption of few changes is important).  Comparing that distribution with the distribution of all inter-point distances (or a subset: maybe all distances from the first point) would tell me if there was any movement.  I was thinking of using the Kolmogorov-Smirnov test for this.  Then I thought I could use a hierarchical clustering method based on the same idea.  Divisive since I expect few clusters.  I would recursively look for the boundary point in time that would maximize the KS test probability for the intervals on both sides (one-sided test since intervals that are unusually stationary would be "ok").  Then I looked for something that would account for model complexity (thinking of reduced chi-square) and found the AIC.  Maybe I could interpret the combined KS probabilities as a likelihood for that particular partition and use the AIC to decide when to stop dividing the intervals.

This is what I came up with based on what I found in my research.  Almost all of the concepts and methods I mention I didn't know about a week ago, so I assume the resulting amalgamation has quite a few "weaknesses" even though it might work.  I'd be happy to hear what knowledgeable people would have to say about this.  Feel free to contact me directly by email.  I'll also monitor the list for replies.


Marc Lalancette
Research MEG Lab Project Manager
Program in Neurosciences and Mental Health, Department of Diagnostic Imaging, The Hospital for Sick Children, Toronto, Canada


This e-mail may contain confidential, personal and/or health information(information which may be subject to legal restrictions on use, retention and/or disclosure) for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this e-mail in error, please contact the sender and delete all copies.

CLASS-L list.