CLASS-L Archives

February 2007


Options: Use Proportional Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Teemu Roos <[log in to unmask]>
Tue, 27 Feb 2007 04:45:56 -0500
text/plain; charset="ISO-8859-1"
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
text/plain (61 lines)
Dear All,

*The Pascal Challenge on Computer-Assisted Stemmatology* evaluates methods
for reconstructing the family-tree of a group of related documents. Such a
family-tree corresponds to a) a clustering hierarchy, where joined subgroups
make subtrees; b) a causal/graphical model of interdocument  dependencies;
c) a network of information flow among the documents; d) a phylogenetic
tree; etc. 

Many of the applicable techniques are often applied in unsupervised
scenarios. The Challenge presents an opportunity to compare different
methods and approaches in a supervised and objective fashion (and to show
that your approach is the best!).

A prototypical example illustrating the problem is as follows: A top AI
researcher has finally concluded (after, what, 20 years?) that the statement
"Tweety is a bird" is true. The rumour of this fact spreads around like
wildfire, becoming distorted along the way. After a while, a set of
scientists report respectively that: "Sweety is a bird"; "Sweety has a
bird"; and "Tweety is a nerd". Can we deduce how the information spread
among the scientists? (No, we are not interested to know if Tweety can
fly... Sorry.)

Participation is open to all.

More information can be found on the web-page of the Challenge:

The schedule is as follows:

First-phase data available    October 6, 2006
Second-phase data available   November 30, 2006
Validation data available   * February 20, 2007 *
Submission deadline           March 30, 2007
Results                       April 30, 2007

So as you see, this is a good time to check out he challenge since we just
made available the validata set and a correct (and an incorrect) solution
for it:

We invite applications of established and, in particular, novel
approaches to stemmatology, including but of course not restricted to
hierarchical clustering, graphical modeling, link analysis, phylogenetics,
string-matching, etc.

* Teemu Roos, Helsinki Inst. for Information Technology
* Tuomas Heikkilä, Dept. of History, University of Helsinki
* Petri Myllymäki, Dept. of CS, University of Helsinki

On behalf of the organizers,
Teemu Roos
Helsinki Institute for Information Technology

CLASS-L list.