CLASS-L Archives

June 2001


Options: Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
Wed, 20 Jun 2001 09:43:22 -0400
"Classification, clustering, and phylogeny estimation" <[log in to unmask]>
text/plain; charset=us-ascii
Mel Janowitz <[log in to unmask]>
text/plain (116 lines)
 DIMACS Summer School Tutorial on New Frontiers in Data Mining

 Date: August 13 - 17, 2001

 Location: DIMACS Center, Rutgers University, Piscataway, NJ

 Dimitrios Gunopulos, University of California at Riverside, [log in to unmask]
 Nikolaos Koudas, AT&T Labs - Research, [log in to unmask]

 Related to: Special Focus on Computational Molecular Biology
 and Special Focus on Data Analysis and Mining.

 Focus: This "summer school" tutorial program is aimed at
 providing background, vocabulary, and theoretical methodology
 to non-specialists in data mining and to others who wish to
 explore this field and at bringing together students, postdocs,
 and researchers working on algorithms for data mining with those
 working in various applications areas. More specifically, we aim
 to introduce the attendees to the fundamental theoretical/algorithmic
 issues that arise in data mining and its applications.

 Data mining is an exciting new field of computer science research,
 encompassing several diverse techniques for analyzing large datasets.
 The goal of data mining is to obtain new, interesting and actionable
 pieces of information. Vast amounts of data are accumulated in diverse
 application domains, including bioinformatics, epidemiology, business,
 physical sciences, web applications, and networking. Data mining
 research is stimulated by hard real life problems in analyzing data
 in all those areas. Data mining is fundamentally an interdisciplinary
 field, borrowing and combining techniques from theory, statistics,
 databases and machine learning, and ultimately producing new approaches.
 A goal of this tutorial is to bring together students, postdocs, and
 researchers from the fields of data mining, bioinformatics,
 networking, and the web, and to facilitate the collaboration between
 fields, as well as to introduce the field of data mining to those who
 are not yet working in it or are not yet working in it from an
 algorithmic point of view.

 In the tutorial we concentrate on new research directions that are
 currently emerging in the field: data mining applications in
 bioinformatics, networking, and the web. We will explore new problems
 that come up in these areas, identify common threads among the various
 applications, and consider new paradigms, methods and techniques that
 are being developed to address these problems. In the tutorial we will
 emphasize the algorithmic aspects of analyzing large datasets. There are
 different general ways to approach this problem, such as approximate
 algorithms and data summarization techniques. We will look at new
 techniques on stream processing and online algorithms, and their
 applications to specific problems.

 Biological research is undergoing a major revolution as new technologies,
 such as high-throughput DNA sequencing and DNA microarrays, are creating
 large amounts of data. New techniques in analyzing such data are important
 in the understanding of biological processes. Many bioinformatics problems
 can be formulated as generalized searching problems in a large space. We
 will look at general lattice search techniques with different constraints,
 as well as new string algorithms. We will also look at applications of
 classification techniques in the area.

 Networking and telecommunications applications produce large amounts of
 data that can be mined for various properties of interest. Time series data
 prevail in such domains and algorithms for time series matching, sequential
 pattern identification are of great interest. We will concentrate on
 incremental and one pass algorithms for networking problems and explore
 the connection between these problems and similar incremental and one pass
 problems arising in the biological sciences.

 The web has emerged as a vast datastore, containing diverse pieces of
 information. We will examine recent approaches to mine information on the
 World Wide Web, including efficient web searching and web site
 personalization efforts. We will also look at data and resource management
 issues in the web environment, with emphasis on bioinformatics and
 telecommunications applications.

 Registration Fee and Procedure: The registration fee is $200 for the week.
 Graduate students, postdocs and DIMACS Members pay $95 for the week. The
 fee less registration deposit will be collected on site, cash, check, Visa
 or Mastercard. Registration fees cover two meals a day, breaks, and all
 workshop materials. Registration is first come, first served and is limited
 to 60 people. A non-refundable $50 registration deposit will hold your

 for information on how to register.

 Financial Support: Limited financial support for travel, local
 expenses, or registration fees may be available depending upon support
 from funding agencies. Applications for financial support can be found at

 WWW Information:

 DIMACS Center
 Rutgers, The State University of New Jersey
 CoRE Bldg., 96 Frelinghuysen Road
 Piscataway, NJ 08854-8018, USA
 TEL: 732-445-5928
 FAX: 732-445-5932
 EMAIL: [log in to unmask]
 DIMACS is a partnership of Rutgers University, Princeton University,
 AT&T Labs - Research, Bell Laboratories, the NEC Research Institute
 and Telcordia Technologies.

 Christine Spassione                            Tel: (732) 445-4304
 Visitor Coordinator                            Fax: (732) 445-5932
 DCI Program Administrator              [log in to unmask]
 DIMACS Center
 Rutgers University
 96 Frelinghuysen Road
 Piscataway, NJ 08854-8018