Workshop on learning from heterogeneous data | |
Canada - BC - Whistler |
|
Hosted by: | Neural Information Processing Systems Conference |
Dates: | Dec 10, 2005 through Dec 10, 2005 |
Call for Proceedings Presentations: | 2005-09-01 through 2005-11-01 |
Description |
|
Many applications of machine learning require the synthesis of heterogeneous types of data. These may be collections of vector data with varying statistical properties, or more generally, diverse data sets consisting of vectors, strings, graphs, sets of objects, etc. An increasingly important example of such an application domain is computational biology, where the human genome is accompanied by real-valued gene expression data, functional annotation of genes, genotyping information, a graph of interacting proteins, a set of equations describing the dynamics of a system, localization of proteins in a cell, a phylogenetic tree relating species, natural language text in the form of papers describing experiments, partial models that provide priors, and numerous other data sources. The problem of integration touches on many fundamental problems in decision theory, information theory, statistics, machine learning, experimental design, and most centrally on machine learning. Recent research in the area deploys advanced methods such as hierarchical Bayesian methods, model averaging, kernels, graphical models, graph diffusions, spectral methods and robustness analysis. In contrast to the majority of work on ensemble methods that was popular in the early 90s, the current integration research focuses on scalable and robust methods for fusing very complex and heterogeneous data sources and structure representations. It is generally believed that proper integration of these highly complex and inherently noisy information sources is the key to numerous breakthroughs in functional genomics and system biology. In addition to focusing on how to make things work, we encourage the submission of manuscripts that analyze why a method works. For example, it is useful to distinguish between integration resulting from prior knowledge of mechanisms, entities, and relations on one hand, and integration based on general inductive methods on the other. While most successful integration methods involve a bit of both, researchers rarely explicate why the approach was successful (e.g., the success may be due to a specific type of prior knowledge and have little to do with the particular method used). The goal of this workshop is to present emerging methods for computational learning from heterogeneous data. We encourage contributions that describe formal representational frameworks, particular algorithmic approaches, or applications of new or existing methods to particularly challenging integration problems in computational biology and other application domains. Submission instructions Researchers interested in contributing should send an extended abstract of up to 4 pages in PDF format to noble@gs.washington.edu by November 1, 2005. Organizers William Stafford Noble Department of Genome Sciences University of Washington Simon Kasif Department of Biomedical Engineering Boston University Nello Cristianini Department of Statistics University of California, Davis Tommi Jaakkola Computer Science and Artificial Intelligence Laboratory MIT Michael Jordan EECS Department University of California, Berkeley Jean-Philippe Vert Centre de Geostatistique Ecole des Mines de Paris |
|
Additional Information | |
Event URL: | http://noble.gs.washington.edu/hdata |
ISCB Member Discount: | None |
Contact Person: | Bill Noble ([javascript protected email address]) |
While ISCB provides for conference and event listings that may be of interest to members and bioinformaticians at large, ISCB is not responsible for the content provided by outside sources. Such listings are not meant as an endorsement by ISCB.