10th Annual Rocky Mountain Bioinformatics Conference

Keynote Speakers

Updated Nov 15, 2012

Links within this page:

James Costello,  PhD
Howard Hughes Medical Institute
Boston University
Massachusetts - USA

CV
: pdf

Title: 
Wisdom of Crowds for Constructing Gene Regulatory Networks and Predicting Drug Sensitivities

Abstract:
'The wisdom of crowds' refers to the phenomenon in which the collective knowledge of a community is greater than the knowledge of any individual.  The Dialogue for Reverse Engineering Assessments and Methods project (DREAM) provides a framework to both test this concept and comprehensively benchmark individual methodologies.  Reliable construction of gene regulatory networks and accurate prediction of drug sensitivities and synergies are key challenges in basic and biomedical research.  Accordingly, these topics were proposed as DREAM challenges.  In each case, the community was supplied with genome-scale training data, namely microarray compendia to infer transcription factor to target gene relationships and microarray, exome sequencing, RNA-sequencing, DNA methylation, copy number variation, protein quantification, and dose response to predict both drug sensitivities and synergies.  Here, I will discuss the results and insights gained from over 30 methods submitted to infer gene regulatory networks and over 50 methods submitted to predict drug effects.  For the network inference challenge, we observed method-specific modifications that separated the top performing submissions from the rest.   Additionally, we were able to show that no single inference method performed optimally across all datasets. In contrast, integration of predictions from multiple inference methods showed robust, high performance across diverse datasets.  For the drug sensitivity challenge, we observed that genome-scale data, absent any drug treatment, could be used to predict the effect of drug treatments.  We saw a wide range of approaches taken to address this challenge and we were able to quantify the relative predictive power of each class of data.  By assessing these DREAM challenges, we provide insights into methodological efficiencies and limitations and establish a performance benchmark to test algorithms in the future. 

Co-authors:
This email address is being protected from spambots. You need JavaScript enabled to view it. - Columbia University, Biomedical Informatics, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Columbia University, Biomedical Informatics, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - MIT, CSAIL, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Pfizer, Computational Sciences, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - IBM, Genomics, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Boston University, Biomedical Engineering, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Oregon Health & Science University, Biomedical Engineering, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Boston University, Biomedical Engineering, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Oregon Health & Science University,
Biomedical Engineering, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - MIT, CSAIL, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Istituto Italiano di Tecnologia, Genomics, Italy
This email address is being protected from spambots. You need JavaScript enabled to view it. - Columbia University, Biomedical Informatics, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Boston University, Biomedical Engineering, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Ludwig-Maximilians University, Informatics, Germany
This email address is being protected from spambots. You need JavaScript enabled to view it. - IBM, Genomics, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Columbia University, Biomedical Informatics, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Siemens Corporate Research, Genomics, US

[Top]

Yuval Itan, PhD
The Rockefeller University
New York - USA

CV: 
Please check back for updates to this page.

Title:  The Human Gene Connectome: A Map of Short Cuts for Morbid Allele Discovery

Abstract: High-throughput genomic and proteomic data obtained by next generation sequencing, microarray or mass spectrometry reveal thousands of gene variants per individual and it is often difficult to decipher which of these variants underlie disease or phenotype in individuals. At the population level however, there can be some level of phenotypic homogeneity, with alterations of specific physiological pathway underlying the pathogenesis of a particular disease. Here we describe the human gene connectome (HGC) as a new approach facilitating the interpretation of abundant genomic and proteomic data, guiding subsequent experimental investigations. We identify the set of shortest plausible biological distances, routes, and degrees of separation between all pairs of human genes, by applying a shortest distance algorithm to the full human gene network. We demonstrate a hypothesis-driven application of the HGC in which we generate a TLR3-specific connectome, which may be useful for the genetic dissection of herpes simplex virus encephalitis of childhood. We also developed the functional genomic alignment (FGA) approach from the HGC. In FGA, the genes are clustered according to their biological proximity (rather than the traditional evolutionary genetic distance), as estimated from the HGC. The HGC and FGA data and computer programs are freely available for non-commercial users at: http://lab.rockefeller.edu/casanova/HGC/, and should facilitate the genome-wide discovery and experimental validation of disease-causing alleles.

Co-authors:
This email address is being protected from spambots. You need JavaScript enabled to view it. - The Rockefeller University
Human Genetics of Infectious Diseases, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - The Rockefeller University
Human Genetics of Infectious Diseases, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - The Rockefeller University
Human Genetics of Infectious Diseases, US
This email address is being protected from spambots. You need JavaScript enabled to view it. - Pasteur Institute
Human Evolutionary Genetics, France
This email address is being protected from spambots. You need JavaScript enabled to view it. - Necker Hospital, Human Genetics of Infectious Diseases, France
This email address is being protected from spambots. You need JavaScript enabled to view it. - The Rockefeller University, Human Genetics of Infectious Diseases, US

[Top]

Kirk E. Jordan, PhD
Emerging Solutions Executive & Assoc. Program Director
Computational Science Center, IBM T.J. Watson Research
Member, IBM Academy of Technology
Massachusetts - USA

CV: 
http://researcher.ibm.com/researcher/view.php?person=us-kjordan

Title:  The Ins and Outs of Developing Biological Applications on a Massively Parallel Multi-Core System

Abstract:
Abstract: With IBM’s latest High Performance Computing (HPC) System, the Blue Gene/Q, as we see core counts continue to rise. In this talk, I will briefly describe this system and explain how it fits into the road to Exascale. More importantly, I will describe some of the applications we have been working on over the last year in the computational biology area.  I will explain some of complications and techniques to over come them by describing some of our experience working on Rosetta, a code that predicts protein structures from amino acid sequences in DNA and some other computational biology applications. In addition, while related to our HPC work, I will describe work continuing to make HPC accessible to a wider audience and eventually targeting the healthcare and life science practitioner directly and explain why this work is of importance.

 [Top]

Christopher S. Miller, Ph.D.
Assistant Professor
Department of Integrative Biology
University of Colorado, Denver
USA

CVpdf

Title: Deep Sequencing and Assembly of Targeted Amplicons in Microbial Communities

Abstract:
Targeted amplification followed by high-throughput sequencing allows for extremely deep sampling of genes of interest in microbial communities. However, current sequencing read-lengths and the existence of variable coverage, closely related genomes make correct assembly of individual amplicons difficult. We modified EMIRGE (Miller et al. Genome Biology, 2011) for the assembly of mixed populations of 16S ribosomal small subunit genes amplified from a variety of bacterial communities. EMIRGE relies on a database of candidate 16S sequences for templated assembly, and a modified expectation-maximization algorithm iteratively adjusts these candidates to reflect the sample sequenced. In each iteration, reads are aligned and probabilistically attributed to candidate 16S genes, and candidate gene abundances and consensus sequences are then adjusted based on this probabilistic read attribution. Once the iterations terminate, the taxonomic identification provided by assembled full-length 16S genes is reported with the estimated abundance of organisms that represent as little as 0.01 % of microbial communities. The method has thus far been applied to characterize microbial communities from an uranium-contaminated aquifer and environments present in a hospital neonatal intensive care unit. With a similar strategy, it should also be possible to reconstruct other functional genes of interest with high sensitivity.

Co-authors:
This email address is being protected from spambots. You need JavaScript enabled to view it., Argonne National Lab
Institute for Genomics and Systems Biology, US
This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley
Earth and Planetary Sciences, US
This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley
Earth and Planetary Sciences, US
This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley
Earth and Planetary Sciences, US
This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley
Earth and Planetary Sciences, US

 [Top]

Chris Mungall, Ph.D.
Bioinformatics Scientist
Berkeley Bioinformatics Open-source Projects
Lawrence Berkeley National Laboratory
Berkeley, CA
USA

CV: web page

Title:
Helping Machines to Help Us: New Advances in Biological Ontologies, Databases and Machine Reasoning

Abstract:

We are utterly reliant on automated methods to make sense of the deluge of data flooding from everywhere from next generation sequencing to electronic health records. However, current statistical procedures that process strings of nucleotides or words operate in a "knowledge vacuum", lacking a codification of the rules of the domain in which they operate. For example, gene function prediction algorithms lack knowledge of the cellular context in which gene products are active, leading to incorrect predictions which would be rejected by a biologist.

The next generation of biological ontologies and knowledge-based reasoning algorithms can be used to help fill this machine knowledge gap. I will give an overview of some recent extensions to well-known resources such as the GO, as well as other biological ontologies comprising the Open Biomedical Ontologies (OBO) library, demonstrating how a move from simple structured vocabularies to sophisticated logic-based representations can provide valuable assistance to existing statistical approaches.

[Top]

Alex Stewart, PhD
Somalogic
Boulder, CO, USA

CV: pdf

Title: 
How to Interpret One Thousand Proteins From (Almost) One Thousand Samples, and What This Can Tell Us About Heart Disease Biology and Individual Risk

Authors: Mike Mehan, Alex Stewart, (Somalogic, Boulder Co), and Shintaro Kato (NEC corporation of America)

Abstract:
Conventional blood based biomarker studies for specific diseases have been technically limited to observing relatively few proteins which are selected from those which are already believed to be involved in biology of the disease of interest. This methodology has the failing that it can never find new biological processes, or discover proteins which are not already suspected to be linked to the disease. Proteomics using the SomaScan platform offers the ability to survey over a thousand of proteins from a multitude of physiological processes from under 100 microliters of blood plasma. This high dimensional data can illuminate the entire range of biological processes involved, however the analysis of such high dimensional data is more challenging (as this audience is well aware).

We applied the SomaScan platform to a prospective cohort of 987 individuals with stable coronary heart disease (CHD) and a median of 6 years of follow up. Individuals with stable CHD are at higher than normal risk for cardiovascular (CV) events, however, this risk varies widely, from lower risk than the general population to much higher. Current methods for CV risk assessment are poor, and both overtreatment and under-treatment of individuals is the result. From the thousand proteins measured with SomaScan we derived a simple 10 protein model that yields world beating performance for predicting the crippling consequences of this disease - death, heart failure, heart attack and stroke (Figure). Notably, these 10 proteins are new to the field of CV risk stratification.

This presentation is the story of how we learned how to combine modern bioinformatics tools and biological knowledge. We believe the commercial product we are developing has the potential to enable targeting of preventive therapies, enhance patient management, enrich enrollment for CV events clinical trials, and identify potential targets for therapeutic discovery, allow novel discovery of particular subgroups, and clarify idiopathic etiologies from larger cohorts of individuals. We compare our performance to existing metrics, and look beyond the 10 protein model to a future in which the interpretation of protein biomarkers depends upon who you are.

[Top]

Olga Troyanskaya, PhD
Assistant Professor
Department of Computer Science and
Lewis-Sigler Institute for Integrative Genomics
Princeton University
USA

CV: pdf

Title:  From Tissue-specific Functional Networks to Understanding Human Disease


Abstract:

The ongoing explosion of new technologies in functional genomics offers the promise of understanding gene function, interactions, and regulation at the systems level. This should enable us to develop comprehensive descriptions of genetic systems of cellular controls, including those whose malfunctioning becomes the basis of genetic disorders, such as cancer, and others whose failure might produce developmental defects in model systems. However, the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it on a systems level, and apply it to the study of specific pathways or genetic disorders. These challenges are further exacerbated by the biological complexity of metazoans, including diverse biological processes, individual tissue types and cell lineages, and by the increasingly large scale of data in higher organisms. I will describe how we address these challenges through the development of bioinformatics frameworks for the study of gene function and regulation in complex biological systems and through close coupling of these methods with experiments, thereby contributing to understanding of human disease. I will specifically discuss how integrated analysis of functional genomics data can be leveraged to study cell-lineage specific gene expression, to identify proteins involved in disease in a way complementary to quantitative genetics approaches, and to direct both large-scale and traditional biological experiments.

[Top]