Links within this page:
![]() |
James Costello, PhD Howard Hughes Medical Institute Boston University Massachusetts - USA CV: pdf Title: Wisdom of Crowds for Constructing Gene Regulatory Networks and Predicting Drug Sensitivities Abstract: [Top] |
![]() |
Yuval Itan, PhD The Rockefeller University New York - USA CV: Please check back for updates to this page. Title: The Human Gene Connectome: A Map of Short Cuts for Morbid Allele Discovery Abstract: High-throughput genomic and proteomic data obtained by next generation sequencing, microarray or mass spectrometry reveal thousands of gene variants per individual and it is often difficult to decipher which of these variants underlie disease or phenotype in individuals. At the population level however, there can be some level of phenotypic homogeneity, with alterations of specific physiological pathway underlying the pathogenesis of a particular disease. Here we describe the human gene connectome (HGC) as a new approach facilitating the interpretation of abundant genomic and proteomic data, guiding subsequent experimental investigations. We identify the set of shortest plausible biological distances, routes, and degrees of separation between all pairs of human genes, by applying a shortest distance algorithm to the full human gene network. We demonstrate a hypothesis-driven application of the HGC in which we generate a TLR3-specific connectome, which may be useful for the genetic dissection of herpes simplex virus encephalitis of childhood. We also developed the functional genomic alignment (FGA) approach from the HGC. In FGA, the genes are clustered according to their biological proximity (rather than the traditional evolutionary genetic distance), as estimated from the HGC. The HGC and FGA data and computer programs are freely available for non-commercial users at: http://lab.rockefeller.edu/casanova/HGC/, and should facilitate the genome-wide discovery and experimental validation of disease-causing alleles. Co-authors: [Top] |
![]() |
Kirk E. Jordan, PhD Emerging Solutions Executive & Assoc. Program Director Computational Science Center, IBM T.J. Watson Research Member, IBM Academy of Technology Massachusetts - USA CV: http://researcher.ibm.com/researcher/view.php?person=us-kjordan Title: The Ins and Outs of Developing Biological Applications on a Massively Parallel Multi-Core System Abstract: Abstract: With IBM’s latest High Performance Computing (HPC) System, the Blue Gene/Q, as we see core counts continue to rise. In this talk, I will briefly describe this system and explain how it fits into the road to Exascale. More importantly, I will describe some of the applications we have been working on over the last year in the computational biology area. I will explain some of complications and techniques to over come them by describing some of our experience working on Rosetta, a code that predicts protein structures from amino acid sequences in DNA and some other computational biology applications. In addition, while related to our HPC work, I will describe work continuing to make HPC accessible to a wider audience and eventually targeting the healthcare and life science practitioner directly and explain why this work is of importance. [Top] |
![]() |
Christopher S. Miller, Ph.D. Assistant Professor Department of Integrative Biology University of Colorado, Denver USA CV: pdf Title: Deep Sequencing and Assembly of Targeted Amplicons in Microbial Communities Abstract: Targeted amplification followed by high-throughput sequencing allows for extremely deep sampling of genes of interest in microbial communities. However, current sequencing read-lengths and the existence of variable coverage, closely related genomes make correct assembly of individual amplicons difficult. We modified EMIRGE (Miller et al. Genome Biology, 2011) for the assembly of mixed populations of 16S ribosomal small subunit genes amplified from a variety of bacterial communities. EMIRGE relies on a database of candidate 16S sequences for templated assembly, and a modified expectation-maximization algorithm iteratively adjusts these candidates to reflect the sample sequenced. In each iteration, reads are aligned and probabilistically attributed to candidate 16S genes, and candidate gene abundances and consensus sequences are then adjusted based on this probabilistic read attribution. Once the iterations terminate, the taxonomic identification provided by assembled full-length 16S genes is reported with the estimated abundance of organisms that represent as little as 0.01 % of microbial communities. The method has thus far been applied to characterize microbial communities from an uranium-contaminated aquifer and environments present in a hospital neonatal intensive care unit. With a similar strategy, it should also be possible to reconstruct other functional genes of interest with high sensitivity. Co-authors: This email address is being protected from spambots. You need JavaScript enabled to view it., Argonne National Lab Institute for Genomics and Systems Biology, US This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley Earth and Planetary Sciences, US This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley Earth and Planetary Sciences, US This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley Earth and Planetary Sciences, US This email address is being protected from spambots. You need JavaScript enabled to view it., University of California, Berkeley Earth and Planetary Sciences, US [Top] |
![]() |
Chris Mungall, Ph.D. Bioinformatics Scientist Berkeley Bioinformatics Open-source Projects Lawrence Berkeley National Laboratory Berkeley, CA USA CV: web page Title: Helping Machines to Help Us: New Advances in Biological Ontologies, Databases and Machine Reasoning Abstract: We are utterly reliant on automated methods to make sense of the deluge of data flooding from everywhere from next generation sequencing to electronic health records. However, current statistical procedures that process strings of nucleotides or words operate in a "knowledge vacuum", lacking a codification of the rules of the domain in which they operate. For example, gene function prediction algorithms lack knowledge of the cellular context in which gene products are active, leading to incorrect predictions which would be rejected by a biologist. The next generation of biological ontologies and knowledge-based reasoning algorithms can be used to help fill this machine knowledge gap. I will give an overview of some recent extensions to well-known resources such as the GO, as well as other biological ontologies comprising the Open Biomedical Ontologies (OBO) library, demonstrating how a move from simple structured vocabularies to sophisticated logic-based representations can provide valuable assistance to existing statistical approaches. [Top] |
![]() |
Alex Stewart, PhD Somalogic Boulder, CO, USA CV: pdf Title: How to Interpret One Thousand Proteins From (Almost) One Thousand Samples, and What This Can Tell Us About Heart Disease Biology and Individual Risk Authors: Mike Mehan, Alex Stewart, (Somalogic, Boulder Co), and Shintaro Kato (NEC corporation of America) Abstract: Conventional blood based biomarker studies for specific diseases have been technically limited to observing relatively few proteins which are selected from those which are already believed to be involved in biology of the disease of interest. This methodology has the failing that it can never find new biological processes, or discover proteins which are not already suspected to be linked to the disease. Proteomics using the SomaScan platform offers the ability to survey over a thousand of proteins from a multitude of physiological processes from under 100 microliters of blood plasma. This high dimensional data can illuminate the entire range of biological processes involved, however the analysis of such high dimensional data is more challenging (as this audience is well aware). We applied the SomaScan platform to a prospective cohort of 987 individuals with stable coronary heart disease (CHD) and a median of 6 years of follow up. Individuals with stable CHD are at higher than normal risk for cardiovascular (CV) events, however, this risk varies widely, from lower risk than the general population to much higher. Current methods for CV risk assessment are poor, and both overtreatment and under-treatment of individuals is the result. From the thousand proteins measured with SomaScan we derived a simple 10 protein model that yields world beating performance for predicting the crippling consequences of this disease - death, heart failure, heart attack and stroke (Figure). Notably, these 10 proteins are new to the field of CV risk stratification. This presentation is the story of how we learned how to combine modern bioinformatics tools and biological knowledge. We believe the commercial product we are developing has the potential to enable targeting of preventive therapies, enhance patient management, enrich enrollment for CV events clinical trials, and identify potential targets for therapeutic discovery, allow novel discovery of particular subgroups, and clarify idiopathic etiologies from larger cohorts of individuals. We compare our performance to existing metrics, and look beyond the 10 protein model to a future in which the interpretation of protein biomarkers depends upon who you are. [Top] |
![]() |
Olga Troyanskaya, PhD Assistant Professor Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics Princeton University USA CV: pdf Title: From Tissue-specific Functional Networks to Understanding Human Disease Abstract: The ongoing explosion of new technologies in functional genomics offers the promise of understanding gene function, interactions, and regulation at the systems level. This should enable us to develop comprehensive descriptions of genetic systems of cellular controls, including those whose malfunctioning becomes the basis of genetic disorders, such as cancer, and others whose failure might produce developmental defects in model systems. However, the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it on a systems level, and apply it to the study of specific pathways or genetic disorders. These challenges are further exacerbated by the biological complexity of metazoans, including diverse biological processes, individual tissue types and cell lineages, and by the increasingly large scale of data in higher organisms. I will describe how we address these challenges through the development of bioinformatics frameworks for the study of gene function and regulation in complex biological systems and through close coupling of these methods with experiments, thereby contributing to understanding of human disease. I will specifically discuss how integrated analysis of functional genomics data can be leveraged to study cell-lineage specific gene expression, to identify proteins involved in disease in a way complementary to quantitative genetics approaches, and to direct both large-scale and traditional biological experiments. [Top] |