Highlights Track Presentations

Highlights, Late Breaking Research and Proceedings Track presentations will be presented by Theme.

Data:

Includes data and text-mining, ontologies, databases and machine learning approaches that do not fit in other categories.

Disease:

Includes analysis of mutations, phenotypes, drugs, epidemiology and other clinically relevant areas

Genes

Includes work in genes (including non-coding RNA), transcriptomes, genomes and variation.

Proteins:

Includes analysis of proteins and their structures and proteomics.

Systems:

This theme includes higher level systems such as cells, tissues, whole organisms and ecosystems. Includes systems biology, molecular interactions and genetic regulation.

Other:

Research areas that do not fall within the five (5) main thematic areas. The organizers may, at their discretion, move submissions to other thematic areas.

Data
Presenting author: David Gfeller, University of Lausanne, Switzerland
Date:Tuesday, July 14 10:50 am - 11:10 amRoom: Liffey Hall 2

Additional authors:
Aurelien Grosdidier, Swiss Institute of Bioinformatics, Switzerland
Matthias Wirth, Swiss Institute of Bioinformatics, Switzerland
Antoine Daina, Swiss Institute of Bioinformatics, Switzerland
Olivier Michielin, Swiss Institute of Bioinformatics, Switzerland
Vincent Zoete, Swiss Institute of Bioinformatics, Switzerland

Area Session Chair: Ioannis Xenarios

Presentation Overview:
Large-scale phenotypic screening initiatives increasingly allow researchers to test the functional impact of small molecules in different eukaryotic species. However, for most bioactive compounds the targets are only partially known. Here, we introduce a new computational approach to predict the targets of bioactive small molecules based on a combination of chemical similarity measures [Gfeller et al. Bioinformatics, Dec 2013]. We further investigate the use of target homology to transfer small molecule-target interactions across organisms. Interestingly, when considering separately orthology and paralogy relationships, we find that mapping small molecule interactions among orthologs significantly improves prediction accuracy, while including paralogs leads to lower prediction accuracy. Overall, our work provides a novel approach to accurately predict the targets of small molecules by combining different kinds of chemical similarity measures and, for the first time, integrates target homology to leverage data from different species. The method is accessible at http://www.swisstargetprediction.ch.
TOP

Presenting author: Hector Corrada Bravo, University of Maryland, United States
Date:Monday, July 13 3:50 pm - 4:10 pmRoom: Wicklow Hall 2A

Additional authors:
Florin Chelaru, University of Maryland, United States

Area Session Chair: Robert F. Murphy

Presentation Overview:
Data visualization is an integral aspect of the analysis of epigenomic experimental results. Commonly, the data visualized
in these tools is the output of analyses performed in computing
environments like _Bioconductor_. These two essential aspects of data
analysis, algorithmic/statistical analysis and visualization, are
usually distinct and disjoint but are most effective when used
iteratively. We will introduce epigenomics data visualization tools that
provide tight-knit integration with computational and statistical
modeling and data analysis: _Epiviz_ (_http://epiviz.cbcb.umd.edu_), a
web-based genome browser application, and the _Epivizr_ Bioconductor
package that provides interactive integration with _R/Bioconductor_
sessions. This combination of technologies permits interactive
visualization within a state-of-the-art functional genomics analysis
platform. The web-based design of our tools facilitates the reproducible
dissemination of interactive data analyses in a user-friendly platform.
TOP

Disease
Presenting author: Martin Kircher, University of Washington, United States
Date:Sunday, July 12 2:00 pm - 2:20 pmRoom: The Auditorium

Additional authors:
Daniela Witten, University of Washington, United States
Preti Jain, Columbia University, United States
Brian O'Roak, Oregon Health and Science University, United States
Gregory Cooper, HudsonAlpha Institute for Biotechnology, United States
Jay Shendure, University of Washington, United States

Area Session Chair: Yana Bromberg

Presentation Overview:
The interpretation of human genetic variation on a genome-wide scale is a crucial challenge in both research and clinical settings. Available annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. missense changes). We developed a broadly applicable metric that objectively weights and integrates the large, diverse, and otherwise unwieldy collection of annotation data available. Combined Annotation Dependent Depletion (CADD) integrates these annotations by contrasting variants that survived natural selection with simulated mutations. We show that CADD-based scores correlate with allelic diversity, pathogenicity of both coding and non-coding variants, and experimentally measured regulatory effects, and also highly rank causal variants within individual genome sequences. We pre-compute SNV scores for the whole human genome and enable scoring of short InDels (http://cadd.gs.washington.edu). We describe our method and discuss the integration of additional annotations as well as methodological improvements that we have made over the last year.
TOP

Presenting author: Mark Leiserson, Brown University, United States
Date:Monday, July 13 2:00 pm - 2:20 pmRoom: The Auditorium

Additional authors:
Fabio Vandin, Brown University, United States
Hsin-Ta Wu, Brown University, United States
Jason Dobson, Brown University, United States
Alexandra Papoutsaki, Brown University, United States
Beifang Niu, Washington University in St. Louis, United States
Michael McLellan, Washington University in St. Louis, United States
Michael Lawrence, Broad Institute of MIT and Harvard, United States
Abel Gonzalez-Perez, Pompeu Fabra University, Spain
David Tamborero, Pompeu Fabra University, Spain
Gregory Ryslik, Yale University, United States
Yuwei Cheng, Yale University, United States
Nuria Lopez-Bigas, Pompeu Fabra University, Spain
Gad Getz, Broad Institute of MIT and Harvard, United States
Li Ding, Washington University in St. Louis, United States
Benjamin Raphael, Brown University, United States
Benjamin Raphael, Brown University, United States

Area Session Chair: Paul Horton

Presentation Overview:
A key challenge in cancer genomics is to identify mutations that drive cancer in a cohort of tumor samples. These mutations often target genetic regulatory and signaling pathways and protein complexes, each including multiple genes. We present the HotNet2 (diffusion oriented subnetworks) algorithm for identifying significantly mutated subnetworks in a protein interaction network. HotNet2 uses an insulated heat diffusion process to simultaneously encode the local topology of a protein and its mutations when identifying significantly mutated (hot) subnetworks. We applied HotNet2 to the The Cancer Genome Atlas Pan-Cancer dataset, including 3110 tumor samples from twelve cancer types. HotNet2 identified significantly mutated subnetworks overlapping well-known cancer pathways, protein complexes with recently characterized roles in cancer (e.g. SWI/SNF and BAP1), and less characterized complexes (including the condensin and cohesin complexes). Our presentation will also include recent extensions and applications of the HotNet2 algorithm.
TOP

Presenting author: Jinfeng Liu, Genentech, United States
Date:Tuesday, July 14 10:50 am - 11:10 amRoom: The Liffey A

Additional authors:
Mark McCleland, Genentech, United States
Eric Stawiski, Genentech, United States
Florian Gnad, Genentech, United States
Oleg Mayba, Genentech, United States
Peter Haverty, Genentech, United States
Steffen Durinck, Genentech, United States
Ying-Jiun Chen, Genentech, United States
Christiaan Klijn, Genentech, United States
Suchit Jhunjhunwala, Genentech, United States
Michael Lawrence, Genentech, United States
Hanbin Liu, Genentech, United States
Yinan Wan, Genentech, United States
Vivek Chopra, Genentech, United States
Murat Yaylaoglu, Genentech, United States
Wenlin Yuan, Genentech, United States
Connie Ha, Genentech, United States
Houston Gilbert, Genentech, United States
Jens Reeder, Genentech, United States
Gregoire Pau, Genentech, United States
Jeremy Stinson, Genentech, United States
Howard Stern, Genentech, United States
Gerard Manning, Genentech, United States
Thomas Wu, Genentech, United States
Richard Neve, Genentech, United States
Frederic de Sauvage, Genentech, United States
Zora Modrusan, Genentech, United States
Somasekar Seshagiri, Genentech, United States
Ron Firestein, Genentech, United States
Zemin Zhang, Genentech, United States

Area Session Chair: Louxin Zhang

Presentation Overview:
Integrative data analysis of genomic and transcriptomic alterations has become critical towards our understanding of disease drivers and personalized cancer therapy. Here, we describe the first comprehensive characterization of paired exomes and transcriptomes of 48 primary tumors and 21 cell lines from gastric cancer, the second leading cause of worldwide cancer mortality. We found that more than half of our patient collection could potentially benefit from targeted therapies. We performed systematic analysis of both mutation-dependent aberrant splicing and mutation-independent splicing isoforms in gastric cancer, and identified 55 splice-site mutations accompanied by aberrant splicing products and about 200 genes with differential isoform usage between tumors and normals. Among genes in cancer pathways found to have altered splicing in tumors, we discovered that the long isoform of ZAK kinase was preferentially upregulated in several cancer types, and isoform-specific oncogenic properties of ZAK were subsequently confirmed by functional validation.
TOP

Presenting author: Olivier Lichtarge, Baylor College of Medicine, United States
Date:Sunday, July 12 2:20 pm - 2:40 pmRoom: The Auditorium

Additional authors:
Martin Lisewski, Baylor College of Medicine, United States
Angela Wilkins, Baylor College of Medicine, United States
Panagiotis Katsonis, Baylor College of Medicine, United States

Area Session Chair: Yana Bromberg

Presentation Overview:
Slide 1 will break the problem of computing personalized therapy into steps, each one a paper. Slides 2-4 will discuss the Cell paper: a network compression approach to integrate and analyze structured Big Data, from databases, culminating with the discovery of the target and mechanism of the best anti-malarial drug with use for future drug screens. Slide 5-6, will expand integration to unstructured data from the entire literature using AI, with an application to p53 biology. Slide 7-9 will turn to the inclusion of personalized information into the network by scoring accurately individual genome variations. Illustrations will summarize winning performance to predict deleterious mutations at the CAGI blind competition and application to head and neck cancer. Slide 10 will summarize the strategy, key results and future directions.
TOP

Presenting author: David Seifert, ETH Zurich, Switzerland
Date:Monday, July 13 10:50 am - 11:10 amRoom: Liffey Hall 2

Additional authors:
Francesca Di Giallonardo, University Hospital Zurich, Switzerland
Karin J. Metzner, University Hospital Zurich, Switzerland
Huldrych F. Günthard, University Hospital Zurich, Switzerland
Niko Beerenwinkel, ETH Zurich, Switzerland

Area Session Chair: Yves Moreau

Presentation Overview:
QuasiFit is a Bayesian MCMC sampler for inferring intra-host viral fitness landscapes from next-generation sequencing data. To estimate fitness, QuasiFit uses cross-sectional genetic data and assumes the viral quasispecies to be in mutation-selection equilibrium. With the inferred posterior fitness distribution, effects such as epistasis and neutral genotype networks can be determined, which will be helpful in judging which viral strains are highly fit and driving intra-host evolution. We applied QuasiFit to infer the viral fitness landscapes in two HIV-infected patients. By using intra-host data, QuasiFit enables learning of host-specific, personalized viral fitness landscapes.
TOP

Presenting author: Donna Slonim, Tufts University, United States
Date:Monday, July 13 11:40 am - 12:00 pmRoom: Liffey Hall 2

Additional authors:
Heather Wick, Johns Hopkins University, United States
Daniel Kee, Braintree Payments, United States
Keith Noto, Ancestry, Inc, United States
Jill Maron, Tufts Univerisy, United States
Donna Slonim, Tufts University, United States
Jisoo Park, Tufts University, United States

Area Session Chair: Yves Moreau

Presentation Overview:
Experiences during early development can affect lifelong health and disease risk. In our study, we have identified significant and surprising links between diseases and several tissue-specific developmental processes. Our work relies on a novel approach whose strength comes from pooling disease genes across related diseases, overcoming problems posed by limited information about gene-disease associations. We demonstrate the efficacy of the pooling method by evaluation on withheld data. We further validate the links between developmental processes and disease by demonstrating that our results, collectively, recover expected connections, such as those between heart development and cardiovascular disorders. We also describe some of the more surprising connections we found, several of which are consistent with other molecular evidence or recent literature. Finally, we present a web-based application that enables users to perform the same analysis for any set of genes of interest, and includes a visualization tool for exploration of the results.
TOP

Presenting author: Robert Kueffner, Helmholtz Center Munich, Germany
Date:Monday, July 13 2:00 pm - 2:20 pmRoom: Liffey Hall 2

Additional authors:
Zach Neta, Prize4Life, Israel
Gustavo Stolovitzky, IBM, United States

Area Session Chair: Knut Reinert

Presentation Overview:
We developed a crowdsourced DREAM Challenge to predict ALS disease progression using clinical trial data. The data are complex and non-uniform as they were measured by different laboratories. Therefore an important step was the harmonization of the different data sets.
On this clinical data, tree-based ensemble regression techniques proved to be most effective for machine learning. Based on the accuracy of the winning algorithms, we will present a simulation model to estimate the expected reduction in the number of patients needed for a clinical trial. The best performing submissions also outperformed the predictions of a group of world leading clinicians. One important outcome of the challenge was the identification of novel predictors of progression rate, potentially offering novel insights about disease mechanisms. We will also discuss our registrant survey where we determined factors that motivated or discouraged potential solvers to participate.
TOP

Presenting author: Olivier Gevaert, Stanford University, United States
Date:Tuesday, July 14 10:10 am - 10:30 amRoom: The Liffey A

Area Session Chair: Louxin Zhang

Presentation Overview:
Aberrant DNA methylation is an important mechanism that contributes to oncogenesis. Yet, few algorithms exist that exploit this vast dataset to identify hypo- and hyper-methylated genes in cancer. We developed a novel computational algorithm called MethylMix to identify differentially methylated genes that are also predictive of transcription. We apply MethylMix to twelve individual cancer sites, and additionally combine all cancer sites in a pancancer analysis. We discover pancancer hypo- and hyper-methylated genes and identify novel methylation-driven subgroups with clinical implications. MethylMix analysis on combined cancer sites reveals ten pancancer clusters reflecting new similarities across malignantly transformed tissues.
TOP

Genes
Presenting author: Volodymyr Kuleshov, Stanford University, United States
Date:Sunday, July 12 10:10 am - 10:30 amRoom: The Liffey A

Additional authors:
Dan Xie, Stanford University, United States
Rui Chen, Stanford University, United States
Dmitry Pushkarev, Stanford University, United States
Zhihai M, Stanford University, United States
Tim Blauwkamp, Illumina, Inc., United States
Michael Kertesz, Stanford University, United States
Serafim Batzoglou, Stanford University, United States
Michael Snyder, Stanford University, United States

Area Session Chair: Siu Ming Yiu

Presentation Overview:
New synthetic long read technologies are finally offering us tools for studying unresolved aspects of the human genome such as structural variation and genomic phase. In this talk, we will present a new phasing technology based on Tru-seq synthetic long reads that places more than 99% of human genomic variants into highly accurate, megabase-long haplotypes. Its low cost and excellent performance bring haplotyping one step closer to being a routine tool for studying allele-specific phenomena such as differential methylation.

At the core of this technology is novel phasing algorithm called Prism that augments long-read phasing with statistical methods; this idea dramatically reduces sequencing requirements, increases haplotype length by almost 10x, and supports any long-read phasing technology. More generally, we will demonstrate through this as well as other ongoing work in metagenomics and de-novo assembly how synthetic long reads combined with sophisticated algorithms can help solve important problems in genomics.
TOP

Presenting author: Jeroen de Ridder, Delft University of Technology, Netherlands
Date:Sunday, July 12 2:40 pm - 3:00 pmRoom: The Liffey B

Additional authors:
Sepideh Babaei, Delft University of Technology, Netherlands
Waseem Akhtar, Netherlands Cancer Institute, Netherlands
Johann de Jong, Netherlands Cancer Institute, Netherlands
Marcel Reinders, Delft University of Technology, Netherlands

Area Session Chair: Janet Kelso

Presentation Overview:
Genomically distal mutations can contribute to deregulation of cancer genes by engaging in chromatin interactions. To study this, we overlay viral cancer-causing insertions obtained in a murine retroviral insertional mutagenesis screen with genome-wide chromatin conformation capture data. In this talk, we show that insertions tend to cluster in 3D hotspots within the nucleus. The identified hotspots are significantly enriched for known cancer genes, and bear the expected characteristics of bona-fide regulatory interactions, such as enrichment for transcription factor binding sites. Additionally, we observe a striking pattern of mutual exclusive integration. This is an indication that insertions in these loci target the same gene, either in their linear genomic vicinity or in their 3D spatial vicinity. Our findings shed new light on the repertoire of targets obtained from insertional mutagenesis screening and underlines the importance of considering the genome as a 3D structure when studying effects of genomic perturbations.
TOP

Presenting author: Brendan Frey, University of Toronto, Canada
Date:Sunday, July 12 3:50 pm - 4:10 pmRoom: The Liffey B

Additional authors:
Hui Xiong, University of Toronto, Canada
Babak Alipanahi, University of Toronto, Canada
Leo Lee, University of Toronto, Canada
Hannes Bretschneider, University of Toronto, Canada
Daniele Merico, University of Toronto, Canada
Ryan Yuen, University of Toronto, Canada
Yimin Hua, University of Toronto, Canada
Serge Gueroussov, University of Toronto, Canada
Hamed Najafabadi, University of Toronto, Canada
Tim Hughes, University of Toronto, Canada
Quaid Morris, University of Toronto, Canada
Yoseph Barash, University of Toronto, Canada
Adrian Krainer, University of Toronto, Canada
Nebojsa Jojic, University of Toronto, Canada
Steve Scherer, University of Toronto, Canada
Ben Blencowe, University of Toronto, Canada

Area Session Chair: Janet Kelso

Presentation Overview:
To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
TOP

Presenting author: Amit Deshwar, University of Toronto, Canada
Date:Tuesday, July 14 12:20 pm - 12:40 pmRoom: The Auditorium

Additional authors:
Shankar Vembu, University of Toronto, Canada
Christina Yung, Ontario Institute for Cancer Research, Canada
Gun Ho Jang, Ontario Institute for Cancer Research, Canada
Lincoln Stein, Ontario Institute for Cancer Research, Canada
Quaid Morris, University of Toronto, Canada

Area Session Chair: Niko Beerenwinkel

Presentation Overview:
Tumors often contain multiple, genetically-diverse subpopulations. Reconstructing the genotype of these subpopulations by determining which of the somatic tumor-associated mutations they contain is a problem of considerable interest to aid in the understanding of tumor development and treatment response. While there has been considerable progress in automated methods for reconstruction, many fundamental questions about this problem remain unanswered. Many subclonal reconstruction methods, including ours, attempt to reconstruct the evolutionary history of the tumour as a means to assign complete genotypes to each subpopulations. I will discuss the current state of the field and our latest work on this problem. I will introduce PhyloWGS, a Bayesian method that is the first to use CNVs and SNVs to perform phylogenetic subclonal reconstruction. PhyloWGS returns a distribution over possible subclonal reconstructions, enabling the identification of portions of the reconstruction that are highly certain and those that are not.
TOP

Presenting author: James Zou, Microsoft Research, United States
Date:Monday, July 13 2:40 pm - 3:00 pmRoom: The Liffey A

Area Session Chair: Uwe Ohler

Presentation Overview:
In epigenome-wide association studies, cell-type composition
often differs between cases and controls, yielding associations
that simply tag cell type rather than reveal fundamental
biology. Current solutions require actual or estimated
cell-type composition—information not easily obtainable
for many samples of interest. We propose a method,
FaST-LMM-EWASher, that automatically corrects for cell-type
composition without the need for explicit knowledge of it,
and then validate our method by comparison with the
state-of-the-art approach.
TOP

Presenting author: Amin Ardeshirdavani, KU Leuven, Belgium
Date:Monday, July 13 11:40 am - 12:00 pmRoom: The Liffey A

Additional authors:
Erika Souche, KU Leuven, Belgium
Luc Dehaspe, KU Leuven, Belgium
Jeroen Van Houdt, KU Leuven, Belgium
Joris Robert Vermeesch, KU Leuven, Belgium
Yves Moreau, KU Leuven, Belgium

Area Session Chair: Jerome Waldispuhl

Presentation Overview:
As many personal genomes are now being sequenced, collaborative analysis of those genomes has become essential to effectively gain biomedical knowledge from those sequencing efforts. However, analysis of personal genomic data raises important confidentiality issues. We propose a methodology called NGS-Logistics, for federated analysis of sequence variants from personal genomes that contributes to alleviate those problems. Our method allows querying the genome for both a set of samples to which the user has authorized direct access (active data set) and for the whole set of samples. The query results are statistics that do not breach data confidentiality but allow further exploration of the data. Relevant samples outside the active data set can be identified through pseudonymous identifiers so that researchers can negotiate access to these samples with the authorized party. This approach minimizes the impact on data confidentiality while enabling powerful data analysis by gaining access to important rare samples.
TOP

Presenting author: John Whitaker, Janssen Pharmaceutical Companies of Johnson & Johnson, United States
Date:Monday, July 13 4:10 pm - 4:30 pmRoom: The Liffey A

Additional authors:
Wei Wang, UCSD, United States
Zhou Chen, UCSD, United States
Kai Zhang, UCSD, United States

Area Session Chair: Uwe Ohler

Presentation Overview:
The epigenome is established and maintained by the site-specific recruitment of chromatin-modifying enzymes and their cofactors. Identifying the cis elements that regulate epigenomic modification is critical for understanding the regulatory mechanisms that control gene expression patterns. We present Epigram, an analysis pipeline that predicts histone modification and DNA methylation patterns from DNA motifs. The identified cis elements represent interactions with the site-specific DNA-binding factors that establish and maintain epigenomic modifications. We cataloged the cis elements in embryonic stem cells and four derived lineages and found numerous motifs that have location preference, such as at the center of H3K27ac or at the edges of H3K4me3 and H3K9me3, which provides mechanistic insight about the shaping of the epigenome.
TOP

Presenting author: Deniz Yorukoglu, Massachusetts Institute of Technology, United States
Date:Sunday, July 12 11:40 am - 12:00 pmRoom: The Liffey A

Additional authors:
Y. William Yu, Massachusetts Institute of Technology, United States
Jian Peng, Massachusetts Institute of Technology, United States
Bonnie Berger, Massachusetts Institute of Technology, United States

Area Session Chair: Siu Ming Yiu

Presentation Overview:
In this presentation, we show how to recover quality information directly from sequence data using the compression tool “Quartz,” rendering such scores redundant and yielding substantially better space and time efficiencies for storage and analysis. Quartz is designed to operate on reads in FASTQ format but can be trivially modified to discard quality scores in other formats with sequence-quality score pairs.

Discarding 95% of quality scores counterintuitively resulted in improved genotyping, implying that compression need not come at the expense of accuracy. In contrast to previous results, we show that although completely discarding quality scores comes at the cost of accuracy and quality score recalibration to improve variant calling accuracy generally decreases compressibility, there is a happy medium at which we can get both good compression and improved accuracy.
TOP

Presenting author: Marnix Medema, Wageningen University, Netherlands
Date:Sunday, July 12 3:30 pm - 3:50 pmRoom: The Liffey A

Additional authors:
Peter Cimermancic, University of California, San Francisco, United States
Jan Claesen, University of California, San Francisco, United States
Kenji Kurita, University of California, Santa Cruz, United States
Eriko Takano, University of Manchester, United Kingdom
Andrej Sali, University of California, San Francisco, United States
Roger Linington, University of California, Santa Cruz, United States
Michael Fischbach, University of California, San Francisco, United States

Area Session Chair: Reinhard Schneider

Presentation Overview:
Bacterial secondary metabolism is of major importance to society, as it is the source of large numbers of antibiotics, anticancer agents, and other important bioactive compounds. The genes encoding the biosynthetic pathways to make these molecules are usually grouped together on the chromosome in so-called biosynthetic gene clusters (BGCs). In our recent paper (Cell 158: 412-421, 2014), we describe a novel algorithm to effectively identify BGCs, and apply this to perform a systematic analysis of BGCs throughout the prokaryotic tree of life. Network analysis of the predicted BGCs revealed numerous large gene cluster families, most of which are uncharacterized. We experimentally characterized the largest of these, which is widespread among bacteria and encodes the biosynthesis of molecules that appear to protect their hosts against oxidative stress. Finally, a detailed evolutionary genomic analysis of all known and predicted BGCs revealed how the astonishing molecular diversity of microbial secondary metabolism continuously evolves.
TOP

Presenting author: Iddo Friedberg, Miami University, United States
Date:Sunday, July 12 3:50 pm - 4:10 pmRoom: The Liffey A

Additional authors:
David Ream, Miami University, United States
Asma Bankapur, Miami University, United States

Area Session Chair: Reinhard Schneider

Presentation Overview:
Gene blocks are genes co-located on the chromosome. In many cases, genes blocks are conserved between bacterial species, sometimes as operons, when genes are co-transcribed. The conservation is rarely absolute: gene loss, gain, duplication,
block splitting, and block fusion are frequently observed. An open question in bacterial molecular evolution is that of the formation and breakup of gene blocks, for which several models have been proposed. These models, however, are not generally applicable to all types of gene blocks, and consequently cannot be used to broadly compare and study gene block evolution. To address this problem we introduce an event-based
method for tracking gene block evolution in bacteria.

In my talk will explain this method, and demonstrate a new visualization technique we call phylomatrices. I will show how we can easily gauge operon conservation, and discover interesting clade-based aberrations as well as horizontal gene transfers.
TOP

Presenting author: Erez Levanon, Bar-Ilan University, Israel
Date:Monday, July 13 10:30 am - 10:50 amRoom: The Liffey A

Additional authors:
Hagit Porath, Bar-Ilan University, Israel
Shai Carmi , Columbia University, United States

Area Session Chair: Jerome Waldispuhl

Presentation Overview:
Adenosine-to-inosine editing is one of the most frequent post-transcriptional modifications, manifested as A-to-G mismatches when comparing RNA sequences with their source DNA. Recently, a number of RNA seq data sets have been screened for the presence of A-to-G editing, and hundreds of thousands of editing sites identified. Here we show that existing screens missed the majority of sites by ignoring reads with excessive ('hyper') editing that do not easily align to the genome. We show that careful alignment and examination of the unmapped reads in RNA-seq studies in human reveal numerous new sites, usually many more than originally discovered, and in precisely those regions that are most heavily edited. Specifically, we more than double the number of detected sites in several published screens. We also identify thousands of new sites in mouse, rat, opossum and fly. Our results establish that hyper-editing events account for the majority of editing sites.
TOP

Presenting author: Fran Supek, Centre for Genomic Regulation, Spain
Date:Monday, July 13 12:00 pm - 12:20 pmRoom: The Liffey A

Additional authors:
Belén Miñana, Centre For Genomic Regulation, Barcelona, Spain
Juan Valcárcel, Centre For Genomic Regulation, Barcelona, Spain
Toni Gabaldón, Centre For Genomic Regulation, Barcelona, Spain
Ben Lehner, Centre For Genomic Regulation, Barcelona, Spain
Anita Kriško, Mediterranean Institute for Life Sciences, Split, Croatia
Tea Copić, Mediterranean Institute for Life Sciences, Split, Croatia

Area Session Chair: Jerome Waldispuhl

Presentation Overview:
Synonymous mutations do not change the encoded amino acids, but are known to have subtle effects on protein translation and on regulation of splicing. We have examined the prevalence of synonymous mutations among somatic changes catalogued across ~3800 human cancer genomes (Supek et al, Cell 2014). Oncogenes harbor an excess of synonymous mutations when compared against the broader gene set and intronic mutation rates. Such mutations were likely to alter exonic splicing enhancer/silencer motifs; RNA-Seq data indicated this leads to aberrantly spliced transcripts. Next, we analyzed the synonymous codon usage biases in 910 prokaryotic genomes (Krisko et al, Genome Biol 2014). Here, we found associations of codon biases within orthologous gene clusters to environmental preferences of microbes, and used this to predict the adaptive value of genes for aerobic, hot, or hypersaline environments. Out of 200 novel functional annotations for COG groups thus obtained, we experimentally validated 35/44 tested predictions.
TOP

Presenting author: Jason Ernst, University of California Los Angeles, United States
Date:Monday, July 13 2:20 pm - 2:40 pmRoom: The Liffey A

Additional authors:
Manolis Kellis, Massachusetts Institute of Technolog, United States

Area Session Chair: Uwe Ohler

Presentation Overview:
With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals and surpass experimental datasets in consistency, recovery of gene annotations and enrichment for disease-associated variants. We use the imputed data to detect low-quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory region annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information.
TOP

Presenting author: Pavel Sumazin, Baylor College of Medicine, United States
Date:Monday, July 13 10:50 am - 11:10 amRoom: The Liffey A

Area Session Chair: Jerome Waldispuhl

Presentation Overview:
We introduce a method for simultaneous prediction of microRNA-target interactions and their mediated competitive endogenous RNA (ceRNA) interactions. Using high-throughput validation assays in breast cancer cell lines, we show that our integrative approach significantly improves on microRNA-target prediction accuracy as assessed by both mRNA and protein level measurements. Our biochemical assays support nearly 500 microRNA-target interactions with evidence for regulation in breast-cancer tumors. Moreover, these assays constitute the most extensive validation platform for computationally inferred networks of microRNA-target interactions in breast-cancer tumors, providing a useful benchmark to ascertain future improvements.
TOP

Presenting author: Lukas Chavez, German Cancer Research Institute, Germany
Date:Sunday, July 12 4:10 pm - 4:30 pmRoom: Liffey Hall 2

Area Session Chair: Janet Kelso

Presentation Overview:
A characteristic feature of asthma is the aberrant accumulation, differentiation or function of memory CD4(+) T cells that produce type 2 cytokines (TH2 cells). By mapping genome-wide histone modification profiles for subsets of T cells isolated from peripheral blood of healthy and asthmatic individuals, we identified enhancers with known and potential roles in the normal differentiation of human TH1 and TH2 cells. We discovered disease-specific enhancers in T cells that differ between healthy and asthmatic individuals. Enhancers that gained the histone H3 Lys4 dimethyl (H3K4me2) mark during TH2 cell development showed the highest enrichment for asthma-associated single nucleotide polymorphisms (SNPs), which supported a pathogenic role for TH2 cells in asthma. In silico analysis of cell-specific enhancers revealed transcription factors, microRNAs and genes potentially linked to human TH2 cell differentiation. Our results establish the feasibility and utility of enhancer profiling in well-defined populations of specialized cell types involved in disease pathogenesis.
TOP

Presenting author: Dietlind Gerloff, Foundation for Applied Molecular Evolution, United States
Date:Monday, July 13 3:50 pm - 4:10 pmRoom: Liffey Hall 2

Additional authors:
Jonathan Magasin, University of California Santa Cruz, United States

Area Session Chair: Knut Reinert

Presentation Overview:
A metagenomic sample can contain billions of cells, thousands of different genomes. The set of sequencing reads derived from it will be sparse by comparison and underrepresent this complexity by orders of magnitude. Additionally, metagenome annotation is confounded by short reads that capture only small fragments of genes, and by the small fraction of known microbes represented in sequence databases, often described as "the culturable 1%". Difficulties include distinguishing known from novel species and often affect the majority of reads in a data set.
 In our paper, we demonstrate quantitatively how careful assembly of marine metagenomic pyrosequencing reads within, but also across, datasets can alleviate annotation problems. Our results outline exciting prospects for data sharing in the metagenomics community. In follow-on work, we have developed a new "geographic profiling" approach that allows us to use chimeric contigs obtained through pooled assembly for (low-cost) discovery of new species in old data.
TOP

Presenting author: Oliver Stegle, EMBL European Bioinformatics Institute, United Kingdom
Date:Sunday, July 12 10:30 am - 10:50 amRoom: The Liffey A

Area Session Chair: Siu Ming Yiu

Presentation Overview:
Many key biological processes are driven by differences in the regulatory landscape between single cells. Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new, and physiologically relevant, sub-populations of cells can be found. A key Bioinformatics challenge in analyzing these data is to comprehensively account for the different sources of variation between cells such that biologically relevant signatures can be reliably identified.

To address this, we here develop a computational approach to dissect single-cell transcriptome variation data, accounting for known and hidden sources of variation. We validate this latent variable model on single-cell data from labeled cellular states before applying it to study data generated from asynchronously differentiating T cells. By accounting for cell-to-cell correlations due to the cell cycle, we show how single-cell RNA-Seq data can be used to place individual cells on the trajectory between undifferentiated and differentiated cells.
TOP

Presenting author: Feng Yue, The Pennsylvania State University, United States
Date:Sunday, July 12 2:20 pm - 2:40 pmRoom: The Liffey A

Additional authors:
Yong Cheng, Stanford University, United States
Alessandra Breschi, Centre for Genomic Regulation and UPF, Spain
Jeff Vierstra, University of Washington, United States
Weisheng Wu, Computational Medicine & Bioinformatics, United States
Tyrone Ryba, New College of Florida, United States
Ricard Sandstrom, University of Washington, United States
Zhihai Ma, Stanford University, United States
Carrie Davis, Cold Spring Harbor Laboratory, United States
Benjamin Pope, Florida State University, United States
Yin Shen, University of California San Diego, United States
John Stamatoyannopoulos, University of Washington, United States
Michael Snyder, Stanford University, United States
Roderic Guigo, Centre for Genomic Regulation and UPF, Spain
Thomas Gingeras, Cold Spring Harbor Laboratory, United States
David Gilbert, Florida State University, United States
Ross Hardison, The Pennsylvania State University, United States

Area Session Chair: Reinhard Schneider

Presentation Overview:
As the premier model organism in biomedical research, the laboratory mouse shares the vast majority of protein-coding genes with humans, but significant differences exist between the two mammals, posing considerable challenges in the modeling of human diseases. The mouse ENCODE consortium produced more than1000 coordinated datasets, including transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains, in over 100 mouse cell types and tissues. By comparative analysis with the data from human ENCODE, we found that although the majority of gene expression and cis-regulatory elements are conserved between the two species, a large degree of gene regulatory elements appear to be species-specific and these species-specific elements are enriched for genes involved in certain pathways such as immune system and metabolic process, suggesting different gene pathways evolve at distinct rates. Our work also provides a great resource for research into mammalian biology and mechanisms of human disease.
TOP

Presenting author: Emmanuel Barillot, Institut Curie, France
Date:Monday, July 13 4:10 pm - 4:30 pmRoom: Liffey Hall 2

Additional authors:
Anne Biton, University of California, United States
Emmanuel Barillot, Institut Curie, France
Francois Radvanyi, Institut Curie, France
Andrei Zinovyev, Institut Curie, France

Area Session Chair: Knut Reinert

Presentation Overview:
Large-scale projects are generating massive amounts of molecular profiles for tumoural samples. It remains a challenge to unravel their complexity into the action of relatively few independent signals. This ambitious task can be approached by blind source separation methods such as Independent Component Analysis (ICA). We analysed data on nine different cancers from 21 patient cohorts and 6671 tumours and identified their commonalities, as well as the cancer type-specific characteristics. By carefull interpretation of ICA results, we managed to distinguish the signals coming from tumoural cells from those coming from the tumour microenvironment, clearly identified signals associated with technology and related to different treatments of tumour tissue biases. We showed that the information captured in independent components is also reflected into anatomopathological staining microscopy images. Analysis of one of the bladder cancer-specific ICA component led to formulating a new hypothesis on the role of PPARG gene which was experimentally verified.
TOP

Proteins
Presenting author: Rachel Kolodny, University of Haifa, Israel
Date:Sunday, July 12 12:20 pm - 12:40 pmRoom: The Liffey B

Additional authors:
Nir Ben-Tal, George S. Wise Faculty of Life Sciences, Tel Aviv University, Israel
Sergey Nepomnyachiy, Polytechnic Institute of New York University, United States

Area Session Chair: Anna Tramontano

Presentation Overview:
To globally explore protein space, we represent all similarities among a representative set of domains as networks. In the “domain network” edges connect domains that share “motifs,” i.e., significantly sized segments of similar sequence and structure, and in the “motif network” edges connect recurring motifs that appear in the same domain. These networks offer a way to organize protein space, and examine how the definition of “evolutionary relatedness” among domains influences their structure. At excessively strict thresholds the networks falls apart; for very lax thresholds, there are network paths between virtually all domains. Interestingly, at intermediate thresholds the network constitutes two regions: "discrete" versus “continuous.” The discrete region consists of isolated islands, each generally corresponding to a fold; the continuous region is dominated by domains with alternating alpha and beta elements. The networks can also suggest evolutionary paths between domains, and be used for protein search and design.
TOP

Presenting author: Andrei Lupas, Max-Planck-Institute for Developmental Biology, Germany
Date:Tuesday, July 14 2:40 pm - 3:00 pmRoom: The Liffey B

Additional authors:
Iuliia Boichenko, Max-Planck-Institute for Developmental Biology, Germany
Mateusz Korycinski, Max-Planck-Institute for Developmental Biology, Germany
Hongbo Zhu, Max-Planck-Institute for Developmental Biology, Germany
Murray Coles, Max-Planck-Institute for Developmental Biology, Germany
Fabio Zanini, Max-Planck-Institute for Developmental Biology, Germany
Marcus Hartmann, Max-Planck-Institute for Developmental Biology, Germany
Birte Hernandez Alvarez, Max-Planck-Institute for Developmental Biology, Germany

Area Session Chair: Donna Slonim

Presentation Overview:
In the public perception, thalidomide mainly evokes children with stunted limbs. Less known is its ongoing importance for treating multiple myeloma and leprosy. Interest in its further pharmacological development thus remains high, but is hindered by the limited understanding of its teratogenic side-effects. Even the main target of thalidomide in the human body, cereblon, was unknown until recently. Given the intractability of cereblon for biochemical studies, we analyzed the evolution of its thalidomide-binding domain and used sequence-structure relationships to identify a prokaryotic model system, which we validated in vitro and in vivo, in a zebrafish fin development assay. In computational and experimental searches we identified uridine as the first biological, universally available ligand. We also found that a surprisingly large number of pharmacologically important substances with known teratogenic effects act through the same binding site as thalidomide, identifying cereblon as a gateway for teratogenicity in the human body.
TOP

Presenting author: Charlotta Schaerfe, University of Tübingen/Harvard Medical School, Germany
Date:Tuesday, July 14 12:00 pm - 12:20 pmRoom: The Liffey B

Additional authors:
Thomas Hopf, Technische Universität München/Harvard Medical School, Germany
João Rodrigues, Utrecht University, Netherlands
Anna Green, Harvard Medical School, United States
Oliver Kohlbacher, University of Tübingen, Germany
Chris Sander, Memorial Sloan Kettering Cancer Center, United States
Alexandre Bonvin, Utrecht University, Netherlands
Debora Marks, Harvard Medical School, United States
Debora Marks, Harvard Medical School, United States

Area Session Chair: Francisco Melo Ledermann

Presentation Overview:
The interactions of proteins with other biomolecules are essential for all biological activity and thus the accurate prediction of protein-protein interaction partners and interface-residues has been of great interest to the scientific community. Here we present a method, EVcomplex, that allows to predict such data from the evolutionary sequence record alone by making use of residue coevolution between proteins.
This method can have stark implications for various topics from the determination of the actual binding partners and binding sites in large protein complexes to whole genome interactome predictions. In the presentation I will show that the evolutionary record allows us to predict novel protein-protein interactions as well as alternate binding conformations without additional external knowledge of the protein’s 3D structure.
TOP

Presenting author: Igor Jurisica, Princess Margaret Cancer Centre, Canada
Date:Tuesday, July 14 2:20 pm - 2:40 pmRoom: The Liffey B

Additional authors:
Max Kotlyar, UHN, Canada
Chiara Pastrello, UHN, Canada
Flavia Pivetta, CRO, Italy
A Losardo, CRO, Italy
Christian A. Cumbaa, UHN, Canada
Han Li, SLRI, Canada
Z Ding, MDA, United States
Tania Naranian, SLRI, Canada
Yun Niu, Nanjing University, China
F Vafaee, USW, Australia
Julia Petschnigg, UCL, United Kingdom
Gordon Mills, MDA, United States
Andrea Jurisicova, SLRI, Canada
Igor Stagljar, U Toronto, Canada
Roberta Maestro, CRO, Canada

Area Session Chair: Donna Slonim

Presentation Overview:
Protein interaction networks represent an essential infrastructure for systems biology. However, about 20% of human proteins have no interactions and another 33% have <= five. Many of these proteins play important roles in disease and are potential drug targets. To reduce this “disease-related sparseness” of the human interactome, we introduced a data mining-based method, FpClass, and predict 250,452 high confidence PPIs among 10,529 proteins, including 1,089 interactome orphans. Compared to previous methods, FpClass achieved better agreement with experimentally detected PPIs. Using three bioassays we validated 137 of 233 tested predictions; 5 involving orphans now shown to interact with P53. Overall, validation achieved 74% sensitivity with 53% FDR. To better understand why some proteins have few known interactions we investigated their properties and discovered that they are significantly younger, more tissue specific, and more likely to be extracellular than other proteins. However, additional challenges prevent systematic study of these proteins.
TOP

Presenting author: Julia Shifman, Hebrew University of Jerusalem, Israel
Date:Tuesday, July 14 3:30 pm - 3:50 pmRoom: The Liffey B

Additional authors:
Yonatan Aizner, Hebrew University, Israel
Jason Shirian, Hebrew University, Israel
Oz Sharabi, Hebrew University, Israel

Area Session Chair: Donna Slonim

Presentation Overview:
We developed an in silico saturation mutagenesis protocol that allows us to scan any binding interface with all amino acids and to predict changes in free energy of binding due to all single mutations, thereby constructing binding landscapes for various protein-protein interactions (PPIs). We tested the performance of the in silico saturation mutagenesis protocol in two evolutionary different classes of PPIs: high-affinity and multispecific PPIs and demonstrated that their binding landscapes are remarkably different. Wild-type sequences of high-affinity complexes are nearly optimized for binding and contain only a handful of mutations that enhance binding affinity further. In contrast, sequences of multispecific proteins lie far from the fitness maximum, presenting multiple possibilities for improvement. In both examples we show that our computational predictions agree well with experimental results and allow for successful identification of affinity- and specificity-enhancing mutations and cold-spot positions where mutations to several amino acids produce affinity improvement.
TOP

Presenting author: Michael Tress, Spanish National Cancer Research Centre (CNIO), Spain
Date:Sunday, July 12 10:30 am - 10:50 amRoom: The Liffey B

Additional authors:
Alfonso Valencia, Spanish National Cancer Research Centre (CNIO), Spain
David Juan, Spanish National Cancer Research Centre (CNIO), Spain
Iakes Ezkurdia, Centro Nacional de Investigaciones Cardiovasculares, CNIC, Spain
Jesus Vazquez, Centro Nacional de Investigaciones Cardiovasculares, CNIC, Spain
Adam Frankish, Wellcome Trust Sanger Institute, Spain
Jennifer Harrow, Wellcome Trust Sanger Institute, Spain
Mark Diekhans, University of California Santa Cruz (UCSC), Spain
Jose Manuel Rodriguez, Spanish National Cancer Research Centre (CNIO), Spain

Area Session Chair: Anna Tramontano

Presentation Overview:
In this paper we mapped peptides from 7 large-scale proteomics studies to protein coding genes from the human genome. While we identified peptides for more than 96% of genes that evolved before bilateria, we did not find peptides for primate-specific genes, for genes without protein-like features or for genes with poor cross-species conservation. We described a set of 2,001 genes that were potentially non-coding based on features such as weak conservation, a lack of protein features, or ambiguous annotations from major databases, all of which correlated with low peptide detection across the seven experiments. We show that many of these genes behave more like non-coding genes than protein-coding genes, and suggest that many may not code for proteins. Their inclusion in the human protein coding gene catalogue is being revised as part of the ongoing human genome annotation effort.
TOP

Presenting author: Michal Linial, The Hebrew University of Jerusalem, Israel
Date:Sunday, July 12 11:40 am - 12:00 pmRoom: The Liffey B

Area Session Chair: Anna Tramontano

Presentation Overview:
Protein translation is the most expensive operation. Therefore, managing the speed and allocation of resources is tightly controlled. In this study we show that the entire proteome in yeast, fly, human, worm, plant and cow do not show the unique properties at the N-terminal segment while a signal is associated with the Signal peptide (SP) containing proteins. We found pattern in the N-terminal for slowing down the translation rate for SP proteome. We critically analyze these observations from statistical and evolutionary perspectives. We generalize our observation to other groups of proteins that govern by the ‘speed controls’. Specifically, the pattern of codons and their prevalence was tested for GPI-anchored and mitochondrial Transit peptide containing proteins. In all cases, a “speed control” pattern is recorded for all tested organisms. We conclude that tuning the translation of a nascent protein is essential for coping with the constraints imposed by proteins’ cellular fate.
TOP

Systems
Presenting author: Mathieu Clément-Ziza, University of Cologne, Germany
Date:Monday, July 13 10:50 am - 11:10 amRoom: The Liffey B

Additional authors:
Francesc X Marsellach, University College London, United Kingdom
Sandra Codlin, University College London, United Kingdom
Manos A Papadakis, Technical university of Denmark, Denmark
Susanne Reinhardt, Technische Universität Dresden, Germany
Maria Rodriguez-Lopez, University College London, United Kingdom
Stuart Martin, University College London, United Kingdom
Samuel Marguerat, Imperial College London, United Kingdom
Alexander Schmidt, University of Basel, Switzerland
Eunhye Grace Lee, University College London, United Kingdom
Christopher T Workman, echnical university of Denmark, Denmark
Jürg Bähler, University College London, United Kingdom
Andreas Beyer, University of Cologne, Germany

Area Session Chair: Hidde de Jong

Presentation Overview:
Our current understanding of how natural genetic variation affects gene expression beyond well annotated coding genes is still limited. The use of deep sequencing technologies for the study of expression quantitative trait loci (eQTLs) has the potential to close this gap. Here, we generated the first recombinant strain library for fission yeast and conducted an RNA-seq-based QTL study of the coding, non-coding, and antisense transcriptomes. We show that the frequency of distal effects (trans-eQTLs) greatly exceeds the number of local effects (cis-eQTLs) and that non-coding RNAs are as likely to be affected by eQTLs as protein-coding RNAs. We identified a genetic variation of swc5 that modifies the levels of many RNAs, with effects on both sense and antisense transcription, and downstream effects on the histone composition at promoters. The strains, methods, and datasets generated here provide a rich resource for future studies.
TOP

Presenting author: Steven Brenner, University of California, Berkeley, United States
Date:Monday, July 13 2:40 pm - 3:00 pmRoom: The Liffey B

Additional authors:
Liana Lareau, University of California, Berkeley, United States

Area Session Chair: Russell Schwartz

Presentation Overview:
Ultraconserved elements, unusually long regions of perfect sequence identity, are found in genes encoding numerous RNA-binding proteins including SR splicing factors. Expression of these genes is regulated via alternative splicing of the ultraconserved regions to yield mRNAs that are degraded by nonsense- mediated mRNA decay (NMD), a process termed unproductive splicing. We have found that unproductive splicing of affects all human SR genes, but rather than being ancestral, it arose independently in nearly every case. We demonstrate that unproductive splicing of the splicing factor SRSF5 is conserved among all animals and even observed in fungi; this is a rare example of alternative splicing conserved between kingdoms, yet its effect is to trigger mRNA degradation. As the gene duplicated, the ancient unproductive splicing was lost in paralogs, and distinct unproductive splicing evolved rapidly and repeatedly to take its place.
TOP

Presenting author: Tijana Milenkovic, University of Notre Dame, United States
Date:Sunday, July 12 12:00 pm - 12:20 pmRoom: Liffey Hall 2

Additional authors:
Fazle Faisal, University of Notre Dame, United States
Han Zhao, University of Notre Dame, United States

Area Session Chair: Igor Jurisica

Presentation Overview:
Studying human aging is of societal importance. Analyses of gene expression or sequence data have been indispensible for studying human aging. But these typically ignore interconnectivities between genes (proteins). Since proteins interact to keep us alive, and since this is what biological networks (BNs) model, BN research will further our understanding of aging. Because different data types can give complementary biological insights, we integrate current static BNs with aging-related expression data to form dynamic, age-specific BNs. Then, we study cellular changes with age from such BNs to identify key players in aging. Also, analogous to sequence alignment, we use BN alignment to transfer aging-related knowledge from well-studied model species to poorly-studied human between conserved network regions. In the process, we propose a novel superior BN alignment method. We validate the aging-related candidates resulting from our integrative, dynamic, and comparative BN analyses by linking them to aging-related cellular processes and diseases.
TOP

Presenting author: Mohammed AlQuraishi, Harvard Medical School, United States
Date:Monday, July 13 12:00 pm - 12:20 pmRoom: The Liffey B

Additional authors:
Grigoriy Koytiger, Harvard Medical School, United States
Anne Jenney, Harvard Medical School, United States
Gavin MacBeath, Harvard Medical School, United States
Peter Sorger, Harvard Medical School, United States

Area Session Chair: Hidde de Jong

Presentation Overview:
Functional interpretation of genomic variation is critical to understanding human disease, but it remains difficult to predict the effects of specific mutations on protein interaction networks and the phenotypes they regulate. We describe an analytical framework based on multiscale statistical mechanics that integrates genomic and biophysical data to model the human SH2-phosphoprotein network in normal and cancer cells. We apply our approach to data in The Cancer Genome Atlas (TCGA) and test model predictions experimentally. We find that mutations mapping to phosphoproteins often create new interactions but that mutations altering SH2 domains result almost exclusively in loss of interactions. Some of these mutations eliminate all interactions, but many cause more selective loss, thereby rewiring specific edges in highly connected subnetworks. Moreover, idiosyncratic mutations appear to be as functionally consequential as recurrent mutations. By synthesizing genomic, structural and biochemical data, our framework represents a new approach to the interpretation of genetic variation.
TOP

Presenting author: Inna Kuperstein, Institut Curie –U900 INSERM - Mines ParisTech, France
Date:Monday, July 13 12:20 pm - 12:40 pmRoom: The Liffey B

Additional authors:
Maia Chanrion, Institut Curie-CNRS UMR 144, France
David Cohen, Institut Curie – U900 INSERM - Mines ParisTech, France
Emmanuel Barillot, Institut Curie – U900 INSERM - Mines ParisTech, France
Daniel Louvard, Institut Curie-CNRS UMR 144, France
Sylvie Robine, Institut Curie-CNRS UMR 144, France
Andrei Zinovyev, Institut Curie – U900 INSERM - Mines ParisTech, France

Area Session Chair: Hidde de Jong

Presentation Overview:
Epithelial-to-mesenchymal transition (EMT) initiates metastases in cancer, however the key players of the process are still debatable. We constructed a comprehensive map of EMT signaling network and performed structural analysis that allowed highlighting the network organization principles and complexity reduction up to core regulatory routs. Using the reduced network we compared combinations of single and double mutants for achieving the EMT phenotype; predicted that a combination of p53 knock-out and overexpression of Notch would induce EMT and suggested the molecular mechanism. This prediction lead to generation of colon cancer mice model with metastases in distant organs. We confirmed in invasive human colon cancer samples that EMT markers are associated with modulation of Notch and p53 gene expression in similar manner as in the mice model, supporting a synergy between these genes to permit EMT induction. The computational and experimental approaches lead to discovery of new metastasis mechanism in colon cancer.
TOP

Presenting author: Ho-Ryun Chung, Max-Planck-Institut F. Molekulare Genetik, Germany
Date:Monday, July 13 2:00 pm - 2:20 pmRoom: The Liffey B

Additional authors:
Juliane Perner, Max Planck Institute for Molecular Genetics, Germany
Julia Lasserre, Max Planck Institute for Molecular Genetics, Germany
Sarah Kinkley, Max Planck Institute for Molecular Genetics, Germany
Martin Vingron, Max Planck Institute for Molecular Genetics, Germany
Ho-Ryun Chung, Max Planck Institute for Molecular Genetics, Germany

Area Session Chair: Russell Schwartz

Presentation Overview:
Chromatin modifiers and histone modifications form chromatin-signaling networks that regulate and drive transcription. In many cases, interactions between chromatin modifiers and histone modifications have only been studied in vitro or are based on the analysis of a few genes. Due to the biased nature of these experimental approaches and the dynamic complexity of chromatin signaling networks many interactions remain undisclosed. To recover novel interactions between chromatin modifiers and histone modifications, we applied computational methods to genome-wide ChIP-Seq data. The identified chromatin-signaling network recovered several previously described interactions and revealed as of yet unknown interactions. We experimentally verified two of these interactions, linking H4K20me1 with members of the Polycomb Repressive Complexes 1 and 2. These findings demonstrate that our computational method identifies interactions with experimental support and leads to novel biological insights, underlining its power in unraveling the connectivity of highly dynamic chromatin signaling networks.
TOP

Presenting author: Christian Jungreuthmayer, Austrian Centre of Industrial Biotechnology, Austria
Date:Sunday, July 12 10:50 am - 11:10 amRoom: Liffey Hall 2

Additional authors:
Matthias Gerstl, Austrian Centre of Industrial Biotechnology, Austria
Juergen Zanghellini, Austrian Centre of Industrial Biotechnology, Austria

Area Session Chair: Igor Jurisica

Presentation Overview:
In the presentation we will introduce the theoretical concept of our novel approach, discuss the main aspects of its numerical implementation and illustrate the biological relevance. Then, we will give a brief demonstration of our toolkit, which is open source software and freely available for everyone from our website. Our presentation will go beyond published work in that we show that the number of relevant pathways can be reduced even further. By means of a novel method based on linear programming we show that only small subsets of all pathways can simultaneously carry a thermodynamically feasible flux.
We identify these phenotypically relevant subsets in a medium scale E. coli model and show that they are characterized by their ability to maximize biomass and ATP production, consistent with evolutionary interpretations of cell behavior.
TOP

Presenting author: Gustavo Stolovitzky, IBM Research / Mt Sinai Hospital, United States
Date:Monday, July 13 2:20 pm - 2:40 pmRoom: Liffey Hall 2

Additional authors:
Andrea califano, Columbia University, United States
James Costello, University of Colorado, United States
Mukesh Bansal, Columbia University, United States

Area Session Chair: Knut Reinert

Presentation Overview:
Recent therapeutic successes have renewed interest in drug combinations, but experimental screening approaches are costly and often identify only small numbers of synergistic combinations. The DREAM consortium launched an open challenge to foster the development of in silico methods to computationally rank 91 compound pairs, from the most synergistic to the most antagonistic, based on gene-expression profiles of human B cells treated with individual compounds at multiple time points and concentrations. Using scoring metrics based on experimental dose-response curves, we assessed 32 methods, four of which performed significantly better than random guessing. We highlight similarities between the methods. Although the accuracy of predictions was not optimal, we find that computational prediction of compound-pair activity is possible, and that community challenges can be useful to advance the field of in silico compound-synergy prediction.
TOP

Presenting author: Aaron Wong, Princeton University, United States
Date:Sunday, July 12 12:20 pm - 12:40 pmRoom: Liffey Hall 2

Additional authors:
Arjun Krishnan, Princeton University, United States
Casey Greene, Dartmouth, United States
Emanuela Ricciotti, University of Pennsylvania, United States
Rene Zelaya, Dartmouth, United States
Daniel Himmelstein, University of California, San Francisco, United States
Ran Zhang, Princeton University, United States
Boris Hartmann, Icahn School of Medicine at Mount Sinai, United States
Elena Zaslavsky, Icahn School of Medicine at Mount Sinai, United States
Stuart Sealfon, Icahn School of Medicine at Mount Sinai, United States
Daniel Chasman, Brigham and Women's Hospital and Harvard Medical School, United States
Garret FitzGerald, University of Pennsylvania, Perleman School of Medicine, United States
Kara Dolinski, Princeton University, United States
Tilo Grosser, University of Pennsylvania, United States
Olga Troyanskaya, Princeton University, United States

Area Session Chair: Igor Jurisica

Presentation Overview:
Tissue and cell-type identity lie at the core of human physiology and disease. Understanding the genetic underpinnings of complex tissues and individual cell lineages is crucial for developing improved diagnostics and therapeutics. We present genome-wide functional interaction networks for 144 human tissues and cell types developed using a data-driven Bayesian methodology that integrates thousands of diverse experiments spanning tissue and disease states. Tissue-specific networks predict lineage-specific responses to perturbation and reveal genes’ changing functional roles across tissues. We introduce NetWAS, which combines genes with nominally significant GWAS p-values and tissue-specific networks to identify disease-gene associations more accurately than GWAS alone. Our webserver, GIANT (http://giant.princeton.edu), provides an interface to human tissue networks through multi-gene queries, network visualization, analysis tools including NetWAS, and downloadable networks. GIANT enables systematic exploration of the landscape of interacting genes across more than one hundred human tissues and cell types.
TOP