Highlights Track Presentation ScheduleHighlights Track: HL01 Sunday, July 17: 10:45 a.m. - 11:10 a.m. Analysis and design of RNA sequencing experiments for identifying isoform regulationRoom: Hall A Presenting author: Yarden Katz , Massachusetts Institute of Technology, United States Additional authors: Eric Wang, Harvard/MIT, United States Edoardo Airoldi, Harvard, United States Area Session Chair: Janet Kelso Presentation Overview: Through alternative splicing, most human genes express multiple isoforms that often differ in function. To infer isoform regulation from high-throughput sequencing of cDNA fragments (RNA-seq), we developed the mixture-of-isoforms (MISO) model, a statistical model that estimates expression of alternatively spliced exons and isoforms and assesses confidence in these estimates. Incorporation of mRNA fragment length distribution in paired-end RNA-seq greatly improved estimation of alternative-splicing levels. MISO also detects differentially regulated exons or isoforms. Application of MISO implicated the RNA splicing factor hnRNP H1 in the regulation of alternative cleavage and polyadenylation, a role that was supported by UV cross-linking-immunoprecipitation sequencing (CLIP-seq) analysis in human cells. Our results provide a probabilistic framework for RNA-seq analysis, give functional insights into pre-mRNA processing and yield guidelines for the optimal design of RNA-seq experiments for studies of gene and isoform expression. TOP Highlights Track: HL02 Sunday, July 17: 11:15 a.m. - 11:40 a.m. The Central Human ProteomeRoom: Hall A Presenting author: Jacques Colinge , Ce-M-M- Center for Molecular Medicine of the Austrian Academy of Science, Austria Additional authors: Keiryn Bennett, Ce-M-M- Center for Molecular Medicine of the Austrian Academy of Science, Austria Giulio Superti-Furga, Ce-M-M- Center for Molecular Medicine of the Austrian Academy of Science, Austria Area Session Chair: Janet Kelso Presentation Overview: We have obtained a first unbiased estimation of the repertoire of proteins commonly expressed by human cells through proteomics analysis of several cell lines. The bioinformatics analysis of this central human proteome (CHP) shows it has several features that confer it an augmented flexibility to adapt to multiple environments (more exons, more interactions, etc.). We shall discuss these results, extend them, and relate then to findings by other authors to show that the CHP is not a static machine but it participates in specialized tasks and can be “recruited” by diseases on top of its fundamental housekeeping tasks. Considering the central human interactome spanned by the CHP, we shall show it has global properties that synchronize translation with other biological processes and its topology supports a global presence facilitating interactions with any specialized process. TOP Highlights Track: HL03 Sunday, July 17: 11:45 a.m. - 12:10 p.m. From revealing new insights into Human Tissue Development to Minimum CurvinealityRoom: Hall A Presenting author: Carlo Cannistraci , King Abdullah University for Science and Technology (KAUST), Saudi Arabia Additional authors: Timothy Ravasi, King Abdullah University for Science and Technology (KAUST), Saudi Arabia Area Session Chair: Janet Kelso Presentation Overview: We will focus on the data-mining exploration of 32 human tissues determined by 1321 transcription factor (TF) expressions. Integrating the expressions with the physical TF interactions and performing machine learning (ML) analysis, we selected 6 expression-weighted-interactions - a homeobox-sub-network – as best discriminating features that unfolded the presence of the three developmental tissue germ-layer-classes (ectoderm, mesoderm, endoderm) with 82% accuracy. Then, we will reveal how starting only from the expressions, it was possible to provide a bi-dimensional data visualization that, evaluated by clustering, offered 84% accuracy. This was achieved by means of two unsupervised and parameter-free MLs: minimum-curvilinear-embedding for nonlinear-dimension-reduction, and minimum-curvilinear-affinity-propagation for non-spherical-clustering. We will conclude with our two recent results: the presence of the germ-layers-classes is conserved in the exploration of several other human and mouse gene-expression-datasets; a novel unsupervised ML using just the expressions is able to identify a discriminative homeobox-sub-network that extends the one previously proposed. TOP Highlights Track: HL04 Sunday, July 17: 12:15 p.m. - 12:40 p.m. Five-Vertebrate ChIP-seq Reveals the Evolutionary Dynamics of Transcription Factor BindingRoom: Hall A Presenting author: Benoit Ballester , European Bioinformatics Institute (EMBL-EBI), United Kingdom Additional authors: Petra Schwalie, EBI-EMBL, United Kingdom Paul Flicek, EBI-EMBL, United Kingdom Dominic Schmidt, CRI, United Kingdom Michael Wilson, CRI, United Kingdom Duncan Odom, CRI, United Kingdom Claudia Kutter, CRI, United Kingdom Stephen Watt, CRI, United Kingdom Aileen Marshall, CRI,Cambridge Hepatobiliary Service, United Kingdom Celia Martinez-Jimenez, Biomedical Sciences Research Center Alexander Fleming, Greece Iannis Talianidis, Biomedical Sciences Research Center Alexander Fleming, Greece Sarah Mackay, CRI, United Kingdom Area Session Chair: Janet Kelso Presentation Overview: Transcription factors (TFs) direct gene expression by binding to DNA regulatory regions. To explore the evolution of gene regulation, we used chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) to determine experimentally the genome-wide occupancy of two TFs, CCAAT/enhancer-binding protein alpha and hepatocyte nuclear factor 4 alpha, in the livers of five vertebrates. Although each TF displays highly conserved DNA binding preferences, most binding is species-specific, and aligned binding events present in all five species are rare. Regions near genes with expression levels that are dependent on a TF are often bound by the TF in multiple species yet show no enhanced DNA sequence constraint. Binding divergence between species can be largely explained by sequence changes to the bound motifs. Among the binding events lost in one lineage, only half are recovered by another binding event within 10 kilobases. Our results reveal large interspecies differences in transcriptional regulation and provide insight into regulatory evolution. TOP Highlights Track: HL05 Sunday, July 17: 2:30 p.m. - 2:55 p.m. Protein Complexes are Central in the Yeast Genetic LandscapeRoom: Hall A Presenting author: Magali Michaut , University of Toronto, Canada Additional authors: Anastasia Baryshnikova, University of Toronto, Canada Michael Costanzo, University of Toronto, Canada Chad L Myers, University of Minnesota, United States Brenda J Andrews, University of Toronto, Canada Charles Boone, University of Toronto, Canada Gary D Bader, University of Toronto, Canada Area Session Chair: Burkhard Rost Presentation Overview: Genetic interactions indicate functional dependencies between genes and are a powerful tool to predict gene function. Functionally related genes tend to have similar profiles of genetic interactions. Recently, global scale mapping of quantitative (positive and negative) genetic interactions has been performed. This data clearly shows groups of genes connected by predominantly positive or negative interactions, termed monochromatic groups. These groups often correspond to functional modules, such as biological processes or protein complexes, or connections between modules, but it is not yet known how these patterns globally relate to known functional modules. Here we systematically evaluate the monochromatic nature of known biological processes and their connections in yeast Saccharomyces cerevisiae. We find that 10% of biological processes and less than 1% of inter-process connections are monochromatic. Further, we show that protein complexes are responsible for a surprisingly large fraction of these monochromatic groups. TOP Highlights Track: HL06 Sunday, July 17: 3:00 p.m. - 3:25 p.m. Comparative Genomics Reveals Birth and Death of Fragile Regions in Mammalian EvolutionRoom: Hall A Presenting author: Max Alekseyev , University of South Carolina, United States Additional authors: Pavel Pevzner, University of California, San Diego, United States Area Session Chair: Burkhard Rost Presentation Overview: An important question in genome evolution is whether there exist fragile regions (rearrangement hotspots) where chromosomal rearrangements are happening over and over again. Existence of fragile regions in mammalian genomes is postulated by Fragile Breakage Model (FBM) proposed in 2003 as a replacement of then widely accepted Random Breakage Model (RBM). While the rebuttal of RBM initially caused a controversy, nearly all recent studies support FBM. However, the most comprehensive phylogenomic analysis of mammals (Ma et al., 2006. Genome Res. 16: 1557–1565) revealed only a few fragile regions shared between different lineages. Our study (2010, Genome Biology 11(11): R117) provided a refinement of FBM, reconciling it with the observed features of mammalian evolution. The newly proposed Turnover Fragile Breakage Model (TFBM) postulates that fragile regions are subject to a "birth and death" process, implying that fragility has a limited evolutionary lifespan. TFBM further implies that fragile regions migrate to different locations in different mammals, explaining why only a few fragile regions are shared between different lineages. The "birth and death" of fragile regions reinforce the recently proposed hypothesis that rearrangements are promoted by matching segmental duplications (Zhao and Bourque, 2009, Genome Res. 19: 934-942) and suggest putative locations of the currently active fragile regions in the human genome. TOP Highlights Track: HL07 Sunday, July 17: 3:30 p.m. - 3:55 p.m. Survival of the Friendly - the Importance of Protein-Protein Interactions in the Evolution of Bacterial GenomesRoom: Hall A Presenting author: Yanay Ofran , Bar Ilan University, Israel Additional authors: Uri Gophna, Tel Aviv University, Israel Area Session Chair: Burkhard Rost Presentation Overview: Bacterial genomes include many genes (up to 30% of the genome, by some accounts)that were not inherited vertically from ancestors, but were acquired laterally from the environment. This phenomenon of individual genes that are incorporated into an existing genome, poses an evolutionary and biological puzzle. Biological functions are not implemented by single genes but by complex and tightly regulated networks of interactions between multiple species of molecules. How could one element that was ripped out of such complex machine become functional on its own? Moreover, how can it be incorporated into an already well-controlled module in the new host? The widely accepted answer to these questions is the decade old “complexity hypothesis”, which postulates that Lateral gene transfer (LGT) occurs mostly in genes with low complexity, that is, in genes that act alone and don't have many interactions with other elements in the genome. Hundreds of follow up studies reiterated this hypothesis. In our study, however, we introduce evidence that the opposite is true: Genes with many interactions are actually more likely to be transferred than genes that have only a few interactions in their pre-transfer host. We show that proteins with more interactions have more interaction sites on their surfaces. Their sticky, or “friendly”, surface makes them more likely to establish new functional interaction after the transfer. These results underline the importance of interactions in the design of bacterial genome throughout evolution. They may provide useful principles for the attempt to design novel modules and genomes. TOP Highlights Track: HL08 Sunday, July 17
: 4:00 p.m. - 4:25 p.m.
Universal epitope prediction for class II MHCRoom: Hall A Presenting author: Andrew Bordner , Mayo Clinic, United States Additional authors: Hans Mittelmann, Arizona State University, United States Area Session Chair: Burkhard Rost Presentation Overview: Predicting peptide-class II MHC binding affinities is a challenging problem due to MHC diversity and multiple binding modes but has many biomedical applications. We recently developed a structure-based approach using peptide docking and machine learning to predict peptide-MHC binding affinities. Unlike popular sequence-based methods, it is applicable to any MHC type because it relies on universal physical interactions rather than limited experimental data for specific MHC types. Using a model trained only on DRB1*0101 binding data we were able to accurately predict peptide binding affinities for all human class II MHC loci (HLA-DP, DQ, and DR) and for two murine MHC types. This provides the first demonstration that a single prediction model can be applied to diverse MHC types with completely different binding specificities. In addition, we will review our RTA sequence-based prediction method, which outperformed more complicated competing methods, and discuss recent work. TOP Highlights Track: HL09 Monday, July 18: 10:45 a.m. - 11:10 a.m. Topological network alignment uncovers biological function and phylogenyRoom: Hall E1 Presenting author: Natasa Przulj , Imperial College London, United Kingdom Additional authors: Tijana Milenkovic, University of Notre Dame, United States Area Session Chair: Erik Bongcam-Rudloff Presentation Overview: There are thousands of genes in the human genome. However, genes are just a means to an end: they produce different protein types that interact in complex networked ways and make our cells work. Thus, network connectivity provides additional biological insight, over and above sequences of individual proteins. Hence, analogous to tools for aligning genetic sequences that have revolutionized biological understanding, network alignment tools are likely to have a similar groundbreaking impact. We introduce a topology-based network alignment algorithm that exposes surprisingly large regions of network similarity even in distant species. Substantial improvements are achieved when additional data sources (including sequence) are integrated with topology: surprisingly, 77.7% of yeast proteins participate in a connected subnetwork that is fully contained in the human network suggesting broad similarities in cellular wiring across all life on Earth. Furthermore, we show that topology is a successful predictor of new cancer genes in melanogenesis-related pathways. TOP Highlights Track: HL10 Monday, July 18: 10:45 a.m. - 11:10 a.m. Systematic planning of genome-scale experiments in poorly studied species.Room: Hall A Presenting author: Casey Greene , Princeton University, United States Area Session Chair: Nir Ben-Tal Presentation Overview: The planning of genome-scale experiments in poorly studied species is in general based on the intuition of experts or heuristic trials. We propose that computational and systematic approaches can be applied to drive the experiment planning process in poorly studied species based on available data and knowledge in closely related model organisms. To this end, we use the data-rich functional genomics compendium of the model organism to quantify the accuracy of each dataset in predicting each specific biological process and the overlap in such coverage between different datasets. Our approach uses an optimized combination of these quantifications to recommend an ordered list of experiments for accurately annotating most proteins in the poorly studied related organisms to most biological processes, as well as a set of experiments that target each specific biological process. This experiment-planning framework could readily be adapted to the design of other types of large-scale experiments TOP Highlights Track: HL11 Monday, July 18: 11:15 a.m. - 11:40 a.m. A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.Room: Hall E1 Presenting author: Erez Levanon , Bar-Ilan University, Israel Additional authors: Alexander Wait Zaranek, Harvard Medical School,, United States Tomer Zecharia, Compugen LTD, Israel Tom Clegg, Scalable Computing Experts, United States George Church, Harvard Medical School, United States Area Session Chair: Erik Bongcam-Rudloff Presentation Overview: Most biomedical, genomic research begins with the painstaking assembly of a ‘‘reference genome’’ for the organism of interest. Implicit in this process is an assumption that genomic information is constant throughout an organism. There are enzymes, however, that can change, or ‘‘edit,’’ genomic information so that variations from the reference can exist within a single organism. In this work, we analyze the raw data used to assemble the reference genomes of ten organisms to discover evidence for editing. We found candidates for DNA and RNA editing as well as a sequencing error that has become incorporated into commonly used genomic resources. Our analysis demonstrates the utility of raw genomic data for the discovery of some editing events and sets the stage for further analysis as sequencing costs continue to decrease exponentially. TOP Highlights Track: HL12 Monday, July 18: 11:15 a.m. - 11:40 a.m. Pi Release From Myosin: A Simulation Analysis of Possible PathwaysRoom: Hall A Presenting author: Marco Cecchini , University of Strasbourg, France Additional authors: Martin Karplus, Harvard University, United States Yuri Alexeev, Institute of Food Research, United Kingdom Area Session Chair: Nir Ben-Tal Presentation Overview: The release of phosphate (Pi) is an important element in actomyosin function that has been shown to be accelerated by the binding of myosin to actin. To provide information about the structural elements important for Pi release, possible escape pathways from various isolated myosin II structures have been determined by molecular dynamics simulations designed for studying such slow processes. The residues forming the pathways were identified and their role evaluated by mutant simulations. Pi release is slow in the pre-powerstroke structure, an important element in preventing the powerstroke prior to actin binding, and is much more rapid for Pi modeled into the post-rigor and rigor-like structures. The backdoor route suggested by Yount et al. is dominant in the pre-powerstroke and post-rigor states, while a different path is most important in the rigor-like state. This finding suggests a novel mechanism for the actin-activated acceleration of Pi release. TOP Highlights Track: HL13 Monday, July 18: 11:45 a.m. - 12:10 p.m. Initial steps towards a production platform for DNA sequence analysis on the gridRoom: Hall E1 Presenting author: Barbera Van Schaik , Academic Medical Center, Netherlands Additional authors: Angela Luyf, Academic Medical Center, Netherlands Michel de Vries, Academic Medical Center, Netherlands Frank Baas, Academic Medical Center, Netherlands Antoine van Kampen, Academic Medical Center, Netherlands Silvia Olabarriaga, Academic Medical Center, Netherlands Area Session Chair: Erik Bongcam-Rudloff Presentation Overview: Next generation sequencing confronts bioinformaticians with new challenges regarding data storage and analysis. Therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently and facilitate collaborations. In this study we reused a platform that was developed for the analysis of medical images. Data transfer, workflow execution and monitoring are operated from one interface. We developed workflows for two sequence alignment tools for which the analysis time was significantly reduced. All workflows are available for the members of two Dutch virtual organizations and all components are open source. The availability of in-house expertise and tools facilitates the usage of grids by new users. Our first results indicate that this is a practical, powerful and scalable solution. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code via http://www.bioinformaticslaboratory.nl/ TOP Highlights Track: HL14 Monday, July 18: 11:45 a.m. - 12:10 p.m. The imprint of codons on protein structureRoom: Hall A Presenting author: Charlotte Deane , Oxford University, United Kingdom Area Session Chair: Nir Ben-Tal Presentation Overview: The central dogma of molecular biology describes the unidirectional flow of interpretable data from genetic sequence to protein sequence. This has led to the idea that a protein’s structure is dependent only on its amino acid sequence. Analysing the input (mRNA) and output (protein) of translation, we find that local protein structure information is encoded in the mRNA nucleotide sequence. Using a detailed mapping between over 4000 solved protein structures and their mRNA we have carried out a comprehensive analysis of codon usage across many organisms. We found no evidence that domain boundaries are enriched with slow codons. In fact, genes seemingly avoid slow codons around structurally defined domain boundaries. Translation speed, however, does decrease at the transition into secondary structure. These results support the premise that codons encode more information than merely amino acids and give insight into the role of translation in protein folding. TOP Highlights Track: HL15 Monday, July 18: 12:15 p.m. - 12:40 p.m. SlideSort: Fast and exact algorithm for Next Generation Sequencing data analysisRoom: Hall E1 Presenting author: Kana Shimizu , National Institute of Advanced Industrial Science and Technology, Japan Additional authors: Koji Tsuda, National Institute of Advanced Industrial Science and Technology, Japan Area Session Chair: Erik Bongcam-Rudloff Presentation Overview: Next Generation Sequencing (NGS) technology calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount data. In this study, we designed and implemented exact algorithm SlideSort that finds all similar pairs whose edit-distance does not exceed a given threshold from NGS data, which helps many important analyses, such as de novo genome assembly, identification of frequently appearing sequence patterns and accurate clustering. Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mer, our method is more effective in reducing the number of edit-distance calculations. In comparison to state-of-the-art methods, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing NGS data for further processing. TOP Highlights Track: HL16 Monday, July 18: 12:15 p.m. - 12:40 p.m. Predicting genetic modifier loci using functional gene networksRoom: Hall A Presenting author: Insuk Lee , Yonsei University, Korea, Rep Additional authors: Ben Lehner, EMBL-CRG Systems Biology Research Unit, Spain Tanya Vavouri, 2EMBL-CRG Systems Biology Research Unit, Spain Junha Shin, Yonsei University, Korea, Rep Andrew Fraser, University of Toronto, Canada Edward Marcotte, University of Texas at Austin, United States Area Session Chair: Nir Ben-Tal Presentation Overview: Most phenotypes are genetically complex with contributions from mutations in many different genes. Mutations in more than one gene can combine synergistically to cause phenotypic change and systematic studies in model organisms show that these genetic interactions are pervasive. However, in human association studies such non-additive genetic interactions are very difficult to identify because of a lack of statistical power — simply put, the number of potential interactions is too vast. One approach to resolve this is to predict candidate modifier interactions between loci, and then to specifically test these for associations with the phenotype. Here we describe a general method for predicting genetic interactions based on the use of integrated functional gene networks. We show that in both S. cerevisiae and C. elegans a single high coverage, high quality functional network can successfully predict genetic modifiers for the majority of genes. We demonstrate how it is possible to rapidly expand the number of modifier loci known for a gene, predicting and validating new genetic interactions for each of three signal transduction genes. We propose that this approach, termed network-guided modifier screening, provides a general strategy for predicting genetic interactions. This work thus suggests that a high quality integrated human gene network will provide a powerful resource for modifier locus discovery in many different diseases. TOP Highlights Track: HL17 Monday, July 18: 2:30 p.m. - 2:55 p.m. Genome3D: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genomeRoom: Hall E1 Presenting author: W Zheng , Medical University of South Carolina, United States Additional authors: Thomas Asbury, Sequenta, Inc, United States Matt Mitman, Maxgaming Technologies, Inc, United States Jijun Tang, University of South Carolina, United States Area Session Chair: Ivo Hofacker Presentation Overview: We have created the first model-view framework of eukaryotic genome, Genome3D, to enable integration and visualization of genomic and epigenomic data in a three-dimensional space. Our model of the physical genome implicitly contains all levels of structure and hierarchy, and provides our underlying platform for integrating multi-scale structural and genomic information within three dimensions. The viewer is designed to display data from multiple scales and uses a hierarchical model of the relative positions of all nucleotide atoms in the cell nucleus, i.e., the physical genome. Genome3D does not intend to replace but rather works with UCSC genome browser, complementing its functionality by visualizing structural and epigenomic information in 3D space. Genome3D can significantly advance genome research in inferring epigenomic knowledge, studying long range inter- and intra-chromosome interaction, and analyzing structural feature of genetic variations, and will have a profound impact on genome information integration and analysis. TOP Highlights Track: HL18 Monday, July 18: 2:30 p.m. - 2:55 p.m. The Impact of Multifunctional Genes on "Guilt by Association" AnalysisRoom: Hall A Presenting author: Jesse Gillis , University of British Columbia, Canada Additional authors: Paul Pavlidis, University of British Columbia, Canada Area Session Chair: Michal Linial Presentation Overview: Many previous studies have shown that by using variants of “guilt-by-association”, gene function predictions can be made with high statistical confidence. In these studies, it is assumed that the “associations” in the data (e.g., protein interactions) of a gene are necessary in establishing “guilt”. Here we show that gene multifunctionality, rather than association, is a primary driver of gene function prediction. We first show that knowledge of the degree of multifunctionality alone can produce astonishingly strong performance when used as a predictor of gene function. We then demonstrate how multifunctionality is encoded in gene interaction data and feeds forward into function prediction. We find that high-quality gene function predictions can be made using data that possesses no information on which gene interacts with which. We suggest that this bias due to multifunctionality is important to control for, with widespread implications for the interpretation of genomics studies. TOP Highlights Track: HL19 Monday, July 18: 3:00 p.m. - 3:25 p.m. Structure determination of genomes and genomic domains by satisfaction of spatial restraints.Room: Hall E1 Presenting author: Marc A. Marti-Renom , Prince Felipe Research Center, Spain Additional authors: Davide Bau, Prince Felipe Research Center, Spain Area Session Chair: Ivo Hofacker Presentation Overview: The genome three-dimensional (3D) organization plays important, yet poorly understood roles in gene regulation. Chromosomes assume multiple distinct conformations in relation to the expression status of resident genes and undergo dramatic alterations in higher order structure through the cell cycle. Despite advances in microscopy, a general technique to determine the 3D conformation of chromatin has been lacking. We developed a new method for the determination of the 3D conformation of chromatin domains in the interphase nucleus, which combines 5C experiments with the computational Integrative Modeling Platform (IMP). The general approach of our method, which has been applied to study the 3D conformation of the ?-globin domain in the human genome [1] and the Caulobacter crescentus whole genome, opens the field for comprehensive studies of the 3D conformation of chromosomal domains and contributes to a more complete characterization of genome regulation. [1] D. Baù et al. Nat Struct Mol Biol 18 (2011) 107. TOP Highlights Track: HL20 Monday, July 18: 3:00 p.m. - 3:25 p.m. Bringing order to protein disorder through comparative genomics and genetic interactionsRoom: Hall A Presenting author: Philip Kim , University of Toronto, Canada Additional authors: Jeremy Bellay, University of Minnesota, United States Sangjo Han, University of Toronto, Canada Magali Michaut, University of Toronto, Canada Taehyung Kim, University of Toronto, Canada Michael Costanzo, University of Toronto, Canada Charles Boone, University of Toronto, Canada Gary Bader, University of Toronto, Canada Chad Myers, University of Minnesota, Canada Area Session Chair: Michal Linial Presentation Overview: Intrinsically disordered regions are widespread, especially in proteomes of higher eukaryotes, and have been associated with a plethora of different cellular functions. Here, we attempt to better understand the different roles of disorder using a novel analysis that leverages both comparative genomics and genetic interactions. Strikingly, we find that disorder can be partitioned into three biologically distinct phenomena: regions where disorder is conserved but with quickly evolving amino acid sequences (“flexible disorder”), regions of conserved disorder with also highly conserved amino acid sequence (“constrained disorder”) and, lastly, non-conserved disorder. Flexible disorder is closest to canonical protein disorder and is associated with signaling pathways and multi-functionality. Conversely, constrained disorder has markedly different functional attributes and is involved in RNA binding and protein chaperones. Finally, non-conserved disorder appears largely non-functional. These distinctions provide both an informative division of disorder functionality and imply common underlying mechanisms that support these functions TOP Highlights Track: HL21 Monday, July 18: 3:30 p.m. - 3:55 p.m. A Unifying Theory for GC3 Biology in Plants and AnimalsRoom: Hall E1 Presenting author: Tatiana TATARINOVA , University of Glamorgan, United Kingdom Additional authors: Nickolai Alexandrov, Ceres, United States John Bouck, Ceres, United States Kenneth Feldmann, University of Arizona , United States Area Session Chair: Ivo Hofacker Presentation Overview: There is a well-documented bias for cytosine and guanine at the third position in a subset of transcripts within a single organism; it is present in some plant species and warm-blooded vertebrates. We demonstrated that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess certain transcription factor binding sites, (4) are predominant in certain classes of genes and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses and later extend it to other species. High levels of GC3 typify a class of genes regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. TOP Highlights Track: HL22 Monday, July 18: 3:30 p.m. - 3:55 p.m. Model-Based Learning for All SCOP FamiliesRoom: Hall A Presenting author: Stefan Kramer , TU Muenchen, Germany Additional authors: Tobias Hamp, TU Muenchen, Germany Fabian Buchwald, TU Muenchen, Germany Fabian Birzele, Roche Deutschland GmbH, Germany Area Session Chair: Michal Linial Presentation Overview: As the automated annotation of genomic and proteomic data is becoming increasingly important, big community efforts aim at computationally predicting their manual classification, as found for example in SCOP. These methods fall into two categories: instance-based (e.g. alignments of a target against templates, followed by the assignment of the class of the best template to the target) and model-based (e.g. a Neural Network) methods. In this context, model-based algorithms have supposedly been unfit for a full-scale application due to the large presence of extremely small classes. Only integration with instance-based methods could enable their universal use.In the talk, we show that it is, for SCOP, effectively impossible to find an integration that is guaranteed to outperform the instance-based-only counterpart. Further, we show that model-based-only classifiers can be applied to arbitrary class sizes and exhibit the so far best reported accuracy to predict the SCOP family of a protein. TOP Highlights Track: HL23 Monday, July 18: 4:00 p.m. - 4:25 p.m. Strengths and limitations of the federal guidance on synthetic DNARoom: Hall E1 Presenting author: Jean Peccoud , Virginia Tech, United States Additional authors: Laura Adam, Virginia Tech, United States Michael Kozar, Virginia Tech, United States Gaelle Letort, Virginia Tech, United States Olivier Mirat, Virginia Tech, United States Arunima Srivastava,, Virginia Tech, United States Tyler Steward, Virginia Tech, United States Mandy Wilson, Virginia Tech, United States Area Session Chair: Ivo Hofacker Presentation Overview: An implementation of the sequence screening method recommended by the U.S. Government to prevent the misuse of gene synthesis highlights improvements over the protocols proposed by the industry. Since it does not rely on a database of curated sequences, its deployment is fast and inexpensive. Without resulting in an unacceptable computational cost, breaking sequences into 200 bp fragments translated in six frames precludes the hiding of sequences of concern within longer, benign sequences. A standardized dictionary of keywords used to interpret alignment results and a realistic suite of annotated test sequences are still needed to assess the performance of the screen software implementations. Beyond its biosecurity application, this screening algorithm can be used to enforce other policies and regulations affecting the biotechnology industry. It is also likely to find a variety of other applications such as partitioning sequencing reads by species in metagenomic samples, forensic, or for clinical diagnostic. TOP Highlights Track: HL24 Monday, July 18: 4:00 p.m. - 4:25 p.m. Next-generation genome alignment with LASTRoom: Hall A Presenting author: Paul Horton , AIST, Computational Biology Research Center, Japan Additional authors: Martin Frith, AIST, Japan Raymond Wan, The University of Tokyo, Japan Kengo Sato, The University of Tokyo, Japan Szymon Kielbasa, Max Planck Institute for Molecular Genetics, Japan Area Session Chair: Michal Linial Presentation Overview: We present LAST, an open-source software package to replace BLAST. BLAST and related sequence similarity tools are arguably the most successful of all bioinformatics applications. However they are not fully adequate for important tasks such as mammalian genome-genome alignment and tera-scale mapping of sequence reads. While BLAST searchers are based on fixed length exact match "seeds", LAST employs the concept of adaptive length seeds. We show that adaptive seeds are robust to highly repetitive (e.g. mammalian) and biased composition (e.g. malaria) genomes. LAST also introduces improved methods for xeno-mapping, e.g. of mammoth reads to an elephant genome. For the task of genome vs genome alignment, LAST is often 10-100 times faster than BLAST for similar levels of sensitivity. In fact, LAST is the first method that can sensitively compare giga-scale, repeat-rich sequences -- all previous methods either have low sensitivity e.g. DNA read mappers, or must heavily suppress repeats e.g. BLASTZ. TOP Highlights Track: HL25 Tuesday, July 19: 10:45 a.m. - 11:10 a.m. A scalable approach for discovering conserved active subnetworks across speciesRoom: Hall E1 Presenting author: Raamesh Deshpande , University of Minnesota-Twin Cities, United States Additional authors: Shikha Sharma, University of Minnesota-Twin Cities, United States Wei-Shou Hu, University of Minnesota-Twin Cities, United States Catherine Verfaillie, Catholic University Leuven, United States Chad Myers, University of Minnesota-Twin Cities, United States Area Session Chair: Yanay Ofran Presentation Overview: Overlaying differential changes in gene expression on protein interaction networks has proven to be a useful approach to interpreting the cell's dynamic response to a changing environment. We have extended this idea to enable the discovery of active subnetworks across species. Specifically, we present a scalable, cross-species network search algorithm, neXus (Network - cross(X)-species - Search), that discovers conserved, active subnetworks based on parallel differential expression studies in multiple species. We applied our approach to identify conserved modules that are differentially active in stem cells relative to differentiated cells based on analogous gene expression studies and functional linkage networks from mouse and human. We find hundreds of conserved subnetworks enriched for stem cell-associated functions such as cell cycle, DNA repair, and chromatin modification processes. Furthermore, we demonstrate that a comparative approach to subnetwork discovery has many statistical advantages over the single-species formulation, which can enable more reliable module discovery. TOP Highlights Track: HL26 Tuesday, July 19: 10:45 a.m. - 11:10 a.m. Benchmarking Ontologies: Bigger or Better?Room: Hall A Presenting author: Lixia Yao , Columbia University, United States Additional authors: Andrey Rzhetsky, University Of Chicago, United States Anna Divoli, University Of Chicago, United States Ilya Mayzus, University Of Chicago, United States James Evans, University Of Chicago, United States Area Session Chair: Paul Horton Presentation Overview: An ontology represents the concepts and their interrelation within a knowledge domain. Many ontologies have been developed in biomedicine, providing standardized vocabularies to describe genes and proteins, anatomical structures, physiological phenotypes or diseases, and many other phenomena. Scientists use them to encode observations and experimental results, and to perform integrative analysis to discover new knowledge. A remaining challenge is to evaluate how well an ontology represents the underlying knowledge domain. We introduce a family of metrics, including breadth and depth, to capture the conceptual and relational coverage and parsimony of an ontology. We test these measures using four commonly used medical ontologies and seven popular English thesauri (ontologies of synonyms) with respect to text from medicine, news and novels. Results demonstrate that both medical ontologies and English thesauri have a small overlap in concepts and relations, and suggest further efforts to tighten the fit between ontologies and biomedical knowledge domain. TOP Highlights Track: HL27 Tuesday, July 19: 11:15 a.m. - 11:40 a.m. Network Modeling Identifies Molecular Functions Targeted by miR-204 to Suppress Head and Neck Tumor Metastasis and Mechanisms of Therapeutic ResistanceRoom: Hall E1 Presenting author: Yves Lussier , The University of Chicago, United States Additional authors: Mark Gerstein, Yale, United States Rosie H Xing, University of Chicago, United States Younghee Lee, University of Chicago, United States Xinan "Holly" Yang, University of Chicago, United States Yong Huang, University of Chicago, United States Qingbei Zhang, University of Chicago, United States Jianrong Li, University of Chicago, United States Hanli Fan, University of Chicago, United States Rifat Hasina, University of Chicago, United States Mark Lingen, University of Chicago, United States Chao Cheung, Yale University , United States Ralph Weichselbaum, University of Chicago, United States Area Session Chair: Yanay Ofran Presentation Overview: Relevance: Accurately modeling microRNA regulation of oncogenic phenotype via its targets is relevant to a broad audience interest because of heightened interest in microRNA-directed therapy and the computational innovations that range from microRNA network models to the genetics of acquired oncogenic phenotypes. Previous studies lack in depth since only a few genes are biologically confirmed as microRNA targets in vitro and rarely in vivo. Additionally, key biological systems perturbed by altered microRNA functions in the context of cancer remain to be identified. This paper demonstrates how to bioinformatically integrate genetics knowledge, gene expression, and molecular network properties, to uncover previously unknown connections between microRNAs, their regulated genes, and their dynamics for streamlined and comprehensive biological validations. TOP Highlights Track: HL28 Tuesday, July 19: 11:15 a.m. - 11:40 a.m. miRGator v2.0 and the construction of miRNA-disease networkRoom: Hall A Presenting author: Wankyu Kim , Ewha Womans University, Korea, Rep Additional authors: Sooyoung Cho, Ewha Womans University, Korea, Rep Yukyung Jun, Ewha Womans University, Korea, Rep Minjeong Ko, Ewha Womans University, Korea, Rep Sanghyuk Lee, Ewha Womans University, Korea, Rep Area Session Chair: Paul Horton Presentation Overview: miRGator is developed as an integrated database of microRNA-associated gene expression, target prediction, disease association and genomic annotation, in order to facilitate functional investigation of miRNAs (http://miRGator.kobic.re.kr). It contains (i) human miRNA expression profiles under various conditions, (ii) paired expression profiles of both mRNAs and miRNAs, (iii) gene expression profiles under miRNA-perturbation (e.g. miRNA knockout and overexpression), (iv) known/predicted miRNA targets and (v) miRNA-disease associations. In total, >8000 miRNA expression profiles, ?300 miRNA-perturbed gene expression profiles and ~2000 mRNA expression profiles are compiled with manual annotations on disease, tissue type and perturbation. Additionally, disease signature genes were extracted from ~12,000 gene expression profiles for ~100 human diseases. By integrating these data sets, a series of novel associations between human diseases and miRNAs is extracted by systematically comparing disease and target signature genes from various sources. Our approach correctly predicted known disease-miRNA associations with high accuracy as well as novel associations. TOP Highlights Track: HL29 Tuesday, July 19: 11:45 a.m. - 12:10 p.m. Mutation Impact Mining using SADI Semantic Web ServicesRoom: Hall E1 Presenting author: Christopher Baker , University of New Brunswick, Canada Additional authors: Alexandre Riazanov, University of New Brunswick, Canada Jonas Laurila, National Food Administration, Sweden Area Session Chair: Yanay Ofran Presentation Overview: We report on a platform for mining and integration of mutation impacts from the literature. Core features of this infrastructure are: a GATE pipeline for extracting impacts of mutations on proteins populating and OWL-DL mutation impact ontology, establishment of semantic database for storing the results of text mining, the SADI framework as a medium for publishing mutation impact software and data. Through multiple case studies we demonstrate the utility of SADI (a set of conventions for creating web services with semantic descriptions that facilitate automatic service discovery and workflow orchestration) to facilitate ad-hoc knowledge discovery through a single SPARQL interface (SHARE) to a registry of SADI services. We illustrate integration of mutation impact services with external SADI services providing information about related biological entities, such as proteins, pathways, SNPS and drugs. SADI provides an effective way of exposing our mutation impact data for reuse by a variety of stakeholders. TOP Highlights Track: HL30 Tuesday, July 19: 11:45 a.m. - 12:10 p.m. Fast and Efficient Dynamic Nested Effects ModelsRoom: Hall A Presenting author: Holger Fröhlich , University of Bonn, Bonn-Aachen International Center for IT, Germany Additional authors: Paurush Praveen, University of Bonn, Bonn-Aachen International Center for IT, Germany Tresch Achim, Ludwig-Maximilians-University Muenchen, Gene Center Munich, Germany Area Session Chair: Paul Horton Presentation Overview: Reverse engineering of biological networks is a key for the understanding of biological systems. The exact knowledge of interdependencies between proteins in the living cell is crucial for the identification of drug targets for various diseases. However, due to the complexity of the system a complete picture with detailed knowledge of the behavior of individual proteins is still out of reach. Nonetheless, the advent of gene perturbation techniques like RNA interference (RNAi), opened new perspectives for network reconstruction by boosting the ability to subject organisms to well defined interventions. Nested Effects Models (NEMs; Markowetz et al., Bioinformatics, 2005) have been introduced as a statistical approach to estimate the upstream signal flow from the downstream nested subset structure of high-dimensional perturbation effects (measured e.g. on microarrays). The method was substantially extended later on by a number of authors and successfully applied to various datasets (Markowetz et al., Bioinformatics, 2005; Tresch & Markowetz, Stat. Appl. Genome Biol., 2007; Froehlich et al., BMC Bioinformatics 2007; Froehlich et al., Bioinformatics, 2008; Froehlich et al., Biometrical Journal, 2009; Zeller et al., EURASIP J. on Bioinf. and Syst. Biol. 2009; Anchang et al., PNAS, 2009). The connection of NEMs to Bayesian Networks and factor graph models has been highlighted (Zeller et al., EURASIP J. on Bioinf. and Syst. Biol. 2009; Vaske et al., PLOS Comp. Biol., 2009). Here we introcude a computationally attractive extension of NEMs that enables the analysis of perturbation time series data (measured e.g. on microarrays). It thus complements the attempt of Anchang et al. (PNAS, 2009) to extend static NEMs to the modeling of perturbation time series measurements. Most importantly, this allows for the resolution of feedback loops in the signaling cascade, as well as for the discrimination of direct and indirect signalling. In contrast to Anchang et al. the key idea in our model is to unroll the signal flow over time. This allows for a computation showing some similarity to Dynamic Bayesian Networks and naturally extends the classical NEM formulation. Our model circumvents the need for time consuming Gibbs sampling, which makes it also computationally attractive. We performed extensive simulations of our model (also compared to a static NEM) to investigate its dependency on the length of time series, the sizes and architectures of the networks to be learned, and on the amount of available data. Our results indicate a very high specificity together with a good sensitivity of our method. The high specificity can be attributed to a special network structure prior favoring sparse networks here, but more generally could also incorporate prior beliefs on specific edges. We applied our model to data investigating self-renewal in murine embryonic stem cell development in mice (Ivanova et al., Nature, 2006). We found a good accordance of our estimated network between 6 key proteins (5 transcription factors) and the biological literature. Moreover, our result generally agrees with the previous published one by Anchang et al., although being more sparse. In summary we believe that our approach can serve as a useful tool to generate data driven hypotheses about signaling and/or transcriptional networks based on high-dimensional perturbation effects. TOP Highlights Track: HL31 Tuesday, July 19: 12:15 p.m. - 12:40 p.m. The chicken or the egg problem in epidemiology: untangling the mutual impact of transmission network dynamic and epidemic spread.Room: Hall E1 Presenting author: Christel Kamp , Paul-Ehrlich-Institut, Germany Area Session Chair: Yanay Ofran Presentation Overview: Network models are well established in computational biology as they acknowledge that system dynamics is rarely determined by components alone but by interactions between components. In epidemiology, not individual behaviour but rather the structure and dynamics of the transmission network between hosts strongly influence the time scale and patterns of epidemics. Within a mathematical framework we further show that also the reverse is true and quantify the impact that epidemics have on the way contacts are made among healthy and infected individuals. This information is relevant for epidemic control as will be shown in a case study on HIV epidemics: The clustering and mixing of infected individuals depend both on the transmission network and stage of epidemic – as do the contributions of primarily and latently infected individuals to epidemic spread. In conclusion, the flexibility of the framework will be highlighted in combination with potential applications in epidemiology and beyond. TOP Highlights Track: HL32 Tuesday, July 19: 12:15 p.m. - 12:40 p.m. C(alpha)-trace model of the transmembrane domain of human copper transporter 1, motion and functional implicationsRoom: Hall A Presenting author: Nir Ben-Tal , Tel-Aviv University, Israel Additional authors: Yariv Barkan, Tel-Aviv University, Israel Turkan Haliloglu, Bogazici University, Turkey Nir Ben-Tal, Tel-Aviv University, Israel Area Session Chair: Paul Horton Presentation Overview: The human copper transporter 1 (hCTR1) is essential for copper uptake and is implicated in sensitivity to chemotherapy drugs. Using the hCTR1 cryoelectron microscopy (cryoEM) map and evolutionary data, we constructed a C?-trace model of the membrane region. Investigating the model's global dynamics through elastic network models, hCTR1’s MxxxM and GxxxG motifs were shown to have significant roles in the two slowest modes of motion. For example, in one of these modes the glycine residues of the GxxxG motif appeared to serve as hinge points and the copper-binding methionine residues of the MxxxM motif manifested cooperative rotational motion, possibly reflecting activation at the pore’s extracellular entrance. We suggest a molecular mechanism of copper transport in which this motif serves both as a gate and as a selectivity filter. We also suggest residues that are responsible for pH activation. TOP Highlights Track: HL33 Tuesday, July 19: 2:30 p.m. - 2:55 p.m. Organization of mammalian genomes with respect to the nuclear laminaRoom: Hall E1 Presenting author: Wouter Meuleman , Netherlands Cancer Institute / Delft University of Technology, Netherlands Additional authors: Daan Peric-Hupkes, Netherlands Cancer Institute, Netherlands Marcel Reinders, Delft University of Technology, Netherlands Lodewyk Wessels, Netherlands Cancer Institute, Netherlands Bas van Steensel, Netherlands Cancer Institute, Netherlands Area Session Chair: Terry Gaasterland Presentation Overview: The three-dimensional organization of chromosomes within the nucleus is largely unknown. We present high-resolution maps of the interaction of human and mouse genomes with the nuclear lamina, providing detailed views of the spatial organization of interphase chromosomes. These maps reveal substantial refolding of chromosomes during differentiation, involving hundreds of genes that collectively determine cellular identity. We illustrate the involvement of lamina-genome interactions in the control of gene expression programs. We find that a substantial portion of the spatial chromosome organization is identical across all assayed cell types. This is even the case between species, despite extensive chromosomal rearrangements in mouse and human. Using the genomic sequence alone, we can accurately predict these lamina-genome interactions in both mouse and human. This indicates that the organization of the genome is, to a large extent, hard-coded in the primary sequence. Based on this, we conclude with a mechanistic explanation of lamina-genome interactions. TOP Highlights Track: HL34 Tuesday, July 19: 2:30 p.m. - 2:55 p.m. Multi-species integrative biclusteringRoom: Hall A Presenting author: Peter Waltman , New York University, United States Area Session Chair: Donna Slonim Presentation Overview: A key challenge in the analysis of functional genomics data is the identification of modules of genes with similar regulatory controls, a non-trivial problem due to the complexity of regulatory networks. Recent works that compare functional genomics datasets for closely related species reveal that many co-regulated gene groups are conserved across several species. This suggests that comparative analysis of multiple-species functional genomics datasets could prove powerful in accurately identifying those conserved modules. We describe an extension of our cMonkey algorithm that allows for the simultaneous biclustering of heterogeneous data collections spanning multiple species. We present results from the multi-species biclustering of a group of Gram positive bacteria containing Bacillus subtilis, Bacillus anthracis, and Listeria monocytogenes. We identify conserved groups of orthologous genes, yielding evolutionary insights into the formation and surprisingly high degree of conservation of regulatory modules across these three species. We report a temporal difference between the two Bacillus species in the expression of a conserved bicluster of metabolic genes required for spore formation. In addition, we discuss the unexpected identification of a highly expressed flagellum assembly bicluster in non-motile B. anthracis. Analysis of biclusters obtained revealed a large number of gene groups with conserved modularity and high biological significance as judged by several measures of cluster quality. We show that the method provides a framework that allows data and insights from well-studied organisms to complement the analysis of related but less well studied organisms. TOP Highlights Track: HL35 Tuesday, July 19: 3:00 p.m. - 3:25 p.m. Model-based method for transcription factor target identification with limited dataRoom: Hall E1 Presenting author: Magnus Rattray , University of Sheffield, United Kingdom Additional authors: Neil Lawrence, University of Sheffield, United Kingdom Antti Honkela, University of Helsinki, Finland Eileen Furlong, EMBL Heidelberg, Germany Charles Girardot, EMBL Heidelberg, Germany Hilary Gustafson, EMBL Heidelberg, Germany Ya-Hsin Liu, National Cheng Kung University, Taiwan Jonatan Ropponen, Aalto University, Finland Pei Gao, University of Cambridge, United Kingdom Area Session Chair: Terry Gaasterland Presentation Overview: A fundamental problem in systems biology is uncovering the structure and dynamics of gene regulatory networks. A first step is to identify the targets of regulatory molecules such as transcription factors (TFs). We introduce a computational method for inferring targets of transcription factors from gene expression time-series data. Our approach works by fitting a simple model of transcriptional regulation to gene expression data for each putative target gene. The model is used to identify targets of transcription factors of interest. We apply the method to identify targets of TFs regulating Drosophila mesoderm development. Using only wild-type microarray time course data we make predictions that are validated with high significance using ChIP-chip data from the same system. Our model-based approach gives better target predictions than knock-out data for the same TFs and we show how spatial expression data can improve the accuracy of predictions. The method is available as a Bioconductor package allowing easy parallel implementation. TOP Highlights Track: HL36 Tuesday, July 19: 3:00 p.m. - 3:25 p.m. Inference of Signaling Network Architecture and Dynamics from High-throughput Combinatorial RNAi ScreensRoom: Hall A Presenting author: Chris Bakal , The Institute of Cancer Research, United Kingdom Additional authors: Oaz Nir, Massachusetts Institute of Technology, United States Norbert Perrimon, Harvard Medical School, United States Bonnie Berger, Massachusetts Institute of Technology, United States Area Session Chair: Donna Slonim Presentation Overview: Biological signaling networks are highly complex systems that act to regulate fundamental cellular processes such as growth, proliferation, differentiation and migration. Because the actions of these networks underpin both health and disease, describing their architecture and dynamics is a fundamental challenge in biology. High-content image based RNAi screens provide a wealth of data regarding the contribution of single genes to different phenotypes, but mapping cellular networks from this data is a daunting challenge. Here we describe a systematic computational framework based on a classification model for network inference using high-dimensional single-cell morphological data. We applied this method to a dataset comprised of images generated in the course of a double RNAi screen and generated a network of RhoGTPases and RhoGTPase Activating Proteins – which are essential for cell shape migration and motility in all eukaryotic cells. Furthermore, we have recently extended these methods to model signaling network dynamics. TOP Highlights Track: HL37 Tuesday, July 19: 3:30 p.m. - 3:55 p.m. Quantitative analysis of the Drosophila segmentation regulatory network using pattern generating potentialsRoom: Hall E1 Presenting author: Saurabh Sinha , University of Illinois Urbana-Champaign, United States Additional authors: Saurabh Sinha, University of Illinois Urbana-Champaign, United States Majid Kazemian, University of Illinois Urbana-Champaign, United States Charles Blatti, University of Illinois Urbana-Champaign, United States Adam Richards, University of Massachusetts Medical School, United States Michael McCutchan, Arizona State University, United States Noriko Wakabayashi-Ito, University of Massachusetts Medical School, United States Ann Hammonds, Lawrence Berkeley National Laboratory, United States Susan Celniker, Lawrence Berkeley National Laboratory, United States Sudhir Kumar, Arizona State University, United States Scot Wolfe, University of Massachusetts Medical School, United States Michael Brodsky, University of Massachusetts Medical School, United States Area Session Chair: Terry Gaasterland Presentation Overview: The developmental program specifying segmentation along the anterior-posterior axis of the Drosophila embryo is encoded in DNA segments called cis-regulatory “modules”. Previous work has identified some of these modules along with their related transcription factors. We present a novel computational framework that turns a qualitative and fragmented understanding of modules and factor-module interactions into a quantitative, systems-level view. Our model utilizes experimentally characterized binding specificities of transcription factors and gene expression patterns to describe how multiple transcription factors act together in a module to determine its regulatory activity. This logistic regression-based model can explain the expression patterns of known modules, infer factor-module interactions and identify novel modules of the regulatory network by quantifying their potential to drive a gene's expression. As databases of binding motifs and gene expression patterns grow, this new approach provides a general method to decode transcriptional regulatory sequences and networks. TOP |