ISMB/ECCB 2011 Highlights

19th Annual International Conference on
Intelligent Systems for Molecular Biology and
10th European Conference on Computational Biology

Highlights Track Presentation Schedule

Highlights Track: HL01 Sunday, July 17: 10:45 a.m. - 11:10 a.m.

Analysis and design of RNA sequencing experiments for identifying isoform regulation
Room: Hall A
Presenting author: Yarden Katz , Massachusetts Institute of Technology, United States

Additional authors:
Eric Wang, Harvard/MIT, United States
Edoardo Airoldi, Harvard, United States

Area Session Chair: Janet Kelso

Presentation Overview:
Through alternative splicing, most human genes express multiple isoforms that often differ in function. To infer isoform regulation from high-throughput sequencing of cDNA fragments (RNA-seq), we developed the mixture-of-isoforms (MISO) model, a statistical model that estimates expression of alternatively spliced exons and isoforms and assesses confidence in these estimates. Incorporation of mRNA fragment length distribution in paired-end RNA-seq greatly improved estimation of alternative-splicing levels. MISO also detects differentially regulated exons or isoforms. Application of MISO implicated the RNA splicing factor hnRNP H1 in the regulation of alternative cleavage and polyadenylation, a role that was supported by UV cross-linking-immunoprecipitation sequencing (CLIP-seq) analysis in human cells. Our results provide a probabilistic framework for RNA-seq analysis, give functional insights into pre-mRNA processing and yield guidelines for the optimal design of RNA-seq experiments for studies of gene and isoform expression.
TOP

Highlights Track: HL02 Sunday, July 17: 11:15 a.m. - 11:40 a.m.

The Central Human Proteome
Room: Hall A
Presenting author: Jacques Colinge , Ce-M-M- Center for Molecular Medicine of the Austrian Academy of Science, Austria

Additional authors:
Keiryn Bennett, Ce-M-M- Center for Molecular Medicine of the Austrian Academy of Science, Austria
Giulio Superti-Furga, Ce-M-M- Center for Molecular Medicine of the Austrian Academy of Science, Austria

Area Session Chair: Janet Kelso

Presentation Overview:
We have obtained a first unbiased estimation of the repertoire of proteins commonly expressed by human cells through proteomics analysis of several cell lines. The bioinformatics analysis of this central human proteome (CHP) shows it has several features that confer it an augmented flexibility to adapt to multiple environments (more exons, more interactions, etc.). We shall discuss these results, extend them, and relate then to findings by other authors to show that the CHP is not a static machine but it participates in specialized tasks and can be “recruited” by diseases on top of its fundamental housekeeping tasks. Considering the central human interactome spanned by the CHP, we shall show it has global properties that synchronize translation with other biological processes and its topology supports a global presence facilitating interactions with any specialized process.
TOP

Highlights Track: HL03 Sunday, July 17: 11:45 a.m. - 12:10 p.m.

From revealing new insights into Human Tissue Development to Minimum Curvineality
Room: Hall A
Presenting author: Carlo Cannistraci , King Abdullah University for Science and Technology (KAUST), Saudi Arabia

Additional authors:
Timothy Ravasi, King Abdullah University for Science and Technology (KAUST), Saudi Arabia

Area Session Chair: Janet Kelso

Presentation Overview:
We will focus on the data-mining exploration of 32 human tissues determined by 1321 transcription factor (TF) expressions. Integrating the expressions with the physical TF interactions and performing machine learning (ML) analysis, we selected 6 expression-weighted-interactions - a homeobox-sub-network – as best discriminating features that unfolded the presence of the three developmental tissue germ-layer-classes (ectoderm, mesoderm, endoderm) with 82% accuracy. Then, we will reveal how starting only from the expressions, it was possible to provide a bi-dimensional data visualization that, evaluated by clustering, offered 84% accuracy. This was achieved by means of two unsupervised and parameter-free MLs: minimum-curvilinear-embedding for nonlinear-dimension-reduction, and minimum-curvilinear-affinity-propagation for non-spherical-clustering.
We will conclude with our two recent results: the presence of the germ-layers-classes is conserved in the exploration of several other human and mouse gene-expression-datasets; a novel unsupervised ML using just the expressions is able to identify a discriminative homeobox-sub-network that extends the one previously proposed.
TOP

Highlights Track: HL04 Sunday, July 17: 12:15 p.m. - 12:40 p.m.

Five-Vertebrate ChIP-seq Reveals the Evolutionary Dynamics of Transcription Factor Binding
Room: Hall A
Presenting author: Benoit Ballester , European Bioinformatics Institute (EMBL-EBI), United Kingdom

Additional authors:
Petra Schwalie, EBI-EMBL, United Kingdom
Paul Flicek, EBI-EMBL, United Kingdom
Dominic Schmidt, CRI, United Kingdom
Michael Wilson, CRI, United Kingdom
Duncan Odom, CRI, United Kingdom
Claudia Kutter, CRI, United Kingdom
Stephen Watt, CRI, United Kingdom
Aileen Marshall, CRI,Cambridge Hepatobiliary Service, United Kingdom
Celia Martinez-Jimenez, Biomedical Sciences Research Center Alexander Fleming, Greece
Iannis Talianidis, Biomedical Sciences Research Center Alexander Fleming, Greece
Sarah Mackay, CRI, United Kingdom

Area Session Chair: Janet Kelso

Presentation Overview:
Transcription factors (TFs) direct gene expression by binding to DNA regulatory regions. To explore the evolution of gene regulation, we used chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) to determine experimentally the genome-wide occupancy of two TFs, CCAAT/enhancer-binding protein alpha and hepatocyte nuclear factor 4 alpha, in the livers of five vertebrates. Although each TF displays highly conserved DNA binding preferences, most binding is species-specific, and aligned binding events present in all five species are rare. Regions near genes with expression levels that are dependent on a TF are often bound by the TF in multiple species yet show no enhanced DNA sequence constraint. Binding divergence between species can be largely explained by sequence changes to the bound motifs. Among the binding events lost in one lineage, only half are recovered by another binding event within 10 kilobases. Our results reveal large interspecies differences in transcriptional regulation and provide insight into regulatory evolution.
TOP

Highlights Track: HL05 Sunday, July 17: 2:30 p.m. - 2:55 p.m.

Protein Complexes are Central in the Yeast Genetic Landscape
Room: Hall A
Presenting author: Magali Michaut , University of Toronto, Canada

Additional authors:
Anastasia Baryshnikova, University of Toronto, Canada
Michael Costanzo, University of Toronto, Canada
Chad L Myers, University of Minnesota, United States
Brenda J Andrews, University of Toronto, Canada
Charles Boone, University of Toronto, Canada
Gary D Bader, University of Toronto, Canada

Area Session Chair: Burkhard Rost

Presentation Overview:
Genetic interactions indicate functional dependencies between genes and are a powerful tool to predict gene function. Functionally related genes tend to have similar profiles of genetic interactions. Recently, global scale mapping of quantitative (positive and negative) genetic interactions has been performed. This data clearly shows groups of genes connected by predominantly positive or negative interactions, termed monochromatic groups. These groups often correspond to functional modules, such as biological processes or protein complexes, or connections between modules, but it is not yet known how these patterns globally relate to known functional modules. Here we systematically evaluate the monochromatic nature of known biological processes and their connections in yeast Saccharomyces cerevisiae. We find that 10% of biological processes and less than 1% of inter-process connections are monochromatic. Further, we show that protein complexes are responsible for a surprisingly large fraction of these monochromatic groups.
TOP

Highlights Track: HL06 Sunday, July 17: 3:00 p.m. - 3:25 p.m.

Comparative Genomics Reveals Birth and Death of Fragile Regions in Mammalian Evolution
Room: Hall A
Presenting author: Max Alekseyev , University of South Carolina, United States

Additional authors:
Pavel Pevzner, University of California, San Diego, United States

Area Session Chair: Burkhard Rost

Presentation Overview:
An important question in genome evolution is whether there exist fragile regions (rearrangement hotspots) where chromosomal rearrangements are happening over and over again. Existence of fragile regions in mammalian genomes is postulated by Fragile Breakage Model (FBM) proposed in 2003 as a replacement of then widely accepted Random Breakage Model (RBM). While the rebuttal of RBM initially caused a controversy, nearly all recent studies support FBM. However, the most comprehensive phylogenomic analysis of mammals (Ma et al., 2006. Genome Res. 16: 1557–1565) revealed only a few fragile regions shared between different lineages.

Our study (2010, Genome Biology 11(11): R117) provided a refinement of FBM, reconciling it with the observed features of mammalian evolution. The newly proposed Turnover Fragile Breakage Model (TFBM) postulates that fragile regions are subject to a "birth and death" process, implying that fragility has a limited evolutionary lifespan. TFBM further implies that fragile regions migrate to different locations in different mammals, explaining why only a few fragile regions are shared between different lineages.

The "birth and death" of fragile regions reinforce the recently proposed hypothesis that rearrangements are promoted by matching segmental duplications (Zhao and Bourque, 2009, Genome Res. 19: 934-942) and suggest putative locations of the currently active fragile regions in the human genome.
TOP

Highlights Track: HL07 Sunday, July 17: 3:30 p.m. - 3:55 p.m.

Survival of the Friendly - the Importance of Protein-Protein Interactions in the Evolution of Bacterial Genomes
Room: Hall A
Presenting author: Yanay Ofran , Bar Ilan University, Israel

Additional authors:
Uri Gophna, Tel Aviv University, Israel

Area Session Chair: Burkhard Rost

Presentation Overview:
Bacterial genomes include many genes (up to 30% of the genome, by some accounts)that were not inherited vertically from ancestors, but were acquired laterally from the environment. This phenomenon of individual genes that are incorporated into an existing genome, poses an evolutionary and biological puzzle. Biological functions are not implemented by single genes but by complex and tightly regulated networks of interactions between multiple species of molecules. How could one element that was ripped out of such complex machine become functional on its own? Moreover, how can it be incorporated into an already well-controlled module in the new host? The widely accepted answer to these questions is the decade old “complexity hypothesis”, which postulates that Lateral gene transfer (LGT) occurs mostly in genes with low complexity, that is, in genes that act alone and don't have many interactions with other elements in the genome. Hundreds of follow up studies reiterated this hypothesis. In our study, however, we introduce evidence that the opposite is true: Genes with many interactions are actually more likely to be transferred than genes that have only a few interactions in their pre-transfer host. We show that proteins with more interactions have more interaction sites on their surfaces. Their sticky, or “friendly”, surface makes them more likely to establish new functional interaction after the transfer. These results underline the importance of interactions in the design of bacterial genome throughout evolution. They may provide useful principles for the attempt to design novel modules and genomes.
TOP

Highlights Track: HL08 Sunday, July 17 : 4:00 p.m. - 4:25 p.m.

Universal epitope prediction for class II MHC
Room: Hall A
Presenting author: Andrew Bordner , Mayo Clinic, United States

Additional authors:
Hans Mittelmann, Arizona State University, United States

Area Session Chair: Burkhard Rost

Presentation Overview:
Predicting peptide-class II MHC binding affinities is a challenging problem due to MHC diversity and multiple binding modes but has many biomedical applications. We recently developed a structure-based approach using peptide docking and machine learning to predict peptide-MHC binding affinities. Unlike popular sequence-based methods, it is applicable to any MHC type because it relies on universal physical interactions rather than limited experimental data for specific MHC types. Using a model trained only on DRB1*0101 binding data we were able to accurately predict peptide binding affinities for all human class II MHC loci (HLA-DP, DQ, and DR) and for two murine MHC types. This provides the first demonstration that a single prediction model can be applied to diverse MHC types with completely different binding specificities. In addition, we will review our RTA sequence-based prediction method, which outperformed more complicated competing methods, and discuss recent work.
TOP

Highlights Track: HL09 Monday, July 18: 10:45 a.m. - 11:10 a.m.

Topological network alignment uncovers biological function and phylogeny
Room: Hall E1
Presenting author: Natasa Przulj , Imperial College London, United Kingdom

Additional authors:
Tijana Milenkovic, University of Notre Dame, United States

Area Session Chair: Erik Bongcam-Rudloff

Presentation Overview:
There are thousands of genes in the human genome. However, genes are just a means to an end: they produce different protein types that interact in complex networked ways and make our cells work. Thus, network connectivity provides additional biological insight, over and above sequences of individual proteins. Hence, analogous to tools for aligning genetic sequences that have revolutionized biological understanding, network alignment tools are likely to have a similar groundbreaking impact. We introduce a topology-based network alignment algorithm that exposes surprisingly large regions of network similarity even in distant species. Substantial improvements are achieved when additional data sources (including sequence) are integrated with topology: surprisingly, 77.7% of yeast proteins participate in a connected subnetwork that is fully contained in the human network suggesting broad similarities in cellular wiring across all life on Earth. Furthermore, we show that topology is a successful predictor of new cancer genes in melanogenesis-related pathways.
TOP

Highlights Track: HL10 Monday, July 18: 10:45 a.m. - 11:10 a.m.

Systematic planning of genome-scale experiments in poorly studied species.
Room: Hall A
Presenting author: Casey Greene , Princeton University, United States

Area Session Chair: Nir Ben-Tal

Presentation Overview:
The planning of genome-scale experiments in poorly studied species is in general based on the intuition of experts or heuristic trials. We propose that computational and systematic approaches can be applied to drive the experiment planning process in poorly studied species based on available data and knowledge in closely related model organisms. To this end, we use the data-rich functional genomics compendium of the model organism to quantify the accuracy of each dataset in predicting each specific biological process and the overlap in such coverage between different datasets. Our approach uses an optimized combination of these quantifications to recommend an ordered list of experiments for accurately annotating most proteins in the poorly studied related organisms to most biological processes, as well as a set of experiments that target each specific biological process. This experiment-planning framework could readily be adapted to the design of other types of large-scale experiments
TOP

Highlights Track: HL11 Monday, July 18: 11:15 a.m. - 11:40 a.m.

A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
Room: Hall E1
Presenting author: Erez Levanon , Bar-Ilan University, Israel

Additional authors:
Alexander Wait Zaranek, Harvard Medical School,, United States
Tomer Zecharia, Compugen LTD, Israel
Tom Clegg, Scalable Computing Experts, United States
George Church, Harvard Medical School, United States

Area Session Chair: Erik Bongcam-Rudloff

Presentation Overview:
Most biomedical, genomic research begins with the painstaking assembly of a ‘‘reference genome’’ for the organism of interest. Implicit in this process is an assumption that genomic information is constant throughout an organism. There are enzymes, however, that can change, or ‘‘edit,’’ genomic information so that variations from the reference can exist within a single organism. In this work, we analyze the raw data used to assemble the reference genomes of ten organisms to discover evidence for editing. We found candidates for DNA and RNA editing as well as a sequencing error that has become incorporated into commonly used genomic resources. Our analysis demonstrates the utility of raw genomic data for the discovery of some editing events and sets the stage for further analysis as sequencing costs continue to decrease exponentially.
TOP

Highlights Track: HL12 Monday, July 18: 11:15 a.m. - 11:40 a.m.

Pi Release From Myosin: A Simulation Analysis of Possible Pathways
Room: Hall A
Presenting author: Marco Cecchini , University of Strasbourg, France

Additional authors:
Martin Karplus, Harvard University, United States
Yuri Alexeev, Institute of Food Research, United Kingdom

Area Session Chair: Nir Ben-Tal

Presentation Overview:
The release of phosphate (Pi) is an important element in actomyosin function that has been shown to be accelerated by the binding of myosin to actin. To provide information about the structural elements important for Pi release, possible escape pathways from various isolated myosin II structures have been determined by molecular dynamics simulations designed for studying such slow processes. The residues forming the pathways were identified and their role evaluated by mutant simulations. Pi release is slow in the pre-powerstroke structure, an important element in preventing the powerstroke prior to actin binding, and is much more rapid for Pi modeled into the post-rigor and rigor-like structures. The backdoor route suggested by Yount et al. is dominant in the pre-powerstroke and post-rigor states, while a different path is most important in the rigor-like state. This finding suggests a novel mechanism for the actin-activated acceleration of Pi release.
TOP

Highlights Track: HL13 Monday, July 18: 11:45 a.m. - 12:10 p.m.

Initial steps towards a production platform for DNA sequence analysis on the grid
Room: Hall E1
Presenting author: Barbera Van Schaik , Academic Medical Center, Netherlands

Additional authors:
Angela Luyf, Academic Medical Center, Netherlands
Michel de Vries, Academic Medical Center, Netherlands
Frank Baas, Academic Medical Center, Netherlands
Antoine van Kampen, Academic Medical Center, Netherlands
Silvia Olabarriaga, Academic Medical Center, Netherlands

Area Session Chair: Erik Bongcam-Rudloff

Presentation Overview:
Next generation sequencing confronts bioinformaticians with new challenges regarding data storage and analysis. Therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently and facilitate collaborations.

In this study we reused a platform that was developed for the analysis of medical images. Data transfer, workflow execution and monitoring are operated from one interface. We developed workflows for two sequence alignment tools for which the analysis time was significantly reduced. All workflows are available for the members of two Dutch virtual organizations and all components are open source.

The availability of in-house expertise and tools facilitates the usage of grids by new users. Our first results indicate that this is a practical, powerful and scalable solution. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code via http://www.bioinformaticslaboratory.nl/
TOP

Highlights Track: HL14 Monday, July 18: 11:45 a.m. - 12:10 p.m.

The imprint of codons on protein structure
Room: Hall A
Presenting author: Charlotte Deane , Oxford University, United Kingdom

Area Session Chair: Nir Ben-Tal

Presentation Overview:
The central dogma of molecular biology describes the unidirectional flow of interpretable data from genetic sequence to protein sequence. This has led to the idea that a protein’s structure is dependent only on its amino acid sequence. Analysing the input (mRNA) and output (protein) of translation, we find that local protein structure information is encoded in the mRNA nucleotide sequence. Using a detailed mapping between over 4000 solved protein structures and their mRNA we have carried out a comprehensive analysis of codon usage across many organisms. We found no evidence that domain boundaries are enriched with slow codons. In fact, genes seemingly avoid slow codons around structurally defined domain boundaries. Translation speed, however, does decrease at the transition into secondary structure. These results support the premise that codons encode more information than merely amino acids and give insight into the role of translation in protein folding.
TOP

Highlights Track: HL15 Monday, July 18: 12:15 p.m. - 12:40 p.m.

SlideSort: Fast and exact algorithm for Next Generation Sequencing data analysis
Room: Hall E1
Presenting author: Kana Shimizu , National Institute of Advanced Industrial Science and Technology, Japan

Additional authors:
Koji Tsuda, National Institute of Advanced Industrial Science and Technology, Japan

Area Session Chair: Erik Bongcam-Rudloff

Presentation Overview:
Next Generation Sequencing (NGS) technology calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount data. In this study, we designed and implemented exact algorithm SlideSort that finds all similar pairs whose edit-distance does not exceed a given threshold from NGS data, which helps many important analyses, such as de novo genome assembly, identification of frequently appearing sequence patterns and accurate clustering.
Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mer, our method is more effective in reducing the number of edit-distance calculations. In comparison to state-of-the-art methods, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing NGS data for further processing.
TOP

Highlights Track: HL16 Monday, July 18: 12:15 p.m. - 12:40 p.m.

Predicting genetic modifier loci using functional gene networks
Room: Hall A
Presenting author: Insuk Lee , Yonsei University, Korea, Rep

Additional authors:
Ben Lehner, EMBL-CRG Systems Biology Research Unit, Spain
Tanya Vavouri, 2EMBL-CRG Systems Biology Research Unit, Spain
Junha Shin, Yonsei University, Korea, Rep
Andrew Fraser, University of Toronto, Canada
Edward Marcotte, University of Texas at Austin, United States

Area Session Chair: Nir Ben-Tal

Presentation Overview:
Most phenotypes are genetically complex with contributions from mutations in many different genes. Mutations in more than one gene can combine synergistically to cause phenotypic change and systematic studies in model organisms show that these genetic interactions are pervasive. However, in human association studies such non-additive genetic interactions are very difficult to identify because of a lack of statistical power — simply put, the number of potential interactions is too vast. One approach to resolve this is to predict candidate modifier interactions between loci, and then to specifically test these for associations with the phenotype. Here we describe a general method for predicting genetic interactions based on the use of integrated functional gene networks. We show that in both S. cerevisiae and C. elegans a single high coverage, high quality functional network can successfully predict genetic modifiers for the majority of genes. We demonstrate how it is possible to rapidly expand the number of modifier loci known for a gene, predicting and validating new genetic interactions for each of three signal transduction genes. We propose that this approach, termed network-guided modifier screening, provides a general strategy for predicting genetic interactions. This work thus suggests that a high quality integrated human gene network will provide a powerful resource for modifier locus discovery in many different diseases.
TOP

Highlights Track: HL17 Monday, July 18: 2:30 p.m. - 2:55 p.m.

Genome3D: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome
Room: Hall E1
Presenting author: W Zheng , Medical University of South Carolina, United States

Additional authors:
Thomas Asbury, Sequenta, Inc, United States
Matt Mitman, Maxgaming Technologies, Inc, United States
Jijun Tang, University of South Carolina, United States

Area Session Chair: Ivo Hofacker

Presentation Overview:
We have created the first model-view framework of eukaryotic genome, Genome3D, to enable integration and visualization of genomic and epigenomic data in a three-dimensional space. Our model of the physical genome implicitly contains all levels of structure and hierarchy, and provides our underlying platform for integrating multi-scale structural and genomic information within three dimensions. The viewer is designed to display data from multiple scales and uses a hierarchical model of the relative positions of all nucleotide atoms in the cell nucleus, i.e., the physical genome. Genome3D does not intend to replace but rather works with UCSC genome browser, complementing its functionality by visualizing structural and epigenomic information in 3D space. Genome3D can significantly advance genome research in inferring epigenomic knowledge, studying long range inter- and intra-chromosome interaction, and analyzing structural feature of genetic variations, and will have a profound impact on genome information integration and analysis.
TOP

Highlights Track: HL18 Monday, July 18: 2:30 p.m. - 2:55 p.m.

The Impact of Multifunctional Genes on "Guilt by Association" Analysis
Room: Hall A
Presenting author: Jesse Gillis , University of British Columbia, Canada

Additional authors:
Paul Pavlidis, University of British Columbia, Canada

Area Session Chair: Michal Linial

Presentation Overview:
Many previous studies have shown that by using variants of “guilt-by-association”, gene function predictions can be made with high statistical confidence. In these studies, it is assumed that the “associations” in the data (e.g., protein interactions) of a gene are necessary in establishing “guilt”. Here we show that gene multifunctionality, rather than association, is a primary driver of gene function prediction. We first show that knowledge of the degree of multifunctionality alone can produce astonishingly strong performance when used as a predictor of gene function. We then demonstrate how multifunctionality is encoded in gene interaction data and feeds forward into function prediction. We find that high-quality gene function predictions can be made using data that possesses no information on which gene interacts with which. We suggest that this bias due to multifunctionality is important to control for, with widespread implications for the interpretation of genomics studies.
TOP

Highlights Track: HL19 Monday, July 18: 3:00 p.m. - 3:25 p.m.

Structure determination of genomes and genomic domains by satisfaction of spatial restraints.
Room: Hall E1
Presenting author: Marc A. Marti-Renom , Prince Felipe Research Center, Spain

Additional authors:
Davide Bau, Prince Felipe Research Center, Spain

Area Session Chair: Ivo Hofacker

Presentation Overview:
The genome three-dimensional (3D) organization plays important, yet poorly understood roles in gene regulation. Chromosomes assume multiple distinct conformations in relation to the expression status of resident genes and undergo dramatic alterations in higher order structure through the cell cycle. Despite advances in microscopy, a general technique to determine the 3D conformation of chromatin has been lacking. We developed a new method for the determination of the 3D conformation of chromatin domains in the interphase nucleus, which combines 5C experiments with the computational Integrative Modeling Platform (IMP). The general approach of our method, which has been applied to study the 3D conformation of the ?-globin domain in the human genome [1] and the Caulobacter crescentus whole genome, opens the field for comprehensive studies of the 3D conformation of chromosomal domains and contributes to a more complete characterization of genome regulation.

[1] D. Baù et al. Nat Struct Mol Biol 18 (2011) 107.
TOP

Highlights Track: HL20 Monday, July 18: 3:00 p.m. - 3:25 p.m.

Bringing order to protein disorder through comparative genomics and genetic interactions
Room: Hall A
Presenting author: Philip Kim , University of Toronto, Canada

Additional authors:
Jeremy Bellay, University of Minnesota, United States
Sangjo Han, University of Toronto, Canada
Magali Michaut, University of Toronto, Canada
Taehyung Kim, University of Toronto, Canada
Michael Costanzo, University of Toronto, Canada
Charles Boone, University of Toronto, Canada
Gary Bader, University of Toronto, Canada
Chad Myers, University of Minnesota, Canada

Area Session Chair: Michal Linial

Presentation Overview:
Intrinsically disordered regions are widespread, especially in proteomes of higher eukaryotes, and have been associated with a plethora of different cellular functions. Here, we attempt to better understand the different roles of disorder using a novel analysis that leverages both comparative genomics and genetic interactions. Strikingly, we find that disorder can be partitioned into three biologically distinct phenomena: regions where disorder is conserved but with quickly evolving amino acid sequences (“flexible disorder”), regions of conserved disorder with also highly conserved amino acid sequence (“constrained disorder”) and, lastly, non-conserved disorder. Flexible disorder is closest to canonical protein disorder and is associated with signaling pathways and multi-functionality. Conversely, constrained disorder has markedly different functional attributes and is involved in RNA binding and protein chaperones. Finally, non-conserved disorder appears largely non-functional. These distinctions provide both an informative division of disorder functionality and imply common underlying mechanisms that support these functions
TOP

Highlights Track: HL21 Monday, July 18: 3:30 p.m. - 3:55 p.m.

A Unifying Theory for GC3 Biology in Plants and Animals
Room: Hall E1
Presenting author: Tatiana TATARINOVA , University of Glamorgan, United Kingdom

Additional authors:
Nickolai Alexandrov, Ceres, United States
John Bouck, Ceres, United States
Kenneth Feldmann, University of Arizona , United States

Area Session Chair: Ivo Hofacker

Presentation Overview:
There is a well-documented bias for cytosine and guanine at the third position in a subset of transcripts within a single organism; it is present in some plant species and warm-blooded vertebrates. We demonstrated that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess certain transcription factor binding sites, (4) are predominant in certain classes of genes and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses and later extend it to other species. High levels of GC3 typify a class of genes regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion.
TOP

Highlights Track: HL22 Monday, July 18: 3:30 p.m. - 3:55 p.m.

Model-Based Learning for All SCOP Families
Room: Hall A
Presenting author: Stefan Kramer , TU Muenchen, Germany

Additional authors:
Tobias Hamp, TU Muenchen, Germany
Fabian Buchwald, TU Muenchen, Germany
Fabian Birzele, Roche Deutschland GmbH, Germany

Area Session Chair: Michal Linial

Presentation Overview:
As the automated annotation of genomic and proteomic data is becoming increasingly important, big community efforts aim at computationally predicting their manual classification, as found for example in SCOP. These methods fall into two categories: instance-based (e.g. alignments of a target against templates, followed by the assignment of the class of the best template to the target) and model-based (e.g. a Neural Network) methods. In this context, model-based algorithms have supposedly been unfit for a full-scale application due to the large presence of extremely small classes. Only integration with instance-based methods could enable their universal use.In the talk, we show that it is, for SCOP, effectively impossible to find an integration that is guaranteed to outperform the instance-based-only counterpart. Further, we show that model-based-only classifiers can be applied to arbitrary class sizes and exhibit the so far best reported accuracy to predict the SCOP family of a protein.
TOP

Highlights Track: HL23 Monday, July 18: 4:00 p.m. - 4:25 p.m.

Strengths and limitations of the federal guidance on synthetic DNA
Room: Hall E1
Presenting author: Jean Peccoud , Virginia Tech, United States

Additional authors:
Laura Adam, Virginia Tech, United States
Michael Kozar, Virginia Tech, United States
Gaelle Letort, Virginia Tech, United States
Olivier Mirat, Virginia Tech, United States
Arunima Srivastava,, Virginia Tech, United States
Tyler Steward, Virginia Tech, United States
Mandy Wilson, Virginia Tech, United States

Area Session Chair: Ivo Hofacker

Presentation Overview:
An implementation of the sequence screening method recommended by the U.S. Government to prevent the misuse of gene synthesis highlights improvements over the protocols proposed by the industry. Since it does not rely on a database of curated sequences, its deployment is fast and inexpensive. Without resulting in an unacceptable computational cost, breaking sequences into 200 bp fragments translated in six frames precludes the hiding of sequences of concern within longer, benign sequences. A standardized dictionary of keywords used to interpret alignment results and a realistic suite of annotated test sequences are still needed to assess the performance of the screen software implementations. Beyond its biosecurity application, this screening algorithm can be used to enforce other policies and regulations affecting the biotechnology industry. It is also likely to find a variety of other applications such as partitioning sequencing reads by species in metagenomic samples, forensic, or for clinical diagnostic.
TOP

Highlights Track: HL24 Monday, July 18: 4:00 p.m. - 4:25 p.m.

Next-generation genome alignment with LAST
Room: Hall A
Presenting author: Paul Horton , AIST, Computational Biology Research Center, Japan

Additional authors:
Martin Frith, AIST, Japan
Raymond Wan, The University of Tokyo, Japan
Kengo Sato, The University of Tokyo, Japan
Szymon Kielbasa, Max Planck Institute for Molecular Genetics, Japan

Area Session Chair: Michal Linial

Presentation Overview:
We present LAST, an open-source software package to replace BLAST.

BLAST and related sequence similarity tools are arguably the most successful of all bioinformatics applications. However they are not fully adequate for important tasks such as mammalian genome-genome alignment and tera-scale mapping of sequence reads.

While BLAST searchers are based on fixed length exact match "seeds", LAST employs the concept of adaptive length seeds. We show that adaptive seeds are robust to highly repetitive (e.g. mammalian) and biased composition (e.g. malaria) genomes. LAST also introduces improved methods for xeno-mapping, e.g. of mammoth reads to an elephant genome.

For the task of genome vs genome alignment, LAST is often 10-100 times faster than BLAST for similar levels of sensitivity. In fact, LAST is the first method that can sensitively compare giga-scale, repeat-rich sequences -- all previous methods either have low sensitivity e.g. DNA read mappers, or must heavily suppress repeats e.g. BLASTZ.
TOP

Highlights Track: HL25 Tuesday, July 19: 10:45 a.m. - 11:10 a.m.

A scalable approach for discovering conserved active subnetworks across species
Room: Hall E1
Presenting author: Raamesh Deshpande , University of Minnesota-Twin Cities, United States

Additional authors:
Shikha Sharma, University of Minnesota-Twin Cities, United States
Wei-Shou Hu, University of Minnesota-Twin Cities, United States
Catherine Verfaillie, Catholic University Leuven, United States
Chad Myers, University of Minnesota-Twin Cities, United States

Area Session Chair: Yanay Ofran

Presentation Overview:
Overlaying differential changes in gene expression on protein interaction networks has proven to be a useful approach to interpreting the cell's dynamic response to a changing environment. We have extended this idea to enable the discovery of active subnetworks across species. Specifically, we present a scalable, cross-species network search algorithm, neXus (Network - cross(X)-species - Search), that discovers conserved, active subnetworks based on parallel differential expression studies in multiple species. We applied our approach to identify conserved modules that are differentially active in stem cells relative to differentiated cells based on analogous gene expression studies and functional linkage networks from mouse and human. We find hundreds of conserved subnetworks enriched for stem cell-associated functions such as cell cycle, DNA repair, and chromatin modification processes. Furthermore, we demonstrate that a comparative approach to subnetwork discovery has many statistical advantages over the single-species formulation, which can enable more reliable module discovery.
TOP

Highlights Track: HL26 Tuesday, July 19: 10:45 a.m. - 11:10 a.m.

Benchmarking Ontologies: Bigger or Better?
Room: Hall A
Presenting author: Lixia Yao , Columbia University, United States

Additional authors:
Andrey Rzhetsky, University Of Chicago, United States
Anna Divoli, University Of Chicago, United States
Ilya Mayzus, University Of Chicago, United States
James Evans, University Of Chicago, United States

Area Session Chair: Paul Horton

Presentation Overview:
An ontology represents the concepts and their interrelation within a knowledge domain. Many ontologies have been developed in biomedicine, providing standardized vocabularies to describe genes and proteins, anatomical structures, physiological phenotypes or diseases, and many other phenomena. Scientists use them to encode observations and experimental results, and to perform integrative analysis to discover new knowledge. A remaining challenge is to evaluate how well an ontology represents the underlying knowledge domain. We introduce a family of metrics, including breadth and depth, to capture the conceptual and relational coverage and parsimony of an ontology. We test these measures using four commonly used medical ontologies and seven popular English thesauri (ontologies of synonyms) with respect to text from medicine, news and novels. Results demonstrate that both medical ontologies and English thesauri have a small overlap in concepts and relations, and suggest further efforts to tighten the fit between ontologies and biomedical knowledge domain.
TOP

Highlights Track: HL27 Tuesday, July 19: 11:15 a.m. - 11:40 a.m.

Network Modeling Identifies Molecular Functions Targeted by miR-204 to Suppress Head and Neck Tumor Metastasis and Mechanisms of Therapeutic Resistance
Room: Hall E1
Presenting author: Yves Lussier , The University of Chicago, United States

Additional authors:
Mark Gerstein, Yale, United States
Rosie H Xing, University of Chicago, United States
Younghee Lee, University of Chicago, United States
Xinan "Holly" Yang, University of Chicago, United States
Yong Huang, University of Chicago, United States
Qingbei Zhang, University of Chicago, United States
Jianrong Li, University of Chicago, United States
Hanli Fan, University of Chicago, United States
Rifat Hasina, University of Chicago, United States
Mark Lingen, University of Chicago, United States
Chao Cheung, Yale University , United States
Ralph Weichselbaum, University of Chicago, United States

Area Session Chair: Yanay Ofran

Presentation Overview:
Relevance: Accurately modeling microRNA regulation of oncogenic phenotype via its targets is relevant to a broad audience interest because of heightened interest in microRNA-directed therapy and the computational innovations that range from microRNA network models to the genetics of acquired oncogenic phenotypes. Previous studies lack in depth since only a few genes are biologically confirmed as microRNA targets in vitro and rarely in vivo. Additionally, key biological systems perturbed by altered microRNA functions in the context of cancer remain to be identified. This paper demonstrates how to bioinformatically integrate genetics knowledge, gene expression, and molecular network properties, to uncover previously unknown connections between microRNAs, their regulated genes, and their dynamics for streamlined and comprehensive biological validations.
TOP

Highlights Track: HL28 Tuesday, July 19: 11:15 a.m. - 11:40 a.m.

miRGator v2.0 and the construction of miRNA-disease network
Room: Hall A
Presenting author: Wankyu Kim , Ewha Womans University, Korea, Rep

Additional authors:
Sooyoung Cho, Ewha Womans University, Korea, Rep
Yukyung Jun, Ewha Womans University, Korea, Rep
Minjeong Ko, Ewha Womans University, Korea, Rep
Sanghyuk Lee, Ewha Womans University, Korea, Rep

Area Session Chair: Paul Horton

Presentation Overview:
miRGator is developed as an integrated database of microRNA-associated gene expression, target prediction, disease association and genomic annotation, in order to facilitate functional investigation of miRNAs (http://miRGator.kobic.re.kr). It contains (i) human miRNA expression profiles under various conditions, (ii) paired expression profiles of both mRNAs and miRNAs, (iii) gene expression profiles under miRNA-perturbation (e.g. miRNA knockout and overexpression), (iv) known/predicted miRNA targets and (v) miRNA-disease associations. In total, >8000 miRNA expression profiles, ?300 miRNA-perturbed gene expression profiles and ~2000 mRNA expression profiles are compiled with manual annotations on disease, tissue type and perturbation. Additionally, disease signature genes were extracted from ~12,000 gene expression profiles for ~100 human diseases. By integrating these data sets, a series of novel associations between human diseases and miRNAs is extracted by systematically comparing disease and target signature genes from various sources. Our approach correctly predicted known disease-miRNA associations with high accuracy as well as novel associations.
TOP

Highlights Track: HL29 Tuesday, July 19: 11:45 a.m. - 12:10 p.m.

Mutation Impact Mining using SADI Semantic Web Services
Room: Hall E1
Presenting author: Christopher Baker , University of New Brunswick, Canada

Additional authors:
Alexandre Riazanov, University of New Brunswick, Canada
Jonas Laurila, National Food Administration, Sweden

Area Session Chair: Yanay Ofran

Presentation Overview:
We report on a platform for mining and integration of mutation impacts from the literature. Core features of this infrastructure are: a GATE pipeline for extracting impacts of mutations on proteins populating and OWL-DL mutation impact ontology, establishment of semantic database for storing the results of text mining, the SADI framework as a medium for publishing mutation impact software and data. Through multiple case studies we demonstrate the utility of SADI (a set of conventions for creating web services with semantic descriptions that facilitate automatic service discovery and workflow orchestration) to facilitate ad-hoc knowledge discovery through a single SPARQL interface (SHARE) to a registry of SADI services. We illustrate integration of mutation impact services with external SADI services providing information about related biological entities, such as proteins, pathways, SNPS and drugs. SADI provides an effective way of exposing our mutation impact data for reuse by a variety of stakeholders.
TOP

Highlights Track: HL30 Tuesday, July 19: 11:45 a.m. - 12:10 p.m.

Fast and Efficient Dynamic Nested Effects Models
Room: Hall A
Presenting author: Holger Fröhlich , University of Bonn, Bonn-Aachen International Center for IT, Germany

Additional authors:
Paurush Praveen, University of Bonn, Bonn-Aachen International Center for IT, Germany
Tresch Achim, Ludwig-Maximilians-University Muenchen, Gene Center Munich, Germany

Area Session Chair: Paul Horton

Presentation Overview:
Reverse engineering of biological networks is a key for the understanding
of biological systems. The exact knowledge of interdependencies between
proteins in the living cell is crucial for the identification of drug
targets for various diseases. However, due to the complexity of the
system a complete picture with detailed knowledge of the behavior
of individual proteins is still out of reach. Nonetheless, the advent
of gene perturbation techniques like RNA interference (RNAi),
opened new perspectives for network reconstruction by boosting the
ability to subject organisms to well defined interventions.

Nested Effects Models (NEMs; Markowetz et al., Bioinformatics, 2005) have been introduced as a statistical
approach to estimate the upstream signal flow from the downstream nested
subset structure of high-dimensional perturbation effects (measured e.g. on microarrays). The method was substantially
extended later on by a number of authors and successfully applied to various
datasets (Markowetz et al., Bioinformatics, 2005; Tresch & Markowetz, Stat. Appl. Genome Biol., 2007; Froehlich et al., BMC Bioinformatics 2007; Froehlich et al., Bioinformatics, 2008; Froehlich et al., Biometrical Journal, 2009; Zeller et al., EURASIP J. on Bioinf. and Syst. Biol. 2009; Anchang et al., PNAS, 2009). The connection of NEMs to Bayesian Networks and factor graph models has been highlighted (Zeller et al., EURASIP J. on Bioinf. and Syst. Biol. 2009; Vaske et al., PLOS Comp. Biol., 2009).

Here we introcude a computationally attractive extension of NEMs that enables the analysis of perturbation time series data (measured e.g. on microarrays). It thus complements the attempt of Anchang et al. (PNAS, 2009) to extend static NEMs to the modeling of perturbation time series measurements. Most importantly, this allows for the resolution of feedback loops in the signaling cascade, as well as for the discrimination of direct and indirect signalling. In contrast to Anchang et al. the key idea in our model is to unroll the signal flow over time. This allows for a computation showing some similarity to Dynamic Bayesian Networks and naturally extends the classical NEM formulation. Our model circumvents the need for time consuming Gibbs sampling, which makes it also computationally attractive.

We performed extensive simulations of our model (also compared to a static NEM) to investigate its dependency on the length of time series, the sizes and architectures of the networks to be learned, and on the amount of available data. Our results indicate a very high specificity together with a good sensitivity of our method. The high specificity can be attributed to a special network structure prior favoring sparse networks here, but more generally could also incorporate prior beliefs on specific edges.

We applied our model to data investigating self-renewal in murine embryonic stem cell development in mice (Ivanova et al., Nature, 2006). We found a good accordance of our estimated network between 6 key proteins (5 transcription factors) and the biological literature. Moreover, our result generally agrees with the previous published one by Anchang et al., although being more sparse.

In summary we believe that our approach can serve as a useful tool to generate data driven hypotheses about signaling and/or transcriptional networks based on high-dimensional perturbation effects.
TOP

Highlights Track: HL31 Tuesday, July 19: 12:15 p.m. - 12:40 p.m.

The chicken or the egg problem in epidemiology: untangling the mutual impact of transmission network dynamic and epidemic spread.
Room: Hall E1
Presenting author: Christel Kamp , Paul-Ehrlich-Institut, Germany

Area Session Chair: Yanay Ofran

Presentation Overview:
Network models are well established in computational biology as they acknowledge that system dynamics is rarely determined by components alone but by interactions between components. In epidemiology, not individual behaviour but rather the structure and dynamics of the transmission network between hosts strongly influence the time scale and patterns of epidemics. Within a mathematical framework we further show that also the reverse is true and quantify the impact that epidemics have on the way contacts are made among healthy and infected individuals. This information is relevant for epidemic control as will be shown in a case study on HIV epidemics: The clustering and mixing of infected individuals depend both on the transmission network and stage of epidemic – as do the contributions of primarily and latently infected individuals to epidemic spread. In conclusion, the flexibility of the framework will be highlighted in combination with potential applications in epidemiology and beyond.
TOP

Highlights Track: HL32 Tuesday, July 19: 12:15 p.m. - 12:40 p.m.

C(alpha)-trace model of the transmembrane domain of human copper transporter 1, motion and functional implications
Room: Hall A
Presenting author: Nir Ben-Tal , Tel-Aviv University, Israel

Additional authors:
Yariv Barkan, Tel-Aviv University, Israel
Turkan Haliloglu, Bogazici University, Turkey
Nir Ben-Tal, Tel-Aviv University, Israel

Area Session Chair: Paul Horton

Presentation Overview:
The human copper transporter 1 (hCTR1) is essential for copper uptake and is implicated in sensitivity to chemotherapy drugs. Using the hCTR1 cryoelectron microscopy (cryoEM) map and evolutionary data, we constructed a C?-trace model of the membrane region. Investigating the model's global dynamics through elastic network models, hCTR1’s MxxxM and GxxxG motifs were shown to have significant roles in the two slowest modes of motion. For example, in one of these modes the glycine residues of the GxxxG motif appeared to serve as hinge points and the copper-binding methionine residues of the MxxxM motif manifested cooperative rotational motion, possibly reflecting activation at the pore’s extracellular entrance. We suggest a molecular mechanism of copper transport in which this motif serves both as a gate and as a selectivity filter. We also suggest residues that are responsible for pH activation.
TOP

Highlights Track: HL33 Tuesday, July 19: 2:30 p.m. - 2:55 p.m.

Organization of mammalian genomes with respect to the nuclear lamina
Room: Hall E1
Presenting author: Wouter Meuleman , Netherlands Cancer Institute / Delft University of Technology, Netherlands

Additional authors:
Daan Peric-Hupkes, Netherlands Cancer Institute, Netherlands
Marcel Reinders, Delft University of Technology, Netherlands
Lodewyk Wessels, Netherlands Cancer Institute, Netherlands
Bas van Steensel, Netherlands Cancer Institute, Netherlands

Area Session Chair: Terry Gaasterland

Presentation Overview:
The three-dimensional organization of chromosomes within the nucleus is largely unknown. We present high-resolution maps of the interaction of human and mouse genomes with the nuclear lamina, providing detailed views of the spatial organization of interphase chromosomes. These maps reveal substantial refolding of chromosomes during differentiation, involving hundreds of genes that collectively determine cellular identity. We illustrate the involvement of lamina-genome interactions in the control of gene expression programs.

We find that a substantial portion of the spatial chromosome organization is identical across all assayed cell types. This is even the case between species, despite extensive chromosomal rearrangements in mouse and human. Using the genomic sequence alone, we can accurately predict these lamina-genome interactions in both mouse and human. This indicates that the organization of the genome is, to a large extent, hard-coded in the primary sequence. Based on this, we conclude with a mechanistic explanation of lamina-genome interactions.
TOP

Highlights Track: HL34 Tuesday, July 19: 2:30 p.m. - 2:55 p.m.

Multi-species integrative biclustering
Room: Hall A
Presenting author: Peter Waltman , New York University, United States

Area Session Chair: Donna Slonim

Presentation Overview:
A key challenge in the analysis of functional genomics data is the identification of modules of genes with similar regulatory controls, a non-trivial problem due to the complexity of regulatory networks. Recent works that compare functional genomics datasets for closely related species reveal that many co-regulated gene groups are conserved across several species. This suggests that comparative analysis of multiple-species functional genomics datasets could prove powerful in accurately identifying those conserved modules.
We describe an extension of our cMonkey algorithm that allows for the simultaneous biclustering of heterogeneous data collections spanning multiple species. We present results from the multi-species biclustering of a group of Gram positive bacteria containing Bacillus subtilis, Bacillus anthracis, and Listeria monocytogenes. We identify conserved groups of orthologous genes, yielding evolutionary insights into the formation and surprisingly high degree of conservation of regulatory modules across these three species. We report a temporal difference between the two Bacillus species in the expression of a conserved bicluster of metabolic genes required for spore formation. In addition, we discuss the unexpected identification of a highly expressed flagellum assembly bicluster in non-motile B. anthracis. Analysis of biclusters obtained revealed a large number of gene groups with conserved modularity and high biological significance as judged by several measures of cluster quality. We show that the method provides a framework that allows data and insights from well-studied organisms to complement the analysis of related but less well studied organisms.
TOP

Highlights Track: HL35 Tuesday, July 19: 3:00 p.m. - 3:25 p.m.

Model-based method for transcription factor target identification with limited data
Room: Hall E1
Presenting author: Magnus Rattray , University of Sheffield, United Kingdom

Additional authors:
Neil Lawrence, University of Sheffield, United Kingdom
Antti Honkela, University of Helsinki, Finland
Eileen Furlong, EMBL Heidelberg, Germany
Charles Girardot, EMBL Heidelberg, Germany
Hilary Gustafson, EMBL Heidelberg, Germany
Ya-Hsin Liu, National Cheng Kung University, Taiwan
Jonatan Ropponen, Aalto University, Finland
Pei Gao, University of Cambridge, United Kingdom

Area Session Chair: Terry Gaasterland

Presentation Overview:
A fundamental problem in systems biology is uncovering the structure and dynamics of gene regulatory networks. A first step is to identify the targets of regulatory molecules such as transcription factors (TFs). We introduce a computational method for inferring targets of transcription factors from gene expression time-series data. Our approach works by fitting a simple model of transcriptional regulation to gene expression data for each putative target gene. The model is used to identify targets of transcription factors of interest. We apply the method to identify targets of TFs regulating Drosophila mesoderm development. Using only wild-type microarray time course data we make predictions that are validated with high significance using ChIP-chip data from the same system. Our model-based approach gives better target predictions than knock-out data for the same TFs and we show how spatial expression data can improve the accuracy of predictions. The method is available as a Bioconductor package allowing easy parallel implementation.
TOP

Highlights Track: HL36 Tuesday, July 19: 3:00 p.m. - 3:25 p.m.

Inference of Signaling Network Architecture and Dynamics from High-throughput Combinatorial RNAi Screens
Room: Hall A
Presenting author: Chris Bakal , The Institute of Cancer Research, United Kingdom

Additional authors:
Oaz Nir, Massachusetts Institute of Technology, United States
Norbert Perrimon, Harvard Medical School, United States
Bonnie Berger, Massachusetts Institute of Technology, United States

Area Session Chair: Donna Slonim

Presentation Overview:
Biological signaling networks are highly complex systems that act to regulate fundamental cellular processes such as growth, proliferation, differentiation and migration. Because the actions of these networks underpin both health and disease, describing their architecture and dynamics is a fundamental challenge in biology. High-content image based RNAi screens provide a wealth of data regarding the contribution of single genes to different phenotypes, but mapping cellular networks from this data is a daunting challenge. Here we describe a systematic computational framework based on a classification model for network inference using high-dimensional single-cell morphological data. We applied this method to a dataset comprised of images generated in the course of a double RNAi screen and generated a network of RhoGTPases and RhoGTPase Activating Proteins – which are essential for cell shape migration and motility in all eukaryotic cells. Furthermore, we have recently extended these methods to model signaling network dynamics.
TOP

Highlights Track: HL37 Tuesday, July 19: 3:30 p.m. - 3:55 p.m.

Quantitative analysis of the Drosophila segmentation regulatory network using pattern generating potentials
Room: Hall E1
Presenting author: Saurabh Sinha , University of Illinois Urbana-Champaign, United States

Additional authors:
Saurabh Sinha, University of Illinois Urbana-Champaign, United States
Majid Kazemian, University of Illinois Urbana-Champaign, United States
Charles Blatti, University of Illinois Urbana-Champaign, United States
Adam Richards, University of Massachusetts Medical School, United States
Michael McCutchan, Arizona State University, United States
Noriko Wakabayashi-Ito, University of Massachusetts Medical School, United States
Ann Hammonds, Lawrence Berkeley National Laboratory, United States
Susan Celniker, Lawrence Berkeley National Laboratory, United States
Sudhir Kumar, Arizona State University, United States
Scot Wolfe, University of Massachusetts Medical School, United States
Michael Brodsky, University of Massachusetts Medical School, United States

Area Session Chair: Terry Gaasterland

Presentation Overview:
The developmental program specifying segmentation along the anterior-posterior axis of the Drosophila embryo is encoded in DNA segments called cis-regulatory “modules”. Previous work has identified some of these modules along with their related transcription factors. We present a novel computational framework that turns a qualitative and fragmented understanding of modules and factor-module interactions into a quantitative, systems-level view. Our model utilizes experimentally characterized binding specificities of transcription factors and gene expression patterns to describe how multiple transcription factors act together in a module to determine its regulatory activity. This logistic regression-based model can explain the expression patterns of known modules, infer factor-module interactions and identify novel modules of the regulatory network by quantifying their potential to drive a gene's expression. As databases of binding motifs and gene expression patterns grow, this new approach provides a general method to decode transcriptional regulatory sequences and networks.
TOP