Poster numbers will be assigned May 30th.
If you can not find your poster below that probably means you have not yet confirmed you will be attending ISMB/ECCB 2015. To confirm your poster find the poster acceptence email there will be a confirmation link. Click on it and follow the instructions.

If you need further assistance please contact and provide your poster title or submission ID.

Category O - 'Systems Biology and Networks'
O001 - Flux analysis: Genomic island etiology by visualization
Louis Cronje, Student, South Africa
Short Abstract: Oscillations of gene exchange in the prokaryotic world have enormous consequences for bacterial evolution and outbreaks of new diseases. Current research in the detection of genomic islands (GIs) allows estimating relative insertion times and inferring donor-recipient relationships, effectively enabling in this way the simulation of gene flux networks across a multitude of species. The Pre_GI database ( contains 26,744 GIs predicted by the SWGIS program ( in 2407 bacterial chromosomes and plasmids. To study the layers of horizontal gene transfer and the extent of gene spreads within the Pre-GI database, we implement Markov clustering among GI groups on a beta-version of the web-based platform ( developed for this project. GI groups were derived from GIs in individual organisms sharing significant similarity in oligonucleotide patterns and equivalent insertion times to counteract the fragmentation of transferred regions after integration. Additionally, we constructed gene networks from successive BLAST matches among GI coding regions, assigning each with functional descriptions by PSI-BLASTs against the COG database.The web platform enables the visualization of gene sharing relationships among selected organisms of interest and suggested links by functional categories from the gene networks. Lastly, we investigate the associations between groups of GIs and the occurrence of shared gene matches among them. We observed multiple gene matches most frequently between groups of GIs in rather distant organisms belonging to Bacillus cereus; Staphylococcus aureus; Clostridium botulinum and Francisella tularensis. The possible role of the identified gene fluxes between taxonomically diverse microorganisms, many of which were pathogens, was discussed.
O002 - Analyses of the effect of loops on the dynamics of small pathway-like gene networks
Eugenio Azpeitia, INRIA, France
Nathan Weinstein, UNAM, Mexico
Daniel Gonzalez-Tokman, INECOL, Mexico
Stalin Muñoz, UNAM, Mexico
Elena Alvarez-Buylla, UNAM, Mexico
David Rosenblueth, UNAM, Mexico
Luis Mendoza, UNAM, Mexico
Short Abstract: Recently, several works have found that the presence of different regulatory motifs, such as feedback and feed-forward loops, is pervasive in gene regulatory networks. Importantly, these motifs generate networks that are neither hierarchical nor unidirectional. Nevertheless, explicitly or not, many of the current experimental analyses and some of the theoretical analyses rely on the assumption of a simple unidirectional and hierarchical (pathway-like) organization. Remark that the presence of motifs may modify the expected results with pathway-like network topology. In this work, we use a Boolean formalism to study the effect of including feedback and feed-forward motifs in the dynamical behavior of pathway-like networks. We show that the inclusion of such motifs increases the dynamical diversity of networks measured as the size and number of attractors. At the same time, the dynamical redundancy increases, by generating a large amount of networks with the same type of dynamical responses. Moreover, the redundant responses coincide with the type of results commonly expected experimentally. However, these results are not distinguishable with many of the current analyses. Hence, we look for putative properties that better characterize the dynamical behavior of the networks. Our results suggest the loops’ functionality and the relation of such functionality with the network topology constrains the set of biologically meaningful network, pointing to possible future directions for creating more accurate and efficient analyses of gene regulatory networks.
O003 - CausalR – Extracting mechanistic sense from genome scale data
Glyn Bradley, GSK,
Short Abstract: Utilisation of causal interaction data enables mechanistic rather than descriptive interpretation of genome scale data. Here we present CausalR, the first open source biological causal reasoning platform. We describe the construction of substrate causal graphs from public and commercial data sources and apply CausalR to immunological transcriptomic datasets. Novel techniques to reverse engineer regulatory causal networks, define compound mechanism of action, identify new targets for disease and reposition existing drugs are discussed. CausalR is available from Bioconductor.
O004 - The regulatory network controlling B cell differentiation
Akram Mendez, Universidad Nacional Autónoma de México, Mexico
Luis Mendoza, Universidad Nacional Autónoma de México, Mexico
Short Abstract: B lymphocyte differentiation is an essential process for the adaptive immune response in vertebrates and is dependent on the concerted action of multiple transcription factors in response to antigen recognition and extracellular signals that control the establishment of lineage-specific gene expression programs and restrict the differentiation options of progenitor cells thus resulting in the formation of antibody producing cells that protect the organism against foreign molecules. While there is a wealth of experimental data regarding the molecular and cellular signals involved in this process, the structure and dynamical properties of the underlying regulatory network controlling this process is not well understood. In this work, we present a dynamical model of the regulatory network controlling B cell differentiation. The structure of the network was inferred from experimental data available in the literature, and its dynamical behavior was analyzed by modeling the network as a continuous dynamical system in the form of a coupled set of differential equations. The steady states of this model are consistent with the known activation patterns of B cells at multiple differentiation stages. Moreover, the model is able to describe the differentiation process from the common lymphiod precursor (CLP) to any of the pre-B, pro-B, Naive, GC, Mem, or PC cell types in response to a specific set of extracellular signals. The model presented in this work constitutes the largest reconstruction to date of the regulatory network controlling terminal B cells.
O005 - Constrains imposed by the Gen Regulatory Network explains phenotype heterogeneity and plasticity in Granulocyte-Monocyte progenitor derived cells
Carlos Ramirez, UNAM, Mexico
Luis Mendoza, UNAM, Mexico
Deni Esther Espinosa , UNAM, Mexico
Short Abstract: Blood cell formation or hematopoiesis is a rather complicated process which shares with biologic differentiation processes characteristics such as heterogeneity, plasticity and hierarchy. Furthermore, this is probable the most studied tissue differentiation process in mammals. To deal with this huge complexity is necessary to make models that takes advantage of the qualitative knowledge that we have gained from experimental observations. Genetic Regulatory Networks (GRN) models has been shown to reproduce some dynamic behaviors for differentiation processes. We assessed whether the characteristics presented in a subset of blood cells, the Granulocyte-Monocyte Progenitor (GMP) derived cells, are derived by the dynamic behavior of the GRN module that are active in the process.
Results: A GRN for GMP derived cells was formalized by an Boolean Network model. Dynamic simulations were able to recover profile expressions for monocytes, mast cells and some granulocytes lineages. Additionally, the model was able to resemble phenotype heterogeneity and plasticity in these GMP derived cell lineages.
O006 - A Minimal Regulatory Network of Extrinsic and Intrinsic Factors Recovers Observed Patterns of CD4+‭ ‬T Cell Differentiation and Plasticity
Mariana Martínez Sánchez, Universidad Nacional Autónoma de México, Mexico
Carlos Villarreal, Universidad Nacional Autónoma de México, Mexico
Luis Mendoza, Universidad Nacional Autónoma de México, Mexico
Elena Alvarez-Buylla, Universidad Nacional Autónoma de México, Mexico
Short Abstract: CD4+ T cells orchestrate adaptive immune responses in vertebrates. These cells differentiate into several types depending on environmental signals and immunological challenges, as well as the lineage to which they belong. Once these cells are committed to a particular fate, they can alter their expression pattern and switch to different cell types, thus exhibiting cell plasticity that enables the immune system to dynamically adapt to novel challenges. How is such plasticity attained and what is the role of intracellular versus extracellular molecular components, as well as environmental cues, is still not well understood. We integrated the experimental data available on the molecular components involved CD4+ T cell differentiation into a large network, we then used formal methods to reduce the large network into a reduced network, that constitutes an intracellular regulatory core that includes transcription factors, signaling molecules, and that is able to attain the configurations characteristic of most CD4+ T cell types, as well as their transitions patterns in response to various signals. Our model provides a formal test of the insufficiency of the regulatory interactions among the core transcription factors, or intrinsic factors, to recover CD4+ T cell types. Using such regulatory network model we also recovered CD4+ T cell transition maps under contrasting micro-environments. Finally, we identified key components for cell differentiation and plasticity under contrasting immunogenic conditions. Our models may be useful to further explore the mechanism underlying the immunological system and may also be a useful tool for biomedical applications.
O007 - Integrated network analysis for identification of functional modules in cancer
Michaela Bayerlova, University Medical Center Göttingen, Germany
Annalen Bleckmann, University Medical Center Göttingen, Germany
Florian Klemm, University Medical Center Göttingen, Germany
Frank Kramer, University Medical Center Göttingen, Germany
Tim Beißbarth, University Medical Center Göttingen, Germany
Short Abstract: Transcriptomic data offers comprehensive profiles of expression changes between different conditions, which can be due to their complexity challenging to interpret. As biological processes are often regulated by coordinated effects of multiple interacting molecules, the analysis of transcriptomic data in the frame of biological networks can ease their interpretation. Thus, we investigated integration approaches for signalling, protein-protein interaction (PPI) and co-expression networks.
We used three different integrative network approaches to analyse a data-set of RNA-Seq data from breast cancer cell-lines with 2068 differentially expressed target genes. 1) We integrated the targets into a newly constructed signalling network of the WNT pathway containing 465 genes. The minimal spanning tree of target nodes was identified and tree nodes were used to extract an induced sub-graph which represented a target specific WNT signalling sub-network. 2) We combined the targets with the BioGrid database interactome resulting in a PPI network of 1712 nodes. To gain further insights we clustered the PPI network into densely interconnected communities and identified their key nodes. 3) We constructed co-expression network comprising 13356 genes connected by 8 million edges to investigate correlation patterns over all conditions. The network was clustered into modules which we tested for enrichment in KEGG pathways and investigated for co-expression patterns.
Our three approaches demonstrated the utility of different biological networks for discovering meaningful and coordinated expression changes. Therefore, by integrative network analysis we identified functional modules important for understanding of transcriptional processes in cancer.
O008 - Dynamic networks reveal key players in aging
Fazle Faisal, University of Notre Dame, United States
Tijana Milenkovic, University of Notre Dame, United States
Short Abstract: Motivation: Because susceptibility to diseases increases with age, studying aging gains importance. Analyses of gene expression or sequence data, which have been indispensable for investigating aging, have been limited to studying genes and their protein products in isolation, ignoring their connectivities. However, proteins function by interacting with other proteins, and this is exactly what biological networks (BNs) model. Thus, analyzing the proteins' BN topologies could contribute to the understanding of aging. Current methods for analyzing systems-level BNs deal with their static representations, even though cells are dynamic. For this reason, and because different data types can give complementary biological insights, we integrate current static BNs with aging-related gene expression data to construct dynamic age-specific BNs. Then, we apply sensitive measures of topology to the dynamic BNs to study cellular changes with age.

Results: While global BN topologies do not significantly change with age, local topologies of a number of genes do. We predict such genes to be aging-related.We demonstrate credibility of our predictions by (i) observing significant overlap between our predicted aging-related genes and 'ground truth' aging-related genes; (ii) observing significant overlap between functions and diseases that are enriched in our aging-related predictions and those that are enriched in 'ground truth' aging-related data; (iii) providing evidence that diseases which are enriched in our aging-related predictions are linked to human aging; and (iv) validating our high-scoring novel predictions in the literature.
O009 - Functional and co-evolutionary analysis of chromatin and cytosine modification networks in mouse embryonic stem cells
Juliane Perner, , Germany
Enrique Carrillo de Santa Pau, Spanish National Cancer Research Center, CNIO, Spain
David Juan, Spanish National Cancer Research Center, CNIO, Spain
Simone Marsili, Spanish National Cancer Research Center, CNIO, Spain
David Ochoa, ropean Molecular Biology Laboratory,
Ho-Ryun Chung, Max Planck Institute for molecular genetics, Germany
Daniel Rico, Spanish National Cancer Research Center, CNIO, Spain
Martin Vingron, Max Planck Institute for molecular genetics, Germany
Alfonso Valencia, Spanish National Cancer Research Center, CNIO, Spain
Short Abstract: The cell-type specific regulatory state of the genome can be observed by investigating the local chromatin environment. Specific combinations of histone modifications and cytosine modifications reveal the current chromatin state. With these combinations individual regulatory elements, e.g. promoters or enhancers, and their regulatory states can be identified. However, the functional impact of the different combinations is not yet clear. Especially their implications on the recruitment of chromatin modifiers to specific chromatin environment and their relationships to transcription factors are mostly unknown.
We have integrated publicly available epigenomic data of mouse embryonic stem cells (mESC) combining 139 experiments (ChIP-Seq and MeDIP) of 77 epigenomic features to investigate the interactions between chromatin modifications and chromatin modifiers. We applied network reconstruction methods to the genome-wide location data and compared the resulting interactions across various chromatin states. Using this novel approach we identified modules of specific interactions characterizing particular chromatin states. Our comparative analysis results in data-driven, novel hypotheses on the regulatory mechanisms defining the various chromatin states. Most importantly, we discovered a novel role of the recently discovered hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) in the chromatin signaling network.
O010 - Human disease related mouse phenotype prediction system with PageRank algorithm
Young Seek Lee, Hanyang University, Korea, Rep
Soo Young Cho, Seoul National University, Korea, Rep
SooJun Park, ETRI, Korea, Rep
Doo-Sik Kim, Yonsei University, Korea, Rep
Je Kyung Seong, Seoul National University, Korea, Rep
Short Abstract: Genetically Engineered Mouse (GEM) models are used in high-throughput phenotyping screens for understanding genotype-phenotype associations and its relevance to human disease. However, not all mouse mutant mouse lines with detectable phenotypes are associated with human disease. Here we propose Target gene selection system for Genetically engineered mouse models (TarGo). Using a combination of human disease descriptions, network topology and genotype-phenotype relations, novel genes potentially related to human disease are suggested. We constructed a gene interaction network using Protein-Protein Interaction (PPI), molecular pathway and co-expression data. The information for human disease related genes was obtained from several repositories for human disease signatures. We calculated disease or phenotype specific gene ranking using network topology and disease signature. TarGo provides many novel features for gene function prediction.
O011 - Integrative Analysis of Transcriptomic Data into Genome-Scale Model of Metabolism Results in Identification of Metabolic Engineering Targets
Minsuk Kim, Seoul National University, Korea, Rep
Jeong Sang Yi, Seoul National University, Korea, Rep
Byung-Gee Kim, Seoul National University, Korea, Rep
Short Abstract: In recent years, in silico strain design algorithms based on the genome-scale model of metabolism (GEM) have become most prominent tools for identifying nonobvious metabolic engineering targets. Currently, majority of the most successful strain design algorithms work in the manner that first generate a reference state for host strain using flux balance analysis (FBA). And then they find sets of required regulatory changes for increasing the product yield. However, due to the underlying assumptions for FBA, the usages of the current algorithms are limited to the cells behaving optimally. To overcome this limitation, integrative analysis of transcriptomic data can be an alternative option for generating reference state instead of using FBA. In this study, we present the first strain design algorithm which makes use of transcriptomic data for generating reference state. Our new algorithm uses integrative metabolic analysis tool (iMAT) (Zur et al. (2010) Bioinformatics) to integrate transcriptomic data into GEM, and determines sets of active and inactive reactions. Subsequently, the algorithm identifies gene overexpression targets from the set of inactive reactions to increase the product yield. To investigate the power of the new algorithm, we applied the algorithm for Streptomyces coelicolor to design antibiotic overproducer. Three gene overexpression targets related to NADPH regeneration process were identified and experimentally validated for its effectiveness in antibiotic production. In short, by employing the transcriptomic data in integrative manner, we can widen the applicability of strain design algorithms for secondary metabolite overproducer design.
O012 - Development of M-path platform for Navigating Potential Metabolic Pathways
Teppei Ogawa, Mitsui Knowledge Industry Co.,Ltd., Japan
Hiroki Makiguchi, Mitsui Knowledge Industry Co.,Ltd., Japan
Masahiko Nakatsui, Dept. of Chem. Sci. and Eng., Grad. Sch. of Eng., Kobe Univ., Japan
Robert Cox III, Dept. of Chem. Sci. and Eng., Grad. Sch. of Eng., Kobe Univ., Japan
Akihiko Kondo, Dept. of Chem. Sci. and Eng., Grad. Sch. of Eng., Kobe Univ., Japan
Michihiro Araki, Org. of Advanced Sci. and Tech., Kobe Univ., Japan
Short Abstract: Recent developments in synthetic biology and metabolic engineering have led to the construction of synthetic metabolic pathways for efficient production of various natural and non-natural chemicals. We have developed a computational tool, M-path, to find potential synthetic metabolic pathways for these chemicals.
M-path makes use of chemical structure information and enzymatic reaction rules, and is performed as follows. Step 1: M-path first calculates pathway feature from start to target compounds. Step 2: Random subsets of reactions are extracted from entire reaction data. Step 3: Calculate combinations of reaction features that fill in the pathway feature using linear programming. Step 4: The combinations of reaction features are rearranged and intermediates are assigned. Step5: Pathway scores are calculated by comparing chemical structures to output pathway candidates. A web-based platform is also developed to check resulting metabolic pathways (
We further constructed M-path database ( In M-path database, 25,000 amino acids and 20,000 keto acids in the PubChem database, and over 30,000 compounds in the ChEBI database were selected as target compounds for predicting possible synthetic pathways. Of these compounds, we found more than 120,000, 140,000 and 270,000 potential metabolic pathways including compounds for 5,129 amino acids, 3,808 keto acids and 6,496 ChEBI compounds, respectively.
O013 - Global versus local biological network alignment: which one is better?
Lei Meng, University of Notre Dame, United States
Aaron Striegel, University of Notre Dame, United States
Tijana Milenkovic, University of Notre Dame, United States
Short Abstract: Biological network alignment (NA) aims to find regions of topological and functional similarities between molecular networks of different species. Just as genomic sequence alignment, NA can be local (LNA) or global (GNA). LNA methods (e.g., NetworkBLAST, NetAligner, AlignNemo, and AlignMCL) aim to find small highly-conserved network regions and produce many-to-many mapping between nodes of the compared networks. On the other hand, GNA methods (e.g., NETAL, GHOST, MAGNA, and WAVE) aim to find large conserved subgraphs and produce one-to-one node mapping. Given the different goals and outputs of LNA and GNA, when a new NA method is proposed, it is compared against existing methods from the same category (i.e., LNA or GNA). Instead, we introduce the first ever systematic evaluation of the two NA method categories, along with new measures of both topological and biological alignment quality that allow for fair comparison of the different LNA and GNA output types. We evaluate the prominent LNA and GNA methods (listed above) on both synthetic and real-world biological networks. We thoroughly study the effect on alignment quality of using only topological information to compute node similarities across different networks versus also using sequence information for this purpose. When using only topological information to construct alignments, GNA outperforms LNA in terms of both topological and biological alignment quality. When sequence information is also included during alignment construction, GNA is superior in terms of topological alignment quality, while LNA is superior in terms of biological quality. Our results provide guidelines for future NA method development.
O014 - NaviCell Web Service for network-based data visualization and analysis
Inna Kuperstein, Institut Curie, France
Eric Bonnet, Institut Curie, France
Eric Viara, Institut Curie, France
Laurence Calzone, Institut Curie, France
David Cohen, Institut Curie, France
Emmanuel Barillot, Institut Curie, France
Andrei Zinovyev, Institut Curie, France
Short Abstract: NaviCell Web Service is a tool for network-based visualization of “omics” data which implements several data visualization methods. NaviCell Web Service uses Google Maps and semantic zooming to browse large biological network maps, represented in various formats, together with different types of the molecular data mapped on top of them. The input data for NaviCell Web Service are various omics data as mRNA, microRNA or proteins expression, mutation landscapes, copy-number genomic profiles. NaviCell Web Service is also suitable for computing aggregate values for sample groups and protein families and mapping this data onto the maps. A table with sample annotations can be uploaded in order to define biologically or clinically relevant groups of samples. The tool provides standard heatmaps, barplots and glyphs as well as the novel map staining technique for grasping large-scale trends in numerical values (such as whole transcriptome) projected onto a pathway map. The web service provides a server mode, which allows automating visualization tasks and retrieve data from maps via RESTfull (standard HTTP) calls. Bindings to different programming languages are provided (Python, R, Java). The novelty of NaviCell Web Service is in the combination of these flexible features that provides an opportunity to adjust the modes of visualization to the data type and achieve the most meaningful picture. The features of the tool are illustrated in case studies using pathway maps created by different research groups, in which data visualization provides new insights into molecular mechanisms involved in systemic diseases such as cancer and neurodegenerative diseases.
O015 - Investigating Drug-Resistance in Cancer with a new Mathematical Framework for the Inference of Cancer Progression Models and Vector Integration Sites Data
Giulio Spinozzi, The San Raffaele Telethon Institute for Gene Therapy (hSR-TIGET), Italy
Andrea Calabria, The San Raffaele Telethon Institute for Gene Therapy (hSR-TIGET), Italy
Giulio Caravagna, University of Milano-Bicocca, Department of Informatics, Systems and Communication (DISCo), Italy
Alex Graudenzi, University of Milano-Bicocca, Department of Informatics, Systems and Communication (DISCo), Italy
Daniele Ramazzotti, University of Milano-Bicocca, Department of Informatics, Systems and Communication (DISCo), Italy
Marco Antoniotti, University of Milano-Bicocca, Department of Informatics, Systems and Communication (DISCo), Italy
Giancarlo Mauri, University of Milano-Bicocca, Department of Informatics, Systems and Communication (DISCo), Italy
Eugenio Montini, The San Raffaele Telethon Institute for Gene Therapy (hSR-TIGET), Italy
Short Abstract: Lentiviral vectors (LVs), when properly modified, might integrate near specific genes, alter their expression and induce cancer or anticancer drug resistance (ACDR). The analysis of vector-cellular genomic junctions in tumor or ACDR cells allows identifying genes causative of the selected phenotype. Indeed, genomic regions targeted at significantly higher frequency than expected by a random distribution are defined Common Insertion Sites (CIS), hallmark of insertional mutagenesis. Bioinformatics tools to infer cancer progression models, as to selective advantage relations among relevant genomic alterations, would allow identifying specific combinations of targeted drugs to overcome the occurrence of resistance. However, mathematical methods for hypothesis testing of genes involved in ACDR are still missing, as none of them is able to handle vector integration site (IS) data.
We developed an integrated bioinformatics workflow composed of: (i) an automated procedure to identify IS, (ii) a step for processing CIS and (iii) a new statistical inference technique to infer selective advantage relations among various mutational events related with drug-resistance. The model is based on probabilistic causation and is able to reconstruct cancer progression models as Direct Acyclic Graphs.
Applying the method to our IS datasets from two cell lines, we were able to generate progression models involving CIS and to confirm the role of PIK3CA-PIK3CB genes in ACDR. The analyses of additional IS datasets from other insertional mutagenesis projects aimed at induce ACDR in different tumor types are ongoing and will allow to validate or identify novel cancer progression model and possible combinatorial therapies.
O016 - The regulatory network controlling natural killer cell differentiation
Adhemar José Liquitaya Montiel, Instituto de Investigaciones Biomédicas, UNAM, Mexico
Luis Mendoza, Instituto de Investigaciones Biomédicas, UNAM, Mexico
Short Abstract: Natural Killer (NK) cells constitute a subset of lymphocytes that are able to eliminate tumor cells and to regulate immune response through the release of cytokines. They differentiate from hematopoietic stem cells due to the concerted action of extracellular signals, signaling pathways activity, and transcriptional regulators. This work integrates published experimental data from NK cell differentiation into a regulatory network model, which is analyzed as a discrete dynamical system. Simulations recover two steady states, one with an expression pattern that can be interpreted as mature NK cells, and the other representing early T lymphocyte differentiation. The system responds to the simulated addition of IL-15 by inducing the appearance of mature NK cells, just has has been experimentally reported. We performed a systematic mutant analysis of the model, and the steady states found agree with the reported phenotypes. We compared our results against those obtained from 1000 random networks with similar characteristics to our model, and observed that the expression pattern of our model is strongly determined by the network architecture.
O017 - A Network-based Functional Validation Method for Protein Sets
Malte Luecken, University of Oxford,
Matthew Page, UCB,
Gesine Reinert, University of Oxford,
Charlotte Deane, University of Oxford,
Short Abstract: Detecting functional modules in Protein Interaction Networks (PINs) is an important problem in biology. It is a challenging problem because of the high error rates and systemic biases prevalent in PINs. A particularly prominent bias arises as certain proteins and pathways are better studied than others. This type of “experimenter bias” has the effect that certain well-studied regions of any PIN contain less false-positive interactions and more, or more specific, functional annotations. These more specific functional annotations mean functional similarities computed between proteins are likely to be higher. Thus, there is a bias towards finding proposed modules significant by functional similarity metrics in regions of a PIN which are well-studied.

We have developed a novel method which evaluates the functional significance of proposed protein modules based on functional annotations and the local network structure. This method is less affected by “experimenter bias” and can thus find significant modules in regions of the PIN with less or less specific annotations.

We tested our method on PINs from the HINT ( and BIOGRID ( protein interaction databases using the community detection methods Modularity Maximization (Reichardt and Bornholt, 2006), BigCLAM (Yang and Leskovec, 2013) and Link Clustering (Ahn et al, 2010). Our method changes which protein communities represent the most promising candidates for functional modules. We go on to show that we can detect functionally significant communities in poorly annotated regions of the PIN.
O018 - The Developmental Transcriptome for Lytechinus variegatus
Emily Speranza, Boston university, United States
John Hogan, Boston University, Bioinformatics Program, United States
Jessica Keenan, Boston University, Bioinformatics Program, United States
Lingqi Luo, Boston University, Bioinformatics Program, United States
Akhil Saji, Boston University, Biology Department, United States
Mary Ann Sundermeyer, Boston University, Biology Department, United States
Daphne Schatzberg, Boston Univeristy, Biology Department, United States
Michael Piacentino, Boston University, Molecular and Cellular Biology and Biochemistry Program, United States
Daniel Zuch, Boston University, Program in Molecular and Cellular Biology and Biochemistry, United States
Amanda Core, Boston University, Biology Department, United States
Jose Horacio Grau, Dahlem Center for Genome Research and Medical Systems Biology, Germany
Bernd Timmermann, Sequencing Core Facility, Max-Plank Institute for Molecular Genetics, Germany
Albert Poustka, Dahlem Center for Genome Research and Medical Systems Biology, Germany
Cynthia Bradham, Boston University, Biology Department, United States
Short Abstract: Embryonic development is arguably the most complex process an organism undergoes during its lifetime. Understanding development is best approached with a systems-level perspective. The sea urchin has become a valuable model organism for understanding developmental specification, morphogenesis, and evolution. As a non-chordate deuterostome, the sea urchin occupies an important evolutionary niche between protostomes and vertebrates. Lytechinus variegatus (Lv) is an Atlantic Ocean species that has been studied for a number of years, and has provided important insights into signal transduction, patterning, and morphogenetic changes during embryonic/larval development. The Pacific Ocean species, Strongylocentrotus purpuratus (Sp), is well-studied particularly for gene regulatory networks and cis-regulatory analyses. A well-annotated genome and transcriptome for Sp are available, but similar resources have not been developed for Lv. Here, we provide analysis of the Lv transcriptome at 11 time points during embryonic/larval development. Based on analysis for the expression of a conserved set of genes, we find that the late pluteus larval stage most closely matches the phylotypic vertebrate pharyngula stage, suggesting that conservation of this temporal gene expression pattern predates the appearance of the chordates. Using principal component analysis, we show that the major transitions in variation of embryonic transcription divide the developmental time series into four temporally sequential groups, which is corroborated by k-means cluster analysis, specification network analysis, and metabolic network analysis. Together, these analyses indicate that sea urchin development includes sequential intervals of relatively stable gene expression states punctuated by more abrupt transitions.
O019 - A systems biology characterization of the anti-cancer compound Vorinostat.
Christopher Woelk, University of Southampton,
Cory White, UCSD, United States
Harvey Johnston, University of Southampton,
Celsa Spina, UCSD, United States
Douglas Richman, UCSD, United States
Spiro Garbis, University of Southampton,
Nadejda Beliakova-Bethell, UCSD, United States
Short Abstract: Vorinostat is a histone deacetylase inhibitor (HDACi) used to treat refractory cutaneous T-cell lymphoma (CTCL) and is being investigated as a component of “shock and kill” strategies to cure HIV. Vorinostat inhibits deacetylation, leading to the acetylation of histones and the relaxation of chromatin. However, little is known about other mechanisms of action or the off-target effects of this compound. Therefore, the effects of Vorinostat on primary CD4 T cells were evaluated in a systems biology approach. Cells were isolated from 10 healthy donors and treated with 1µM of Vorinostat for 24 hours or left untreated. Protein extracts from 4 donors were subjected to iTRAQ labeling and characterized by two-dimensional liquid chromatography-mass spectrometry quantitative proteomics. RNA was isolated from 6 donors and subjected to transcriptomic analysis (Illumina HT12 v4 microarrays). Differentially expressed genes (DEGs) and proteins (DEPs), as well as differentially expressed phosphorylated (DPPs) and acetylated (DAPs) proteins were identified using Limma. Data integration was primarily facilitated by using all four data types to construct a single protein interaction network. The addition of proteomic data revealed a much more detailed protein interaction network with the inclusion of many nodes not regulated at the transcriptional level but at the post-translational level. In addition, HMGA1 was differentially expressed at the transcript, protein, and acetylated protein levels. This protein is of particular interest since it may repress transcription from the HIV promoter and thus may limit the effectiveness of Vorinostat in HIV cure strategies.
O020 - GREAT: GRaphlet Edge-based network AlignmenT
Joseph Crawford, University of Notre Dame, United States
Tijana Milenkovic, University of Notre Dame, United States
Short Abstract: Network alignment aims to find regions of topological or functional similarities between networks. In computational biology, it can be used to transfer biological knowledge from a well-studied species to a poorly-studied species between aligned network regions. Typically, existing network aligners first compute similarities between nodes in different networks (via a node cost function) and then aim to find a high-scoring alignment (node mapping between the networks) with respect to "node conservation'', typically the total node cost function over all aligned nodes. Only after an alignment is constructed, the existing methods evaluate its quality with respect to an alternative measure, such as "edge conservation''. Thus, we recently aimed to directly optimize edge conservation while constructing an alignment, which improved alignment quality. Here, we approach a novel idea of maximizing both node and edge conservation, and we also approach this idea from a novel perspective, by aligning optimally edges between networks first in order to improve node cost function needed to then align well nodes between the networks. In the process, unlike the existing measures of edge conservation that treat each conserved edge the same, we favor conserved edges that are topologically similar over conserved edges that are topologically dissimilar. We show that our novel method, which we call GRaphlet Edge AlignmenT (GREAT), improves upon state-of-the-art methods that aim to optimize node conservation only or edge conservation only.
O021 - Diffusion-based network analysis approach for Metabolomics data
Sergio Picart, Centre for Biomedical Engineering Research, ESAII, UPC; CIBERbbn, Spain
Francesc Fernández, Centre for Biomedical Engineering Research, ESAII, UPC; CIBERbbn, Spain
Maria Vinaixa, Centre for Omics Sciences, Rovira i Virgili University; CIBERDEM, Spain
Miguel A. Rodríguez, Centre for Omics Sciences, Rovira i Virgili University; CIBERDEM, Spain
Suvi Aivio, Institute for Research in Biomedicine, Barcelona, Spain
Travis H. Stracker, Institute for Research in Biomedicine, Barcelona, Spain
Oscar Yanes, Centre for Omics Sciences, Rovira i Virgili University; CIBERDEM, Spain
Alexandre Perera, Centre for Biomedical Engineering Research, ESAII, UPC; CIBERbbn, Spain
Short Abstract: We propose a generic mathematical method for the enrichment of omic datasets. This method starts from the altered elements measured in an experiment and simulates a diffusion process through a network representation of current knowledge. Particularly, the method contains a null model of this diffusion process, allowing to build a statistical test for every node in the network and providing a measure of significance through a p-value.

First, we have represented the current knowledge in Metabolomics using KEGG, Kyoto Encyclopedia of Genes and Genomes, a manually curated database. We have selected some of its categories regarding their roles: compounds (also known as metabolites), reactions, enzymes, modules and metabolic pathways. Our network takes into account the connections between these entities, according to KEGG.

Afterwards, we have applied our method to study the effect of a gene knockout case-control experiment, measured through Liquid Chromatography-Mass Spectrometry. The dataset contains five cultures per condition and 279 quantified metabolites for all the samples, out of which 41 appear significantly affected between both conditions. Using this information, we are able to extract an affected subnetwork including predictions in terms of interesting biological entities, namely reactions, enzymes, modules and metabolic pathways. Particularly, the methods suggests how the enzymes are connected to the input in the context of the relevant pathways. Our results are validated through Nuclear Magnetic Resonance analyses of the same samples through flux analysis by tracking of isotopic labels.
O022 - An integrative approach to unravel the human–Schistosoma mansoni interactome: Who, when and where
Yesid Astroz, Fiocruz, Brazil
Alberto Santos, Novo Nordisk Foundation Center for Protein Research, Denmark
Lars Jensen, Novo Nordisk Foundation Center for Protein Research, Denmark
Guilherme Oliveira, Fiocruz, Brazil
Short Abstract: The study of molecular host–parasite interactions is essential to understand parasite infection and local adaptation within the host. Recent efforts use several strategies to identify inter-species protein–protein interactions (PPIs) between the host and parasites, viruses and bacterias. Here, we investigate the inferred PPI network between human and S. mansoni, one of the parasites causing Schistosomiasis, a neglected tropical disease. To this end, we propose an integrative approach that gives context to the interactions according to the parasite’s life cycle and subcellular localization of the proteins. We use a homology-based method to predict interactions by looking at intra-species interactions among all organisms within the closest ancestral group common to both, human and S. mansoni and uses conservation of interactions as a measure of confidence. Besides, we used publicly available datasets of domain-domain interactions to identify possible PPIs based on common domains. To contextualize the interactions, we limit the interactions to human membrane expressed in tissues that support the parasite’s tropism (skin, blood, lung, liver and intestine). Our approach predicted 34,586 PPIs, which show crosstalk between parasite and host proteins enriched in metabolic and tissue-specific secretory pathways essential in the life cycle of the parasite. An initial manual curation of some of the interactions revealed tissue-specific interactions that are also stage-specific according to expression data available for S. mansoni. We believe that applying this systems biology approach will certainly help uncover targetable mechanisms for the therapy of Schistosomiasis, and also opens the possibility for the analyses of any host-parasite pair.
O023 - Knowledge-Guided Fuzzy Logic Network Modeling to Detect Alterations in Cancer Signaling Pathways
Jie Zheng, Nanyang Technological University, Singapore
Hui Liu, Nanyang Technological University, Singapore
Shital Kumar Mishra, Nanyang Technological University, Singapore
Fan Zhang, Nanyang Technological University, Singapore
Shuigeng Zhou, Fudan University, China
Short Abstract: This poster is based on Proceedings Submission 115.

Abnormal alteration in signaling pathways is a key characteristic of cancer cells. As drug-induced rewiring of signaling networks is a major strategy of anticancer treatment, accurate prediction of cellular responses to drugs is a crucial but challenging task. Our prior knowledge about mechanism of signal transduction is limited for accurately predicting the actual cellular responses to perturbations. Despite encouraging success, data-driven methods have their limitations including the requirement of large datasets that may not be available and the difficulty of interpreting the results. Hybrid methods integrating prior knowledge with data-driven inference are therefore highly desirable. In this paper, we propose a fuzzy logic network model integrating the prior knowledge and data-driven inference to detect signaling pathway alterations. In particular, we introduce a regularizer to encode the penalty against both model complexity and structural divergence between prior and learned networks, to the least square error between experimental and predicted data. We formulate the knowledge-guided fuzzy logic network model into a constrained nonlinear integer programming problem that can be efficiently solved by genetic algorithm. The proposed method is evaluated on a synthetic dataset and three real phosphoproteomic datasets, and the experimental results demonstrate that our method can not only effectively uncover the topological structure and logical gates of network, but also infer the signaling pathway alterations that are not included in prior knowledge network but supported by data.
O024 - DNA methylation-dependent transcription regulatory networks elucidate dynamics of transcription regulatory circuitry in cancers
Xuerui Yang, Tsinghua University, China
Yu Liu, Tsinghua University, China
Yang Liu, Tsinghua University, China
Zhengtao Xiao, Tsinghua University, China
Shengcheng Dong, Tsinghua University, China
Short Abstract: Context-dependent DNA methylation plays a critical role in regulating gene transcription, thereby serving as an important epigenetic marker or regulator in many biological processes and complex diseases such as cancer. However, previously DNA methylation has rarely been taken into account as a significant factor in most of the de novo reconstructions of cancer type-specific transcription regulatory networks. The present study was set to systematically assess the involvement of DNA methylation in transcription regulatory circuitry in cancer. We took advantages of the multi-dimensional profiling data of DNA methylations and gene expressions in tumors of different cancers in The Cancer Genome Atlas consortium, and developed an integrative analysis pipeline based on conditional mutual information, to quantify the cooperative regulatory effects of CpG site methylation and transcription factor activity on gene expressions. Our genome-wide analysis shows that DNA methylation and transcription factors indeed cooperate to control gene expressions. To map the interplay between these two major defining factors of gene expression, DNA Methylation-dependent Transcription Regulatory Network (MeTRN), the first of its kind, was assembled for each of 19 major cancer types, and broadly validated using public ChIP-seq and DNaseI-seq data. Comparison of these networks across cancer types showed that context-specificity of transcriptional circuits can be largely attributed to the context-dependent nature of DNA methylation patterns. In summary, MeTRN recapitulates an epigenetic scheme that implements dynamics of transcription regulatory circuitry across cancers via context-dependent DNA methylation marks, and thereby serves as a new basis for further mechanistic studies of gene expression dysregulations in cancers.
O025 - The GeneDataAtlas: Visualizing gene relations in multidimensional data spaces using Voronoi maps
Piet Molenaar, Bioinformatician, Netherlands
Jan Koster, Postdoc, Netherlands
Short Abstract: Defects in the genetic regulatory processes that govern the life cycle of cells can cause diseases, including cancer. Often, these processes are drawn as neatly ordered diagrams. However, gene duplication is one of the driving mechanisms of evolution, resulting in shared properties. As such genes participate in more or less specific sets of processes governing (transitions between) stable states of biological systems.

High throughput data as obtained from microarrays, proteomics and the like, captures the fuzzy nature of these processes. However, visualizing all of this fuzziness in a comprehensive way is still a challenge.

We hypothesized that capturing existing structured knowledge of genes in a 2D map, a GeneDataAtlas, and superimposing experimental data upon that map would help biologists with the discovery of new patterns in their genomics data.

As a proof of principle we converted the 3 geneontology realms into semantic similarity matrices. These were subsequently hierarchically clustered and the resulting binary trees were laid out as modified 2D VoronoiTreeMaps. As such, the relative positions of genes reflect their similarity in the underlying knowledge space. Experimental data can subsequently be mapped on these fixed maps.

Robustness of the map was verified by comparing relative positions of genes between different versions of the GeneOntology.

The GeneDataAtlas maps are implemented within the R2 genomics analysis platform (, where they can be used as visualization by highlighting of genes from analysis results.
O026 - ClustEval - An Integrated Online Framework for the Standardization and Evaluation of Popular Bioinformatics Clustering Tools
Christian Wiwie, University of Southern Denmark, Denmark
Richard Röttger, University of Southern Denmark, Denmark
Jan Baumbach, University of Southern Denmark, Denmark
Short Abstract: In recent years the amount of biological data produced in large-scale experiments has grown rapidly. The increasing number and size of datasets requires computational analyses that search for structure in data. Clustering is a popular unsupervised learning technique to identify patterns in unlabeled inputs by arranging similar entities together in groups. An extensive number of clustering methods has been developed, each providing at least one parameter. Also, more and more cluster validity indices are proposed to evaluate significance of clusterings. These factors continuously increase the overall complexity of cluster studies.
We developed ClustEval, an integrative clustering evaluation framework for objective cluster analysis. It enables the scientist to easily carry out cluster studies and provides unbiased and reproducible results. Many clustering methods can be applied to sets of datasets in an automatized way and comparative results are visualized on a website. ClustEval automatizes the identification of good parameter values for the clustering methods. Applicability of methods to datasets is ensured by introducing standard formats. ClustEval can generate synthetic datasets to evaluate clustering methods on inputs with defined characteristics.
O027 - Comparison of heuristic methods for the identification of switched hybrid systems
Susana Vinga, IDMEC, Portugal
Andras Hartmann, IDMEC/IST-UL, Portugal
João M. Lemos, INESC-ID/IST-UL, Portugal
Short Abstract: One limiting assumption of many mathematical models describing dynamic systems is that the parameters are time-invariant during the observation period. However, this premise does not necessary hold for many biological systems, for example the existence of intra-individual variability is typical for biological and medical systems,
Hybrid time-varying parameter models comprise both discrete and continuous elements, and provide a suitable framework for biological data modeling. Typically, while the states of the system are modeled with continuous dynamics, the parameters that can be subject to changes exhibit discrete states.
This work considers the problem of parameter identification for switched hybrid systems. The identification of such systems typically results in non-convex optimization problems, where finding the globally optimal solution exhibits exponential computational complexity in the size of the input. Such complexity may however not be tractable even for middle size problems. Another approach involves heuristics in order to deliver estimate in tractable time, with the trade-off that the estimates are only approximate solutions.
Three recently proposed algorithms for switched ARX system identification are compared. We consider segmentation with regularization, Expectation Maximization with Particle Filtering, and identification using sum-of-norms. Statistical measures are introduced in order to quantitatively compare the performance of the different methods on a simulated one-dimensional example. The individual behavior of the methods is also analyzed together with the computational complexity.
These methods can be successfully applied to the identification of several types of biological systems, from pharmacokinetic/pharmacodynamics (PK/PD) models to metabolic networks and cell growth dynamics.
O028 - Multiscale mathematical modelling recapitulates breast cancer invasion phenotypes
Arnau Montagud, Institut Curie, France
Andrei Zinovyev, Institut Curie, France
Emmanuel Barillot, Institut Curie, France
Short Abstract: Understanding tumour invasion mechanisms is crucial to improve prognosis and develop new cancer treatment strategies, but this is hindered by the lack of understanding of detailed molecular determinants of this process and their interactions leading to different ways cancer cells invade the surrounding tissues. Tumour invasion varies from individual to collective cell movement or if migrating cells are mesenchymal- or amoeboid-like or also if they use proteases to facilitate their migration. In the past years several efforts have been done in systematising different mechanisms of cell migration and understanding their underlying causes.
We devised a multi-scale mathematical model that incorporates information of a series of traits, cellular and environmental, that output in a set of invasion modes. For this, the model incorporates different intracellular and signalling pathways and the resulting influence network has been translated into a mathematical model using discrete logical modelling.
We have taken advantage of continuous time Boolean modelling based on Markovian stochastic process defined on the model state transition graph to simulate intracellular molecular processes determining individual cellular properties. We have embedded this Boolean model in a lattice-free individual cell population model to cope with interaction between cells and microenvironment affecting cell properties, leading to various patterns of collective cell behaviour. The model has been tuned by observed phenotypes on existing data from experimental results on tumours, cell lines and cells spheroids. Present work is part of a collaborative effort to model tumour invasion in order to identify treatment strategies and to understand underlying properties of metastasis.
O029 - Cellular environment shapes tissue-specificity of cancer genes
Martin Schaefer, Centre for Genomic Regulation, Spain
Luis Serrano, Centre for Genomic Regulation, Spain
Short Abstract: One of the biggest mysteries in cancer research remains why mutations in certain genes cause cancer only at specific sites in the human body. The poor correlation between the expression level of a cancer gene and the tissues in which it causes malignant transformations raises the question of which factors determine the tissue-specific effects of a mutation. Here, we explore why some cancer genes are associated only with few different cancer types (i.e., are specific), while others are found mutated in a large number of different types of cancer (i.e., are general). We do so by computationally contrasting functions of general and specific cancer genes and by investigating properties of cancer genes in host-pathogen and toxicogenomic networks. By doing so, we identify mechanisms by which the differential exposure to environmental mutagens modulate the effect of a disease mutation across tissues and thereby can explain the different associations between cancer genes and tissues.
O030 - Inexact Multiple Network Alignment using Parallelized Ant Colony Optimization
Nicolas Alcaraz, University of Southern Denmark, Denmark
Frederik G. Alkaersig, University of Southern Denmark, Denmark
Simon Larsen, University of Southern Denmark, Denmark
Katrine L. Staehr, University of Southern Denmark, Denmark
Jan Baumbach, University of Southern Denmark, Denmark
Short Abstract: We have arrived at the post-genomic era where high-throughput OMICs technologies to-
gether with the emergence of systems biology are exponentially increasing the quantity of
biological interaction information stored in public databases. To be able to exploit this new
wealth of information and to produce novel biological insights, sophisticated computational
methods are required.
Here we deal with one of the central problems in systems biology: multiple
network alignment of large biological networks. Solving this problem has the potential of
providing great aid in the functional understanding of complex biological pathways and their
evolution, which can result in important applications such as the discovery of novel biomarkers
against diseases. However, current graph alignment methodology is unable to cope with the
size and number of today’s biological networks, where aligning only two networks is
already a proven NP-hard optimization problem. In addition, one needs to cope
with noisy and incomplete interaction information.
We propose tackling an inexact version of the multiple network
alignment problem by developing a powerful
dedicated metaheuristic approach based on Ant Colony Optimization,
and combining it with massive CPU and GPU parallelization.
O031 - Modeling and analysis of atherosclerosis process using time Petri nets
Marcin Radom, Institute of Computing Science, Poznan University of Technology, Poland
Dorota Formanowicz, Department of Clinical Biochemistry and Laboratory Medicine, Poznan University of Medical Sciences, Poland
Piotr Formanowicz, Institute of Computing Science, Poznan University of Technology; Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poland
Short Abstract: Atherosclerosis is a common phenomenon for which our knowledge, e.g. about the causes, is still insufficient. Therefore its treatment or even preventing the clinical consequences is still not fully effective. An enormous progress has been made in order to understand the causes of atherosclerosis and it is now well known that this process is caused by many different factors, i.e. bad nutritional habits, low physical activity, hypertension, inflammations, oxidative stress. The last two are actively involved in lipids peroxidation. In this way modified lipids without limitation are trapped by the macrophages and become a substrate for the atherosclerotic plaque. In this study we propose a systems approach to study this complex process, and for this purpose a time Petri net based model has been constructed. For such a model an analysis based on invariants, MCT-sets and clusters is possible as it is for the classical Petri nets, but the time factor provides new important data that must be considered. Combining time information connected with the net transitions system and the classical analysis based on invariants allows to draw new, biologically interesting conclusions about the modeled process of atherosclerosis.
O032 - A comparative analysis of the lipids metabolism changes in patients suffering from chronic kidney disease and healthy controls – a Petri net approach
Adam Kozak, Institute of Computing Science, Poznan University of Technology, Poland
Dorota Formanowicz, Department of Clinical Biochemistry and Laboratory Medicine, Poznan University of Medical Sciences, Poland
Piotr Formanowicz, Institute of Computing Science, Poznan University of Technology; Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poland
Short Abstract: Dyslipidemias are very common complications of chronic kidney disease (CKD). Disturbances in lipoprotein metabolism are evident even at the early stages of CKD and usually worsen with the deterioration of renal function. The characteristic lipid abnormalities seen in CKD are elevated triglycerides, normal/reduced total cholesterol (TC), decreased High Density Lipoprotein (HDL). Low Density Lipoprotein (LDL) levels are not markedly raised, but the LDL particles tend to be more atherogenic. These changes are influenced by many traditional and non-traditional cardiovascular risk factors, whose mutual networks of links make the systems approach to the analyzed phenomenon justified. In this work such systems approach to the study of this complex phenomenon is presented. For this purpose Petri net based models of the lipids metabolism disturbances found among CKD patients and healthy controls have been built. Afterwards the analyses of these models, based on the generation of MCT-sets and t-clusters calculated on the basis of a set of t-invariants, followed by the use of appropriately matched clustering method, have been performed. Then, comparisons between the models have been made, which allowed to draw interesting biological conclusions.
O033 - Exploring patterns of chromatin modification and binding proteins to identify Cis-regulatory modules (CRMs) driving the restricted expression of genes in dorsal cells of D. melanogaster embryos
Calixto Dominguez, Fundación Ciencia y Vida, Chile
David Medina, Universidad de Talca , Chile
Verónica Cambiazo, Laboratorio de Bioinformatica y Expresión Génica, INTA-Universidad de Chile. Fondap Center for Genome Regulation (CGR), Chile
Tomas Perez-Acle, Computational Biology Lab (DLab), Fundación Ciencia y Vida, Chile
David S Holmes, Center for Bioinformatics and Genome Biology, Fundación Ciencia y Vida. Faculty of Biological Sciences, Andres Bello University, Chile
Short Abstract: Patterning in multicellular organisms has been studied intensively and more recently modeled at several levels. Although the primary DNA sequence encodes the regulatory program for each cell type, epigenetic modifications, including chromatin modifications, can modulate the interpretation of this program. These chromatin modifications, in conjunction with the driving control of CRMs on gene expression, contribute to orchestrate the diversity of fly cell phenotypes. We focused on dorsal cells during the early stages of D. melanogaster embryogenesis where Decapentaplegic (Dpp) ligand controls their patterning. An unsupervised learning algorithm based on multivariate hidden Markov model was used to model chromatin segments. To develop this model, we used ModEncode chromatin immunoprecipitation tracks, transcription factors binding data and chromatin accessibility tracks at different developmental stages in fly embryogenesis. A multistate model was obtained that allowed us to recover a group of known CRMs restricted to the dorsal cells of fly embryos. Notably, 3 new CRMs were discovered that drive the expression of genes controlled by Dpp pathway.
O034 - A novel generation of drug pathway elucidates new indication of existing drugs
Min Oh, Gachon Univ., Korea, Rep
Youngmi Yoon, Gachon Univ., Korea, Rep
Short Abstract: Revealing drug mechanism is limited by our incomplete knowledge about drug action such as its targets and their downstream pathways. This limitation also restricts the opportunity to identify drug repositioning candidates and to understand side effects of drugs. Here we developed a novel method to identify drug pathways by linking 5 types of genes relevant to drug response that include target genes, variant genes, differentially expressed genes, side-effect genes and disease genes. Based on the assumption that drug targets initiate a cascade of signaling network that ultimately affects disease phenotype, our method connects drug targets to disease genes through the drug-response-relevant genes and highly reliable interactions. We built an integrated gene interaction network to draw drug pathway using protein-protein interactions, metabolic interactions and transcriptional interactions. The drug pathway is constructed for each drug and shows significant enrichment in known drug response pathway. Furthermore, we used the drug pathways to predict new indications and possible side-effects for existing drugs. The correlation between drug pathways illuminates functionally related drug clusters. In addition, drug pathway-based classifier proposes novel drug repositioning candidates and possible side-effects.
O035 - Network-Regularization Improves Prediction of Influenza Vaccination Response
Stefan Avey, Yale University, United States
Steven Kleinstein, Yale University, United States
Short Abstract: Seasonal influenza viruses cause thousands of deaths annually worldwide and result in widespread disease and health care burden. While recommended for most individuals, the efficacy of the seasonal influenza vaccine is relatively low and host determinants of successful antibody response are poorly understood. Thus, identifying clinically relevant signatures of response is crucial to improve both vaccine delivery and design. Advances in high-throughput technologies over the last decade have resulted in large repositories of biological knowledge often summarized in gene interaction networks. Few others have attempted to utilize this prior biological knowledge for prediction as it remains unclear how to best leverage this a priori knowledge. We apply a network-regularized sparse partial least squares algorithm to incorporate prior knowledge from gene networks into prediction of vaccination response from baseline gene expression data. Models trained using biological networks outperform random networks on an independent test set. Furthermore, the use of prior knowledge improves the interpretability of the gene signatures, as demonstrated by increased pathway enrichment. In this study, we propose a framework for incorporating prior biological knowledge into prediction of influenza vaccination response and show that this improves both the accuracy and interpretability of the resulting model.
O036 - Identification of novel targets for the inhibition of Ets-1 related cancer progression
Guillaume BRYSBAERT, CNRS UMR 8576 UGSF, France
Short Abstract: The Ets-1 oncoprotein is a transcription factor that promotes target gene expression in specific biological processes. Most of the time Ets-1 activity is low in healthy cells, but elevated levels of expression have been found in cancerous cells, specifically related to tumor invasion. In this context, we have recently identified two DNA repair enzymes, PARP-1 and DNA-PK, as novel interaction partners and identified their domains of interaction. We showed that interactions between Ets-1 and these DNA repair enzymes are important for cancer cell survival, which may therefore constitute therapeutic targets. Nevertheless nothing is known of the molecular details of these interactions or of the interaction and regulatory networks of these proteins.

Our goal is to characterize these interactions at an atomic level (to inhibit them afterward), and to identify new therapeutic targets for the inhibition of Ets-1 related cancer progression. We have identified proteins which contain domains homolog to the domains of interaction of Ets-1 and these DNA-repair enzymes. We have characterized at the atomic level, the interaction between Ets-1 and new partners, and between some homologs, by using a protein-protein docking approach. And we have constructed the protein-protein interaction and regulatory networks of Ets-1, partners and homologs, then mapped some public expression data onto these networks.

Our work provides understanding of the role of Ets-1 in cellular signaling, specifically in relation to dysregulation of target genes and tumor invasion, and allows the identification of novel targets for the inhibition of Ets-1 related cancer progression.
O037 - Analyzing T helper 17 cell differentiation dynamics using a novel integrative modeling framework for time-course RNA sequencing data
Jukka Intosalmi, Aalto University, Finland
Helena Ahlfors, The Babraham Institute,
Sini Rautio, Aalto University , Finland
Zhi Jane Chen, Turku Centre for Biotechnology, University of Turku and Åbo Akademi , Finland
Riitta Lahesmaa, Turku Centre for Biotechnology, University of Turku and Åbo Akademi , Finland
Brigitta Stockinger, Division of Molecular Immunology, Medical Research Council National Institute for Medical Research,
Harri Lähdesmäki, Aalto University , Finland
Short Abstract: The differentiation of naive CD4+ helper T (Th) cells into effector Th17 cells is steered by extracellular cytokine signals that activate and control the lineage specific transcriptional program. Recent experimental studies provide a plethora of information about the Th17 lineage specific regulatory network but precise mechanistic understanding of the transcription factor dynamics is yet to be attained. In this study, we construct a detailed description for the dynamics of the core network driving the Th17 cell differentiation and use this description to implement alternative quantitative models in the form of ordinary differential equations (ODEs). The ODE models consist of two lineage specific inducing cytokine signals (TGFβ and IL6) as well as mRNA and protein levels for three key genes (STAT3, RORγt, and FOXP3). Further, we combine the ODE models with time-course RNA-seq measurements using a novel statistical framework designed specifically for sequencing data and, based on rigorous statistical modeling, quantify the evidence for alternative models. Our results show significant evidence, for instance, for inhibitory mechanisms between the transcription factors and also confirm that our description of dynamics is on a feasible level to explain the data. Besides these findings related to our application to T cell biology, we discuss the role of our statistical framework which is based on the well-established characterization of sequencing count data as well as the state-of-the-art computational and statistical methods, including population based MCMC and thermodynamic integration, that are used to obtain the results.
O038 - Detection of Composite Communities in Multiplex Biological Networks through Mathematical Programming
Sophia Tsoka, King's College London,
Laura Bennett, University College London,
Aristotelis Kittas, King's College London,
Lazaros Papageorgiou, University College London,
Short Abstract: The detection of community structure in complex networks is a widely accepted means of investigating the principles governing the organisation of biological systems. Recent efforts are exploring ways in which multiple data sources can be integrated to generate a more comprehensive model of cellular interactions, leading to the detection of more biologically relevant communities (1). Previously, we have shown that mathematical programming is an efficient and flexible means of modeling such network analysis tasks (2, 3). Here, we present a mathematical programming model that aims to cluster multiplex biological networks, i.e. multiple network slices, each with a different interaction type, to determine a single representative partition of composite communities. Our method is evaluated through its application to yeast networks of physical, genetic and co-expression interactions. Comparative analyses involving partitions of the individual networks, partitions of aggregated networks and partitions generated by similar methods from the literature highlight the ability of our model to identify functionally enriched modules. It is further shown that our method offers enhanced results when compared to existing approaches, without the need to train on known cellular interactions.
1. P. J. Mucha, et al. “Community structure in time-dependent, multiscale, and multiplex networks”, Science, 328, 876–878, 2010.
2. L. Bennett, et al. “Community Structure Detection for Overlapping Modules through Mathematical Programming in Protein Interaction Networks”, PLoS ONE, 9(11): e112821, 2014.
3. L. Bennett, et al. “Detection of Disjoint and Overlapping Modules in Weighted Complex Networks”, Advances in Complex Systems, 15(5), 1150023-1, 2012.
O039 - Pathway relevance ranking for tumor samples through network-based data integration
Lieven Verbeke, Ghent University / iMinds / IBCN, Belgium
Jimmy Van den Eynden, Ghent University / iMinds / IBCN, Belgium
Piet Demeester, Ghent University / iMinds / IBCN, Belgium
Kathleen Marchal, / iMinds / IBCN, Belgium
Jan Fostier, / iMinds / IBCN, Belgium
Short Abstract: We present a new pathway relevance ranking method that is able to prioritize pathways according to the information contained in any combination of tumor related omics datasets. Key to the method is the conversion of all available data into a single network representation containing not only genes but also individual patient samples. Additionally, all data are linked through a network of previously identified molecular interactions. The performance of the new method is demonstrated by applying it to breast and ovarian cancer datasets from The Cancer Genome Atlas. By integrating gene expression, copy number, mutation and methylation data, the method’s potential to identify key pathways involved in breast cancer development shared by different molecular subtypes, is illustrated. Interestingly, certain pathways were ranked equally important for different subtypes, even when the underlying (epi)-genetic disturbances were diverse. The pathway ranking method was also able to identify subtype-specific pathways. Often the score of a pathway could only be explained by a combination of genetic and epi-genetic disturbances, stressing the need for a network-based data-integration approach. The analysis of ovarian tumors, as a function of survival-based subtypes, demonstrated the method’s ability to correctly identify key pathways, irrespective of tumor subtype. A differential analysis of survival-based subtypes revealed several pathways with higher importance for the bad-outcome patient group than for the good-outcome patient group. Many of the pathways exhibiting higher importance for the bad-outcome patient group could be related to ovarian tumor proliferation and survival.
O040 - A systems biology approach towards understanding severe bacterial soft tissue infection through network visualization
Erno Lindfors, LifeGlimmer GmbH, Germany
Santhosh Mukundan, University of North Dakota, Grand Forks, North Dakota, United States, United States
Karthickeyan Chella Krishnan, University of North Dakota, Grand Forks, North Dakota, United States, United States
Suba Nookala, University of North Dakota, Grand Forks, North Dakota, United States, United States
Vítor Martins dos Santos, Laboratory of Systems and Synthetic Biology, Wageningen University, Dreijenplein 10, Building 316, 6703 HB Wageningen, The Netherlands, Netherlands
Carolyn Ming Chi Lam, LifeGlimmer GmbH, Germany
Malak Kotb, University of North Dakota, Grand Forks, North Dakota, United States, United States
Short Abstract: Background
Toxic shock syndrome and Necrotizing Fasciitis (NF), also called “the flesh-eating” disease, are highly morbid and often fatal sequelae of invasive infections of Group A Streptococcus (GAS)/ Staphylococcus aureus bacteria. Understanding the mechanisms related to the disease progression of NF is a focus of the EU project INFECT ( We present a systems biology approach to investigate disease-specific signaling pathways based on quantitative trait loci (QTLs), differentially-expressed genes, and available omics data.

We are developing a visualization framework in Cytoscape integrating significant QTL genes with quantitative gene expression, transcriptomics, or proteomics data on top of mammalian signaling pathways (TRANSPATH®) and associating these interactions with superpathways (WikiPathways) to facilitate better analysis of large-scale heterogeneous disease data. Our framework also calculates their statistical enrichment in superpathways for an overall understanding of their biological roles.

As an illustration of investigating pathways associated with skin infection, we used expression data and genes under QTLs modulating survival of a mouse GAS sepsis model (Abdeltawab et al. 2008 PLoS Pathog. 4(4):e1000042) to identify further signaling interactions. Detailed protein-protein/gene interactions plus enrichment analysis showed complex signaling networks for e.g. Hspa5, Il1a, Psmd5 etc. MAPK signaling, apoptosis, and oxidative damage pathways were enriched for resistant and susceptible mice; whereas for susceptible mice the eicosanoid synthesis pathway was enriched. Our framework can contribute significantly to advancing the knowledge towards this disease and assist discovery of new treatments, especially when omics data are integrated with pathway information strategically to explore new facets of the disease mechanisms.
O041 - flowcatchR: A user-friendly workflow solution for the analysis of time-lapse cell flow imaging data
Federico Marini, Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Germany
Johanna Mazur, Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Germany
Harald Binder, Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Germany
Short Abstract: Automated bioimage analysis is required for reproducible and efficient extraction of information out of time-lapse microscopy data when investigating the in vivo dynamics of complex biological processes.

We developed a comprehensive workflow solution, based on our R/Bioconductor package flowcatchR. Our solution specifically addresses the challenges of blood cell flow data, where cells show dynamic behaviors, classified e.g. into states such as flowing, rolling and adhering. Analysis of the corresponding fast movements is further complicated by cells entering and leaving the field of view, and transitions in and out of focus.

Subject matter knowledge is incorporated for making analysis feasible. Specifically, we developed a penalty function for a cell tracking algorithm to take into account the directionality of the flowing cells.

The construction of our workflow solution, based on an R package implementing the algorithms, a Shiny App, and Jupyter notebooks, may also serve as a good example of how to bridge the gap between sophisticated analysis tools available and end-user requirements in other bioimaging applications.

In particular, we also address deployment to cooperation partners by custom-made Docker containers that provide a fully operative environment where the necessary libraries and dependencies are already provided and ready for use, in a way that is easily accessible to a broad range of life scientists.
O042 - MAGNA++: Maximizing Accuracy in Global Network Alignment via both node and edge conservation
Vipin Vijayan, University of Notre Dame, United States
Vikram Saraph, Brown University, United States
Tijana Milenkovic, University of Notre Dame, United States
Short Abstract: Biological network alignment aims to identify similar regions between networks of different species. Hence, it can be used for across-species transfer of biological knowledge. Existing methods aim to find high-scoring alignments with respect to overall node similarity (or node conservation). But, the accuracy of the alignments is then evaluated using some other measure, such as the amount of conserved edges. Thus, the existing methods align similar nodes between networks hoping to conserve many edges, but only after the alignment is constructed. Instead, we introduce MAGNA to directly optimize edge conservation while the alignment is constructed, without decreasing the quality of node mapping. In systematic evaluations against state-of-the-art methods (IsoRank, MI-GRAAL and GHOST), on both synthetic and real-world biological networks, MAGNA outperforms all of the existing methods, in terms of both node and edge conservation as well as both topological and biological alignment accuracy. Our more recent MAGNA++ framework further improves alignment quality by: 1) simultaneously maximizing any one of three different measures of edge conservation (including our recent superior S3 measure) and any desired measure of node conservation, which yields better alignments compared to maximizing only node conservation (as existing methods do) or only edge conservation (as MAGNA does), 2) speeding up the original MAGNA algorithm by parallelizing it to automatically use all available resources, as well as by re-implementing the edge conservation measures more efficiently, 3) providing a friendly GUI for easy use by domain (e.g., biological) scientists, and 4) offering source code for easy extensibility by computational scientists.
O043 - Understanding Leishmania Development and Drug Resistance using an Integrative Omics Compendium
Bart Cuypers, University Of Antwerp, Belgium
Pieter Meysman, University Of Antwerp, Belgium
Maya Berg, Institute Of Tropical Medicine, Belgium
Manu Vanaerschot, Institute Of Tropical Medicine, Belgium
Jean-Claude Dujardin, Institute Of Tropical Medicine, Belgium
Kris Laukens, University Of Antwerp, Belgium
Short Abstract: Leishmania donovani causes visceral leishmaniasis (VL), a disease which is lethal without treatment. With only four drugs available and rapidly emerging drug resistance, knowledge about the parasite’s resistance mechanisms is essential to boost the development of new drugs. However, only little is known about Leishmania’s gene regulation and the few findings indicate major differences to known gene expression systems.

Integration of different ‘omics could shed light on these gene regulatory mechanisms, but there has been little integration effort so far. Therefore, we developed an easy to use tool, able to collect and connect all the existing L. donovani –omics experiments. Genomics, epigenomics, transcriptomics, proteomics, metabolomics and phenotypic data was collected and added to a MySQL database compendium, further complemented with publicly available data. Relations between the different ‘omics levels were explicitly defined and provided with a level of confidence. Python scripts were developed to preprocess, import and access the data.

Next to this vast data source a set of integrative data-analysis tools was developed based on data mining strategies. For example: One tools uses frequent pattern mining algorithms to look which proteins and metabolites frequently behave in the same way under different conditions. Another tool converts several –omics data to a network format that can be opened in Cytoscape and can thus be the basis for network analysis.

Using the compendium, we characterized the development and drug-resistance in a system biology context (all ‘-omics). The compendium and its scripts could be used for other organisms with only minor changes.
O044 - Protein interaction abnormalities in leukaemia: Integrative Systems Biology approach from network to 3D molecular structures
Sun Sook Chung, King's College London,
N Shaun B Thomas, King's College London,
Franca Fraternali, King's College London,
Short Abstract: Systems biology has played a significant role in boosting high-throughput biological applications. This is achieved not only by systematic integration of new incoming biological information, but also by bridging gaps between a wide range of large-scale biological data with the aim of rationalising their functions. In particular, protein-protein interaction networks (PPINs) have been widely applied to discover novel functional associations playing a crucial role in fundamental biological processes. To understand governing rules of PPINs, our recent study highlighted short loop network motifs as an essential topological feature of PPINs in extracting meaningful biological functions (Chung et al, 2015). One of these motifs was selected as contained proteins enriched in specific cellular functions such as mRNA metabolic processing and cell cycle activity.
We will apply our integrative systems biology approach to investigate molecular implications of patient-specific variations in acute myeloid leukaemia (AML) by using PPINs and 3D structural complexes. Human PPINs from high-confidence sources are integrated with experimentally verified proteomic analysis of human T lymphocytes during cell-cycle entry performed in our laboratory (Orr et al, 2012), AML related gene mutations extracted from COSMIC (Forbes et al, 2011) and the corresponding 3D protein structures (Lu et al, 2013). The functional effects of the mutations are analysed in PPINs by short loop profiling, predicted by structural modelling and docking, and probed by targeted siRNA silencing experiments.
O045 - Analysis of the Organisation of Interactome using Dominating Sets: a Case Study on Cell Cycle Interaction Networks
Haiying Wang, Ulster University,
Huiru Zheng, Ulster University,
Chaoyang Wang, University of Edinburgh,
Short Abstract: The significance of understanding the organisation of protein interaction networks (PINs) has been well recognized. Recently, scientists start to examine the dynamics and structure of networks using control theory, aiming to determine a set of key proteins in the control of underlying interaction networks.

In this study, we focus on the analysis of critical and redundant proteins identified using the Minimum Dominating Sets (MDS) in the analysis of PINs in both yeast and human cell cycles. Based on the integration of the latest, high quality PINs and information on cell-cycle-regulated gene expression, we firstly constructed cell cycle specific PINs in both organisms. A Cytoscape plugin has been developed to determine critical and redundant nodes using the MDS for a given network. A total of 132 yeast genes and 129 human proteins have been identified as critical nodes while 950 in yeast and 980 in human have been categorized as redundant nodes. A clear distinction between critical and redundant proteins was observed when examining their topological parameters including betweenness centrality, suggesting the central role of critical proteins in the PINs. The differences between the two sets of proteins in terms of genomic essentiality, gene coexpression, functional similarity, and the impact on resilience of a network have been assessed. To investigate whether critical proteins predicted to be playing an important role in the control of a PIN carry biological significance, we examined the enrichment level of essential and disease genes in those genes. Detailed results will be presented at the Conference.
O046 - Muscle-invasive urothelial cancer network signatures inferred from large-scale tumor gene expression datasets
Ricardo de Matos Simoes, Queens University Belfast,
Sabine Dalleau, Queens University Belfast,
Kate E Williamson, Queens University Belfast,
Frank Emmert-Streib, University of Tampere, Finland
Short Abstract: Urothelial cancer (UC) originates from the epithelial lining of the bladder and can progress from non-invasive (NMI) to more aggressive muscle-invasive (MI) subtypes which penetrate the deeper tissue layers of the bladder. We present a novel method that allows to integrate inferred gene regulatory networks, curated and experimental protein networks for the generation of network-based gene signatures. Our method implements a network-based feature space inflation for each individual patient sample by joint expression averages and expression ratios for gene pairs that are defined by a given network structure. The method subsequently performs an elastic net feature selection procedure on the gene pair features for the generation of NMI/MI expression signatures. We performed our analysis separately for a large-scale oligo and a large-scale bead UC microarray dataset and generated NMI/MI signatures for inferred, curated and experimental gene and protein-networks. Network-based signatures from gene regulatory networks greatly improved the performance for unsupervised clustering of NMI/MI samples compared to single gene based signatures. The gene pair targets of the signatures that we identified with most prominent differential joint expression ratios such as EDNRA/POSTN and KRTAP5-2/SHANK2 represent promising novel putative diagnostic targets for subsequent studies in NMI/MI UC tumors. Our results shed new light on the analysis and integration of network and gene expression data for the identification and development of novel diagnostic targets in UC.
O047 - Detection of Heterogeneity in Single Particle Tracking Trajectories
Paddy Slator, ,
Nigel Burroughs, Warwick Systems Biology Centre,
Short Abstract: Single particle tracking (SPT) data is fundamentally stochastic, which makes the extraction of robust biological conclusions difficult. This is especially the case when trying to detect heterogeneous movement of molecules in the plasma membrane. This heterogeneity could be due to a number of biophysical processes including: receptor clustering, traversing lipid rafts, binding to the cytoskeleton, or changes in membrane diffusivity.

We aim to build statistically robust methods for analysing SPT trajectories.
Working in a Bayesian framework, we have developed multiple models for heterogeneity, such as confinement in a harmonic potential well, and a hidden Markov model where a particle switches between two states with different diffusion coefficients.

We analyse these models using Markov chain Monte Carlo algorithms, which infer the model parameters and hidden states from single trajectories. We also calculate model selection statistics, such as Bayes factors, to determine the most likely model given the trajectory. Our methodology also accounts for localisation accuracy.

We have applied our algorithms to experimental data sets. Analysis of the membrane receptor LFA-1 shows that 8-20% of trajectories display clear switching between diffusive states. Analysis of lipids in a model membrane system shows transient trapping in harmonic potential wells. We have also demonstrated that allowing for localisation accuracy is essential, as otherwise false detection of heterogeneity may be observed.
O048 - Adding Activations and Inhibitions in Biological Network Motifs Analysis to Investigate Relationships Between Protein Characteristics and Topological Structures.
Alberto Calderone, Institute for Systems Analysis and Computer Science, Italy
Daniele Santoni, Institute for Systems Analysis and Computer Science, Italy
Paola Bertolazzi, Institute for Systems Analysis and Computer Science, Italy
Short Abstract: Natural networks of interacting entities, such as genes or proteins, are complex structures that exhibit several regularities and properties. Other than having specific properties such as long-tailed node degree distributions and characteristic average path length, they exhibit recurring topological motifs, or graphlets. So far, network motifs count was focused on undirected/directed graphs and some motifs were associated with specific biological processes. On the other hand, signaling databases such as SignaLink and Kegg can be queried to assemble directed networks where edges also have information about the effect that one protein has on the target protein.
Among the different recurring motifs that occur in a graph, triads and tetrads are of particular interest in biological networks. Feed Forward Loops and Feed Back Loops are triads that have already been analyzed, for instance, in bacteria and yeast. As far as tetrads are concerned, even though some configurations are frequent in biological network, they have not yet been associated with particular processes.
As more data become available, analysis of network motifs taking into account both directionality and effects, i.e. activation or inhibition, can be used, together with machine learning approaches, to better understand how proteins are involved in specific biological processes that tend to exhibit recurring network motifs. Preliminary results obtained using motifs counts, clustering and GO terms enrichment confirmed how triads can be used to discriminate specific biological processes. Furthermore, the inclusion of tetrads, other than consolidating known results, provided new insights about the association between biological processes and network motifs.
O049 - Proper Evaluation of Alignment-free Network Comparison Methods
Omer Yaveroglu, University of California Irvine, United States
Tijana Milenkovic, University of Notre Dame, United States
Natasa Przulj, Imperial College London,
Short Abstract: In ECCB'14, a new alignment-free network comparison method, NetDis [1], was introduced, claiming to be the most precise network distance measure. However, NetDis was not properly compared with state-of-the-art network distances, including the Graphlet Correlation Distance (GCD) [2], which is surprising since both methods are based on graphlets (small, connected, non-isomorphic and induced sub-graphs).

To correct this methodological flaw, we evaluate all state-of-the-art alignment-free network comparison methods to assess how well they can group topologically similar networks. By performing these tests on both synthetic and real-world networks from different domains, we show that GCD is the most accurate, noise-tolerant, and computationally efficient alignment-free network comparison method. Furthermore, our study uncovers that the accuracy of NetDis is strongly dependent on the choice of a network null model, unlike the other graphlet-based methods. Since a well-fitting network null model is not known for most real-world networks, this dependence makes NetDis impractical. Finally, we demonstrate that NetDis cannot reconstruct the phylogenetic relationships of different species, as originally claimed. Overall, our study highlights that GCD is superior to all other alignment-free network comparison methods, including NetDis.

[1] W. Ali, T. Rito, G. Reinert, F. Sun, and C. M. Deane, “Alignment-free protein interaction network comparison”, Bioinformatics, vol. 30, no. 17, pp. i430–i437, 2014.

[2] O. N. Yaveroglu, N. Malod-Dognin, D. Davis, Z. Levnajic, V. Janjic, R. Karapandza, A. Stojmirovic, and N. Przulj, “Revealing the hidden language of complex networks,” Scientific reports, vol. 4, 2014.
O050 - Structure Learning for Stochastic Reaction Networks
Anna Klimovskaia, Swiss Federal Institute Of Te, Switzerland
Manfred Claassen, ETH Zurich, Switzerland
Short Abstract: Development of new deep high-throughput technologies for single cell measurements enables new approaches to modeling biological systems such as signaling cascades, by means of stochastic reaction networks. Most approaches for mechanistic modeling assume a known topology of the reaction network. However, in some systems this situation only partially applies since the network of reactions is not yet known in its entirety.

We propose a method to simultaneously infer the structure and fit parameters of a biological system described by stochastic reaction network with unknown topology. This method assumes a time series setting, where for a set of single cells the abundance of a set of species has been measured at a discrete set of time points. We further assume mass action kinetics and consider the system’s moment equations to relate moments in the data to the kinetic parameters. This formulation translates to a convex optimization problem for parameter estimation. We apply this formulation to estimate the parameters of a reaction network, which enumerates all possible binary and some unary reactions among the system species. We introduce sparsity-inducing penalties to implicitly perform model selection. We demonstrate model selection performance on synthetic data of the apoptotic receptor subunit. We aim at applying this method to learn the reaction network structure of cancer related signaling pathways such as TRAIL induced apoptosis.
O051 - Rapamycin treatment of normal human fibroblasts increases the transcriptional abundance of genes involved in cytokine-cytokine receptor signaling
Kimberly MacKay, University of Saskatchewan, Canada
Zoe Gillespie, University of Saskatchewan, Canada
Brett Trost, University of Saskatchewan, Canada
Christopher Eskiw, University of Saskatchewan, Canada
Anthony Kusalik, University of Saskatchewan, Canada
Short Abstract: Background: Rapamycin is an immunosuppressant drug that is currently used to prevent transplant organ rejection. It is additionally being investigated as a potential therapy for many other diseases. The effect it has on cytoplasmic and genomic function has been extensively studied in model organisms. However, it is unclear what affect rapamycin has on gene expression in normal human primary cells.

Objective: To determine the global impact rapamycin has on gene expression in normal human fibroblasts.

Methods: RNA-seq was performed on proliferative and rapamycin-treated human fibroblasts. SeqMonk was used to calculate the fold-change difference in transcriptional abundance by comparing the read counts of the two datasets. A protein interaction network was constructed based on the genes that had at least a 5-fold change in transcriptional abundance using Cytoscape and ReactomeFI. The resultant network was annotated using biological process, molecular function and cellular component terms from the Gene Ontology Consortium as well as pathway annotation terms from the Kyoto Encyclopedia of Genes and Genomes.

Conclusions: Rapamycin treatment of normal human fibroblasts resulted in 537 genes having a 5-fold or greater change in transcriptional abundance. The network analysis revealed a significant enrichment for genes associated with PI3K-AKT signaling, linking our observations to rapamycin’s established cytoplasmic target. The most significant pathway annotation was cytokine-cytokine receptor interaction with many of these genes belonging to the Interleukin-6 signaling pathway. It is possible that prolonged exposure to rapamycin and the production of cytokines like Interleukin-6 could produce sufficient cellular stress to drive normal human primary cells into senescence.
O052 - ssKSR LIVE: A novel algorithm for unravelling signal coordination from large scale phosphorylation kinetic data
Westa Domanova, University of Sydney, Australia
James Krycer, University of Sydney, Australia
Rima Chaudhuri, University of Sydney, Australia
Fatemeh Vafaee, University of Sydney, Australia
David James, University of Sydney, Australia
Zdenka Kuncic, University of Sydney, Australia
Short Abstract: A growing body of evidence is emerging that the temporal behavior of cellular signalling molecules controls biological responses. Phosphorylation, one of the most prevalent signalling modifications, occurs rapidly in response to environmental changes but for the majority of phosphorylation events the kinase is unknown. To elucidate the underlying topology of signaling cascades from high-throughput data, we need to be able to predict kinase substrate relationships. Currently, prediction algorithms ignore the crucial biological context of kinase substrate relationships. To address this we consider temporal behaviours: given that phosphorylation events occur in a coordinated way, with some kinases being active before others, we predict site-specific kinase substrate relationships from large scale in vivo experiments (ssKSR-LIVE). Applying this to an insulin-stimulated phosphorylation screen we were able to distinguish between the substrates of AKT and RPS6KB1, two kinases with the same consensus motif, and identified IRS-1-S270 as a novel putative AKT site. We subsequently used our ssKSR-LIVE algorithm to predict novel substrates for the kinases driving insulin signaling, shedding light on their role in driving insulin-stimulated biological processes. ssKSR-LIVE can be applied to other high-throughput screens of signal transduction, and thus can be used to improve our understanding of complex diseases caused by dysregulated signalling, including cancer and type 2 diabetes.
O053 - Transcriptome analysis reveals thousands of targets of nonsense-mediated mRNA decay that offer clues to the mechanism in different species
Steven Brenner, University of California, Berkeley, United States
Courtney French, University of California, Berkeley, United States
Gang Wei, Fudan University, China
Anna Desai, University of California, Berkeley, United States
James Lloyd, University of California, Berkeley, United States
Angela Brooks, Broad Institute of MIT and Harvard, United States
Thomas Gallagher, Ohio State University, United States
Li Yang, CAS-MPG Partner Institute for Computational Biology, China
Brenton Graveley, University of Connecticut Health Center, United States
Sharon Amacher, Ohio State University, United States
Short Abstract: Nonsense-mediated mRNA decay (NMD) is an RNA surveillance pathway that degrades aberrant transcripts harboring premature termination codons. However, many genes produce physiological alternative isoforms containing premature termination codons degraded by NMD. This provides a mode of regulation wherein a splicing factor can induce splicing of these alternative isoforms to decrease expression levels. In mammals, a premature termination codon is canonically one that is >50nt upstream of an exon-exon junction (‘50nt Rule’). There is evidence that this rule also holds in Arabidopsis, but not in other eukaryotes. There are also reports that a longer 3’ UTR triggers NMD in plants, flies, and mammals.
To survey the targets of NMD genome-wide in human, zebrafish, and fly, we performed RNA-Seq analysis on cells where NMD has been inhibited via knockdown of UPF1, a critical protein. We found thousands of genes produce alternative isoforms degraded by NMD in the three species. Additionally, we found that the 50nt rule is a strong predictor of NMD degradation in human cells, and has an effect in zebrafish and fly. In contrast, we found little correlation between the likelihood of degradation by NMD and 3’ UTR length in any of the three species.
Based on an extensive literature analysis, we have also produced a model of known splice factor regulatory interactions. Protein-RNA interactions are extensive, and most tested cases reflect auto- and cross-regulation through splicing and NMD, yielding a robust network. We see little evidence of a hierarchy with “master regulators” of splicing.
O054 - On the power of RNA-seq for predicting miRNA-transcript interactions
Azim Dehghani Amirabad, , Germany
Marcel H. Schulz, Max planck institute for infromatics- MMCI, Germany
Short Abstract: MicroRNAs (miRNAs) are small non-coding RNAs which play critical role in a wide range of biological processes, via post-transcriptional gene regulation. Identifying miRNA targets is a critical step toward elucidating their functions in different diseases. In recent years, several computational methods based on miRNA-mRNA sequence complementarity information have been developed. However the expected false positive rate of sequence based predictions is still large. In addition many target relationships are context-specific. Therefore, most approaches incorporate miRNA-gene expression levels to improve prediction accuracy. Because microRNAs most often do not target all transcripts of one gene, using the expression level of the gene may be suboptimal. We challenged traditional microRNA target inference methods and used the estimated transcript expression level instead of gene expression level as input for our models. We formulated miRNA target interaction prediction using different linear regression models (LASSO, Elastic Net), that can deal with the large number of features encountered
We show that models based on transcript expression levels show improved prediction performance, independent on the regression method used. In general, recall is increased without sacrifice in precision, supporting the idea that using transcript annotation is indeed helpful for predicting miRNA-gene interactions. Additionally, transcript-based models can, for the first time, pinpoint which transcript of the gene is regulated by which miRNA on a genome-wide scale. Overall, we conclude that the transcript based prediction models introduced in this work are more powerful in predicting miRNA-gene interactions from miRNA and mRNA expression data than established approaches.
O055 - Reconstruction of the temporal signaling network in Salmonella-infected human cells
Gungor Budak, Middle East Technical University, Turkey
Oyku Eren Ozsoy, Middle East Technical University, Turkey
Yesim Aydin Son, Middle East Technical University, Turkey
Tolga Can, Middle East Technical University, Turkey
Nurcan Tuncbag, Middle East Technical University, Turkey
Short Abstract: Salmonella enterica is a bacterial pathogen that usually infects host through food sources. Translocation of pathogen proteins into host cells changes signaling mechanism by activating or inhibiting the host proteins and eventually modifies the host response network. Using high-throughput ‘omic’ technologies, these changes can be quantified at different levels; however, experimental hits are usually incomplete to represent the whole signaling system and some driver proteins stay hidden in the experimental data. More coherent view of the underlying biological processes and signaling networks can be obtained by using a network modeling approach with reverse engineering principles where a confident region from the protein interactome is found by inferring hits from omic experiments. Here, we used a published temporal phosphoproteomic data of Salmonella-infected human cells and reconstructed the signaling network of the host by integrating interactome and phosphoproteomic data. We have combined two well-established network modeling frameworks, the prize-collecting Steiner forest approach and the integer linear programming based edge inference approach. The resulting network conserves the temporal information, direction of interactions and hidden entities in the signaling, while revealing several pathways such as SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis. The Salmonella effectors' targets in the host cell such as CDC42, RHOA, 14-3-3δ, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed network although they were not in the phosphoproteomic data. Integrated approaches, such as the one presented here, have a high potential for the clinical targets' identification in infectious diseases, especially in Salmonella infection.
O056 - A Gaussian Process Model for Inferring the Dynamic Transcription Factor Activity
Muhammad Rahman, The University of Sheffield,
Neil Lawrence, The University of Sheffield,
Short Abstract: In molecular biology and genetics, a transcription factor is a protein that binds to specific DNA sequences and controls the flow of genetic information from DNA to mRNA. To develop models of cellular processes, quantitative estimation of the regulatory relationship between transcription factors and genes is a basic requirement. Quantitative estimation is complex due to many reasons. Many of the transcription factors' activities and their own transcription levels are post-transcriptionally modified; very often the levels of the transcription factors' expressions are low and noisy. From the expression levels of their target genes it is useful to infer the activities of the transcription factors. Many of computational approaches on gene expression data for time series analysis are not well suited where time points are irregularly spaced. Even in commonly used state-space model time points must occur at regular intervals. On the other side gene expression experiments with regular samples may not be cost effective or optimal from the perspective of statistics. Models with irregular time points might be more informative if the time points are selected considering some temporal features. Gaussian process is not restricted to equally spaced time series data. Already Gaussian process regression have been successfully applied to overcome this issue and analyse time series data. Here we design a special covariance function of Gaussian process for reconstructing the exact transcription factor activities from combination of mRNA expression levels and DNA protein binding measurements. Our model overcomes the restriction of temporal sampling with equally spaced time intervals.
O057 - Network analyses to identify protein-interactions responsible for lung and brain metastasis differentiation of breast cancer
Emel Sen, Chemical and Biological Engineering,Center for Computational Biology and Bioinformatics, Koc University, Turkey
Farideh Halakou, Computer Engineering,Center for Computational Biology and Bioinformatics, Koc University, Turkey
Özlem Keskin, Chemical and Biological Engineering,Center for Computational Biology and Bioinformatics, Koc University, Turkey
Attila Gürsoy, Computer Engineering,Center for Computational Biology and Bioinformatics, Koc University, Turkey
Short Abstract: Metastases have been known to cause majority of the breast cancer deaths. We aim to find novel genes/proteins and specific pathways that play important roles in brain and lung breast cancer metastasis phenotypes using protein-protein interaction (PPI) networks. Human PPI subnetworks of breast cancer are constructed starting from experimentally identified seed genes for the two types of metastasis. The interactions are scored using a network-topology based prioritization method (GUILDify) relative to the seed genes. Each scored subnetwork is filtered by using experimentally proved interactions reported in STRING database. 1000 protein interactions with highest ranks are further analyzed both topologically and functionally in each subnetwork.
The functional enrichment analysis shows that KEGG pathways associated with the immune system and infectious diseases, particularly the chemokine signaling pathway, are important for lung metastasis. On the other hand, pathways related to genetic information processing are more involved in brain metastasis. The topological analysis identified genes such as RPL5, MMP2, CCR5 and DPP4, which are already known to be associated with lung or brain metastasis. Additionally, we found 6 and 9 putative genes that are specific for lung and brain metastasis, respectively. Our analysis suggests that variations in genes and pathways contributing to these different breast cancer metastasis types may arise due to change in tissue microenvironment.
O058 - The Systems Toxicology Challenge: How to Leverage Omics Data to Predict Mechanisms of Toxicity?
Carine Poussin, Philip Morris International R&D, Switzerland
Vincenzo Belcastro, Philip Morris International R&D, Switzerland
Stephanie Boue, Philip Morris International R&D, Switzerland
Alain Sewer, Philip Morris International R&D, Switzerland
Bjorn Titz, Philip Morris International R&D, Switzerland
Nikolai Ivanov, Philip Morris International R&D, Switzerland
Manuel C Peitsch, Philip Morris International R&D, Switzerland
Julia Hoeng, Philip Morris International R&D, Switzerland
Short Abstract: Risk assessment in the context of 21st century toxicology relies on the elucidation and understanding of mechanisms of toxicity. For that purpose, datasets generated by high-throughput technologies (e.g., high-throughput/content screening) combined with various omics data types are now generated in vitro to test large and diverse set of chemicals (e.g. ToxCast). The development of relevant computational approaches for the analysis and integration of these big data remains challenging and requires qualitative and quantitative evaluation. The current scope of sbv IMPROVER (Industrial Methodology for Process Verification in Research; is the verification of methods and concepts in systems biology research via challenges opened to the scientific community. Previous challenges brought new insights on methods and their associated results that address questions about diagnostic signatures, the translatability of biological responses/processes across species, and the relevance of biological causal network models. A new sbv IMPROVER challenge will be introduced aiming at evaluating (i) methodologies for the identification of specific biomarkers of exposure and (ii) the predictability by omics data of toxicity mechanisms when cells/tissues in vitro or whole organisms are exposed to individual chemical molecules or mixtures. Participants will be provided with high quality data sets to develop predictive models/classifiers. For this challenge, the integration of a priori biological knowledge in the development of computational approaches may be required to enable biological interpretability/understanding of the predictions. The results and post-challenge analyses will be shared with the scientific community, and will open new avenues in the field of systems toxicology.
O059 - Co-expression network analysis to identify pluripotency biomarkers in bovine and porcine embryos
Gianluca Mazzoni, University of Copenhagen, Denmark
Kristine Freude, University of Copenhagen, Denmark
Vanessa Jane Hall, University of Copenhagen, Denmark
Kaveh Mashayekhi, BioTalentum, Hungary
Poul Hyttel, University of Copenhagen, Denmark
Andras Dinnyes, BioTalentum, Hungary
Haja Kadarmideen, University of Copenhagen, Denmark
Short Abstract: Differentiated somatic cells can be reprogrammed in induced pluripotent stem cells (iPSCs); a cell type with great potentials in regenerative medicine and in vitro disease modeling. In the pig, we have developed iPSCs, but proper culture conditions for maintaining pluripotency over time are still lacking. Hence, there is a need for a more fundamental dissection of the pluripotency apparatus in the pig as well as in cattle.
The aim of this study is to analyze RNA-seq data to increase the knowledge about biological pathways in porcine and bovine embryonic pluripotent cell populations exploiting the mouse data as proof of principle. In particular we studied cell populations from three different stages of pluripotency after fertilization: the inner cell mass, the epithelial epiblast and the gastrulating epiblast.
Reads quality was checked with FASTQC, then the reads were pre-processed using Prinseq and mapped with STAR aligner ending up with a minimum of 80% of uniquely mapped reads per sample. Post mapping quality control with Qualimap showed a minimum of 60% of reads mapped in the exonic regions per sample. Finally the expression levels were estimated using HTSeq.
Gene co-expression will be analyzed using a weighted network based method to identify highly co-expressed genes (module) and hub genes. Then modules with a potential role in pluripotency will be identified with an enrichment procedure and regulator genes identified with LemonTree algorithm. Finally differential wiring of the modules among species will be evaluated.
ACKNOWLEDGEMENTS: we thank for the financial support from the EU project PluriSys, HEALTH-2007-B-223485.
O060 - A pipeline for the efficient analysis of E.Coli RNA-seq time series data in the RecogNice project
Günter Jäger, Anrede, Germany
Karin Schäferhoff, Institute for Medical Genetics and Applied Genomics, University of Tübingen, Germany
Sven Poths, Institute for Medical Genetics and Applied Genomics, University of Tübingen, Germany
Vanessa Vosseler, Institute for Medical Genetics and Applied Genomics, University of Tübingen, Germany
Joana Simen, Institute for Biochemical Engineering, University of Stuttgart, Germany
Michael Löffler, 2Institute for Biochemical Engineering, University of Stuttgart, Germany
Ronny Feuer, Institute for System Dynamics, University of Stuttgart, Germany
Oliver Sawodny, Institute for System Dynamics, University of Stuttgart, Germany
Michael Ederer, Institute for System Dynamics, University of Stuttgart, Germany
Georg Sprenger, Institute for Microbiology, University of Stuttgart, Germany
Ralf Takors, Institute for Biochemical Engineering, University of Stuttgart, Germany
Olaf Rieß, Institute for Medical Genetics and Applied Genomics, University of Tübingen, Germany
Short Abstract: Cells in industrial scale fermenters are frequently exposed to different growth conditions, arising from CNO substrate fluctuations, as they travel through the reactor. This can lead to performance losses. In the BMBF supported research project “RecogNice” the development and optimization of production processes is simulated in lab-scale fermenters using the model organism Escherichia Coli. By applying oxygen, nitrogen and carbon limitations and monitoring gene expression levels at different zones and time points, the interplay of substrate-gradient based stimuli and the dynamic metabolic and transcriptional response are studied. By the application of the RNA-seq technology, sequenced reads of expressed coding and non-coding transcripts are produced that have to be further processed using bioinformatics approaches. To address this, we have built an automated workflow for data processing, including quality control of the raw sequence reads, read mapping and count table generation. Furthermore, variant detection and annotation is possible during the workflow, providing further insights into gene regulation or loss of gene function. The data processing is followed by an automated statistical analysis procedure for read count normalization, quality control, and filtering as well as statistical evaluation of gene expression levels at different time points. The pipeline is realized by combining state of the art bioinformatics tools into a common framework that is both fast and easy to use, enabling the processing of hundreds of samples in a single run. Because of its generic design the pipeline can also easily be transferred to a broader range of applications.
O061 - Developing a web-based tool for drug-induced toxicity prediction and analysis in multi-organs
Jinwoo Kim, Kyoungpook national university, Korea, Rep
Erkhembayar Jadamba, Kyoungpook national university, Mongolia
Miyoung Shin, Kyoungpook national university, Korea, Rep
Short Abstract: To understand a certain compound, it can be of great use to predict its toxicity effects on several organs such as liver or kidney. For this purpose, we have developed a web-based tool that is to predict drug-induced toxicity effects in multi-organs (including liver and kidney) from gene expression profiles. The required inputs are raw CEL files of rattus norvegicus samples in liver and kidney exposed to target compound. To predict its toxicity, we trained classification models based on 135 drugs related expression data obtained from toxicity related database. Then these models attempts to whether toxicity effect would occur in liver or kidney for each give sample. In addition, our tool can discover pathways actively working on both organs. For the identification of active pathways, we defined an activity score of each pathway depending on how much significant genes involved in a certain pathway are differentially expressed between cases and control samples. The identified pathways are given for a table format along with their statistical significance (p-value). Also, they are visualized via organ-specific colored nodes over pathway network. The edges in a pathway network are defined if two pathway gene-sets share one or more significant genes in common.
O062 - Identification of common longevity networks in C. elegans
Cedric Debes, CECAD, Germany
Andreas Beyer, University of Cologne, Germany
Adam Antebi, Max Plank Institute for Biology of aging, Germany
Christoph Dieterich, Max Plank Institute for Biology of aging, Germany
Manopriya Chokkalingam, Universtiy of Cologne, Germany
Yidong Shen, Max Plank Institute for Biology of aging, Germany
Roman-Ulrich Muller, Universtiy of Cologne, Germany
Short Abstract: Abstract: Insulin signalling, dietary restriction, mitochondrial respiration, and hypoxia signaling pathways are known to affect longevity in C. elegans. Although these pathways have been extensively studied in isolation, it is still open if they impact on a common molecular endpoint, which ultimately modulates lifespan. Here, we present an integrated analysis of these longevity pathways in order to identify common network modules regulating longevity. This study utilizes gene expression data from different C. elegans knock-out strains that simulate three life-extending conditions. Using network smoothing on protein interactions, we found three network modules that are commonly targeted by the different longevity pathways. Those modules implement cellular functions related to reproduction and lipid metabolism.
O063 - TREMPPI - a Toolkit for Reverse Engineering of Molecular Pathways through Parameter Identification
Adam Streck, Freie Universität Berlin, Germany
Heike Siebert, Freie Universität Berlin, Germany
Short Abstract: TREMPPI is a new, visual tool for construction, validation and, analysis of models of molecular regulation and signalling. The overall aim is to provide assistance for both reverse engineering and experimental design, the main application area being systems with a low number of components, which however exhibit complex, non-linear behaviour, e.g. circadian clock. Our formalism is based on the generalization of asynchronous Boolean Networks, the so-called Thomas Networks. This framework provides a very high level of abstraction and is especially applicable when the exact values of the kinetic parameters are unknown. The focus of TREMPPI are systems for which the data are not sufficient for the construction of a full mechanistic model. The user is provided with the means for encoding a variety of phenotypic data, e.g. regulatory constraints, time series... Based on these a pool of models that fit the phenotype is created. These can be additionally scored by their properties, like simulation behaviour, strengths of interactions etc. Lastly, TREMPPI provides tools for comparative statistical analysis of model sets, allowing to obtain novel knowledge through data mining.
Available at:
O064 - A critical evaluation of algorithms for tissue-specific metabolic model reconstruction
Miguel Rocha, Universidade do Minho, Portugal
Sara Correia, Universidade do Minho, Portugal
Short Abstract: Recently, genome-scale metabolic models have been used to predict and study the metabolism of several organisms. In particular, human metabolic models have been applied to drug discovery, biomarker identification or to target diseases such as cancer or Alzheimer.
However, the human organism is composed by several cell types and tissues, which have different metabolic profiles and functions. In this context, it is essential to address the reconstruction of tissue-specific metabolic models to predict metabolic phenotypes of different cell types.
Recently, some approaches have been presented to address this challenge, namely the Model-Building Algorithm (MBA), Metabolic Context specificity Assessed by Deterministic Reaction Evaluation (MCADRE), and Task-driven Integrative Network Inference for Tissues (tINIT). All these methods use a generic metabolic model as template and integrate evidences from omics data, literature and/or network analysis to reconstruct the tissue-specific metabolic model.
Here, we present a systematic analysis of the results of these methods for the reconstruction of models of distinct tissues and cancer cell lines. Different omics data combinations were used as inputs and their impact on the results has also been evaluated.
The results show that omics data sources have a poor overlap and have a significant impact on the final models. Indeed, these are very dependent on the combination of method and omics data source, but seem to depend more on the latter. This suggests that a priori omics data integration should be conducted before the reconstruction process or, alternatively, methods for this task should be able to take into account diverse types of data in their optimization processes.
O065 - Comparative genome-scale reconstruction of gapless metabolic networks for present and ancestral species
Merja Oja, VTT Technical Research Centre of Finland, Finland
Esa Pitkänen, University of Helsinki, Finland
Peter Blomberg, VTT Technical Research Centre of Finland, Finland
Sandra Castillo, VTT Technical Research Centre of Finland, Finland
Dorothee Barth, VTT Technical Research Centre of Finland, Finland
Greg Medlock, University of Virginia, United States
Anna Blazier, University of Virginia, United States
Jason Papin, University of Virginia, United States
Merja Penttilä, VTT Technical Research Centre of Finland, Finland
Mikko Arvas, VTT Technical Research Centre of Finland, Finland
Short Abstract: Background: We introduce our novel computational approach CoReCo for comparative metabolic re-construction and present new results for bacterial reconstructions.

Method: Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species, such as genomes assembled from metagenomics data. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. Our current implementation of CoReCo contains a new improved database of metabolic reactions, whereas KEGG was used in the original CoReCo [1]. Compounds and reactions from ChEBI, KEGG, Rhea, YMDB and HMDB, and the genome-scale metabolic models of several microorganisms were combined to create a database of ~11000 electron and atom balanced reactions.

Results: We will present results of latest analysis of our bacterial and fungal models. In particular, experimental verifications for models built at VTT have been carried out at University of Virginia and CoReCo models have been found to fit well to experimental data.

Conclusion: Our reconstruction method allows comparative reconstruction of an arbitrary number of species from sequence and phylogenetic data. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments.

[1] Pitkänen et al. (2014). PLoS Computational Biology, 10(2), e1003465.
O066 - Integrate everything but the kitchen sink: Data set selection and sensitivity estimation in collective factor models
Marinka Zitnik, University of Ljubljana, Slovenia
Blaz Zupan, University of Ljubljana, Slovenia
Short Abstract: BACKGROUND: Molecular biology data is rich in volume as well as heterogeneity. We can view individual data sets as relations between objects of different types, for example, function annotations describe relationships between genes and functions. We represent a large data compendium with a multiscale and multiplex relation graph. Recently, latent factor models were developed to fuse such representations and collectively infer accurate prediction models (Zitnik & Zupan, IEEE TPAMI 2015). Here, we are interested in how changes in one relation (data set) affect the latent model of another relation in the context of a given collective latent factor model. For example, in a user-movie recommendation system, how would a change of casting affect user's movie preferences? In bioinformatics, how would a change in gene expression data influence prediction of gene-disease associations?

RESULTS: We have developed Forensic, an approach to estimate sensitivity between any two relations within a single run of inference algorithm. Forensic derives from the theory of Frechet derivation and matrix conditioning and can be used with any collective matrix factorization. We applied Forensic to a compendium of 100 experimental protein interaction data sets, whose sheer number increased the likelihood of outlier relations of lower quality. Forensic provided reliable estimates of data relevance and identified inconsistent relations. The estimated sensitivity correlated highly with the changes of relation reconstruction error when inconsistent relations were removed.

CONCLUSIONS: Our results suggest that Forensic could be used to detect low-quality experimental data sets and recommend which data sets should be included in data fusion.
O067 - Identifying active molecular interactions in the disease condition
Yeeok Kang, , Korea, Rep
Hasun Yu, KAIST, Korea, Rep
Doheon Lee, KAIST, Korea, Rep
Short Abstract: For studying disease and drug mechanisms, molecular interactions in a specific context are necessary. However, analyzing only disease-specific molecular interactions has some limitations to represent the actual molecular network in the specific context and understanding the disease and the drug mechanism and perturbation. Therefore, identifying disease-active interactions may be helpful to better understand human molecular networks in the disease condition and analyze perturbations in between a drug and a target. We suggested approaches to identify disease-active molecular interactions not only disease-specific interactions in the specific tissue by using biological function information. The diseases are caused by changes of biological functions and the changes of biological functions are caused by changes of active molecular interactions so analyzing of disease-related biological functions are helpful to build the disease network as well as to study related mechanisms. From this assumption, we suggested methods for scoring interaction probabilities to extract the disease-active interaction by using the disease-related biological function information based on the molecular backbone network. In addition, the disease and tissue-specific gene expression data was used to assign molecular activity scores in the disease condition. In our research, the disease-active tissue-specific molecular interaction can be identified by our suggested approaches with biological function information. These interactions will become a valuable resource for studying the disease and the drug mechanism and can be used to improve computational drug repositioning or drug-like small molecule prediction.
O068 - Estimating Transcription Factor Activity in distinct Regulatory Networks
Christopher Schiefer, , Germany
Saskia Pohl, HU Berlin, Germany
Short Abstract: Comprehending gene regulation AT the cellular LEVEL IS a major goal of Bioinformatics since years. A popular STARTING POINT IS quantitative transcriptome DATA AND qualitative networks of transcription factor – gene relationships. Numerous methods AND algorithms trying TO infer the actual regulatory relationships WITHIN a sample have been presented. Recently, Schacht et al. [1] proposed a novel approach TO estimate the regulatory activity of transcription factors based ON their cumulative effects ON target genes, reporting remarkable results FOR predicting levels of gene expression AND identifying the most influential regulators of particular genes. To assess the impact of different underlying networks AND TO investigate the method's capabilities we re-implemented and extended it to estimate activity on a network wide scope. We compared the impact of changing the underlying network (from the MetaCoreTM database to an open text mining network developed by Thomas et al. [2]) and tested its robustness to small changes within thenetwork. Using the same set of transcriptome data, we find that predicting gene expression is more accurate using the MetaCoreTM network compared to the text mining network (correlation coefficients r=0.59 and r=0.46, respectively). Narrowing the calculations to expression data from specific tissues yielded similarly significant differences between the two networks. Additionally, we found a notable influence of the network's general topology ON the estimated results. Randomization tests showed that the method IS also fairly robust TO changes IN the regulatory network. [1] doi:10.1093/bioinformatics/btu446 [2] doi:10.1093/bioinformatics/btu795
O069 - Pareto Optimization Identifies Diverse Set of Phosphorylation Signatures Predicting Response to Treatment with Dasatinib
Christoph Schaab, Evotec, Germany
Martin Klammer, Evotec, Germany
Nikolaj Dybowski, Evotec, Germany
Daniel Hoffmann, University of Duisburg-Essen, Germany
Tao Xu, Evotec, Germany
Manuela Machatti, Evotec, Germany
Short Abstract: Multivariate biomarkers that can predict the effectiveness of targeted therapy in individual patients are highly desired. Previous biomarker discovery studies have largely focused on the identification of single biomarker signatures, aimed at maximizing prediction accuracy. Here we present a different approach that identifies multiple biomarker by simultaneously optimizing their predictive power, number of features, and proximity to the drug target in a protein-protein interaction network.

To this end, we incorporated NSGA-II, a fast and elitist multi-objective optimization algorithm that is based on the principle of Pareto optimality, into the biomarker discovery workflow. The method was applied to quantitative phosphoproteome data of 19 non-small cell lung cancer (NSCLC) cell lines from a previous biomarker study. The algorithm successfully identified a total of 77 candidate biomarker signatures predicting response to treatment with dasatinib. Through filtering and similarity clustering, this set was trimmed to four final biomarker signatures, that then were validated on an independent set of breast cancer cell lines. All four candidates reached the same good prediction accuracy (83%) as the originally published biomarker. Although, the newly discovered signatures were diverse in their composition and in their size, the central protein of the originally published signature -- integrin beta 4 (ITGB4) -- was also present in all four Pareto signatures, confirming its pivotal role in predicting dasatinib response in NSCLC sell lines.

In summary, the method presented here allows for a robust and simultaneous identification of multiple multivariate biomarkers that are optimized for prediction performance, size, and relevance.
O070 - Evolutionary models for transcriptional regulatory networks
Antonio Rosanova, University of Turin, Italy
Michele Caselle, University of Turin, Italy
Short Abstract: In this work we discuss a few general properties of duplication-divergence models of the human transcriptional regulatory network.
We focus in particular on the organization in families of paralogous Transcription Factors (TFs).
We compare the predictions of a generic duplication model with the recent ENCODE data on human repertoire of TFs.
Our main result is that the data are compatible with the model only if one assumes that a relevant fraction of Transcription Factors
is made of "singletons" i.e. TF whose duplication is selected against.
We then address the topological properties of the resulting networks and again compare them with the ENCODE data. In this way we may fix in a biologically meaningful way
all the parameters of the model and may use it as a null model to identify the relevant signatures of the
evolutionary pressure on the human transcriptional network.
O071 - Integration of transcriptomics and metabolimics data with ONION: application to nutrigenomics studies
Wiktor Jurkowski, The Genome Analysis Centre,
Monika Piwowar, Jagiellonian University, Poland
Short Abstract: We are presenting new approach that makes use of verified and putative molecular interactions or functional association to guide Canonical Correlation Analysis of omics data sets. The workflow includes dividing of data sets to reach the expected data structure, statistical analysis within groups and interpretation of results. By applying pathway and network analysis, data obtained by various platforms are grouped with moderate stringency to avoid functional bias. As a consequence classical CCA and other multivariate models can be applied to calculate robust statistics and provide easy to interpret associations between metabolites and genes to leverage understanding of metabolic response.
Effective integration of lipidomics and transcriptomics is demonstrated on murine and human nutrigenomics data sets. We are able to demonstrate that our approach improves detection of genes related to lipid metabolism, in comparison to applying rCCA or PLS statistics alone. This is measured by increased percentage of explained variance (95% vs. 75-80%) and by identifying new metabolite-gene associations explaining the phenotype.
O072 - Ciliacarta: an integrated compendium of ciliary genes to accelerate cilium research and genetic diagnostics of ciliopathies
John van Dam, Radboud University Medical Center Nijmegen, Netherlands
Julie Kennedy, University College Dublin, Ireland
Erik de Vrieze, Radboud University Medical Center, Netherlands
Kirsten A. Wunderlich, Johannes Gutenberg Universitaet , Germany
Suzanne Rix, University College London,
Gerard W. Dougherty, Westfaelische Wilhelms Universitaet , Germany
Robin van der Lee, Radboud University Medical Center, Netherlands
The SYSCILIA consortium, Radboud University Medical Center, Netherlands
Oliver E. Blacque, University College Dublin, Ireland
Uwe Wolfrum, Johannes Gutenberg Universitaet , Germany
Victor Hernandez, University College London,
Martijn A. Huynen, Radboud University Medical Center, Netherlands
Short Abstract: With high throughput screens becoming the norm in biomedical research, finding a way to meaningfully integrate all this data and overcoming the shortcomings of individual datasets is paramount. Here we report on our efforts to integrate large datasets generated within the SYSCILIA consortium using a naive Bayesian classifier for the purpose of discovering genes involved in ciliary biology and associated human disease. We have rigorously curated genomic, proteomic, transcriptomic and evolutionary data and integrated these using a naïve Bayesian classifier.

The classifier was trained with our previously defined high quality set of known ciliary genes and a carefully selected set of likely non-ciliary genes. The naïve Bayesian classifier for ciliary genes performs considerably better than the individual datasets and provides an increased coverage of the human genome. A randomly selected set of 37 genes from the top 283 ranking genes with an expected FDR of 25% were selected for thorough validation in Caenorhabditis elegans, mouse, zebrafish and human cell lines, applying several distinct approaches to determine ciliary function.

We discovered 26 novel ciliary human genes, therefore validating our Bayesian classifier. One gene was taken forward for full bio-molecular characterisation, and we show that it localizes to the base of the cilium, redefining what has been published about this previously “non-ciliary” gene. Our Bayesian classifier forms the basis of a comprehensive ciliary compendium, the Ciliacarta, which can be used to objectively prioritize candidate genes in experiments and patient exome sequencing to help in the discovery of novel disease causing genes.
O073 - The DOE Systems Biology Knowledgebase (KBase): Progress Toward a System for Collaborative and Reproducible Analysis
Paramvir Dehal, Lawrence Berkeley National Lab, United States
Chris Henry, Argonne National Laboratory, United States
Doreen Ware, Cold Spring Harbor Laboratory, United States
Dylan Chivian, Lawrence Berkeley National Labortatory, United States
David Weston, Oak Ridge National Laboratory, United States
Fernando Perez, Lawrence Berkeley National Laboratory, United States
Robert Cottingham, Oak Ridge National Laboratory, United States
Sergei Maslov, Brookhaven National Laboratory, United States
Rick Stevens, Argonne National Laboratory, United States
Adam Arkin, Lawrence Berkeley National Laboratory, United States
Short Abstract: The U.S. Department of Energy Systems Biology Knowledgebase (KBase, integrates commonly used core tools, reference and experimental data, and overlays them with new capabilities for visualization, exploration, and predictive analysis with KBase-generated recommendations designed to accelerate our understanding of microbes, plants, and communities. The mission of KBase is driven by key DOE stakeholders that operate in team science mode and seek to optimally use and disseminate, both internally and to the wider community, the data and algorithms their programs generate. These include the DOE Bioenergy Research Centers (BRCs) and the DOE Joint Genome Institute (JGI). KBase offers open access to quality-controlled data and high-performance modeling and simulation tools that enable researchers to build new knowledge, interpret missing information necessary for predictive modeling, test hypotheses, design experiments, and share findings. KBase empowers researchers with a variety of microbial resources and analytical services including microbial genomes that are integrated with phenotype experiments, gene expression profiles, regulatory, interaction, and metabolic networks. These data sources can be used as input to KBase analysis tools to build models and generate new hypotheses such as metabolic reconstruction and flux balance analyses. In addition, user-furnished data can be uploaded, analyzed using high-performance bioinformatics tools, and compared with KBase-provided data and models. By enabling members of the community to integrate and use a wide spectrum of analysis tools and datasets, KBase will serve as a catalyst for biological research, accelerating discovery for DOE missions and providing insights and benefits that can ultimately serve numerous application areas.
O074 - STRING-RNA: Integrating non-coding RNA and protein interaction networks
Alexander Junge, Center for non-coding RNA in Technology and Health, University of Copenhagen, Denmark
Jan C. Refsgaard, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
Christian Garde, Center for Biological Sequence Analysis, Technical University of Denmark, Denmark
Xiaoyong Pan, Center for non-coding RNA in Technology and Health, University of Copenhagen, Denmark
Alberto Santos, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
Christian Anthon, Center for non-coding RNA in Technology and Health, University of Copenhagen, Denmark
Ferhat Alkan, Center for non-coding RNA in Technology and Health, University of Copenhagen, Denmark
Christian von Mering, Institute of Molecular Life Sciences and Swiss, Institute of Bioinformatics, University of Zurich, Switzerland
Christopher T. Workman, Center for Biological Sequence Analysis, Technical University of Denmark, Denmark
Lars Juhl Jensen, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
Jan Gorodkin, Center for non-coding RNA in Technology and Health, University of Copenhagen, Denmark
Short Abstract: The ability to study protein-protein interactions experimentally, co-expression profiling and computational predictions enabled the elucidation of large-scale protein association networks. Similar approaches are emerging for non-coding RNAs (ncRNAs) and their interactions with other ncRNAs, mRNAs and proteins forming ncRNA association networks, an additional layer of interactions. However, relatively little efforts have been made to integrate ncRNA and protein association networks. In this work, we are projecting ncRNAs and their associations into protein-protein interaction networks collected in the STRING database version 10. This was realized by expanding the payload mechanism of STRING to accommodate the addition of ncRNA nodes and their interactions to existing protein association networks.

We collected ncRNA interactions from a wide range of resources: a) curated knowledge, b) experimentally supported interactions, c) predicted microRNA-target interactions, and d) co-occurrences found by text mining Medline abstracts. Each resource was benchmarked by assessing its agreement with a gold standard set of microRNA-target interactions. This allowed the assignment of a reliability score to each interaction.

STRING-RNA aggregates associations and (predicted) interactions of a vast collection of ncRNA classes, including microRNAs and long ncRNAs. After querying the STRING-RNA database for an ncRNA or protein of interest, an interactive interaction network is shown allowing the user to quickly obtain more information about each edge and node in the network. STRING-RNA will provide the research community with an easily accessible web portal fostering the understanding of the cell’s complex interaction network formed by ncRNAs and proteins. STRING-RNA is available at:
O075 - A Probabilistic Graphical Model for Interleukin-1 Signaling in Cancer
Aurora Blucher, , United States
Anupriya Agarwal, Oregon Health & Science University, United States
Jeffrey Tyner, Oregon Health & Science Universit, United States
Shannon McWeeney, Oregon Health & Science Universit, United States
Guanming Wu, Oregon Health & Science University, United States
Short Abstract: The interleukin-1 (IL-1) family of cytokines regulates both innate and adaptive immunity by controlling proinflammatory reactions. Upregulation of IL-1 has been found in several types of tumors and is often associated with a poor prognosis for the patient. Probabilistic graphical models provide a concise way of representing a complex distribution of conditional probabilities across many cross-dependent variables, such as those found in biological pathways and networks. These models have successfully been applied in the study of biological systems, such as determining patient-specific pathway activity in cancer genomic samples (Vaske, et al. (2010) Bioinformatics 26: i237). Using an IL-1 signaling network compiled from the literature in work by Ryll, et al. (Ryll, et al. (2011) Mol Biosyst 7:3252), we have developed a probabilistic graphical model for the IL-1 pathway. By using this model, we can integrate several different types of genomic data, including gene expression and copy number variations (CNVs), from tumor samples along with drug sensitivity information and explore how IL-1 pathway activation differs among samples. This work will help to further elucidate the role of IL-1 signaling in tumor development and sensitivity to chemotherapeutic drugs.
O076 - Integrative analysis for outcome-guided gene networks from multiple omics profiles
Hyun-hwan Jeong, Ajou University, Korea, Rep
Garam Lee, Ajou University, Korea, Rep
Kyung-Ah Sohn, Ajou University, Korea, Rep
Short Abstract: Recent advances in sequencing technology have provided high-throughput multi-level omics data from genome level to metabolome level. Integrative analysis of such multi-omics data has gained more significance because it can play a key role in understanding hidden mechanisms underlying disease.
We recently proposed a mutual information network-based integrative analysis framework that constructs outcome-guided gene networks from clinical outcome and omics profiles. An edge in each network implies an association between the gene pair and clinical outcome. The strength of the association is measured as mutual information between the gene pair and the outcome. The outcome-guided network could improve the prediction performance of the network-based Cox-regression in comparison with traditional networks such as a correlation network or static PPI network. Our framework also considers the integration of multiple such networks, which successfully discovered novel outcome-associated sub-networks. However, the network integration method in our previous work was rather limited as it was based on the edge co-occurrence.
To go one step further, we aimed at improving the network integration method. We employed the recent work of similarity network fusion technique in our analysis framework. To demonstrate the utility of the proposed method, we applied it to various types of genomic profiles of ovarian serous cystadenocarcinoma patients from TCGA. We examined functional enrichment test results to assess the biological significance of the results and compared the results with previous methods. Our analysis reveals that the proposed approach has the potential to provide more comprehensive view on the genomic interactions across multiple levels.
O077 - ChIP-Array 2: integrating multiple omics data to construct gene regulatory networks
Yiming Qin, The University of Hong Kong, Hong Kong
Panwen Wang, The University of Hong Kong, Hong Kong
Jing Qin, The University of Hong Kong, Hong Kong
Yun Zhu, The University of Hong Kong, Hong Kong
Lily Yan Wang, The University of Hong Kong, Hong Kong
Michael Q. Zhang, The University of Texas at Dallas, United States
Junwen Wang, The University of Hong Kong, Hong Kong
Short Abstract: Transcription factors (TFs) play an important role in gene regulation. The interconnections among TFs, chromatin interactions, epigenetic marks and cis-regulatory elements form a complex gene transcription apparatus. Our previous work, ChIP-Array, combined TF binding and transcriptome data to construct gene regulatory networks (GRNs). Here we present an enhanced version, ChIP-Array 2, to integrate additional types of omics data including long-range chromatin interaction, open chromatin region, and histone modification data to dissect more comprehensive GRNs involving diverse regulatory components. Moreover, we substantially extended our motif database for human, mouse, rat, fruit fly, worm, yeast and rabidopsis, and curated large amount of omics data for users to select as input or backend support. With ChIP-Array 2, we compiled a library containing regulatory networks of 18 TFs/chromatin modifiers in mouse embryonic stem cell (mESC). The web server and the mESC library are publicly free and accessible at
O078 - Orthogonal nonnegative factorization-based analysis of nineteen protein-RNA interaction CLIP data sets
Tomaz Curk, University of Ljubljana, Slovenia
Martin Stražar, University of Ljubljana, Faculty of Computer and Information Science, Slovenia
Marinka Žitnik, University of Ljubljana, Faculty of Computer and Information Science, Slovenia
Blaž Zupan, University of Ljubljana, Faculty of Computer and Information Science, Slovenia
Jernej Ule, Department of Molecular Neuroscience, UCL Institute of Neurology,
Tomaž Curk, University of Ljubljana, Faculty of Computer and Information Science, Slovenia
Short Abstract: RNA binding proteins (RBPs) regulate different mechanisms of post-transcriptional gene expression, including splicing, RNA editing, polyadenylation, nuclear transport and RNA stability. Contemporary methods modeling protein-RNA interaction assume precise structural information, such as protein folding, which is available only for a limited number of proteins. Increasingly growing experimental data from protocols such as CLIP, iCLIP, PAR-CLIP, HITS-CLIP, enables simultaneous integrative study of different RBPs.

We developed an algorithm for orthogonal non-negative matrix factorization and used it to integrate various omics data on protein-RNA interactions, including gene/transcript sequence, function and structure. The algorithm fuses multiple data sources by joint dimensionality reduction and discovers multiple, non-overlapping modules within the data. Exploiting synergistic relations among data sources improves prediction of interacting sites.

We analyzed protein-RNA interaction data on 19 RBPs. By visualizing and exploring the factor model, we discovered a number of biologically relevant patterns governing individual, cooperative and competitive RBP targeting of RNA sites. Many of the discovered patterns are known features governing protein-RNA interactions, including RNA structure, RNA sequence motifs and positioning, cooperation and competition in binding among proteins to same target RNAs, affinity to interact with specific types of gene regions, and functional annotation of targeted genes. We used these patterns to cluster RBPs into functionally related groups. For example, we find that hnRNP proteins bind to U-rich motifs in introns to regulate splicing (hnRNPs, U2AF2, ELAVL1, TDP-43, TAF15, FUS, QKI). Those that regulate splicing (SR), spliced mRNA (eIF3E3) or 3’UTR (Ago, IGF2BP) mRNA bind to GC-rich motifs.
O079 - Graph-based visualisation and analysis of RNA-seq data
Fahmi Nazarie, The University of Edinburgh,
Tim Angus, The University of Edinburgh,
Sz-Hau Chen, The University of Edinburgh,
Mark Barnett, The University of Edinburgh,
Karsten Klein, Monash University, Australia
Anton Enright, EMBL-European Bioinformatics Institute,
Tom Freeman, The University of Edinburgh,
Short Abstract: RNA-seq is a powerful transcriptome profiling technology enabling transcript discovery and the quantification of RNA abundance. However, the analysis of RNA-seq data remains a significant challenge. Data is large and the tools for its assembly, analysis and visualisation are still under development. Assemblies of reads can be inspected using tools such as the Integrative Genomics Viewer (IGV) where visualisation of results involves ‘stacking’ the reads onto a reference genome. Whilst sufficient for many needs, when the underlying variance of the genome or transcript assemblies is complex, this visualisation method can be limiting; errors in assembly can be difficult to spot and visualisation of splicing events may be challenging.

Here we report on our investigations into the use of a graph-based visualisation method as a complementary approach to understanding transcript diversity and issues with assembly. Visualisation of the resulting graphs is performed using the network analysis tool BioLayout Express3D that can adequately render the often large and complex graph topologies that result from DNA/RNA sequence assembly. We have also implemented an analysis pipeline for the creation of transcript graphs and developed both a command-line and web-based interface that allows users to create and visualise such graphs. Here we demonstrate the utility of an approach on RNA-seq data including the unusual structure of these ‘overlap’ graphs and how they can be used to identify issues in assembly, internal homology in transcripts and splice variants. We believe this approach has the potential significantly to improve appreciation of assemblies of sequence data.
O080 - Exploring disease etiology through a large-scale mapping of deleterious genes to cell types
Alex Cornish, Imperial College London,
Ioannis Filippis, Imperial College London,
Alessia David, Imperial College London,
Michael Sternberg, Imperial College London,
Short Abstract: While the majority of diseases are manifested within a specific anatomical structure, known disease-associated alleles are often inherited and therefore present throughout the body. Understanding how these ubiquitous alleles produce localized disease is key to understanding the mechanisms that drive disease. We have developed a novel approach, called gene set compactness (GSC), that contrasts the relative positions of disease-associated genes on cell type-specific interactomes to identify the cell types most likely to be affected by the alleles. Cell type-specific interactomes were created through the integration of protein-protein interaction (PPI) data and cell type-specific expression data from the FANTOM5 project. We conducted text-mining of the PubMed database to produce an independent map of disease-associated cell types, which we used to validate our method. Our method identifies previously-suggested associations, along with associations that warrant further study. These include mast cells and multiple sclerosis (MS); a population of cells that is currently being targeted in an MS phase 2 clinical trial. Furthermore, we used the associations identified by our method to construct a pathogenic cell type-based diseasome, offering insight into diseases linked by common etiology. The dataset produced represents the first large-scale mapping of diseases to their pathogenic cell types. Overall, we demonstrate that the GSC method links disease-associated genes to the phenotypes they produce; one of the key goals of systems biology.
O081 - Uncovering the mechanisms modulating cardiac electrophysiology using systems genetics approaches in recombinant inbred rat strains
Michiel Adriaens, Maastricht University, Netherlands
Michiel Adriaens, Maastricht University, Netherlands
Aida Moreno-Moral, Imperial College London,
Elisabeth Lodder, AMC, Netherlands
Carol Ann Remme, AMC, Netherlands
Rianne Wolswinkel, AMC, Netherlands
Enrico Petretto, Imperial College London, Singapore
Stuart Cook, Imperial College London, Singapore
Connie Bezzina, AMC, Netherlands
Short Abstract: Genome-wide association studies have identified many common genetic variants impacting on susceptibility to cardiac arrhythmias and sudden cardiac death (SCD). But uncovering the underlying disease mechanisms remains a substantial challenge, as the required resources for the human heart are sparse and underpowered. Hence, the only means to paint the full picture is to complement insights derived from human studies with systems genetics approaches in statistically powerful animal models. In this study we use 29 BXH/HXB recombinant inbred (RI) rat strains, a strong model to uncover the mechanisms modulating cardiac electrical function. Prolonged ECG indices of conduction and repolarization are risk factors for cardiac arrhythmias and SCD, and here we combine such indices with genotyping and RNA-seq transcriptomics data. In this data we hunt for quantitative trait loci (QTL): genetic markers associated with changes in a quantitative trait, i.e. an ECG index or gene expression level. Using a Bayesian systems genetics framework, we identified multiple candidate genes and networks. One of these genes is Acbd4: a nearby genetic marker appears to modulate the expression of this gene (eQTL). Additionally, the same marker is associated with PR prolongation (ecgQTL). The protein product of Acbd4 plays a role in vesicle formation, deregulation of which is known to be linked to heart disease. Acbd4’s co-expression network is significantly positively correlated with PR duration and partly conserved in human, suggesting that the underlying mechanism may be of clinical relevance as well. Validation of our findings is currently ongoing.
O082 - Genome-wide ceRNA networks
Mario Flores, University of Texas, United States
Yidong Chen Flores, University of Texas Health Science Center, United States
Yufei Huang, UTSA, United States
Short Abstract: Postranscriptional regulation of gene expression can be modeled as a competitive endogenous RNA (ceRNA) network in which mRNAs compete for miRs binding. Previous research shows that this competition maintains and fine-tune levels of protein coding genes and the disruption of the network contributes to phenotypic conditions like cancer. Based on our previous studies we provided a tool (TraceRNA) for reconstruction of ceRNA networks around a gene of interest (GoI). The approach used in TraceRNA although practical and useful for gene-based studies provides only a partial landscape of the ceRNA mechanisms and phenotypes. Besides TraceRNA offers an ad-hoc approach for the study of the ceRNA phenomenon. In this work we present a formal genome-wide approach for ceRNA networks study. This novel and formal treatment of the ceRNA phenomenon provides new perspectives in the study of ceRNA networks and its specific phenotype. We divide the study of genome-wide ceRNA networks in three main sections: network construction, analysis of network components by network perturbation and network stability.
O083 - Chromatin Interactions Predict Co-expression in the Mouse Cortex
Ahmed Mahfouz, Delft University of Technology, Netherlands
Sepideh Babaei, Delft University of Technology, Netherlands
Marc Hulsman, VU University Medical Center, Netherlands
Boudewijn lelieveldt, Leiden University Medical Center, Netherlands
Jeroen de Ridder, Delft University of Technology, Netherlands
Marcel Reinders, Delft University of Technology, Netherlands
Short Abstract: The three dimensional conformation of the genome in the cell nucleus influences gene expression regulation. These interactions can be measured using Hi-C data, yet understanding how they affect co-regulation remains challenging. To investigate this, we present the first attempt to predict co-expression based on chromatin interactions.

We represent 3D interactions between genes as a network which can be described using its topological properties. We show that characterizing the interaction network using its topology at different scale levels can effectively capture not only direct interactions between gene pairs but also large-scale interactions between chromatin compartments.

We use our model to predict co-expression in the mouse cortex using data from Allen Mouse Brain Atlas based on an interaction network of Hi-C data measured in cortical cell.
O084 - Data visualization and modeling using Atlas of Cancer Signaling Network predicts clinical outcome
Inna Kuperstein, Institut Curie –U900 INSERM - Mines ParisTech, France
Inna Kuperstein, Institut Curie –U900 INSERM - Mines ParisTech, France
Eric Bonnet, Institut Curie –U900 INSERM - Mines ParisTech, France
Eric Viara, Institut Curie –U900 INSERM - Mines ParisTech, France
Maia Chanrion, Institut Curie –U900 INSERM - Mines ParisTech, France
Hien-Anh Nguyen, Institut Curie –U900 INSERM - Mines ParisTech, France
David Cohen, Institut Curie –U900 INSERM - Mines ParisTech, France
Laurence Calzone, Institut Curie –U900 INSERM - Mines ParisTech, France
luca Grieco, Institut Curie –U900 INSERM - Mines ParisTech, France
Christophe Russo, Institut Curie –U900 INSERM - Mines ParisTech, France
Maria Kondratova, Institut Curie –U900 INSERM - Mines ParisTech,
Marie Dutreix, Institut Curie –U900 INSERM - Mines ParisTech, France
Sylvie Robine, Institut Curie –U900 INSERM - Mines ParisTech, France
Emmanuel Barillot, Institut Curie –U900 INSERM - Mines ParisTech, France
Andrei Zinovyev, Institut Curie –U900 INSERM - Mines ParisTech, France
Short Abstract: The successful application of bioinformatics and systems biology methods for analysis of high-throughput data in cancer research depends on availability of global and detailed reconstructions of signaling networks amenable for computational analysis. The Atlas of Cancer Signaling Network (ACSN) is an interactive and comprehensive map of molecular mechanisms implicated in cancer that includes tools for map navigation, visualization and analysis of molecular data in the context of signaling network maps. Constructing and updating ACSN involves manual literature curation and participation of experts in the corresponding fields. The cancer-oriented content of ACSN is original and covers major mechanisms involved in cancer progression. Cell signaling mechanisms are depicted in details, together creating a seamless ‘geographic-like’ map of molecular interactions frequently deregulated in cancer. The map is browsable using NaviCell web interface using the Google Maps engine and semantic zooming principle. The associated web-blog provides a forum for commenting and curating the ACSN content. ACSN allows uploading heterogeneous omics data from users on top of the maps for visualization and performing functional analyses. We suggest several scenarios for ACSN application in cancer research for visualizing high-throughput data. In addition, we show a study on drug sensitivity prediction using the ACSN. Finally, we describe how epithelial to mesenchymal transition (EMT) signaling network from the ACSN collection has been used for finding metastasis inducers in colon cancer through network analysis. ACSN may support data analysis and interpretation; patient stratification; prediction of treatment response and resistance to cancer drugs and design of novel treatment strategies.
O085 - Identifying driver genomic alterations in cancers by searching minimum-weight, mutually exclusive sets
Xinghua Lu, University of Pittsburgh, United States
Songjia Lu, University of Pittsburgh, United States
Short Abstract: An important goal of cancer genomic research is to identify the driving pathways underlying disease mechanisms. It is well known that somatic genome alterations (SGAs) affecting the genes that encode the proteins within a common signaling pathway exhibit mutual exclusivity, in which these SGAs usu-ally do not co-occur in a tumor. With some success, this property has been utilized as an objective function to guide the search for driver mutations. However, the mutual exclusivity alone is not suffi-cient to indicate that genes affected by such SGAs are in common pathways. Here, we propose a nov-el, signal-oriented framework for identifying driver SGAs, such that our new method constrains the mutual exclusivity only on tumors that have SGAs to perturb a common signal (not on all tumors as previous methods used). We apply this framework to the OV and GBM data from TCGA, and perform systematic evaluations. Our results indicate that the signal-oriented approach enhances the ability to find informative sets of driver SGAs that likely constitute signaling pathways.
O086 - Association of mean telomere length with biomolecular pathway deregulations in lung adenocarcinoma
Lilit Nersisyan, Institute of Molecular Biology NAS RA, Armenia
Anna Hakobyan, Institute of Molecular Biology NAS RA, Armenia
Arsen Arakelyan, Institute of Molecular Biology NAS RA, Armenia
Short Abstract: Telomere length dynamics in lung adenocarcinomas has been extensively studied. However, details on how mean telomere length (MTL) is associated with activity changes in signaling, metabolic and regulatory pathways is scarce. We have studied the connection of MTL with biomolecular pathway deregulations in 26 lung adenocarcnioma cell lines.
MTL was computed from whole genome sequencing data with an in-house program Computel. The RNA-seq data was analyzed with an Pathway Signal Flow algorithm for assessing the activity profiles of 168 KEGG pathways in the studied samples. We performed partial correlation analysis of MTL with pathway signal flow values. The top 3 pathways positively correlated with telomere length (local FDR p value < 0.05) were Cytokine-cytokine receptor interaction, Purine metabolism and Glycolysis/Gluconeogenesis pathways (Spearman correlation coefficients 0.46, 0.42 and 0.4). The pathways identified in this study have previously been strongly implicated in lung adenocarcinomas. The cytokine-cytokine receptor interaction pathway has been shown altered both on the levels of expression and genetic variations. Additionally, purine metabolism and glycolysis/gluconeogenesis acceleration was shown to be characteristic of poorly differentiated tumors, such as adenocarcinomas. Moreover, telomerase targeting of tumor cells has been shown to have strong effect on activation of the glycolytic pathway.
In conclusion, this study for the first time demonstrated the possible association of mean telomere length with activation of metabolic and signaling pathways involved in growth and proliferation of lung adenocarcinoma cells.
O087 - PhenomeExpress: A refined network analysis of expression datasets by inclusion of known disease phenotypes
Jean-Marc Schwartz, University of Manchester,
Jamie Soul, University of Manchester,
Timothy Hardingham, University of Manchester,
Raymond Boot-Handford, University of Manchester,
Short Abstract: We will present a new method for the analysis of transcriptomic datasets which combines input from both protein-protein interaction and phenotype similarity networks. Our method enables the identification of sub-networks that are significantly enriched in differentially expressed genes and are at the same time related to disease relevant phenotypes. This contrasts with previous active sub-network detection methods, which rely solely on protein-protein interaction networks derived from compounded data of many unrelated biological conditions.

We will show results from several case studies from subchondral bone in osteoarthritis and Pax5 in acute lymphoblastic leukaemia, as well as newly generated datasets comparing cartilage samples from intact and damaged sites on osteoarthritic joints. We will present core disease pathways and compare them with sub-networks detected by other tools. We thereby demonstrate that our algorithm enhances the detection of molecular phenotypes and provides a more detailed context to pathways identified as candidates for disease mechanisms.
O088 - KeyPathwayMiner - De-novo network enrichment by combining multiple OMICS data and biological networks
Jan Baumbach, University of Southern Denmark, Denmark
Nicolas Alcaraz, University of Southern Denmark, Denmark
Josch Pauling, University of Southern Denmark, Denmark
Richa Batra, University of Southern Denmark, Denmark
Eudes Barbosa, University of Southern Denmark, Denmark
Anne G. Christensen, University of Southern Denmark, Denmark
Henrik J. Ditzel, University of Southern Denmark, Denmark
Jan Baumbach, University of Southern Denmark, Denmark
Short Abstract: We tackle the problem of de-novo pathway extraction. Given
a biological network and a set of case-control studies,
KeyPathwayMiner efficiently
extracts and visualizes all maximal connected sub-networks that contain mainly
genes that are dysregulated, e.g., differentially expressed, in most cases studied.
The exact quantities for ``mainly'' and ``most'' are modeled with two
easy-to-interpret parameters that allow the user to control the number of outliers (not
dysregulated genes/cases) in the solutions. We developed two slightly varying models
that fall into the class of NP-Hard optimization problems and designed a set of algorithms
to tackle the combinatorial explosion of the search space.
During the presentation we will demonstrate how to: Import and process the data, set the parameters for the two
compute and visualize the key pathways, judge and statistically evaluate the results
, explain the different algorithms.
Finally, we will discuss on-going work,
future extensions, and present yet unpublished results.
O089 - miRNAs in differential network analysis
Yvonne Mayer, Humboldt Universität zu Berlin, Germany
Berit Haldemann, Humboldt Universität zu Berlin, Germany
Dido Lenze, Campus Benjamin Franklin, Charite Universitätsmedizin Berlin , Germany
Michael Hummels, Campus Benjamin Franklin, Charite Universitätsmedizin Berlin , Germany
Ulf Leser, Humboldt Universität zu Berlin, Germany
Short Abstract: Differential network analysis (DiNA) denotes a novel class of algorithms which focus on the differences in network topologies between two states of a cell, such as healthy and disease, to identify key players in the discriminating biological processes. In contrast to conventional differential analysis, DiNA identifies changes in the interplay between molecules, rather than changes in single molecules or in groups of molecules. This ability is especially important in cases where effectors are changed (e.g., mutated), but their expression is not. In this paper, we study the impact of miRNA, an important class of mostly negative effectors in the regulatory machinery of mammalian cells, on DiNA algorithms when applied to human regulatory networks. We constructed high-quality regulatory networks with and without miRNA and mapped to those co-expression data for four different cancer types. We then compared the performance of ten DiNA algorithms in recovering genetic key players in each cancer in miRNA-free networks to those in miRNA enriched networks. We find that the inclusion of miRNA consistently and significantly improves the performance of almost each tested method in almost each cancer. These result are influenced by the on-average higher degree of miRNAs, but strong positive effects also remain when miRNA are excluded from gold standards. We furthermore show that the underlying network topology has a predominant impact on most DiNA algorithms compared to the specific co-expression values. Our results underline the importance of using comprehensive models of cells for network analysis and of carefully selecting the underlying network.
O090 - Detecting differentially expressed metabolic pathways with adjustments for macronutrient intake
Teal Guidici, University of Michigan, United States
George Michailidis, University of Michigan, United States
Charles Burant, University of Michigan, United States
Amy Rothberg, University of Michigan, United States
Short Abstract: Differential expression testing and set enrichment analysis are commonly used to summarize the results of high throughput biological experiments, to generate biologically meaningful hypothesis for further analysis. and to aid in the planning of validation experiments.

Conventional approaches to differential expression testing and set enrichment analysis do not usually account for individual variation in relevant background features, in many cases due to lack of pertinent data. These features are especially relevant in the context of metabolomics, where blood metabolite levels can react sensitively and quickly to changes in nutrient intake.

In this project we introduce a network based method for detecting differentially expressed metabolites and metabolic pathways, while adjusting for individual variation in the consumption of relevant macronutrients through the integration of nutrition intake data. We test our method on metabolomic and nutrition intake data from a controlled feeding study featuring two distinct diets (a high polyunsaturated fat diet and a high carbohydrate diet).

Our method yields conclusions which are more
biologically relevant and have greater statistical significance than those from a conventional approach to differential expression analysis. Thus, our method and its findings may provide greater clarity when generating hypothesis for future research.
O091 - Computational approaches to identify and dissect the transcriptional influence on metabolism
Kevin Schwahn, MPI-MP, Germany
Zoran Nikoloski, MPI-MP, Germany
Short Abstract: The availability of high-throughput data from transcriptomics and metabolomics technologies necessitate novel statistic approaches to elucidate the transcriptional influences on metabolism. Here we introduce two new approaches to identify transcriptional effects on metabolite levels: The first combines partial correlations with principal component analysis, while the second partials out the covariance of transcript expression levels from the covariance of metabolite levels. Based on these approaches, we also defined and investigated three new concepts—stable correlations, noise-sensitive correlations, and total partialing. Our findings demonstrate that the proposed approaches are effective in pinpointing the metabolite pathways under strong transcriptional influence. The proposed approaches can also be readily employed to extract network-based descriptions of the data sets, which we use in our subsequent enrichment analysis. Using transcriptomic and metabolomic profiles from Escherichia coli under five different environments, we show that the so-extracted networks contain a smaller number of three-cycles, in comparison to correlation-based networks; however, the findings from the enrichment analysis remain unaltered. Therefore, the proposed approaches provide a promising extension to widely used techniques in computational systems biology to dissect relationships between components on different levels of cellular organization.
O092 - Shopping for conserved drug responses within compound families
Stefan Naulaerts, University of Antwerp, Belgium
Pieter Meysman, University of Antwerp, Belgium
Bart Goethals, University of Antwerp, Belgium
Wim Vanden Berghe, University of Antwerp, Belgium
Kris Laukens, University of Antwerp, Belgium
Short Abstract: Historically, drug resistance and response have always been investigated by means of case-by-case studies. However, large amounts of high-throughput information on gene expression and protein abundance changes have been stored in large online repositories. Some of the most known data resources are the Gene Expression Omnibus and ArrayExpress for the gene expression level, as well as PRIDE, ProteomicsDB and several others for the protein level.

In this work, we extract the relevant known information from gene and protein compendia for a large selection of compounds in PubChem. Using unsupervised data mining methods, such as frequent itemset mining, we are able to distinguish several patterns that are typical of distinct chemical groups. Next, resulting pattern lists were mapped on top of a human interactome, which we derived by combining HPRD, MIntAct and BioGRID and limiting evidence to physical associations.

Known information from the Drugbank database was mapped on top of the combined network in a bid to construct a true “drug-target” network. We subsequently fed the fully integrated entity to the classic network analysis pipeline and investigated topological parameters such as average shortest path, neighbourhood connectivity and betweenness centrality. Finally, term enrichment was investigated on the pathway and functional enrichment levels using Reactome and Gene Ontology Biological process IDs respectively.

Overall, we demonstrate that using frequent itemsets and network analysis tools such as Cytoscape, we are able to identify conserved responses for several structurally similar compounds.
O093 - Network-based eQTL analysis in clonal organisms
Kathleen Marchal, University Of Ghent, Belgium
Dries De Maeyer, Department of Information Technology, UGent; Department of Plant Biotechnology and Bioinformatics, UGent; Department of Microbial and Molecular Systems, KU Leuven, , Belgium
Bram Weytjens, Department of Information Technology, UGent; Department of Plant Biotechnology and Bioinformatics, UGent; Department of Microbial and Molecular Systems, KU Leuven,, Belgium
Short Abstract: eQTL analysis of strains with a phenotype of interest offers a great potential for trait identification and studying natural variation. Classical eQTL analysis searches for a statistical association between a certain genetic locus and an expression phenotype. However, in independently evolved clonal systems, such as bacteria there is no guarantee that exactly the same locus is responsible for the observed phenotype. Rather mutations in the same pathways will result in the adaptive phenotype. This makes clonal eQTL analysis challenging, but also creates the opportunity to exploit information from independently evolved clones to identify adaptive pathways.
For clonal systems eQTL analysis thus depends on the search for mutational consistency in terms of pathways, rather than in terms of individual mutations. To facilitate clonal eQTL we propose a network-based eQTL method based on probabilistic pathfinding. The method assumes that genes with adaptive mutations obtained from different parallel evolved clones are more tightly ‘connected’ to each other and to the genes involved in a downstream expression phenotype on an interaction network than randomly acquired passenger mutations without relation to the focal phenotype. The interaction network here refers to a comprehensive representation of all known interactions between molecular entities in the organism of interest. Connected components in the network thus prioritize adaptive pathways.
Applying our method on the eQTL-data of focal endpoints obtained from two different clonal evolution experiments in E. coli showed how our network-based strategy was able to prioritize true adaptive pathways even in the presence of mutator phenotypes.
O094 - ALEX123 – automated analysis of lipid experiments comprising MS, MS2, and MS3 data
Josch Pauling, University of Southern Denmark, Denmark
Klaus Christiansen, University of Southern Denmark, Denmark
Reinaldo Almeida, University of Southern Denmark, Denmark
Christer Ejsing, University of Southern Denmark, Denmark
Short Abstract: Lipids are the major components of membranes and hence these compounds play vital roles in multiple metabolic processes. Yet, insights into molecular dynamics of lipid compositions in various tissues or cellular compartments subject to changing conditions remain limited. Lipidomics analysis across large sample sets produces high-content mass spectrometry datasets that require dedicated software tools supporting automated lipid identification and quantification, efficient data management and lipidome visualization. After peak identification in mass spectra, corresponding lipid species can be annotated by their sum composition (e.g. phosphatidyl-choline 34:1) or by their more detailed molecular composition (e.g. phosphatidyl-choline 16:0-18:1). However, molecular composition annotation requires the detection of structure-specific fragment ions by MSn>1 experiments. Detecting these fragment ions is essential in elucidating molecular dynamics in a cell’s regulation of lipid metabolism and homeostasis. Here we present a novel software-based platform, named ALEX123 (Analysis of lipid experiments including MS, MS2 and MS3 mass spectra), for streamlined analysis, processing, and management of shotgun lipidomics data. ALEX123 is capable of identifying molecular lipid species by querying Fourier-Transform and Ion-Trap (fragment) spectra. It utilizes detailed information about structure-specific fragment ions accessible via a database. Fragment ion information was compiled by systematic fragmentation analysis of synthetic lipid standards. ALEX123 tracks all experimental data (e.g. polarity, ion activation, measured m/z, intensity) across large sample sets. The compiled results are saved in a universal table format for subsequent use with third-party programs for statistical assessment, quality control and visualization. This approach is generic and extendable to higher MS dimensions (MSn>3).
O095 - Exploring the structure and function of temporal networks with dynamic graphlets
Huili Chen, University of Notre Dame, United States
Yuriy Hulovatyy, University of Notre Dame, United States
Tijana Milenkovic, University of Notre Dame, United States
Short Abstract: With increasing availability of temporal real-world networks, how to efficiently study these data? One can model a temporal network as a single aggregate static network, or as a series of time-specific snapshots, each being an aggregate static network over the corresponding time window. Then, one can use established methods for static analysis on the resulting aggregate network(s), but losing in the process valuable temporal information either completely, or at the interface between different snapshots, respectively. Here, we develop a novel approach for studying a temporal network more explicitly, by capturing inter-snapshot relationships. We base our methodology on well-established graphlets (subgraphs), which have been proven in numerous contexts in static network research. We develop new theory to allow for graphlet-based analyses of temporal networks. Our new notion of dynamic graphlets is different from existing dynamic network approaches that are based on temporal motifs (statistically significant subgraphs). The latter have limitations: their results depend on the choice of a null network model that is required to evaluate the significance of a subgraph, and choosing a good null model is non-trivial. Our dynamic graphlets overcome the limitations of the temporal motifs. Also, when we aim to characterize the structure and function of an entire temporal network or of individual nodes, our dynamic graphlets outperform the static graphlets. Clearly, accounting for temporal information helps. We apply dynamic graphlets to temporal age-specific molecular network data to deepen our limited knowledge about human aging.
O096 - Linking GO Functions and Processes to the Bigger Picture
David P Hill, The Jackson Laboratory, Dept. of Bioinformatics and Computational Biology, United States
Peter D\'Eustachio, NYU School of Medicine, Dept of Biochemistry & Mol Pharm., United States
Nikolai Renedo, Tufts University, Dept. of Biology, United States
Judith A Blake, The Jackson Laboratory, Dept. of Bioinformatics and Computational Biology, United States
Short Abstract: The Gene Ontology (GO) is a freely available resource that describes the role of gene products in biological systems. The GO uses three interrelated ontologies to assign gene products to terms that describe how they function at a biochemical level, where they function, and the overall biological objective of sets of functions. The GO is designed to be species-agnostic and its design principles attempt to describe processes in the most global manner possible. Recently we have undertaken an effort to reflect current views of biochemical pathways in GO by formally structuring the necessary functions that are required for the execution of those pathways (Hill et al, in preparation). Taking the network of processes by which carbohydrates are converted, often via glucose, to pyruvate as a test case we report on the inspection and assignment of cross-references between GO pathways and existing pathway resources and we discuss the strategies surrounding decisions and implications of creating those cross-references. We provide the pathway representation in GO that has resulted from this exercise, show how it maps onto pathway representations in other informatics pathways resources, and show its value as a prototype for defining and cross-referencing biological processes more generally.
O097 - An integrative Graph-database Approach for Respiratory Disease
Irina Roznovat, CNRS - European Institute for Systems Biology and Medicine, France
Artem Lysenko, Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ,
Mansoor Saqi, European Institute for Systems Biology and Medicine (EISBM), CNRS-ENS-UCBL, Campus Charles Mérieux, Université de Lyon, 50 Avenue Tony Garnier, France
Alexander Mazein, European Institute for Systems Biology and Medicine (EISBM), CNRS-ENS-UCBL, Campus Charles Mérieux, Université de Lyon, 50 Avenue Tony Garnier, France
Chris Rawlings, Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ,
Charles Auffray, European Institute for Systems Biology and Medicine (EISBM), CNRS-ENS-UCBL, Campus Charles Mérieux, Université de Lyon, 50 Avenue Tony Garnier, France
Short Abstract: A major effort in translational medicine is the identification of molecular signatures that are associated with disease subtypes. Whilst such signatures are useful for patient stratification, in order to obtain a mechanistic understanding of the disease process it is necessary to put these signatures or patterns into a broader biological context. This involves integration of a large number of heterogeneous data sources such as protein-protein interaction, disease-gene association and protein-pathway mapping. Graph databases provide a natural representation for biological information, which is typically highly connected and semi-structured. Information is represented in the form of networks consisting of nodes (concepts) and edges (relationships), and the network neighbourhood of proteins implicated in a disease condition provides biological context. Graph databases are naturally suited to traversal type queries, which can reveal unexpected relationships and can facilitate hypothesis generation. Here, we explore the use of a well-established graph database (Neo4j) for integration of multiple data sources and for providing biological context to genes implicated in two chronic respiratory diseases: asthma and chronic obstructive pulmonary disease (COPD). Specifically, we explore biological similarity of asthma and COPD at gene network level, by applying measures of network overlap on differentially expressed genes from a set of GEO studies on human respiratory diseases. We show how a graph - database approach can provide disease maps that complement traditional pathway diagrams.
O098 - Computational modeling of electrophysiology and molecular variability in single neurons reveals a tunable switch with discrete neuromodulatory response phenotypes
Rajanikanth Vadigepalli, Thomas Jefferson University, United States
Warren Anderson, Thomas Jefferson University, United States
Hirenkumar Makadia, Thomas Jefferson University, United States
Short Abstract: Recent single cell studies show extensive molecular variability underlying cellular responses. We evaluated the impact of such variability in the expression of molecular components on the neurophysiology and neuromodulation. We employed a computational model that integrates neuropeptide receptor-mediated signaling and electrophysiology. We simulated a large population of neurons in which expression levels of a neuropeptide receptor and multiple ion channels were simultaneously varied over a range of physiological levels. We analyzed the effects of variation on the electrophysiological response to a neuropeptide stimulus. Our results revealed distinct response patterns associated with low versus high receptor levels and were further tuned by the biophysical state. Neurons with low receptor levels showed increased excitability and neurons with high receptor levels showed reduced excitability. These response patterns were separated by a narrow range of receptor levels forming a separatrix. The position of this separatrix was dependent on the expression level of multiple ion channels. To assess the relative contributions of receptor and ion channel levels to the response profiles, we categorized the responses into five phenotypes based on response kinetics and magnitude. The results showed that ion channel expression variation primarily influenced response kinetics while receptor level variation primarily influenced the steady-state response. Our results show that receptor expression and biophysical state interact to yield discrete neuronal activity phenotypes corresponding to a tunable switch-like behavior. Funding: NIH/NHLBI R01 111621.
O099 - Mixed Graphical Models for Analysis of Multi-modal Genomic and Clinical Variables
Panagiotis Benos, University of Pittsburgh, School of Medicine, United States
Andrew J Sedgewick, University of Pittsburgh, United States
Joseph D Ramsey, Carnegie Mellon University, United States
Rory M Donovan, Uniersity of Pittsburgh, United States
Clark Glymour, Carnegie Mellon University, United States
Short Abstract: Graphical models is an important tool for biomedical research because they can intuitively represent the underlying structure of complex, multivariate probability distributions found in biological data. Learned models can be used for classification, biomarker selection, and functional analysis. These models are generally designed to handle only one type of data, however, and this limits their applicability to a large class of biological datasets with both continuous and discrete variables. To address this issue, we developed new methods for directed network recovery over mixed discrete and continuous data, and compared them to existing causal discovery algorithms. Our method first learns an undirected mixed graphical model (MGM) superstructure and then uses that as a starting point for PC-Stable or GES, two well-known causal network search algorithms. We tested our methods on a simulated datasets of continuous and categorical variables generated from a variety of network structures, which included both cycles and scale-free topologies. Our methods outperformed existing methods for both overall and directed edge recovery on these simulated data. When applied breast cancer data from The Cancer Genome Atlas (TCGA), our methods recovered relevant connections between RNA-seq variables and clinical variables for hormone receptor status and PAM50 subtype label.
O100 - Driving network identification for type 1 diabetes from the perspective of dynamic phase transition
Xujing Wang, The National Heart, Lung, and Blood Institute (NHLBI), United States
Short Abstract: In this study we describe our approach to identify disease-driving genetic network of Type 1 diabetes (T1D), by treating disease onset as a phase transition in a dynamic system and utilizing the knowledge of critical phenomena. First, we define a feasible search space through: (1) Identify a comprehensive list of ~5,000 genes, based on (a) candidate disease genes predicted by our mathematical model (Math BioSci, 2006), (b) differential expression in our cross-sectional studies of human T1D in over 240 subjects (Diabetes 2014), (c) Genome Wide Association Studies (GWAS) or over 5,000 cases, and (d) genes show progressive, time-course changes in at-risk subjects that eventually develop T1D. (2) Narrow down the list to a working search space of ~500 genes based their functional roles (regulatory, protein coding, etc), and network degree and betweenness centrality in protein-interaction and transcription-regulation network. Next, using a simulated annealing algorithm applied to time course (6-9 time points) transcriptome profiles of 22 at-risk subjects that progressed to T1D onset, we then searched for a subset of genes with highest autocorrelation increase as the subjects reached disease onset, as well as having highest inter-synchronization during the progression. The results were then evaluated through bootstrapping. We found that the top 50 candidates for the driving network are enriched for the GO biologically processes Innate and Adaptive Immunity, which agrees with the currently understanding of the disease etiology.
O101 - Unbiased Metabolic Pathway Analysis of Large Networks by Metabolomics Integration
Christian Jungreuthmayer, Austrian Centre of Industrial Biotechnology, Austria
Matthias Gerstl, Austrian Centre of Industrial Biotechnology, Austria
Juergen Zanghellini, Austrian Centre of Industrial Biotechnology, Austria
Short Abstract: In the presentation we will introduce the theoretical concept of our novel approach, discuss the main aspects of its numerical implementation and illustrate the biological relevance. Then, we will give a brief demonstration of our toolkit, which is open source software and freely available for everyone from our website. Our presentation will go beyond published work in that we show that the number of relevant pathways can be reduced even further. By means of a novel method based on linear programming we show that only small subsets of all pathways can simultaneously carry a thermodynamically feasible flux.
We identify these phenotypically relevant subsets in a medium scale E. coli model and show that they are characterized by their ability to maximize biomass and ATP production, consistent with evolutionary interpretations of cell behavior.
O102 - Comprehensive map of molecular interactions involved in response of innate immunity to tumour development : application for omics data analysis
Maria Kondratova, Institut Curie, France
Vassili Soumelis, Laboratoire d\\\'Immunologie Clinique, Institut Curie, France
Emmanuel Barillot, Institut Curie, INSERM U900, Ecole des Mines ParisTech , France
Andrei Zinovyev, Institut Curie, INSERM U900, Ecole des Mines ParisTech , France
Inna Kuperstein, Institut Curie, INSERM U900, Ecole des Mines ParisTech , France
Short Abstract: Variability in presence of particular innate immunity cellular subsets in tumor microenviroment is an important prognostic factor for many cancer types. However not only the cytological composition, but also the functional status (activated or inhibited) of those cells is important for creating a prognostic signature. Expression data of tumor samples contains this information, but in order to extract and interpret immunological signaling from genome-wide expression profiles, systematic and formalized representation of information about molecular mechanisms involved in tumor-immunity cross talk is needed.
Based on experimental data retrieved from literature by manual curation, we have constructed an integrated signalling networks of innate-immune response in tumor microenviroment. The map compiles information about three major players: macrophages, dendritic cells and natural killer cells. It represents both intracellular signaling involved in the response of each type of immune cells and intercellular signalling describing crosstalk between different immune cell types and tumor cells. Interactive version of the map is supported by NaviCell technology (1) and soon will be integrated into the Atlas of Cancer Signaling Networks (
The maps were applied for visualization and analysis of expression data from different molecular subtypes of breast-cancer tumors and allowed to demonstrate a significant correlation between activation of immunosuppressive signaling modules and tumor aggressiveness. The new resource is powerful and flexible tool for modelling pathways involved in tumorigenesis and immune response and expression data analysis.
1. Kuperstein I. et al. NaviCell: a web-based environment for navigation, curation and maintenance of large molecular interaction maps. BMC Syst. Biol. 7, 100 (2013) DOI: 10.1186/1752-0509-7-100
O103 - Modeling Wnt/β-catenin signaling
Annika Jacobsen, VU University Amsterdam, Netherlands
Reneé van Amerongen, Swammerdam Institute for Life Sciences, University of Amsterdam, Netherlands
Nika Heijmans, Swammerdam Institute for Life Sciences, University of Amsterdam, Netherlands
Martine J. Smit, Division of Medicinal Chemistry, VU University Amsterdam, Netherlands
Jaap Heringa, Centre for Integrative Bioinformatics (IBIVU), VU University Amsterdam, Netherlands
K. Anton Feenstra, Centre for Integrative Bioinformatics (IBIVU), VU University Amsterdam, Netherlands
Short Abstract: The Wnt/β-catenin signaling pathway is crucial for cell renewal, proliferation and differentiation during early development in stem cells, but should be attenuated in mature cell types. Ongoing Wnt/β-catenin caused by specific mutations plays an important part in oncogenesis. A better understanding of these signaling mechanisms in conjunction with known mutations is therefore crucial. We have constructed a Petri net model of Wnt/β-catenin signaling to capture the behavior from developmental and oncogenic signaling. Main players included are the Wnt receptor and the so-called destruction complex (DC) and two of its crucial components: AXIN2 and GSK3. Activated receptor sequesters the DC, thereby preventing it from marking β-catenin for degradation. The model includes an important feedback loop of upregulation of AXIN2 expression by β-catenin. Simulations of the model with Wnt stimulation or of GSK3 inhibition led to the following observations: 1) β-catenin levels by GSK3 inhibition were significantly higher than by Wnt stimulation, while transcriptional activity of AXIN2 was comparable for both conditions; this was validated by western blot and TCF/LEF luciferase reporter assay. This suggests that the low levels of β-catenin from Wnt stimulation are sufficient for transcriptional activation. 2) The feedback from AXIN2 only has a negative effect on Wnt ligand stimulation, where it restores AXIN cytoplasmic levels. 3) Using this model we also predicted the behavior of oncogenic Wnt/β-catenin signaling from different APC mutations found in breast and colorectal cancer, respectively. In summary, our model can be used to explain plausible underlying mechanisms for oncogenic signaling.
O104 - Systems Pharmacology as a tool for future therapy development: a feasibility study on the cholesterol biosynthesis pathway
Joanna Sharman, University of Edinburgh,
Helen Benson, University of Edinburgh,
Steven Watterson, University of Ulster,
Chido Mpamhanga, MRC Technology,
Christopher Southan, University of Edinburgh,
Peter Ghazal, University of Edinburgh,
Short Abstract: Although in its relative infancy, Systems Pharmacology has the potential to inspire a novel range of much needed medical interventions. Public databases such as the IUPHAR/BPS Guide to PHARMACOLOGY ( provide curated, quantitative information on the pharmacological effects of drugs and other chemical substances at their protein targets. Increased understanding of how these components come together in biological systems now affords the opportunity to quantify, predict and model the effects of drug administration on whole systems. Furthermore, we can go on to ask how multiple drugs can be used together in complement to reprogram whole systems, maximising efficacy and minimising side-effects, and examine the potential for therapeutic strategies to be adapted for personal genetic profiles.
Here, we review and explore the feasibility of such an approach at the current point in time. We undertake a systems analysis of the mevalonate branch of the cholesterol biosynthesis pathway and demonstrate that current data from the literature and online databases are not yet sufficient to facilitate a full Systems Pharmacology study (mainly due to their ad-hoc production and the lack of a systematic programme). We describe the limitations in the data (in particular lack of systematic reporting, incomplete coverage and inaccuracy or ambiguity of recording) and suggest future solutions, based on the introduction of systematic programmes and standards acceptable to both the pharmacology and computational biology communities.
O105 - Targeted destruction of receptors of cancer cells by porphyrins
Aram Gyulkhandanyan, Institute of Biochemistry of NAS of Armenia, Armenia
Grigor Gyulkhandanyan, Institute of Biochemistry of NAS of Armenia, Armenia
Short Abstract: The epidermal growth factor receptor (EGFR) is transmembrane protein and its overexpression affects on the state of the cell and leads to tumor growth. Upon binding of the natural peptide ligands to domains I and III of the extracellular domain of EGFR occurs a conformational rearrangements leading to dimerization of intracellular domains of receptors, which converts cells in oncological state. Together with scientists from the University of Nantes we have shown that some small compounds (non-peptide compound nitro-benzoxadiazolyl (NBD)) may purposefully bind to dimerization domain EGFR, what promotes the formation of stable dimers and launching of oncological process. By computer simulation method (molecular docking method), we showed that NBD have high affinity to different sites of EGFR, including to domains I and III. It has been shown that the highest affinity NBD showing to a site between two macromolecules of the extracellular domain of EGFR. On the other hand by the method of molecular docking we showed high affinity of EGF and NBD also with the cationic porphyrins and the formation of complex systems [NBD + porphyrin] and [EGF + porphyrin]. Porphyrins accumulate selectively in tumors and upon illumination promote generating of reactive oxygen species that result to the destruction of cancer cells. It allows assuming that the complex type [NBD + porphyrin] or [EGF + porphyrin] at affinity binding with the extracellular dimerization domains of EGFR and by photodynamic illumination, the reactive oxygen species can cause destruction of the domains of EGFR, prevent the dimerization process and cancer launch.
O106 - A Cytoscape plugin for the integration and visualization of time-series data in biological networks
Christian Nørskov, University of Southern Denmark, Denmark
Richard Röttger, University of Southern Denmark, Denmark
Jan Baumbach, University of Southern Denmark, Denmark
Short Abstract: Network visualization is often the first step of a network analysis. However, the visualization of biological time-series data still poses a major challenge: such a visualization has to preserve information about the network's topology as well as its changes over time as best as possible. This easily leads to high complexity and reduces interpretability.
To reduce this visual complexity, one can partition the given network into subgraphs which exhibit similar behavior over time. Clustering methods can be used to identify such subgraphs, where nodes of a single subgraph show a predefined time-series expression profile. Entities sharing the same profile might exhibit similar cellular functions and can give clues about the global behavior of the biological system in the given time points.
Here we introduce a new approach including a Cytoscape plugin, that can cluster time-series data into groups of time-series patterns. Further it identifies patterns that are overrepresented, and uses these patterns for visualization. The Cytoscape plugin gives a simplified visualization of the biological networks with highlights of groups following the same time-series pattern.
O107 - Harnessing a large collection of sparse and noisy gene perturbation data to discover mammalian causal gene regulatory networks
Djordje Djordjevic, Victor Chang Cardiac Research Institute, Australia
Andrian Yang, Victor Chang Cardiac Research Institute, Australia
Shu Lun Shannon Kwan, Victor Chang Cardiac Research Institute, Australia
Joshua W. K. Ho, Victor Chang Cardiac Research Institute, Australia
Short Abstract: Gene regulatory networks (GRNs) play a central role in systems biology. Recent research findings, including ours on mammalian causal GRNs[1], showed that it is virtually impossible to infer causal GRNs in eukaryotes without using gene perturbation data. Our initial work on manually curating over 6,000 experimental results from genetic or molecular perturbation data enabled us to infer >3,000 causal gene regulatory interactions among >1,000 genes across multiple tissues during mammalian embryonic development. This approach has already enabled us to uncover biologically useful causal GRNs for multiple organs. We present our results as web-based community resources for early tooth development (, ocular lens development and cataract formation (, and heart development and the investigation of congenital heart disease (
Realising that a vast amount of gene perturbation data exists in databases such as NCBI GEO, we have developed an automated pipeline to (i) identify studies that have employed a genetic or molecular perturbation experimental design (ii) accurately associate phenotypic annotation to each sample, and (iii) extract robust differentially expressed genes from each perturbation dataset. We further present a probabilistic model that can combine the diverse and noisy data to infer cell-type or other context specific causal GRN. Finally, we present a network-based algorithm to identify the minimum set of upstream regulators spanning a candidate gene set at multiple levels, and demonstrate the biological significance of the results.
[1] Djordjevic et al. (2014) How difficult is inference of mammalian causal gene regulatory networks? PLoS One, 9(11), e111661
O108 - A Logical Model of Hsp27 role in Prostate Cancer progression to Resistance
Elisabeth Remy, I2M, Université d’Aix-Marseille, CNRS, France
Abibatou Mbodj, I2M, Université d’Aix-Marseille, CNRS, France
Anaïs Baudot, I2M, Université d’Aix-Marseille, CNRS, France
Laurence Calzone, Institut Curie, France
Short Abstract: Hsp27 is a chaperone involved in numerous biological processes [1] and in many cancers, such as castration-resistant Prostate Cancers (PC) in which it has been found highly over-expressed. Although inhibiting Hsp27 with an antisense oligonucleotide provokes tumor regression [2], the precise functions and regulations of Hsp27 in PC remain unknown.
We constructed an over-simplified phenomenological model using logical formalism [3], and including key genes (Androgen Receptor, AKT, BCL2 and PTEN) necessary to roughly sketch the apparition and progression of PC: loss of PTEN (the cells proliferate), drug treatment (the cells die massively by apoptosis) and over-expression of BCL2 (inhibition of the apoptotic pathway, and treatment resistance).
This phenomenological model aims to ease the comprehension and the analysis of a more detailed model of Hsp27 regulations. Indeed, relying on an extensive literature survey, we devised a network containing dozens of genes and interactions from pathways involved in proliferation, survival and apoptosis, upstream and downstream of the Hsp27 protein. Finally, this detailed model is enriched with mass-spectrometry proteomics profiles recently generated for cell lines representing the different stages of PC progression. Overall, the Hsp27 models dynamical analyses allow proposing a sequence of events for the progression from healthy cells to hormone-sensitive cells to drug resistance and uncontrollable proliferation.

[1] Katsogiannou M et al. (2014) MCP. 13(12):3585-601
[2] Rocchi P et al. (2005) Cancer Res. 65 :11083-93.
[3] Thomas R., (1991) JTB 57: 247-76.
O109 - GXB: A Tool for Expression Analysis and Data Integration
Scott Presnell, Benaroya Research Institute, United States
Kelly Domico, Benaroya Research Institute, United States
Brad Zeitner, Benaroya Research Institute, United States
Anna Bjork, Benaroya Research Institute, United States
David Anderson, Benaroya Research Institute, United States
Elizabeth Whalen, Benaroya Research Institute, United States
Nicole Baldwin, Balyor Institute for Immunology Research, United States
Michael Mason, Benaroya Research Institute, United States
Cate Speake, Benaroya Research Institute, United States
Damien Chaussabel, Sidra Medical and Research Center, Qatar
Charlie Quinn, Benaroya Research Institute, United States
Short Abstract: Systems immunology based experiments generate enormous amounts of data. So much of the information contained in that data remains untapped. What is needed is the ability to extract other insights from the data, and to compare, and correlate with other sources of data. To this end, it is important to develop information systems and tools that enable investigators to integrate the different types of experimental data and to exploit the increasing amount of untapped data that are available both within research institutions and from public data sources.

We present a web-based analysis and visualization tool, designed to facilitate the integration and exploration of gene expression data and associated systems-level experiments. The Gene Expression Browser™ (GXB) provides a detailed evaluation of expression levels from both microarray and RNA-seq based transcriptome experiments. It supports the calculation and interactive display of differentially expressed gene lists, while also integrating results from immune-monitoring experiments. It simplifies the inclusion of clinical, demographic, and other laboratory data. In addition, GXB contains communication tools that facilitate the sharing of ideas and discoveries between investigators, thereby increasing the pace of discovery and research impact

The Gene Expression Browser is available for download at Github as gxbrowser.
O110 - PetriScape - A plugin for discrete Petri net simulations in Cytoscape
Diogo Marinho Almeida, syddansk universitet, Denmark
Richa Batra, Syddansk Universitet, Denmark
Vasco Azevedo, Universidade Federal de Minas Gerais, Brazil
Artur Silva, Universidade Federal do Para, Brazil
Jan Baumbach, Syddansk Universitet, Denmark
Short Abstract: Systems biology plays a central role for biological network analysis in the post-genomic era. Cytoscape is the standard bioinformatics tool offering the community an extensible platform for computational analysis of the emerging cellular network together with experimental omics data sets. However, only few apps/plugins/tools are available for simulating network dynamics in Cytoscape 3. Many approaches of varying complexity exist but none of them have been integrated into Cytoscape as app/plugin yet. Here, we introduce PetriScape, the first Petri net simulator for Cytoscape. Although discrete Petri nets are quite simplistic models, they are capable of modeling global network properties and simulating their behaviour. In addition, they are easily understood and well visualizable. PetriScape comes with the following main functionalities: (1) import of biological networks in SBML format, (2) conversion into a Petri net, (3) visualization as Petri net, and (4) simulation and visualization of the token flow in Cytoscape. PetriScape is the first Cytoscape plugin for Petri nets. It allows a straightforward Petri net model creation, simulation and visualization with Cytoscape, providing clues about the activity of key components in biological networks. PetriScape is publicly available as a app/plugin from the Cytoscape App Store.
O111 - Pathway-based integration of time-series omics data using public database knowledge
Tim Beissbarth, University Medical Center Goettingen, Germany
Astrid Wachter, University Medical Center Goettingen, Germany
Short Abstract: Background: The increased generation of omics data on different functional levels of the cell represents a constantly growing challenge for their analysis and interpretation. Time-series data add another dimension of complexity but likewise enable deeper characterization of biological processes.

Method: We developed a straightforward systems biology approach for the integrative analysis of time-series data from different high-throughput technologies based on pathway and interaction models from public databases. Implemented in our software tool pwOmics this approach performs pathway-based level-specific data comparison of coupled human proteomic and genomic/transcriptomic data sets. Separate downstream and upstream analyses are performed on the functional levels of pathways, transcription factors and genes/transcripts and integrated in the cross-platform consensus analysis. Via network reconstruction and inference methods (steiner tree, dynamic bayesian network inference) consensus graphical networks provide detailed insight into dynamic regulatory processes.

Results: With this approach we investigated a public data set comprising time-course mass-spectrometry and microarray data from EGF signaling in human mammary epithelial cells. We found a moderate intersection of signaling molecules on the different cellular levels pointing to the inherent biases of the measurement techniques. In addition regulatory time profiles could be identified that help understanding complex pathway interdependencies.

Conclusion: Integration of paired high-throughput time-series data enables a comprehensive interpretation of time-dependent signaling. Our approach exploits public database knowledge and paired omics data from different platforms in order to generate hypotheses on the succession of underlying pathway interplay mechanisms.

Keywords: high-throughput data, time-series experiment, cellular level, pathway, public database
O112 - Network Modeling Identifies Personalized Therapeutic Strategies in Glioblastoma
Nurcan Tuncbag, Massachusetts Institute of Technology, United States
Short Abstract: Glioblastoma multiforme (GBM) is the most common and aggressive type of malignant human brain tumor. Molecular profiling experiments using gene expression and proteomics have revealed that these tumors are extremely heterogeneous, and this heterogeneity is one of the principal challenges for developing targeted therapies. We hypothesize that despite the diverse molecular profiles, it might still be possible to identify common signaling changes that could be targeted in some or all tumors. Using a network modeling approach, we reconstruct the altered signaling pathways from tumor-specific phosphoproteomic data and known protein-protein interactions. We then develop a network-based strategy for selecting therapeutic targets that were predicted by the models but not directly observed in the experiments. Among these hidden targets, we show that the mitogen activated protein kinase kinase 1 (MEK1) displays increased phosphorylation in all tumor lines compared to normal cells. By contrast, we show that protein numb homolog (NUMB) is present only in a subset of the tumors. We evaluate clinical data and find that presence of NUMB is directly correlated with the invasiveness of the tumors. Overall, our results demonstrate that despite the heterogeneity of the proteomic data, network models can identify common or tumor specific pathway-level changes. These results represent an important proof of principle that can improve the target selection process for personalized medicine.
O113 - A local method for the evaluation of gene regulatory network inference based on three nodes graphlets
Alberto Jesus Martin, Computational Biology Lab (DLab), Chile
Calixto Dominguez, Bioinformatics and Genome Biology, Chile
Alejandro Bernardin, Computational Biology Lab (DLab), Chile
Tomas Perez-Acle, Computational Biology Lab (DLab), Chile
Short Abstract: Motivation: Networks are abstract representations widely used to depict complex biological systems. In Gene Regulatory Networks (GRNs), whereas vertices or nodes represent genes, the connections among them represent the existence or absence of a regulatory interaction. Importantly, GRNs are composed of basic building units; small induced subgraphs called Graphlets. Graphlets denote local interconnectivity patterns describing functional associations of network components. Notably, the performance of methods for the inference of GRNs is commonly estimated without considering graphlets. The prevalent approach solely relays on the existence of single edges disregarding network functional units represented by graphlets. Therefore, we propose an assessment method that focuses on the comparison of graphlets occurring in the gold standard as well as in inferred GRNs. Under our approach, the existence or absence of graphlets is treated as a binary classification problem where common metrics can be applied. Considering the biological importance of graphlets in GRN, we created a quantitative metric, REC, that measures the rate of graphlets reconstruction.

Results: All metrics were applied to re-evaluate the inference of whole genome GRNs from the DREAM5 experiment. Our data suggest that the evaluation of methods for the inference of GRNs should include REC as a key element to be assessed, since the prevalent metrics tend to overpass the existence of the functional local structures encoded by graphlets. Importantly, our approach could also be used as a quantitative metric to conduct differential analysis of networks representing biological system under different conditions.

View Posters By Category

Search Posters: