Home

Accepted Posters

If you need assistance please contact submissions@iscb.org and provide your poster title or submission ID.

Track: Network Biology

Session A-192: Network-based identification of adaptive pathways associated with an adaptive phenotype in experimentally evolved populations

COSI: NetBio

Bram Weytjens, Gent University, Belgium
Camilo Perez Romero, Gent University, Belgium
Kathleen Marchal, Gent University, Belgium

Session A-194: HoloNet: Knowledge-Driven Holographic Visualization of Complex Biological Networks in Mixed Reality

COSI: NetBio

Dmitry Korkin, Worcester Polytechnic Institute, USA
Pavel Terentiev, Worcester Polytechnic Institute, USA

Short Abstract: We develop a proof-of-concept in visualization and manipulation of static and dynamically changing holographic molecular networks in 3D space using Microsoft HoloLens mixed reality. We apply our approach to study protein interaction networks centered around complex genetic disorders.

Session A-196: Topology Preservation of Disease-specific Networks via TFmiR

COSI: NetBio

Maryam Nazarieh, Saarland University, Germany
Hema Sekhar Reddy Rajula, Saarland University, Germany
Rahmad Akbar, Saarland University, Germany
Volkhard Helms, Saarland University, Germany

Session A-198: Simulating Differential Network Analysis

COSI: NetBio

Yvonne Lichtblau, Humboldt-Universität zu Berlin, Germany
Ulf Leser, Humboldt-Universität zu Berlin, Germany

Short Abstract: Differential network analysis (DiNA) denotes a recent class of network-based Bioinformatics algorithms which focus on the differences in network topologies between two states of a cell, such as healthy and disease, to identify key players in the discriminating biological processes. One major advantage of DiNA algorithms over conventional differential expression is that they identify changes in the interplay between molecules rather than changes in single molecules. Here, we perform simulation studies to evaluate ten different DiNA algorithms regarding their ability to recover genetic key players. Therefore, we construct random scale free correlation networks of different sizes. In the disease networks we perturb the correlations between a different number of genes and their neighbors to varying degrees. Given the covariance structure of the generated networks we simulate the corresponding expression data with the Cholesky decomposition. For each combination of parameter settings we perform 100 runs for each DiNA algorithm. The best performing algorithms are the local algorithm LS and the hybrid algorithm DiffRank. They have a highly significant recovery rate and are less dependent on the size of the network and the number of genes with changed correlation. We also find that the changed genes mostly do not show a change in their expression, which makes it impossible to find them with conventional differential expression analysis. Our simulations underline the advantages of comprehensive cell models for the analysis of transriptomics data.

Session A-200: Target deconvolution through network analysis of drug induced transcriptome data

COSI: NetBio

Hyesoo Jung, Ewha Womans University, South Korea
Wankyu Kim, Ewha Womans University, South Korea

Short Abstract: In the early drug discovery process, phenotypic screening is considered as an alternative method to target-based screening. Target-based screening requires rigorous validation of identified targets, which is challenging due to the complexities of underlying biology. Phenotypic screening directly identifies hit compounds of the desired phenotypic response. The next step is to identify targets underlying the overserved pharmacological outcome, or target deconvolution. Here, we applied a network diffusion approach to predict unknown targets of drugs based on drug-induced transcriptome dataset (CMAP) and several types of gene-gene network. Our analyses confirm that the gene expression is more frequently changed among the proximal genes to the known drug targets than randomly selected genes in the networks tested. It suggests that network diffusion approach may be useful in the target deconvolution of phenotypic screening hits.

Session A-202: Consistent regulation patterns in signed causal gene networks

COSI: NetBio

Andreas Kraemer, Qiagen Bioinformatics, USA

Short Abstract: Literature-based networks derived from observed activating and inhibiting causal relationships between genes are "signed", i.e. each edge is associated with a sign +1 or -1, determining whether the activation states ("activated" or "inhibited") of connected genes are correlated or anti-correlated. We use the Ingenuity Knowledge Base (QIAGEN) to construct a large-scale signed gene network, and apply a statistical physics approach employing quenched Monte Carlo simulations to discover gene sets ("modules") whose activity patterns are most consistent with the underlying signed network structure. In a second step, these modules are connected to consistent upstream regulators and downstream functions, allowing the construction of high-level context-specific causal networks for exploration and visualization of the underlying biology.

Session A-204: Discovery of Adaptive Resistance Pathways and Anti-Resistance Combination Therapies in Cancer from Phosphoproteomic Data

COSI: netbio

Augustin Luna, DFCI/HMS, USA
Ozgun Babur, OHSU, USA
Emek Demir, OHSU, USA
Gordon Mills, UT MD Anderson Cancer Center, USA
Chris Sander, DFCI/Harvard Medical School, USA
Anil Korkut, UT MD Anderson Cancer Center, USA

Session A-206: Gene regulatory network inference from time-series gene expression data using boosted trees

COSI: NetBio

Sungjoon Park, Korea University, South Korea
Wonho Shin, Korea University, South Korea
Minji Jeon, Korea University, South Korea
Jaewoo Kang, Korea University, South Korea

Short Abstract: A gene regulatory network (GRN) is a biological network expressing the regulatory relationship between a transcription factor (TF) and a target gene. Identifying GRNs is a crucial problem for understanding biological systems. Using high throughput techniques such as microarray and RNA-Seq, gene expression data can be easily obtained and used to reconstruct GRNs. However, to more precisely capture the characteristics of a dynamic biological system, analyzing GRNs using time-series data is necessary. Various models that infer GRNs using time-series gene expression data have been proposed. However, each model has been validated on only a limited number of benchmark datasets. We first integrated all the benchmark time-series gene expression datasets from previous studies and reassessed the baseline models. We observed that the bagging based tree ensemble model GENIE3-time achieved the best performance on the integrated dataset. GENIE3-time basically computes the regulatory score between a target gene and its TFs using feature importance score of the Random Forest (or Extra-Trees). To improve the performance of GENIE3-time, we applied Boosted Trees, powerful boosting based ensemble tree models, when calculating the feature importance scores. We evaluated our models on the integrated benchmark dataset and achieved the best results on both AUROC (area under the receiver operating characteristic curve) and AUPR (area under the precision-recall curve) scores. Furthermore, we ranked the scores for all datasets and showed the best average rank, demonstrating the robustness of our model.

Session A-208: The New CyREST: Keys to Economically Delivering Complex and Reproducible Network Biologic Workflows

COSI: NetBio

Barry Demchak, UC San Diego, United States
Keiichiro Ono, UC San Diego, United States
David Osatek, UC San Diego, United States
Eric Sage, UC San Diego, United States

Short Abstract: The booming popularity of analytics authoring and delivery systems such as Jupyter and RStudio has enabled bioinformatic programmers to create, distribute and improve novel workflows more quickly and economically than ever before. While languages such as Python and R have access to robust and performant libraries that implement general graph operations, such libraries lack support for network biologic operations such as enrichment, complex clustering, complex layouts and visual styling, publication support, and biologic database access. To date, we have positioned Cytoscape to provide basic network construction, styling and layout capabilities via the CyREST system, which consists of language-specific libraries that broker Cytoscape functions across a REST-based network connection. In our latest work, we have extended the CyREST repertoire to enable access to the large collection of biologically relevant Cytoscape apps thus far available only to interactive users. These include complex clustering, heat propagation, network alignment, pathway analysis, regulatory interaction attributes, enrichment and ontology analysis, among others. Finally, the Cytoscape Cyberinfrastructure enables bioinformaticians to author new network analyses functions in the language of their choice (e.g., Python, golang, C++), deploy them as services in a scalable cluster, and make them available to Cytoscape as apps callable via CyREST. This extends Cytoscape to leverage large memory and CPU farms previously out of reach. By exposing Cytoscape’s app ecosystem and flexible, scalable network-biologic web services, we enable network biologists to now author and distribute complex, auditable, and reproducible workflows without first redeveloping Cytoscape functionality, and yet still leverage highly capable web services.

Download

Session A-210: Project Survival, characterizing pancreatic cancer through the combination of multiomic screening and AI-reasoning

COSI: NetBio

Eric Milliman, Berg Health, United States
Leonardo Rodrigues, Berg Health, United States
Rangaprasad Sarangarajan, Berg Health, United States
John Crowley, Cancer Research and Biostatistics, United States
Amy Stoll-D’astice, Cancer Research and Biostatistics, United States
Valerie Bussberg, Berg Health, United States
Cindy Nguyun, Berg Health, United States
Kiki Panagopoulos, Berg Health, United States
Fei Gao, Berg Health, United States
Vladimir Tolstikov, Berg Health, United States
Emily Chen, Berg Health, United States
Eric Grund, Berg Health, United States
Vivek Vishnudas, Berg Health, United States
Michael Kiebish, Berg Health, United States
Manual Hidalgo, Beth Israel Deaconess Medical Center and Harvard Medical School, United States
Niven Narain, Berg Health, United States
A. James Moser, Beth Israel Deaconess Medical Center and Harvard Medical School, United States
Viatcheslav Akmaev, Berg Health, United States

Short Abstract: High throughput assays (e.g. sequencing and mass spectrometry) generate large, comprehensive, and rich datasets. Analyzing different ‘omics data streams together provides important insight into the behavior/regulation of a biological system. However, the integration and analysis of multiple ‘omics data remains a challenge and is an active area of research. Probabilistic graphical models (PGMs), e.g. Bayesian networks, are an intuitive way to represent the uncertainty and complexity of biological systems in an unbiased manner. PGMs provide a method to understand the causal relationships between diverse sets of data, such as molecular, phenotypic and clinical data. As such, drivers of outcomes can be analyzed, which represent potential biomarkers and therapeutic targets of a phenotype/disease. Project Survival is a collaborative longitudinal study between Berg, BIDMC, PCRT and CRAB to identify and validate clinical biomarkers and additional diagnostic and therapeutic molecules to improve outcomes for patients with pancreatic adenocarcinoma, which is an unmet medical need. This project’s goal is to discover and implement effective companion diagnostic panels to stratify pancreatic cancer patients based on expected therapy outcomes and, hence, define custom treatment strategies. Currently, we have interrogated the lipidome, metabolome and proteome of 164 patients diagnosed with various forms of pancreatic disease. Using our Bayesian AI platform, bAIcis™, we have assessed associations between molecular and clinical data streams. Molecular drivers of clinical endpoints have been identified and analyzed to rank potential biomarkers, which will be validated in future patients as part of Project Survival.

Session A-212: Orthology-based pathway comparison between human and animal models

COSI: NetBio

Nadezhda T. Doncheva, University of Copenhagen, Denmark
Jan Gorodkin, University of Copenhagen, Denmark
Lars Juhl Jensen, University of Copenhagen, Denmark

Short Abstract: Animal models are very important for the study of human diseases and for the development of new treatment therapies. However, it is challenging to identify the regulatory genes and pathways in an animal model that would be useful to generate reliable hypotheses about a phenotype of interest in human. In order to transfer knowledge from one species to another, it is crucial to take into account the intrinsic differences in regulation between human and animal models as well as the mechanisms underlying the specific phenotype. Here, we present a comprehensive framework for the comparison of pathways in mammalian organisms. Our analysis provides different levels of detail based on the integration of different types of data and network analysis techniques. To identify and visualize the overlap of already existing pathways, we use the orthology relationships between participating genes as indicated by the eggNOG database (http://eggnogdb.embl.de). We further integrate the pathways with tissue expression data from the TISSUES database (http://tissues.jensenlab.org/), which covers several mammalian organisms. In particular, we highlight the differences and similarities of genes involved in KEGG signaling pathways (http://kegg.jp/) between different tissue types for four organisms of interest (human, mouse, rat and pig).

Session A-214: Machine learning in the prediction and construction of protein-protein interaction networks of Cryptococcus spp.

COSI: NetBio

Henrique Toledo, Centro de Pesquisas René Rachou, Brazil
Jeronimo Ruiz, Centro de Pesquisas René Rachou, Brazil

Short Abstract: The etiological agents of cryptococcosis are species of the fungus Cryptococcus spp. The disease typically affects immunocompromised patients and represents a neglected public health problem. It is estimated that approximately 500,000 people die annually of cryptococcosis, most of them with weakened immune system. Cryptococcosis treatment efficiency is low and the collateral effects can be severe. The search for new drugs targets that could contribute to cryptococcosis control can be made through analysis of protein–protein interactions (PPIs) networks. Nevertheless, PPIs data for Cryptococcus genus are scarce and laboratorial methods for high-throughput PPIs identification are very expensive, time-consuming and can produce a high number of false-positives and false-negatives. In this study we are developing an ab initio approach for Cryptococcus gattii and Cryptococcus neoformans PPIs prediction using machine learning techniques. The Weighted Sparse Representation based Classifier (WSRC) methodology that has been reported with accuracy estimates of 97% was used together with Global Encoding vector in order to build predictive models. As training dataset Saccharomyces cerevisiae PPIs data were used. Networks predicted were analyzed using Cytoscape software and they have the potential of expand the biological knowledge of Cryptococcus spp. and catalyze new discoveries.

Session A-216: Network analysis prioritizes molecular mechanisms through which mutant p53R172H increases the apoptosis resistance of Pancreatic Ductal Adenocarcinoma

COSI: NetBio

Frederick Kleine, Independent researcher, Germany

Short Abstract: The high lethality of Pancreatic Ductal Adenocarcinoma (PDAC) is partially due to its intrinsic apoptosis resistance. Previous studies demonstrated that PDAC cells with p53R172H alleles show increased apoptosis resistance to those with p53 wild-type alleles. To investigate the underlying molecular mechanisms, we inferred the regulatory network through which mutant p53R172H alters the expression of apoptosis genes and prioritized the information flow. Apoptosis genes were determined in the differentially expressed genes between 32 p53 wild-type and 15 p53R172H PDAC samples from a Cre-loxP mouse model. On their basis, our published regulatory model of mutant p53R172H that was reconstructed from the same dataset was minimized to the apoptosis relevant regulatory network. Topological and distance-based methods were then employed to prioritize the importance of the dysregulated transcription factors and miRNAs in mediating the expression changes. Our study identified 66 genes implicated in apoptotic processes and the reconstructed apoptosis network contained a regulatory prediction for 53 of them. The network analysis prioritized that the mutant p53R172H-induced dysregulation of p53, Ctnnb1, Sp1 and of the miR-297-669 cluster has a strong effect on the apoptosis resistance of p53R172H cells. Analyzing the distribution of apoptosis genes whose expression changes have a positive or negative effect on apoptosis indicated further that the loss of p53 wild-type gene regulation, the modulation of Ctnnb1 and the down-regulation of miR-297-669 cluster have an apoptosis-suppressing effect.

Session A-218: R package for network-guided Genome-Wide Association Studies

COSI: NetBio

Chloé-Agathe Azencott, CBIO (Mines ParisTech - Institut Curie - INSERM), France
Héctor Climente Gonzalez, CBIO (Mines ParisTech - Institut Curie - INSERM), France

Short Abstract: Identifying the similarities, at the molecular level, between patients who exhibit similar susceptibilities is necessary to understand the differences in disease predisposition between individuals. Over the last decade, genome-wide association studies (or GWAS) have become one of the prevalent tools for detecting genetic variants correlated with a phenotype. GWAS have provided novel insights into the mechanisms of many common human diseases. However, a number of frustrating results have also been reported. Indeed, the genetic variants they have uncovered often fall short of explaining all of the phenotypic variation that is known, from family studies, to be inheritable. A key reason for this "missing heritability" is the statistical difficulties of analyzing data with orders of magnitudes more features than samples. One way to address this problem is to reduce the dimensionality of the space of solutions by means of prior biological knowledge. Such knowledge can in particular be given by biological networks, which provide a means to take a more holistic view of the problem. In this context, SConES (Selecting CONected Explanatory SNPs) was proposed a few years ago for quantitative phenotypes to look for SNPs that are both informative and connected on an underlying network. Here, we present a R package that facilitates the usage of SConES for bioinformaticians. It incorporates statistical tests for the case/control setting and a BIC-based approach for the selection of appropriate hyperparameters that leads to improved performance on simulated data.

Download

Session A-220: Impact of network clustering methods on pathway enrichment analysis tools

COSI: NetBio

Miguel Castresana, Scilifelab, University of Stockholm, Spain
Christoph Ogris, Scilifelab, University of Stockholm, Sweden
Sam De Meyer, Scilifelab, University of Stockholm, Sweden
Erik Sonnhammer, Scilifelab, University of Stockholm, Sweden

Short Abstract:   Pathway enrichment analysis has become a fundamental tool to gain insight into the underlying biological relation between e.g. differentially expressed genes and biological pathways. Network-based methods have been demonstrated to outperform simpler methods based on gene overlap. However, gene sets derived from experiments are often complex and influenced by noise, decreasing analysis accuracy. The complexity, i.e. that multiple pathways are affected, incompleteness of known pathways, and noise can lead to failure in detecting true pathways. Therefore,clustering was applied on the gene sets and pathway analysis techniques were performed on the separate modules with the objective to increase the sensitivity. The impact of clustering was benchmarked with different pathway enrichment analysis tools.

Session A-222: Controllability and identifying drug targets of metabolic networks in cancer

COSI: NetBio

Dwitiya Tyagi, Åbo Akademi, Finland
Krishna Kanhaiya, Åbo Akademi, Finland
Ion Petre, Åbo Akademi, Finland

Short Abstract: Genome-scale metabolic models have been proven to be valuable for defining cancer or to indicate the severity of cancer. However, identifying effective metabolic drug-target of the active small-molecule compound are difficult to unravel and there are still unmet challenges to solve. In this study, we propose a network analysis of enzyme- and metabolite-centric networks to identify targets on breast cancer data. We have applied topological network analysis to identify clusters, which are crucial in controlling the system and providing useful parameters for defining the relationships between topological features and drug-targets. We show that both enzymes and metabolites can be effective targets, and high degree metabolites can be driver nodes in the network. We also perform a comparative analysis between the analysis of cancer networks and that of corresponding random networks to measure the set of predicted drug-targets changes in the system. Furthermore, principal component analysis defines the overall metabolic states in the systems, and correlation analysis identifies the link between drug-target and enzymes. Overall, we show that a better understanding of the metabolic networks of cancer by use of statistical modeling could be useful in drug-target identification for efficient therapeutic approaches and personalized medicine.

Session A-224: ndexr – an R package to interface with the Network Data Exchange

COSI: NetBio

Florian Auer, Department of Medical Statistics, University Medical Center Göttingen, Germany
Dexter Pratt, Department of Medicine, University of California San Diego, United States
Trey Ideker, Department of Medicine, University of California San Diego, United States
Frank Kramer, Department of Medical Statistics, University Medical Center Göttingen, Germany

Short Abstract: Motivation: Seamless exchange of biological network data enables bioinformatic algorithms to integrate networks as prior knowledge input as well as to document resulting network output. However, the interoperability between pathway databases and various methods and platforms for analysis is currently lacking. NDEx, the Network Data Exchange, is an open-source data commons that facilitates the user-centered sharing and publication of networks of many types and formats. Results: Here, we present a software package that allows users to programmatically connect to and interface with NDEx servers from within R. The network repository can be searched and networks can be retrieved and converted into igraph-compatible objects. These networks can be modified and extended within R and uploaded back to the NDEx servers. Availability: ndexr is a free and open-source R package, available via GitHub (https://github.com/frankkramer-lab/ndexr) and has been submitted to Bioconductor.

Download

Session A-226: Interaction between polyglutamine genes ATN1 and ATXN2 coincides with regions associated to Huntington's Disease

COSI: NetBio

Arlin Keo, LUMC / TU Delft, Netherlands
Ahmad Aziz, LUMC, Netherlands
Oleh Dzyubachyk, LUMC, Netherlands
Jeroen van der Grond, LUMC, Netherlands
Willeke Roon-Mom, LUMC, Netherlands
Boudewijn Lelieveldt, LUMC / TU Delft, Netherlands
Marcel Reinders, TU Delft / LUMC, Netherlands
Ahmed Mahfouz, LUMC / TU Delft, Netherlands

Short Abstract: Expansions of the CAG repeat in nine polyglutamine (polyQ) genes (HTT, ATXN1, ATXN2, ATXN3, CACNA1A, ATXN7, ATN1, AR, and TBP) cause neurodegenerative diseases including Huntington's disease (HD) and spinocerebellar ataxias (SCAs). These polyQ diseases are characterized by different patterns of brain atrophy. The expanded CAG repeat length in the causal gene negatively affects the age-at-onset (AAO). Additionally, the CAG repeat length in other polyQ genes, other than the causal gene, also affects AAO, suggesting functional associations between the polyQ genes. However, there is no detailed assessment of the interactions among polyQ genes in pathologically relevant regions of the brain. We used gene co-expression analysis to study the functional relationships between polyQ genes in different brain regions using the Allen Human Brain Atlas (AHBA), a spatial map of gene expression in the healthy brain. We constructed co-expression networks for seven anatomical structures as well as a region associated to magnetic resonance imaging (MRI). In the HD-associated region, we found that ATN1 and ATXN2 are co-expressed and functionally related through other co-expressed genes. We observed the same association in the frontal lobe, parietal lobe, and striatum, which are structures involved in HD pathology. Across the brain, the two genes also share many co-expressed genes with HTT forming a triangular relationship. The observed interactions of HTT, ATN1, and ATXN2 may be dysregulated in all polyQ diseases. However, the stronger interaction between ATN1 and ATXN2 observed in the HD-associated region and even more strongly in striatum may be more specific to HD.

Session A-228: A pipeline for the simulation of integrated KEGG pathways

COSI: NetBio

Adva Yeheskel, The Bioinformatics Unit, Faculty of Life Sciences, Tel Aviv University, Israel
Adam Reiter, School of Computer Science, Faculty of Exact Sciences, Tel Aviv University, Israel
Metsada Pasmanik-Chor, The Bioinformatics Unit, Faculty of Life Sciences, Tel Aviv University, Israel
Amir Rubinstein, School of Computer Science, Faculty of Exact Sciences, Tel Aviv University, Israel

Short Abstract: Drugs are designed to target specific genes or pathways. However, researchers and clinicians often observe unexpected severe changes and side-effects, resulting from drug-gene unknown interactions. Moreover, not all the molecular events taking part in the drug’s response are always fully known. Our goal was to implement a pipeline for predicting the molecular effects of drug on complex networks. We have built a simple, free and intuitive Cytoscape application [1] designed for non-programmer biologists. The pipeline enables importing and merging of multiple pathways from the KEGG database [2], loading expression levels from experiments\databases such as GEO (NCBI), and network behaviour’s simulation using the BioNSi (Biological network simulation) tool [3]. Heregulin (HRG) is known to bind and inhibit ErbB receptors, and is therefore known as a target for cancer therapy. ErbB signaling, MAPK and EGFR pathways were selected for simulation analysis of the molecular events following HRG drug’s effect. Microarray gene expression data was obtained for control and for heregulin-treated MCF7 cells (GSE6462 [4]). Simulation of in-silico HRG inhibition on control expression data was performed. The pipeline’s multi-pathway HRG simulation enables prediction and visualization of network dynamics, and presents central hub genes, many common to the gene expression experimental results, that suggest a model of the molecular mechanism of HRG drug effect. References: 1. Shannon P. et al. (2003) Genome Research 13, 2498. 2. Minoru K. et al. (2017) Nucleic Acids Res 45, D353. 3. Rubinstein,A. et al. (2016) J. Proteome Res. 15, 2871. 4. Kim,Y. et al. (2011) Bioinformatics 27, 391.

Session A-230: Semantic modeling of protein-protein interactions

COSI: NetBio

Laleh Kazemzadeh, Insight Centre for Data Analytics, Ireland
Md. Rezaul Karim, Insight Centre for Data Analytics, Ireland
Dietrich Rebholz-Schuhmann, Insight Centre for Data Analytics, Ireland
Frank Barry, Regenerative Medicine Institute (REMEDI), Ireland

Short Abstract: The amount of biomedical data produced by DNA-sequencing, by curated knowledge on diseases mechanisms and treatments, by pharmaceutical research and by many other data generation studies, is escalating with high pace. The wealth of biomedical data is a precious resource for integrative research studies which draw conclusions through the analysis of multiple heterogeneous data for knowledge discovery. However, data integration and in addition data interpretation have to overcome a number of hurdles, which result from the characteristics of the biomedical data sources including challenges from data diversity in protein labelling, data consistency, analogy, availability and interoperability. The aim of this work is harnessing the capability of the big biomedical data by integrating its artefacts through the application of Semantic Web and the Linked Data principles in the extraction of potential protein-protein interactions. A semantic model for protein-protein interaction networks has been developed in this work which is used in order to identify explicit knowledge on protein interactions. The data is exposed as visual analytics platform, LinkedPPI, which is optimised for intuitive data exploration. A selection of potential protein interactions has then been experimentally validated in order to demonstrate the validation of the predicted interactions. The positive outcomes of the experimental validation demonstrate that the semantically integrated data forms effective means in the selection of most relevant and yet unknown protein interaction candidates.

Session A-232: Evaluation of network reconstruction methods and their application to early embryonic stem cell dynamics

COSI: NetBio

Robert Sehlke, Max Planck Institute for Biology of Ageing, Germany
Marius Garmhausen, CECAD, University of Cologne, Germany
Peri Tate, Sanger Institute, United Kingdom
Frank Buchholz, TU Dresden, Germany
Francis Stewart, TU Dresden, Germany
Bill Skarnes, Sanger Institute, United Kingdom
Andreas Beyer, CECAD, University of Cologne, Germany

Short Abstract: The reconstruction of gene-regulatory networks from incomplete molecular data remains one of the most challenging tasks in systems biology. Frequently, the only data available are medium- to large-scale gene perturbation assays, resulting in generally underdetermined systems. The problem is compounded further by the tendency of common classifiers to identify spurious edges due to transitivity. To address these issues, regularized regression methods such as LASSO have been widely employed, alongside tools aimed explicitly at the identification of spurious relationships. Here, we systematically evaluate variants of LASSO regression in conjunction with stability selection. To further improve prediction accuracy, we compare recent methods for the discrimination of direct from indirect relationships: regularized partial correlations, distance partial correlation, local transitive reduction, and network deconvolution. We take note of interesting properties of these methods, e.g. a bias of regression-based methods that causes negatively correlated nodes to be discovered at a lower frequency but with much higher precision. Combining these tools into a pipeline, we demonstrate good performance on in-silico gene perturbation data at different levels of sparsity, in the presence of non-linear effects. We then applied this pipeline to an RNAi assay of 116 selected genes in mouse embryonic and epiblast-derived stem cells. After recovering many known and putative players of early lineage determination and their dynamic associations in each condition, we validated them using independent time courses of stem cell differentiation.

Session A-234: DeepGRN: Deciphering gene deregulation in cancer development using deep learning

COSI: NetBio

Roland Mathis, IBM Research Zurich, Switzerland
Matteo Manica, ETH-IMSB//IBM Research, Switzerland
Maria Rodriguez Martinez, IBM, Zurich Research Laboratory, Switzerland

Short Abstract: Understanding gene regulatory networks (GRNs) is key towards deciphering gene deregulation in cancer development. We are building on previous efforts to find tissue-specific and disease-specific gene regulatory networks (FANTOM5). While large efforts have been devoted to create context specific GRNs for a range of tissues as well as diseases, most currently available cancer GRNs are inferred from unmatched datasets for which only the diseased tissue is available. Our goal is to find disease-specific changes of gene regulation using matched normal and tumor patient data in a cohort-specific fashion. We propose DeepGRN, a deep learning model that enables us to find cohort-specific disease-induced changes in the GRN’s of cancer patients by learning the interactions from RNASeq measurements and reported tissue-specific interactions. We apply DeepGRN to two prostate cancer cohorts: TCGA-PRAD and ProCOC (Zurich University Hospital). For each prostate-specific interaction reported in FANTOM5 we use as features the joint RNASeq measurements from the two interacting genes of the patients in a given cohort. We then train a deep learning network on the data from normal patient samples to learn transcription factors to target interactions in disease-free state. Once the model has been trained on normal samples, we predict the GRN and their deregulation for the tumor samples, highlighting differences in the regulation process between normal and tumor samples. DeepGRN can be used to generate hypotheses for detection of new biological processes relevant for cancer onset and development and puts forward a novel approach to drive drug discovery and suggest targeted therapies.

Session A-236: Access and Discover Biological Pathway Information from Pathway Commons

COSI: NetBio

Gary Bader, University of Toronto, Canada
Emek Demir, Oregon Health & Science University, United States
Ozgun Babur, Oregon Health and Science University, United States
Chris Sander, cBio Center at the Dana-Farber Cancer Institute and at Harvard Medical School, United States
Igor Rodchenkov, University of Toronto, Canada
Augustin Luna, cBio Center at the Dana-Farber Cancer Institute and at Harvard Medical School, United States
Jeffrey Wong, University of Toronto, Canada

Short Abstract: Pathway Commons (www.pathwaycommons.org/) serves researchers by integrating data from public pathway and interaction databases and disseminating it in a uniform fashion. The knowledge base is comprised of metabolic pathways, genetic interactions, gene regulatory networks and physical interactions involving proteins, nucleic acids, small molecules and drugs. Alongside attempts to increase the scope and types of data, a major focus has been the creation of user-focused tools and resources that facilitate access, discovery and application of existing pathway information to facilitate day-to-day activities of biological researchers. For those wishing to browse and discover pathways within the collection, we offer a web-based ‘Search’ application that enables users to query by keyword and visualize ranked search results. ‘PCViz’ is a web tool that accepts gene names and returns a customizable interaction network visualization based upon pathway data resources. These complement existing desktop software add-ons linking Pathway Commons to the Cytoscape (CyPath2) network analysis tool and the R (paxtoolsr) programming language. To facilitate analysis and interpretation of experimental data - for instance, enrichment studies that distill pathway alterations from underlying gene expression changes - pathway data file downloads can be directly used in software tools such as Gene Set Enrichment Analysis. For those wishing to learn more about pathway resources and analysis, an online ‘Guide’ includes case studies and guided workflows. Ongoing development of web apps will enhance the accessibility to pathways and integrate support for visualization and interpretation of experimental data.

Download

Session A-238: Network analysis of trans-meQTL using a Bayesian approach to Gaussian Graphical Models

COSI: NetBio

Johann Hawe, Helmholtz Zentrum München, Germany
Benjamin Lehne, Imperial College London, United Kingdom
Melanie Waldenberger, Helmholtz Zentrum München, Germany
Christian Gieger, Helmholtz Zentrum München, Germany
John Chambers, Imperial College London, United Kingdom
Matthias Heinig, Helmholtz Zentrum München, Germany

Short Abstract: Although studied deeply, the identification of regulatory mechanisms from biological data still is challenging, especially across different types of ‘-omics’ data, which is growing in size and complexity. Specifically, we are interested in identifying the regulatory mechanisms underlying trans-acting meQTL hotspots from multi-omics (genotype, gene expression and CpG methylation) data. While the integration of different layers of information in ‘multi-omic’ studies has the potential to yield an almost complete view of the underlying biological system, the statistical methods to fully use this potential are still lagging behind. Here we present an approach based on Bayesian Gaussian Graphical Models (GGMs), which we extend to include non-uniform, data-driven priors to identify regulatory networks from multi-omics data. In general, GGMs impose a sparse graph structure on the underlying data by the use of partial correlations. This Bayesian approach to GGMs is based on a Markov-Chain-Monte-Carlo method, where graphs are scored depending on the data and the given prior information for that graph. Assuming independence between edges, the nature of the algorithm allows us to express graph specific priors by edge specific priors. We devise priors for the three different edge types derived from the data: SNP-gene, gene-gene and CpG-gene priors. For the SNP-gene priors, we estimate the probability of observing an expression quantitative trait locus (eQTL) based on the GTEx eQTL data. Similarly, we define priors for CpG-Gene links by using expression quantitative trait methylation information (eQTM) reported in an independent study. Gene-Gene priors are created by integrating protein-protein interaction (PPI) information from STRING with the GTEx gene expression data. By implementing non-uniform graph priors, we extended a Bayesian GGM framework which allows for the identification of direct associations between genotypes, gene expression and CpG methylation levels. This approach eases the integration of different kinds of ‘-omics’, thereby making it possible to extract regulatory networks from those data whilst taking into account already established biological knowledge as well as other, independent data. In the end, those networks aid the interpretation of complex multi-omics data and give insights into underlying regulatory mechanisms which might explain studied phenotypes or diseases.

Session A-240: Protein-protein interaction network in herpes simplex virus type 1

COSI: NetBio

Anna Hernandez Duran, Institute of Structural and Molecular Biology, Birkbeck College, University of London, United Kingdom
Kay Grünewald, Oxford Particle Imaging Centre, Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, United Kingdom
Maya Topf, Institute of Structural and Molecular Biology, Birkbeck College, University of London, United Kingdom

Short Abstract: Over 90% of the worldwide population is infected by at least one species of human herpesviruses, members of the Herpesviridae family. The currently eight known human species differ in infective strategies, but they all cause lifelong infections with sporadic reactivations. Their symptoms range from mild to severe, including encephalitis and neurological disabilities. The high prevalence and possible severity of their infections, coupled with the lack of antiviral therapy capable of eradicating the virus from the host, make these viruses a predominant public health concern. Using network biology, we have been working to extend our understanding on the physical and functional relationships among herpesvirus proteins upon infection [1]. We have computationally reconstructed a protein-protein interaction (PPI) network for the best-studied human herpesvirus species, and archetype species of the whole family, herpes simplex virus type 1 (HSV1). Our pipeline integrates two different data sets: one with PPIs collated from public resources, and another containing computationally predicted PPIs derived using a sequence homology-based method [2]. Computationally predicted PPIs pinpoint potential novel interactions that can be used to assist experimental PPI testing design. All our interaction data are assessed under a common scoring scheme inspired on the standardised MIscore system [3]. Using clustering approaches we highlight functional modules and macromolecular complexes within the reconstructed network, and identify novel functionally meaningful relationships. A web server has been designed to make all our interactomics data publicly available: HVint (http://topf-group.ismb.lon.ac.uk/hvint/) [1]. [1] Ashford, et al. (2016) [2] Yu, et al. (2004) [3] Villaveces, et al. (2015)

Download

Session A-242: Hypergraphlets Give Insight into Multi-Scale Organisation of Molecular Networks.

COSI: NetBio

Thomas Gaudelet, University College London, United Kingdom
Noel Malod-Dognin, University College London, United Kingdom
Jose Lugo-Martinez, Indiana University Bloomington, United States
Predrag Radivojac, Indiana University Bloomington, United States
Natasa Przulj, University College London, United Kingdom

Short Abstract: Graphlets and their statistics have been used to compare biological networks, to uncover their functional organisation principles, and to relate the wiring patterns of genes in these networks with their biological functions. However, modelling molecular data as a network in which edges only capture binary interactions between molecules is limited and disregards the higher level organisation of the interacting molecules. A more comprehensive model is obtained through hypergraph representations, in which hyperedges link all entities that interact in a specific way, e.g. an entire protein complex or a signaling pathway. Inspired by these, we build upon hypergraphlets -- an extension of graphlets to hypernetworks that has been recently defined by Jose Lugo-Martinez et al. (2016) -- to propose: a Hypergraphlet Degree Vector (HDV) defined for each node by the counts of hypergraphlets, an HDV-similarity using the cosine similarity for pairs of nodes, and Hypergraphlet Correlation Distance (HCD), which provides a measure of distance between two hypergraphs. We apply these new statistics to mine hypergraphs modelling protein-protein interactions, proteins complexes, biological pathways, drug-target data, and gene-disease data. On these hypergraphs, we observe that genes having similar functions (according to the semantic similarity between their gene ontology annotations) also have similar wiring patterns in hypergraphs, as measured by HDVs, however the opposite is generally not true. We also note that the set of hypergraphlets in a network gives an indication of the type of data it represents.

Session A-244: Functional subgraphs identification by constraints approach on signaling networks

COSI: NetBio

Bertrand Miannay, Laboratoire des Sciences du Numérique de Nantes (LS2N), France
Florence Magrangeas, Centre de recherche en cancérologie Nantes-Angers, CRCNA, UMR 892 INSERM- 6299 CNRS, 8 quai Moncousu - BP 70721 44007 Nantes cedex 1, France
Stephane Minvielle, Centre de recherche en cancérologie Nantes-Angers, CRCNA, UMR 892 INSERM- 6299 CNRS, 8 quai Moncousu - BP 70721 44007 Nantes cedex 1, France
Olivier Roux, Laboratoire des Sciences du Numérique de Nantes (LS2N), France
Carito Guziolowski, Laboratoire des Sciences du Numérique de Nantes (LS2N), France

Short Abstract: With the advent of high-throughput biological data and knowledge, integration of gene expression profiles (GEPs) and large-scale biological networks (BN) derived from Pathways Databases is a subject which is being widely explored. Existing methods are based on significantly measured species and only a small number of them include the directionality and underlying logic existing in BN. In this study we approach the GEP-networks integration problem by considering the network logic without requiring a prior genes selection according to their expression level. Our method aims to model the causality logic in BNs using Logic Programming . This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. From these states independent components are found, each of them related to a fixed and optimal assignment of active or inactive states and independant of the others. Then we compute the similarity between these subgraphs states and the GEP allowing to identify specific subgraphs to a class of GEP. We applied our method to study the set of possible states derived from a graph from the NCI-PID Pathway Interaction Database. This graph linked Multiple myeloma (MM) genes to know receptors for cancer. We identified 15 independant subgraphs, and when confronted to 611 MM and 9 healthy GEPs, we discover one subgraph as more specific to represent the difference between cancer and healthy profiles.

Session A-246: Towards a Systematic Understanding of Combinatorial Drug Perturbations

COSI: NetBio

Michael Caldera, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Felix Müller, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Marco P. Licciardello, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Charles Lardeau, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Anna Ringler, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Stefan Kubicek, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Jörg Menche, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria

Short Abstract: Drug-drug interactions (DDIs), i.e. changes in the effect of a drug when used in combination with another drug, have important implications for clinical applications (e.g. efficacy of a treatment, side effects) as well as for drug development (e.g. combination therapies). Although the concept of DDIs has been known for nearly a century, relatively little is known about general rules and patterns underlying such drug combination effects on a cellular (cell autonomous) level. In order to fill this gap, we have created and analysed a comprehensive DDI network using: (i) high-throughput high-content imaging screens of a representative library of FDA-approved drugs. (ii) A novel methodology based on high-dimensional cell morphology feature vectors allowing us to identify the full extent of reciprocal and joint interactions between drugs. (iii) An integrative network analysis of the resulting DDI network in the context of molecular and phenotypic networks. Overall, this project represents a first systematic attempt to reveal the fundamental arithmetics of drugs, i.e. a profound, molecularly rooted and predictive understanding of how the effects of individual drugs add up when used in combination, and thus helps us understanding how and when they occur, revealing their molecular mechanisms and identifying promising combinations for specific diseases.

Session A-248: NICE - Network Informed funCtional Enrichment

COSI: NetBio

Felix Müller, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Michael Caldera, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Jörg Menche, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria

Short Abstract: The interactome, i.e., the integrated network of all physical interactions within the cell, can be interpreted as a map of biological mechanisms. Functional gene annotations (e.g., gene ontology (GO) terms, disease or phenotype associations etc.) allow for local enrichment analyses to explore the functional landscape of the interactome or other biological networks. Network-based approaches can aid in identifying the specific interactome neighborhoods associated with particular biological functions. Here, we present a novel tool called NICE (Network Informed funCtional Enrichment) that characterizes biological networks by identifying functional clusters using topological proximity measures. We show how NICE can be used to explore the functional landscape of the interactome and as a tool to enhance the interpretation of sample gene sets of interest, for example genes associated with a particular disease. More generally, we address the questions of how to properly define biologically meaningful clusters in dense networks and whether different kinds of networks (e.g., protein-protein, co-expression or genetic networks) are more suitable for different kinds of biological annotations.

Session A-250: Database resources and their functionalities: ConsensusPathDB and ToxDB

COSI: NetBio

Gal Barel, MPIMG, Germany
Ralf Herwig, MPIMG, Germany
Christopher Hardt, MPIMG, Germany
Matthias Lienhard, MPIMG, Germany
Atanas Kamburov, Broad Institute of MIT and Harvard, United States

Short Abstract: We present two database resources, ConsensusPathDB and ToxDB, display their functionalities and demonstrate how they can be used to carry out a system-wide approach for the HeCaToS project. ConsensusPathDB consists of a comprehensive collection of molecular interaction data integrated from 32 different public repositories and a web interface featuring a set of computational methods and visualization tools to explore these data. ToxDB was developed for the analysis of the functional consequences of drug treatment at the pathway level, and thus consists of 2,282 pathway concepts as well as numerical response scores for 437 drugs and 7,467 different experimental conditions. For the HeCaToS project we have used both resources and established an integrative approach for the identification and prediction of human liver and heart drug induced toxicity.

Download

Session A-252: A Network-Based Community Expansion Method to Identify Novel Disease Genes within the Interactome-Phenome Space

COSI: NetBio

Pisanu Ize Buphamalai, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria
Tomislav Kokotovic, Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Austria
Vanja Nagy, Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Austria
Jörg Menche, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Austria

Short Abstract: Despite over a decade of post-genomic molecular biology, the functional characterisation of human genes, both in their normal cellular context, as well as in disease states, remains a challenge. In the past, a considerable contribution to our understanding of individual genes and their functions has been derived from the study of rare diseases, as they are often monogenic and thereby offer a clear genotype to phenotype relationship. Given the wealth of unbiased, high-throughput molecular and phenotypic data that is now available on many rare diseases, we can now attempt to go beyond studying one rare disease at a time and move towards a more systematic approach to identifying the pathobiological processes in which the respective causal genes are involved. As an initial case study for such an approach, we consider a group of diseases characterized by intellectual disability (ID). It has been estimated that well over 2000 genes could be involved in ID, yet less than 800 have been identified to date. We propose a network-based community expansion method to predict the likelihood of a gene being causative based on molecular, phenotypic and topological information of other genes in both human and model organisms. On one hand, the results from our computational approach facilitate further experimental validation and characterization of the identified genes. On the other hand, they may also contribute to a more fundamental understanding of disease genes and their phenotypic impact in the context of their molecular interactions.

Session A-254: Differentiate the response of Acute Myeloid Leukemia patients to treatment by using Proteomics Data and Answer Set Programming

COSI: NetBio

Lokmane Chebouba, Laboratoire du Sciences du Numérique de Nantes, France
Carito Guziolowski, Ecole Centrale de Nantes, France

Short Abstract: Prediction of the treatment response of Acute Myeloid Leukemia (AML) patients may speed clinical decisions. The 2014 DREAM Challenge aims to predict the Complete Remission (CR) and Primary Resistant (PR) response of 191 AML patients from proteomics data (231 measurements) and from 40 bio-clinical data. The results put in evidence only 2 discriminant proteins, the most discriminant data was the bio-clinical. In this study we check if by using a mathematical model built over a graph associating the measured proteins we could increase the number of proteomics measures to predict the CR-PR patients response. In order to do this we first build a graph that connected the 231 measured proteins. In this graph we distinguished 3 types of nodes: stimuli, inhibitors and readouts. Our objective was to find underlying Boolean networks (BNs) that explain the logic of the observed nodes in the graph according to the proteomics measures. Then we conceived a logic program in Answer Set Programming to select k stimuli and inhibitor proteins that maximize the number of pairs of patients for which the discretized values of their experimental measures matched in both classes (CR, PR). This subset of the initial experimental data, was used in a latter step as input to caspo, a method that learns BNs from multiple perturbation data. We aimed to learn different BNs families by using the identical stimuli-inhibitor cases and the maximal difference of readouts measures for each CR and PR class, and to finally compare the structure of these BNs families.

Session A-256: A network analysis of the incidence pattern of microcephaly in the context of Zika Virus Infection

COSI: NetBio

Myriam Patricia Cifuentes, The Ohio State University, Universidad Antonio Nariño, United States Minor Outlying Islands
Clara Mercedes Suarez, Universidad CES, Colombia
Sam Freddy Ludwien Windels, University College London, United Kingdom
Ricardo Cifuentes, Universidad Militar Nueva Granada, Colombia
Nathan Doogan, The Ohio State University, United States Minor Outlying Islands
Jose Fernando Valderrama, Ministerio de Salud, República de Colombia, Colombia
Noel Malod-Dognin, University College London, United Kingdom
Darryl Hood, The Ohio State University, United States Minor Outlying Islands
Natasa Przulj, University College London, United Kingdom

Short Abstract: During the recent epidemic of microcephaly in South America, the dissimilar geographic distribution of the associated Zika Virus Infection (ZVI) raised questions about the virus’ role in the cause of the birth defect. These questions where further amplified as other factors such as agro-toxic usages, vaccinations, metal-poisonings and social conditions have been proposed to co-explain the microcephaly epidemic in Brazil. Therefore, we quantify and provide insights in the different incidence patterns of microcephaly, with and without the presence of the Zika Virus. To that aim, we generate a network model based on the significant associations between 382 non-redundant variables related to ZVI-microcephaly surveillance and social determinants of health, as measured over the 5,665 municipalities in Brazil. Relationships between these variables are quantified by means of significant partial correlations and are represented as edges in our network. As this approach provides us with a dense network, we apply and evaluate different thresholding strategies on the edges of our network to allow for a meaningful topological analysis. Finally, we extract multiple sub-networks representing the incidence patterns around variables of interest, such as microcephaly with and without ZVI, which consist of the node representing our variable of interest, its direct neighbours and the edges connecting them. By comparing such sub-networks we are able to distinguish which variables influence the occurrence of microcephaly under different conditions.

Session A-258: Developing Executable BioPAX Models of Gene Regulatory Networks

COSI: NetBio

Clara Pavillet, University of Oxford, United Kingdom

Short Abstract: Gene regulatory networks (GRNs) are complex, and still largely uncharacterized, sets of regulators and interactions that govern cellular processes, such as proliferation, invasion and metabolic adaptation. Virtually simulating biological pathways offers a powerful method to better understand the role of specific genes and pathways in health and disease, and predict cellular responses to perturbations or treatment. Pathway data collected from the literature is represented computationally in various formats including SBML, SBGN, SIF, to name a few. The BioPAX (Biological Pathway Exchange) language is becoming a widely-used format to work with this data. It aims to offer flexibility and compatibility across computational methods and software to solve the issue of format disagreement from different data providers. Major pathway databases such as Reactome, WikiPathways, and Panther increasingly support file export to BioPAX. Here, we aim to develop executable gene regulatory networks (GRNs) exploiting BioPAX. Having identified all major pathways repositories, we evaluated whether the file formats offered could be executed with available software tools. Whereas we could easily generate faithful visual representations of the GRNs considered, none of the tools available, including BioPAX supported tools, allowed us to efficiently simulate them. The results so far highlight the complexity of the task, requiring both format and language compatibility, and the need for further standardization.

Download

Session A-260: A heuristic multiple maximum common subgraph detection tool for Cytoscape

COSI: NetBio

Simon J. Larsen, University of Southern Denmark, Denmark
Alexander G.B. Grønning, University of Southern Denmark, Denmark
Jan Baumbach, University of Southern Denmark, Denmark

Short Abstract: Biologial network alignment is a challenging computational problem in bioinformatics that aims to identify similar nodes, edges or subnetworks among two or more networks. By computing the maximum common edge subgraph between a set of networks, one is able to detect conserved substructures and quantify their topological similarity. To aid such analyses we have developed a heuristic algorithm for the multiple maximum common edge subgraph problem on both directed and undirected networks. Our algorithm uses an iterative local search algorithm for computing conserved subgraphs by optimizing a novel edge conservation score that is able to detect not only fully conserved edges but also partially conserved edges, to provide further insight into the common structure of the compared networks. Our method is available as a stand-alone application and as a Cytoscape app at http://cytomcs.compbio.sdu.dk.

Session A-262: FocusHeuristics – expression-data-driven network optimisation and disease gene prediction

COSI: NetBio

Stephan Struckmann, Rostock University Medical Center, Germany
Yang Du, Rostock University Medical Center, Germany
Gregor Warsow, Rostock University Medical Center, Germany
Mohamed Hamed, Rostock University Medical Center, Germany
Nicole Endlich, University of Greifswald, Germany
Karlhans Endlich, University of Greifswald, Germany
Hugo Murua Escobar, Rostock University Medical Center, Germany
Lisa-Madeleine Sklarz, Rostock University Medical Center, Germany
Sina Sender, Rostock University Medical Center, Germany
Christian Junghanß, Rostock University Medical Center, Germany
Steffen Möller, Rostock University Medical Center, Germany
Georg Fuellen, Rostock University Medical Center, Germany
Mathias Ernst, Rostock University Medical Center, Germany

Short Abstract: To identify genes contributing to disease phenotypes remains a challenge for bioinformatics. Static knowledge on biological networks is often combined with the dynamics observed in gene expression levels over disease development, to find markers for diagnostics and therapy, and also putative disease-modulatory drug targets and drugs. The basis of current methods ranges from a focus on expression-levels (Limma) to concentrating on network characteristics (PageRank, HITS/Authority Score), and both (DeMAND, Local Radiality). We present an integrative approach (the FocusHeuristics) that is thoroughly evaluated based on public expression data and molecular disease characteristics provided by DisGeNet. The FocusHeuristics combines three scores, i.e. the log fold change and another two, based on the sum and difference of log fold changes of genes/proteins linked in a network. A gene is kept when one of the scores to which it contributes is above a threshold. Our FocusHeuristics is both, a predictor for gene-disease-association and a bioinformatics method to reduce biological networks to their disease-relevant parts, by highlighting the dynamics observed in expression data. The FocusHeuristics is slightly, but significantly better than other methods by its more successful identification of disease-associated genes measured by AUC, and it delivers mechanistic explanations for its choice of genes.

Session A-264: A visual exploration route: from time-series to network models

COSI: NetBio

Athanasios Vogogias, Edinburgh Napier University, United Kingdom
Jessie Kennedy, Edinburgh Napier University, United Kingdom
Daniel Archambault, Swansea University, United Kingdom

Short Abstract: Network models aim to describe relationships between components and provide an abstract view on how a biological system works as a whole. However, learning the structure of the most representative network from sparse and often noisy data sets is a challenging optimisation problem. In this work we propose a visual analytics approach for inferring Bayesian networks from time-series gene expression data. In particular, we seek to provide visual support for exploring the search space of all candidate networks in conjunction with a representation of the original data. As part of the variable selection phase we apply hierarchical clustering with multiple-level cuts to reduce feature redundancy. The purpose is to effectively reduce the number of variables by detecting and aggregating genes that follow a similar expression pattern over time. As part of the network construction phase we use different search algorithms and parameter settings to sample the search space of all possible Bayesian networks. An appropriate scoring metric is used to assess their fitness to the original data. We extend the small multipiles technique to provide visual support for exploring large collections of scored, directed networks, which constitute the solution space. Depending on its shape, either the top scoring network is selected, or a consensus network is constructed from a number of high-scoring candidate networks. Our approach aims at the design and implementation of a visualisation toolbox which would help analysts in inferring Bayesian network models which are not only reproducible, but also representative of the original time-series data.

Download

Session A-266: Assessing of Graphical Gaussian Models to study association networks.

COSI: NetBio

Esteban Vegas, University of Barcelona, Spain
Antonio Miñarro Alonso, University of Barcelona, Spain

Short Abstract: Graphical Gaussian Models (GGM) has recently become a popular tool to study association networks. Application of GGMs to omics data is quite challenging, as the number of variables (p) is usually much larger than the number of samples (n), and classical GGM theory is not valid in a small sample setting. Several algorithms have been developed to handle GGMs with small samples. These algorithms boil the problem down to finding suitable estimates for the covariance matrix and its inverse when n < p. In this work, we have verified through simulations the ability of the methods to recover the original structure of direct interactions. We have used different algorithms implemented in the R package GGMselect and significance tests of the partial correlation coefficient in the case n> p, adjusting p-values for multiple comparisons. In practice we have generated graphs of order p, we have obtained random samples of size n from the generated structure and we have verified if the original state: connected or not connected, is recovered between each pair of variables. Results show, on one hand, a dependency on the ratio n/p. The greater the ratio, the better it fits. On the other hand, results depend only very slightly on the original network structure. Similar results have been obtained with highly structured networks or when the original structure is random. We have also considered several methods to compare networks by comparing the overlapping of nodes and edges or the degree distribution of nodes.

Download

Session A-268: Predicting the influence of combination therapies in signaling networks

COSI: NetBio

Maren Sitte, University Medical Center Göttingen, Institute of Medical Statistics, Germany
Tim Beissbarth, University Medical Center Göttingen, Institute of Medical Statistics, Germany

Short Abstract: A german-wide consortium named “Molecular Mechanisms in Malignant Lymphomas - Demonstrators of Personalized Medicine” compound of research groups of biologists, bioinformaticians and doctors propose to develop prognostic and diagnostic platforms that guide treatment decisions and that support the process of therapeutic target identification in diffuse large B-cell lymphomas (DLBCL). The focus lies on the DLBCL microenvironment as prognostic relevance, which is the foundation of the diagnostic platforms the consortium will establish. The communication of the cell microenvironment with the tumour cells will be the target for the novel therapeutic strategies the consortium wants to investigate. In our subproject, we aim to investigate hybrid-models, which will integrate signalling data with existing gene expression data to predict how lymphomas translate signalling stimuli in expression phenotypes. For this approach we will integrate pathway knowledge and experimental data and implement previously developed network reconstruction methodology. These existing approaches as Deterministic Effects Propagation Networks (Bender et al., 2011) and Nested Effects Models (Fröhlich et al., 2008; Markowetz et al., 2005) are based on Bayesian networks. This is the ground line of my research and shall be adapted, so that measurements from proteomic experiments and prior pathway knowledge can be combined. References: Bender, C., Heyde, S., Henjes, F., Wiemann, S., Korf, U., and Beissbarth, T. (2011). Inferring signalling networks from longitudinal data using sampling based approaches in the R-package 'ddepn'. BMC Bioinformatics 12, 291. Fröhlich, H., Fellmann, M., Sultmann, H., Poustka, A., and Beissbarth, T. (2008). Predicting pathway membership via domain signatures. Bioinformatics 24, 2137-2142. Markowetz, F., Bloch, J., and Spang, R. (2005). Non-transcriptional pathway features reconstructed from secondary effects of RNA interference. Bioinformatics 21, 4026-4032

Session A-270: Finding novel drug indications using drug-disease vectors

COSI: NetBio

Taekeon Lee, Gachon university, South Korea
Giup Jang, Gachon University, South Korea
Soyoun Hwang, Gachon University, South Korea
Youngmi Yoon, Department of Computer Engineering, Gachon University, South Korea

Short Abstract: In this study, we construct a classifier to find novel drugs for specific disease via exploring gene regulatory network. Firstly, we build a gene regulatory network which has 3 types of interactions (activation, inhibition, and neutral) using data from KEGG, bioCarta, Reactome and Pathway Interaction Database. Then we collect drug-target genes, disease genes and chemical-disease associations from DrugBank, disGeNet and CTD respectively. The chemical IDs are mapped to drugBank IDs. In this study, for each drug-disease pair, we create a vector whose component is value for how much the gene is affected by the drug. To get the value, we find shortest paths from drug-target genes to disease genes on the gene regulatory network. Then we calculate a probability value for each path using degrees of genes on the network. We use the value as weight of the path. For calculating direction of the path, we use interactions between genes in the path. The product of weight and direction implies how much a drug-target gene affects a disease gene through the path. If there is more than one path, values are added up. Additionally, if drug has multi-target genes, we sum up all the values of each target gene. To construct classifier, we make positive set using vectors which have known drug-disease association. Negative set is built with random sampling from vectors excluding the positive set. Finally, we use randomForest and this process is repeated 1000 times for each disease. As a result, we get AUC>0.6 for 265/298 diseases.

Session A-272: Mining multiple pathogen-host interactomes for the detection of differentiating patterns

COSI: NetBio

Pieter Moris, UAntwerp, Belgium
Pieter Meysman, UAntwerp, Belgium
Kris Laukens, UAntwerp, Belgium

Short Abstract: Infectious diseases remain one of the leading causes of mortality and illness in the modern world. However, disease susceptibility can differ wildly, both between individuals of the same species and across species boundaries (e.g. reservoirs of zoonotic pathogens). This interplay between intracellular pathogens and their hosts is mediated through intricate molecular interactions, which generally serve common goals such as the invasion of the host cells, evasion of the immune system and manipulation of host processes for replication. By studying these interactions via a systems biology approach, a better understanding of the underlying determinants of disease susceptibility can be gained. We propose a generic methodology based on frequent pattern mining to tackle this research question and we showcase it in this case-study on the intra-species interactomes of Herpesviridae and their hosts. By utilising the wealth of knowledge that has been compiled into public protein-protein interaction databases and complementing this with different forms of annotation data, we attempt to uncover biological relevant patterns in these networks. This method also conveniently lends itself to the derivation of association rules and the construction of classification models to distinguish between, for example, human and animal herpesviruses. An extension of this workflow is the inclusion of different omics levels to validate or filter the previous findings. Ultimately, our goal is to deliver fundamental insights into the evolutionary drivers of disease susceptibility, as a better understanding of the underlying molecular mechanisms is crucial for the treatment and prevention of pathogen-induced infection diseases.

Session A-274: Functional analysis of Aryl Hydrocarbon Receptor main and unknown molecular-genetic pathways involved in human cutaneous malignant melanoma for designing new therapeutic approaches

COSI: NetBio

Serena Dotolo, Institute of Food Science-CNR, Italy
Angelo Facchiano, Institute of Food Science-CNR, Italy

Short Abstract: ‘’Omics’’ approaches are widely applied to examine in depth physiological processes and pathological conditions, studying the disease pattern. Our interest is oriented to the integration of data from different experimental approaches and fields of investigation, to highlight hidden information and to mine new knowledge from available experimental data. Our study is focused on integrative-functional analysis of molecular pathways that involve AHR (Aryl hydrocarbon receptor), a cytosolic ligand-activated transcription factor in cutaneous malignant melanoma (CMM1) and some skin cancer melanoma-independent [1]. The functional analysis is executed by means of different open-source platforms and tools bioinformatics: GeneCards, DSYSMAP, Oncomine platform for structural-functional and clinical characteristics; Cytoscape platform for realizing and visualizing molecular networks at different levels, in order to improve the knowledge of molecular mechanisms; BioGPS platform, BioXpress and MERAV software to analyze the gene expression profile of our biological target involved in melanoma [2,3], and MelGene DB (melanoma database). In this study, it is explained the molecular networks and discussed the potential roles of specific nodes evidenced by the analysis, comparing this information with Oncomine clinical data, also in consideration of the role of several AHR disease-related mutations in different biological conditions. A deeper analysis of AHR molecular mechanisms based on pro/antitumor functions could be useful for a better understanding of the bound between AHR and development-progression of melanoma, and for proposing novel therapies, in order to cure or control the melanoma evolution.

Session A-276: Topology Based Entropy as a Method to Evaluate the Construction of Gene Co-Expression Networks

COSI: NetBio

Samuel Heron, DTC in Neuroinformatics & Computational Neuroscience, Edinburgh University, United Kingdom
Owen Dando, Centre for Integrative Physiology, University of Edinburgh, United Kingdom
Giles Hardingham, Centre for Integrative Physiology, University of Edinburgh, United Kingdom
T. Ian Simpson, Institute for Adaptive and Neural Computation, Edinburgh University, United Kingdom

Short Abstract: The use of gene co-expression networks to model and analyse complex biological phenomena is becoming increasingly commonplace. Whilst several methods exist to create these networks, none offer a means of evaluating parameter selection choices made during network construction. Gene co-expression networks are normally created using default settings where parameter optimisation against objective criteria could result in networks that better represent the underlying biological systems data. Here we tackle this issue using an entropy measure to optimise parameter settings by quantifying their effect on the association of gene sub-networks with biological pathways. Our method compares the topology of gene modules that have been identified by the WGCNA co-expression network tool with the topology of gene interactions in biological pathways. We derive entropy values for each module against each pathway using the weights of common edges between the two graphs and measure the effects of varying correlation method, minimum cluster size and edge weight threshold on module-pathway entropy. To evaluate this approach, we use RNA-Seq data in which genes from specific pathways have been synthetically perturbed to increase association values and measure the sensitivity and specificity of changes in module-pathway entropy. As our method preserves topology we propose that it presents a more rigorous and sensitive way to assess the efficacy of gene co-expression network creation tools such as WGCNA and could be used for their future development.

Download

Session A-278: Bringing Pathway Knowledge to Systems Medicine Approaches

COSI: NetBio

Florian Auer, Department of Medical Statistics, University Medical Center Göttingen, Germany
Frank Kramer, Department of Medical Statistics, University Medical Center Göttingen, Germany
Tim Beißbarth, Department of Medical Statistics, University Medical Center Göttingen, Germany

Short Abstract: In modern Systems Medicine approaches the aim is to look at increasingly complex interactions of complete signaling pathways in order to get a more holistic view for individualized treatment decisions. Individualized treatment decisions and newly developed specialized drugs warrant the need to broaden the focus in individualized medicine from singular biomarkers to pathways. On the other hand pathway databases offer vast amounts of knowledge on biological networks, freely available and encoded in semi-structured formats[BCS06, SAK+09]. The efficient re-use of pathway knowledge and its integration into bioinformatic analyses enables new insights for researchers in systems medicine. However, the vast amount of published data on molecular interactions makes it increasingly challenging for life science researchers to find and extract the most relevant information. Currently, the tools to use this information and integrate it in a clinical context are still lacking. Our idea is to compose an analysis pipeline in order to enable patient-specific systems medicine analyses in a university hospital setting. Our poster will present a workflow for visualizing pathway information and integrating omics data within an interactive online application, utilizing state of the art technology[FLH+16, R C14, KBK+ 13, FBBL15] and well-established standard data models[DCP+ 10, HFS+ 03, PCW+ 15]. References [BCS06] Gary D. Bader, Michael P. Cary, and Chris Sander. Pathguide: a pathway resource list. Nucleic Acids Research, 34(Database issue):D504–506, January 2006. [DCP+10] Emek Demir, Michael P Cary, Suzanne Paley, et al. The BioPAX community standard for pathway data sharing. Nature Biotechnology, 28(9):935–942, September 2010. [FBBL15] Silvia Frias, Kenneth Bryan, Fiona S. L. Brinkman, and David J. Lynn. CerebralWeb: a Cytoscape.js plug-in to visualize networks stratified by subcellular localization. Database, 2015:bav041, January 2015. [FLH+ 16] Max Franz, Christian T. Lopes, Gerardo Huck, et al. Cytoscape.js: a graph theory library for visualisation and analysis. Bioinformatics, 32(2):309–311, 2016. [HFS+ 03] M. Hucka, A. Finney, H. M. Sauro, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics, 19(4):524–531, March 2003. [KBK+ 13] F. Kramer, M. Bayerlova, F. Klemm, A. Bleckmann, and T. Beissbarth. rBiopaxParser–an R package to parse, modify and visualize BioPAX data. Bioinformatics, 29(4):520–522, February 2013. [PCW+ 15] Dexter Pratt, Jing Chen, David Welker, et al. NDEx, the Network Data Exchange. Cell Systems, 1(4):302–305, October 2015. [R C14] R Core Team. R: A Language and Environment for Statistical Computing. 2014. [SAK+ 09] Carl F. Schaefer, Kira Anthony, Shiva Krupa, et al. PID: the Pathway Interaction Database. Nucleic Acids Research, 37(Database issue):D674–D679, January 2009.

Download

Session A-280: Ulign: Unified Alignment of Protein-Protein Interaction Networks

COSI: NetBio

Noel Malod-Dognin, Department of Computer Science, University College London, United Kingdom
Kristina Ban, Laboratory of Data Technologies, Faculty of Information Studies, Slovenia
Natasa Przulj, Department of Computer Science, University College London, United Kingdom

Short Abstract: Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, all aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or on which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables a complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift to come from aligning all available data types collectively rather than any particular data type in isolation from others.

Session A-282: MetaGraphite: pathway analysis with metabolites

COSI: NetBio

Gabriele Sales, Università di Padova, Italy
Enrica Calura, Università di Padova, Italy
Chiara Romualdi, University of Padova, Italy

Short Abstract: RNA expression profiling is routinely employed to quantify the abundance of tens of thousands of transcripts across different conditions and tissues. Metabolomics, which systematically measures the abundance of small molecules, has emerged as an attractive addition as it directly measures the end products of biological processes and is thus key to understand disease phenotypes. If data gathering has become easier with the improvement of experimental protocols, the analysis of their results still poses significant challenges. Gene set analysis (GSA) has emerged as a promising technique to provide a biological interpretation of such data and pathway topology is one of its most important source of information. Our “graphite” R package provides networks derived from the pathways of six major databases (Biocarta, HumanCyc, KEGG, NCI/Nature Pathway Interaction Database, Panther and Reactome) covering 14 species. The software discriminates between different types of gene groups (complexes or alternative genes of the same family); allows the selection of edges by type of interaction; uniformly converts heterogeneous node IDs using the facilities provided by BioConductor. Moreover, it gives easy access to topological analyses such as the clipper, DEGraph, SPIA and topologyGSA methods. Here we present a novel extension to the package which explicitly tracks small molecules. This makes it possible to capture with a higher level of detail metabolic pathways, further extends the collection of databases to include dedicated resources such as SMPDB and PharmGKB and opens the way to statistical analyses over mixed datasets measuring both RNAs and metabolites.

Session A-284: Elucidating the mechanism of action of drug based on gene module-module interaction network

COSI: NetBio

Soyoun Hwang, Gachon University, South Korea
Giup Jang, Gachon University, South Korea
Taekeon Lee, Gachon University, South Korea
Youngmi Yoon, Gachon University, South Korea

Short Abstract: It is essential to understand a mechanism of action (MOA) of drug for treating diseases. However, it is still challenging to analyze the mechanism because genes perform various functions interacting with each other inside the body. Therefore, we try to elucidate the MOA of drug with not relationships between genes but modular groups of genes. In our study, a protein-protein interaction (PPI) network is obtained from BioGRID, DIP, KEGG, and Reactome databases. We use the link communities[1] to modularize genes that are significantly related to each other in the PPI network. Edge weights between gene modules are calculated with interactions based on PPI network, and then we construct a weighted gene module-module interaction (MMI) network. Based on the MMI network, modules which are significantly enriched with drug target genes and disease related genes are selected as starting modules and ending modules respectively. To build a drug-disease module pathway, we find shortest path between the starting modules and the ending modules. Finally, we analyze the MOA of drug with the drug-disease module pathway. In the drug-disease module pathway, if there are genes which are unknown to be related to the mechanism, they could be assumed to be involved in the mechanism. Further study of modules in the pathway can give us insight on which functions play important role on the MOA of drug. [1] Ahn, Y. Y., Bagrow, J. P., & Lehmann, S. (2010). Link communities reveal multiscale complexity in networks. Nature, 466(7307), 761-764.

Session A-286: HitPredict and TimeXNet: Resources for network and pathway analysis

COSI: NetBio

Ashwini Patil, Institute of Medical Science, University of Tokyo, Japan
Yosvany López, Tokyo Dental and Medical University, Japan
Phit Ling Tan, Institute of Medical Science, University of Tokyo, Japan
Kenta Nakai, Intitute of Medical Science, University of Tokyo, Japan

Short Abstract: The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality molecular interaction information and tools to extract relevant relationships from them. We present HitPredict (http://hintdb.hgc.jp/htp/), a consolidated resource of experimentally identified, physical protein–protein interactions with confidence scores to indicate their reliability (1). The latest version of HitPredict provides a high quality dataset of 398,696 physical associations between 70,808 proteins from 105 species (2). To extract condition-specific information from databases like HitPredict, we present TimeXNet (http://timexnet.hgc.jp/), a tool that predicts activated pathways during a cellular response using time-course gene expression profiles (3). TimeXNet identifies activated pathways in a large molecular network between three sets of genes based on their time of expression (4). A web-based version of TimeXNet (http://txnet.hgc.jp/) directly utilizes the interaction networks in HitPredict to enable the analysis of gene expression profiles from multiple species. References 1. HitPredict: a database of quality assessed protein-protein interactions in nine species. Ashwini Patil, Kenta Nakai and Haruki Nakamura, Nucleic Acids Research 39:D744:D749, 2011. 2. HitPredict version 4 - comprehensive reliability scoring of physical protein-protein interactions from more than 100 species. Yosvany Lopez, Kenta Nakai and Ashwini Patil, DATABASE 2015:bav117, 2015. 3. TimeXNet: Identifying active gene sub-networks using time-course gene expression profiles. Ashwini Patil and Kenta Nakai, BMC Systems Biology, 8(Suppl4), S2, 2014. 4. Linking transcriptional changes over time in stimulated dendritic cells to identify gene networks activated during the innate immune response. Ashwini Patil, Yutaro Kumagai, Kuo-ching Liang, Yutaka Suzuki and Kenta Nakai, PLOS Computational Biology 9(11):e1003323, 2013.

Download

Session A-288: The Reactome Graph Database: Efficient Access to Complex Data Structures

COSI: NetBio

Antonio Fabregat, EMBL-EBI, United Kingdom
Florian Korninger, EMBL-EBI, United Kingdom
Guilherme Viteri, EMBL-EBI, United Kingdom
Konstantinos Sidiropoulos, EMBL-EBI, United Kingdom
Peter D'Eustachio, NYU Langone Medical Centre, United States
Lincoln Stein, OICR, Canada
Peipei Ping, UCLA, United States
Henning Hermjakob, EMBL-EBI, United Kingdom

Short Abstract: Reactome (http://reactome.org) is a free, open-source, curated and peer-reviewed knowledge base of biomolecular pathways that provides infrastructure and intuitive bioinformatics tools for search, visualisation, interpretation and analysis of pathways. From the data point of view, it offers detailed representations of cellular processes as an ordered network of molecular reactions, annotating them in a consistent pathway format to create an online resource for researchers and students as a core reusable pathway dataset for systems biology based approaches. This network amounts to thousands of interconnected terms forming a graph of biological knowledge. Storing, retrieving, and analysing such networks can become inefficient when relying on a relational database management system (RDBMS). Although relational databases are widely used among pathway knowledge-bases for data management, they are not always the best fit to deal with modern performance requirements and increasing complexity of data. Complexity in this case does not only refer to the quantity of data but also its interconnectedness. The benefit of storing these data in their natural form is that there is no need to be transformed into a flat table format but instead, can be persisted as originally designed. Adopting Neo4j as the graph database management system helps reducing the complexity of the database and, thus, allows a more straightforward access to the Reactome knowledgebase via its query language, Cypher. The latter allows for faster queries, reducing the average response time per query by 95%, and helps expressing the queries in a more intuitive, human readable and easy to learn syntax.

Session A-290: Nested Bootstrapping for reliable Gene Regulatory Network Inference

COSI: NetBio

Daniel Morgan, Stockholm University, Sweden
Andreas Tjarnberg, Linkping University, Sweden
Torbjörn E. M. Nordling, National Cheng Kung University, Taiwan
Erik Sonnhammer, SciLifeLab, Sweden

Short Abstract: Common Gene Regulatory Network (GRN) inference methods, such as LASSO, do not provide information about the confidence of inferred links. We address this by extending the bootstrap method, instead overlapping the analysis in iterated runs, and applying it to three inference methods. Details of the shortcomings of L1-regularization methods when operating over sufficiently informative data are known. Here, all of the referenced methods perform sub-optimally in terms of Matthew's Correlation Coefficient (MCC) for low signal-to-noise ratio (SNR) data matrices, even when the data are informative enough for network inference by other metrics. It is thus important not only to introduce methods which are optimized for analyzing datasets of certain quality, but also to define criteria for determining which method to use to optimize analysis. When considering which gene-gene interactions are true, we seek to differentiate spurious gene-gene interactions from those that truly exist in the system. To this end we use a linear ODE model and the GeneSPIDER package to infer the regulatory network of interactions by relating the effect of single gene perturbations to the expression of the remaining unperturbed set.

Download

Session A-292: Drug-Target Interaction Networks Prediction using Short-Linear Motifs

COSI: NetBio

Wenxiao Xu, School of Computer Science, University of Windsor, Canada
Luis Rueda, School of Computer Science, University of Windsor, Canada
Alioune Ngom, School of Computer Science, University of Windsor, Canada

Short Abstract: Drug-target interaction (DTI) prediction is a fundamental step in drug discovery. Given an unknown pair, (di, tj), of drug compound di and target protein tj, the objective is to predict whether di interacts with tj given a known DTI network as training data. Machine learning (ML) is currently being used for DTI predictions. There are mainly two types of ML-based approaches: similarity-based methods and feature-based methods. We propose a new feature-based approach which uses short-linear motifs (SLiMs) as protein features combined with chemical substructure fingerprints used as drug features, and applied ML methods to predict DTIs. SLiMs are short protein sequence patterns of 3-10 amino acids involved in the recognition and targeting activities of drugs. Existing methods for DTI predictions consider the absence of interaction between a drug di and a target tj in the training data as a true negative interaction. However, the lack of interaction in (di, tj) means unknown, not negative. We introduce a strategy that finds negative pairs based on protein and drug features, and devise feature selection and classification algorithms to predict DTIs. We tested our DTI prediction approach on four gold-standard data sets (Yamanishi et al, 2008). Both, random forest (RF) and support vector machine (SVM) classifiers give high AUC performance of 99.24% and 97.64%, respectively. Our method also outperforms existing DTI prediction methods discussed in literature. Generally, RF performs better than SVM, with AUC results of 99.04%, 96.39%, 97.33%, and 87.64%, respectively in each data set.

Download

Session A-294: Predicting multicellular function through multi-layer tissue networks

COSI: NetBio

Marinka Zitnik, Stanford University, United States
Jure Leskovec, Stanford University, United States

Short Abstract: Motivation: Understanding functions of proteins in specific human tissues is essential for insights into disease diagnostics and therapeutics, yet prediction of tissue-specific cellular function remains a critical challenge for biomedicine. Results: Here we present OhmNet, a hierarchy-aware unsupervised node feature learning approach for multi-layer networks. We build a multi-layer network, where each layer represents molecular interactions in a different human tissue. OhmNet then automatically learns a mapping of proteins, represented as nodes, to a neural embedding based low-dimensional space of features. OhmNet encourages sharing of similar features among proteins with similar network neighborhoods and among proteins activated in similar tissues. The algorithm generalizes prior work, which generally ignores relationships between tissues, by modeling tissue organization with a rich multiscale tissue hierarchy. We use OhmNet to study multicellular function in a multi-layer protein interaction network of 107 human tissues. In 48 tissues with known tissue-specific cellular functions, OhmNet provides more accurate predictions of cellular function than alternative approaches, and also generates more accurate hypotheses about tissue-specific protein actions. We show that taking into account the tissue hierarchy leads to improved predictive power. Remarkably, we also demonstrate that it is possible to leverage the tissue hierarchy in order to effectively transfer cellular functions to a functionally uncharacterized tissue. Overall, OhmNet moves from flat networks to multiscale models able to predict a range of phenotypes spanning cellular subsystems.

Download

Session A-296: Virus-host protein-protein interactions in STRING

COSI: NetBio

Helen Cook, NNF Center for Protein Research, Denmark
Damian Szklarczyk, University of Zurich, Switzerland
Christian von Mering, University of Zurich, Switzerland
Lars Juhl Jensen, NNF Center for Protein Research, Denmark

Short Abstract: The study of viruses is aided by bioinformatics resources such as protein–protein interaction databases. Having a comprehensive picture of a virus protein's interaction partners is crucial to the understanding of the viral lifecycle and aids in the search for vaccines and antiviral drugs. Here, we extend the STRING database of protein-protein interactions to store and display cross-species virus-host and intra-virus interactions. Information is taken from different channels: experimental evidence, pathways and text mining. This enables the visualization of networks that show the virus interacting with the host proteins, which are primarily proteins of the innate immune system.

Download

Session A-298: Computational prediction of molecular host-pathogen interactions using dual RNA-Seq

COSI: NetBio

Thomas Wolf, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany
Sylvie Schulze, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany
Philipp Kämmer, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany
Sascha Brunke, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany
Bernhard Hube, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany
Reinhard Guthke, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany
Jörg Linde, Leibniz Institute for Natural Product Research and Infection Biology – Hans Knöll Institute (HKI), Germany

Download

Session A-300: A collaborative knowledgebase for driving nutrient research in food crops

COSI: NetBio

Jeremy J Jay, University of North Carolina at Charlotte, United States
Richard Linchangco, University of North Carolina at Charlotte, United States
Robert W Reid, University of North Carolina at Charlotte, United States
Cory Brouwer, University of North Carolina at Charlotte, United States

Session A-302: Using a dual-network model for relating clinical and geographical data of patients.

COSI: NetBio

Pietro Hiram Guzzi, Laboratory of Bioinformatics, University of Catanzaro, Italy

Session A-304: Identification of related binary biological data using the Jaccard/Tanimoto index

COSI: NetBio

Neo Chung, University of Warsaw, Poland
Blazej Miasojedow, University of Warsaw, Poland
Anna Gambin, University of Warsaw, Poland

Session A-306: Multiple network-constrained regressions expand insights into influenza vaccination responses

COSI: NetBio

Stefan Avey, Yale School of Medicine, United States
Subhasis Mohanty, Yale School of Medicine, United States
Jean Wilson, Yale School of Medicine, United States
Heidi Zapata, Yale School of Medicine, United States
Samit R. Joshi, Yale School of Medicine, United States
Barbara Siconolfi, Yale School of Medicine, United States
Sui Tsang, Yale School of Medicine, United States
Albert C. Shaw, Yale School of Medicine, United States
Steven H. Kleinstein, Yale School of Medicine, United States

Short Abstract: Systems immunology leverages recent technological advancements that enable broad profiling of the immune system to better understand the response to infection and vaccination, as well as the dysregulation that occurs in disease. An increasingly common approach to gain insights from these large-scale profiling experiments involves the application of statistical learning methods to predict disease states or the immune response to perturbations. However, the goal of many systems studies is not to maximize accuracy, but rather to gain biological insights. The predictors identified using current approaches can be uninterpretable or present only one of many equally predictive models, leading to a narrow understanding of the underlying biology. Here we show that incorporating prior biological knowledge within a logistic modeling framework by using network-level constraints on transcriptional profiling data significantly improves interpretability. Moreover, incorporating different types of biological knowledge produces models that highlight distinct aspects of the underlying biology, while maintaining predictive accuracy. We propose a new framework, Logistic Multiple Network-constrained Regression (LogMiNeR), and apply it to understand the mechanisms underlying differential responses to influenza vaccination. While standard logistic regression approaches were predictive, they were minimally interpretable. Incorporating prior knowledge using LogMiNeR led to models that were equally predictive yet highly interpretable. In this context, B cell-specific genes and mTOR signaling were associated with an effective vaccination response in young adults. Overall, our results demonstrate a new paradigm for analyzing high-dimensional immune profiling data in which multiple networks encoding prior knowledge are incorporated to improve model interpretability.

Download

Session A-308: Active Interaction Mapping reveals the hierarchical organization of autophagy

COSI: NetBio

Michael Kramer, University of California, San Diego, United States
Jean-Claude Farre, University of California, San Diego, United States
Koyel Mitra, University of California, San Diego, United States
Michael Ku Yu, University of California, San Diego, United States
Keiichiro Ono, University of California, San Diego, United States
Barry Demchak, University of California, San Diego, United States
Katherine Licon, University of California, San Diego, United States
Mitchell Flagg, University of California, San Diego, United States
Rama Balakrishnan, Stanford University, United States
J. Michael Cherry, Stanford University, United States
Suresh Subramani, University of California, San Diego, United States
Trey Ideker, University of California, San Diego, United States

Session A-310: Revealing Complex Genetic Bases of ABC Transporter Mediated Drug Resistance in Yeast using an Engineered Population Strategy

COSI: NetBio

Albi Celaj, Roth Lab, Canada
Louai Musa, University of Toronto, Canada
Marinella Gebbia, University of Toronto, Canada
Minjeong Ko, Ewha womans university / Ewha Research Center for Systems Biology, South Korea
Shijie Zhou, University of Toronto, Canada
Tiffany Fong, University of Toronto, Canada
Nozomu Yachie, University of Toronto, Canada
Frederick Roth, University of Toronto & Mt Sinai Hospital, Canada

Session A-312: The STRING app: bringing quality-controlled protein-protein and protein-chemical networks into Cytoscape

COSI: NetBio

Nadezhda T. Doncheva, University of Copenhagen, Denmark
John H. Morris, University of California, San Francisco, United States
Jan Gorodkin, University of Copenhagen, Denmark
Lars Juhl Jensen, University of Copenhagen, Denmark

Download

Session A-314: A network perspective for the TFEB transcription factor in angiogenesis

COSI: NetBio

Davide Cora', Dept. of Oncology, University of Torino, c/o Candiolo Cancer Institute IRCCS, Italy
Elena Astanina, Dept. of Oncology, University of Torino, c/o Candiolo Cancer Institute IRCCS, Italy
Alessio Noghero, Dept. of Oncology, University of Torino, c/o Candiolo Cancer Institute IRCCS, Italy
Francesco Neri, Leibniz Institute on Aging - Fritz Lipmann Institute (FLI), Germany
Salvatore Oliviero, Dept. of Life Science and Systems Biology, University of Torino, Italy
Federico Bussolino, Dept. of Oncology, University of Torino, c/o Candiolo Cancer Institute IRCCS, Italy
Gabriella Doronzo, Dept. of Oncology, University of Torino, c/o Candiolo Cancer Institute IRCCS, Italy

Session A-316: Quality assessment of gene coexpression network; reproducibility, functional consistency and genomic consistency

COSI: NetBio

Takeshi Obayashi, Graduate School of Information Sciences, Tohoku University, Japan
Yuichi Aoki, Graduate School of Information Sciences, Tohoku University, Japan
Kengo Kinoshita, Graduate School of Information Sciences, Tohoku University, Japan

Session A-318: Integration of multilevel OMICs data based on the identification of regulatory modules from protein-protein interaction networks

COSI: NetBio

T. Conrad, Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Germany
O. Kniemeyer, Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knöll-Institute, Germany
S. G. Henkel, BioControl Jena GmbH, Germany
T. Krüger, Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knöll-Institute, Germany
R. Guthke, Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knöll-Institute, Germany
A. A. Brakhage, Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knöll-Institute, Germany
S. Vlaic, Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knöll-Institute, Germany
J. Linde, Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology – Hans-Knöll-Institute; Research Group PiDOMICS, Leibniz Institute for Natural Product Research and, Germany

Session A-320: Analyzing biological processes of anatomical context-specific molecular networks through large-scale integration

COSI: NetBio

Mijin Kwon, KAIST, South Korea
Yousang Jo, KAIST, South Korea
Sunghwa Bae, KAIST, South Korea
Soorin Yim, KAIST, South Korea
Gwangmin Kim, KAIST, South Korea
Doheon Lee, KAIST, South Korea

Session A-322: Annotating activation/inhibition relationships to protein-protein interactions using gene ontology

COSI: NetBio

Soorin Yim, KAIST, South Korea
Hasun Yu, KAIST, South Korea
Dongjin Jang, KAIST, South Korea
Doheon Lee, KAIST, South Korea

Download

Session A-324: Using multi-stage disease expression profiles to identify dynamics in the network topology

COSI: NetBio

Ru-Fong Peng, National Kaohsiung University of Applied Sciences, Taiwan
Cheng-Han Hsieh, National Kaohsiung University of Applied Sciences, Taiwan
Wen-Yu Chung, National Kaohsiung University of Applied Sciences, Taiwan

Download

Session A-326: NetRep: a scalable permutation approach for assessing replication and preservation of network modules in large datasets

COSI: NetBio

Scott Ritchie, The University of Melbourne, Australia
Stephen Watts, The University of Melbourne, Australia
Liam Fearnley, The University of Melbourne, Australia
Kathryn Holt, The University of Melbourne, Australia
Gad Abraham, The University of Melbourne, Australia
Michael Inouye, The University of Melbourne, Australia

Download

Session A-328: Structural Connectome of Drosophila Brain at the Single-Cell Scale

COSI: NetBio

Chi-Tin Shih, Department of Applied Physics, Tunghai University, Taiwan
Chung-Chuan Lo, Institute of Systems Neuroscience, National Tsing Hua University, Taiwan
Ann-Shyn Chiang, Brain Research Center, National Tsing Hua University, Taiwan

Session A-330: Multi-omics network analysis reveals a muscular dystrophy-related signature in mouse blood

COSI: NetBio

Kristina Hettne, Leiden University Medical Center, Netherlands
Roula Tsonaka, Leiden University Medical Center, Netherlands
Mohammed Charrout, Leiden University Medical Center, Netherlands
Olga Veth, Leiden University Medical Center, Netherlands
Alexandre Seyer, Profilomic, France
Eleni Mina, Leiden University Medical Centre, Netherlands
Peter A. C. 'T Hoen, Leiden University Medical Center, Netherlands
Annemieke Aartsma-Rus, Leiden University Medical Center, Netherlands
Pietro Spitali, Leiden University Medical Center, Netherlands

Session A-332: Metabolic pathway extraction from text

COSI: NetBio

Cécile Pereira, University of Florida, United States
Salva Casani, Centro de Investigaciones Principe Felipe (CIPF), Spain
Ana Conesa, UNIVERSITY OF FLORIDA, United States

Session A-334: A novel metabolite-centric approach for the identification of perturbed metabolic pathways from genome-wide data

COSI: NetBio

Tunahan Cakir, Gebze Technical University, Turkey

Download

Session A-336: Drug Response Prediction as a Link Prediction Problem

COSI: NetBio

Zachary Stanfield, Case Western Reserve University, United States
Mustafa Coskun, Case Western Reserve University, United States
Mehmet Koyuturk, Case Western Reserve University, United States

Download

Session A-338: PathSys: Integrating pathway curation, profiling methods, and public repositories: An infrastructure for functional molecular data sharing

COSI: NetBio

Sokratis Kariotis, University of Sheffield, United Kingdom

Session A-340: Crowdsourced enhancement of causal network models – results from past network verification challenges and new application enabling liver phase I xenobiotic metabolism model refinement.

COSI: NetBio

Stephanie Boue, Philip Morris International R&D, Switzerland
Justyna Szostak, Philip Morris International R&D, Switzerland
Marja Talikka, Philip Morris International R&D, Switzerland
Florian Martin, Philip Morris International R&D, Switzerland
Julia Hoeng, Philip Morris International R&D, Switzerland

Session A-342: Mapping DNA damage-dependent protein interactions via conditional Barcode Fusion Genetics-Yeast Two-Hybrid

COSI: NetBio

Dae-Kyum Kim, Donnelly Centre, University of Toronto, Canada
Brandon Ho, Donnelly Centre, University of Toronto, Canada
Nishka Kishore, Lunenfeld-Tanenbaum Research Institute (LTRI), Mount Sinai Hospital, Canada
Nikko Torres, Donnelly Centre, University of Toronto, Canada
Dayag Sheykhkarimli, Donnelly Centre, University of Toronto, Canada
Siyang Li, Donnelly Centre, University of Toronto, Canada
Natascha van Lieshout, Lunenfeld-Tanenbaum Research Institute (LTRI), Mount Sinai Hospital, Canada
Evangelia Petsalakis, European Bioinformatics Institute (EMBL-EBI), United Kingdom
Jennifer Knapp, Donnelly Centre, University of Toronto, Canada
Julia Kitaygorodsky, Donnelly Centre, University of Toronto, Canada
Ghazal Haddad, Donnelly Centre, University of Toronto, Canada
Atina Cote, Donnelly Centre, University of Toronto, Canada
Marinella Gebbia, Donnelly Centre, University of Toronto, Canada
Jochen Weile, Donnelly Centre, University of Toronto, Canada
Daniel Durocher, Lunenfeld-Tanenbaum Research Institute (LTRI), Mount Sinai Hospital, Canada
David E. Hill, Center for Cancer Systems Biology (CCSB), Dana Farber Cancer Institute, United States
Marc Vidal, Center for Cancer Systems Biology (CCSB), Dana Farber Cancer Institute, United States
Grant W. Brown, Donnelly Centre, University of Toronto, Canada
Frederick P. Roth, Donnelly Centre, University of Toronto, Canada

Session A-344: Network Pharmacology Exploration of the Anti-Inflammatory Drug Space

COSI: NetBio

Guillermo de Anda Jáuregui, University of North Dakota School of Medicine and Health Sciences, United States
Kai Guo, University of North Dakota School of Medicine and Health Sciences, United States
Brett McGregor, University of North Dakota School of Medicine and Health Sciences, United States
Junguk Hur, University of North Dakota, United States

Session A-346: Pathways on demand: automated reconstruction of human signaling networks

COSI: NetBio

Anna Ritz, Reed College, United States
Christopher Poirel, RedOwl Analytics, United States
Allison Tegge, Virginia Tech, United States
Nicholas Sharp, Virginia Tech, United States
Kelsey Simmons, Virginia Tech, United States
Allison Powell, Virginia Tech, United States
Shiv Kale, Virginia Tech, United States
T. M. Murali, Virginia Tech, United States

Download

Session A-348: Exploring Bacteriophage Diversity Through Gene-Level Networks

COSI: NetBio

Jason Shapiro, Loyola University Chicago, United States
Catherine Putonti, Loyola University Chicago, United States

Session A-350: Generalized benchmarking of gene prioritization methods

COSI: NetBio

Dimitri Guala, Stockholm University, Sweden
Erik Sonnhammer, Stockholm University, Sweden

Download

Session A-352: MODELING BIOLOGICAL NETWORK FOR IDENTIFICATION OF NEW PLAYERS IN THE INNATE IMMUNE SYSTEM

COSI: NetBio

Rasha Boulos, Institut de Recherche en Cancérologie de Montpellier, France
Matthias Habjan, Max-Planck Institute of Biochemistry, Germany
Assel Mussabekova, Institut de Biologie Moléculaire et Cellulaire, France
Carine Meignin, Institut de Biologie Moléculaire et Cellulaire, France
Jean Luc Imler, Institut de Biologie Moléculaire et Cellulaire, France
Andreas Pichlmair, Max-Planck Institute of Biochemistry, Germany
Jacques Colinge, Institut de Recherche en Cancérologie de Montpellier, France

Session A-354: FunCoup v4

COSI: NetBio

Christoph Ogris, Stockholm University, Sweden
Mateusz Kaduk, Stockholm University, Sweden
Dimitri Guala, Stockholm University, Sweden
Erik Sonnhammer, Stockholm University, Sweden

Download

Session A-356: Network-based method for feature selection and prediction of cancer drug response

COSI: NetBio

Mushthofa Mushthofa, Ghent University, Belgium
Lieven Verbeke, Ghent University, Belgium
Kathleen Marchal, Ghent University, Belgium

Session A-358: Using network analysis to identify a new key set of Parkinson’s Disease associated gene

COSI: NetBio

Katharina F Heil, University of Edinburgh // KTH Stockholm, United Kingdom
Oksana Sorokina, University of Edinburgh, United Kingdom
Colin Mclean, University of Edinburgh, United Kingdom
J Douglas Armstrong, University of Edinburgh, United Kingdom

Download

Session A-360: Residue interaction networks explain role of amino acids in protein stability

COSI: NetBio

Dibyajyoti Das, TATA Consultancy Services Ltd., India
Arijit Roy, TATA Consultancy Services Ltd., India
Navneet Bung, TATA Consultancy Services Ltd., India
Gopalakrishnan Bulusu, TATA Consultancy Services Ltd., India

Session A-362: HGPEC: a Cytoscape app for prediction of novel disease-gene and disease-disease associations and evidence collection based on a random walk on heterogeneous network

COSI: NetBio

Duc-Hau Le, VINMEC Research Institute of Stem Cell and Gene Technology, Viet Nam

Session A-364: Pan-cancer classification of gene signatures for their information value and functional redundancy

COSI: NetBio

Laura Cantini, Institut Curie, INSERM U900, PSL Research University, Mines ParisTech, 26, rue d’Ulm, F-75248 Paris, France, France
Laurence Calzone, Institut Curie, INSERM U900, PSL Research University, Mines ParisTech, 26, rue d’Ulm, F-75248 Paris, France, France
Loredana Martignetti, Institut Curie, INSERM U900, PSL Research University, Mines ParisTech, 26, rue d’Ulm, F-75248 Paris, France, France
Emmanuel Barillot, Institut Curie, INSERM U900, PSL Research University, Mines ParisTech, 26, rue d’Ulm, F-75248 Paris, France, France
Andrei Zinovyev, Institut Curie, INSERM U900, PSL Research University, Mines ParisTech, 26, rue d’Ulm, F-75248 Paris, France, France

Session A-366: A PRObabilistic Pathway Score (PROPS) for Classification with Applications to Inflammatory Bowel Disease

COSI: NetBio

Lichy Han, Stanford University, United States
Mateusz Maciejewski, Pfizer, Inc, United States
Christoph Brockel, Pfizer, Inc, United States
William Gordon, Pfizer Inc, United States
Scott B. Snapper, Boston Children's Hospital, Harvard Medical School, United States
Joshua R. Korzenik, Brigham & Women's Hospital, Harvard Medical School, United States
Lovisa Afzelius, Pfizer, Inc, United States
Russ B. Altman, Stanford University, United States

Session A-368: Alignment of dynamic networks

COSI: NetBio

Tijana Milenkovic, University of Notre Dame, United States
Vipin Vijayan, University of Notre Dame, United States
Dominic Critchlow, University of Notre Dame, United States

Short Abstract: Networks can model real-world systems in a variety of domains. Network alignment (NA) aims to find a node mapping that conserves similar regions between compared networks. NA is applicable to many fields, including computational biology, where NA can guide the transfer of biological knowledge from well- to poorly-studied species across aligned network regions. Existing NA methods can only align static networks. However, most complex real-world systems evolve over time and should thus be modeled as dynamic networks. We hypothesize that aligning dynamic network representations of evolving systems will produce superior alignments compared to aligning the systems' static network representations, as is currently done. For this purpose, we introduce the first ever dynamic NA method, DynaMAGNA++. This proof-of-concept dynamic NA method is an extension of a state-of-the-art static NA method, MAGNA++. Even though both MAGNA++ and DynaMAGNA++ optimize edge as well as node conservation across the aligned networks, MAGNA++ conserves static edges and similarity between static node neighborhoods, while DynaMAGNA++ conserves dynamic edges (events) and similarity between evolving node neighborhoods. For this purpose, we introduce the first ever measure of dynamic edge conservation and rely on our recent measure of dynamic node conservation. Importantly, the two dynamic conservation measures can be optimized using any state-of-the-art NA method and not just MAGNA++. We confirm our hypothesis that dynamic NA is superior to static NA, under fair comparison conditions, on synthetic and real-world networks, in computational biology and social network domains. DynaMAGNA++ is parallelized and it includes a user-friendly graphical interface.

Session A-370: A novel methodology on distributed representations of proteins using their interacting ligands: A case-study on Sphingolipid Metabolic Pathway

COSI: NetBio

Hakime Öztürk, Boğaziçi University, Turkey
Mehmet Aziz Yirik, Boğaziçi University, Turkey
Arzucan Ozgur, Bogazici University, Turkey
Elif Ozkirimli, Bogazici University, Turkey
Kutlu O. Ulgen, Boğaziçi University, Turkey

Short Abstract: Motivation: Proteins in a metabolic pathway catalyze reactions on similar metabolites. Examination of metabolic networks provides information on key targetable proteins. In this study, we propose a novel machine-learning based method to cluster proteins by representing them with ligands they bind to. The proteins are represented utilizing the word-embeddings of the SMILES representations of their ligands and then clustered using K-means algorithm. We compare this method with two other methods one of which is another machine-learning based approach that utilizes the word-embeddings model to represent proteins using their sequences. The other is a network-based model that we introduced in our previous study, in which proteins are connected to each other based on the ligands which they interact. We showed that the investigation of proteins based on the ligands with which they interact reveals functionally meaningful protein families on a network model. To collect the required protein-ligand interaction data, we also developed a Python package to automatically extract protein-ligand interactions from available databases. As a test case, we examined proteins that participate in the sphingolipid (SL) metabolic pathway. Results: Our results show that describing proteins with the ligands that they bind to brings the ligand similarity information within, thus leading to the construction of functionally meaningful protein clusters. Availability: https://github.com/hkmztrk/SMILES2VecBasedProteinClustering

Session A-372: Incorporating Interaction Networks into the Determination of Functionally Related Hit Genes in Genomic Experiments with Markov Random Fields

COSI: NetBio

Sean Robinson, Université Grenoble Alpes/University of Turku, Finland
Jaakko Nevalainen, University of Turku, Finland
Guillaume Pinna, CEA, France
Anna Campalans, CEA, France
J. Pablo Radicella, CEA, France
Laurent Guyon, CEA, France

Short Abstract: Motivation: Incorporating gene interaction data into the identification of ‘hit’ genes in genomic experiments is a well-established approach leveraging the ‘guilt by association’ assumption to obtain a network based hit list of functionally related genes. We aim to develop a method to allow for multivariate gene scores and multiple hit labels in order to extend the analysis of genomic screening data within such an approach. Results: We propose a Markov random field based method to achieve our aim and show that the particular advantages of our method compared to those currently used lead to new insights in previously analysed data as well as for our own motivating data. Our method additionally achieves the best performance in an independent simulation experiment. The real data applications we consider comprise of a survival analysis and differential expression experiment and a cell-based RNA interference functional screen. Availability: We provide all of the data and code related to the results in the paper.

Session A-374: ComPass – A graph-based algorithm for pathway analysis in microbial communities

COSI: NetBio

Aarthi Ravikrishnan, Indian Institute of Technology Madras, India
Meghana Nasre, Indian Institute of Technology Madras, India
Karthik Raman, Indian Institute of Technology Madras, India

Short Abstract: Micro-organisms are ubiquitous and exist together in communities, where they interact with each other through several means, especially the exchange of chemical signals and metabolites. There has been an increasing interest in using microbial consortia for applications in metabolic engineering. A microbial consortium, besides enjoying division of labour, also provides a wider scope for joint exploration of diverse metabolisms. Although there are many naturally occurring microbial communities, their systematic exploration has been very rarely carried out. This is primarily due to the difficulties in understanding the complex interactions between the organisms in a community. In order to systematically design a microbial community, it is important to understand their metabolism and the metabolic exchanges happening therein. To this end, we have developed a novel graph-based algorithm, ComPass (Community Pathway analysis), to predict all possible metabolic interactions that occur between microorganisms in a consortium. We demonstrate that ComPass can easily scale to large metabolic networks and can reliably predict several sub-networks between any given source and target metabolite. We illustrate the utility of ComPass to understand existing microbial communities by analysing the predicted sub-networks from different types of microbial communities and demonstrate interesting metabolic exchanges that occur between the micro-organisms.

Session A-472: NETWORK BASED APPROACH FOR ANALYSIS OF CELL HETEROGENEITY AND IMMUNE POLARIZATION IN TUMOR MICROENVIRONMENT FROM SINGLE-CELL DATA

COSI: NetBio

Maria Kondratova , Institut Curie,

Short Abstract: Tumor microenvironment (TME) plays important and, sometimes, opposite roles in tumor evolution. We have created a collection of comprehensive cell type specific maps of molecular interactions in TME. The collection includes maps of Macrophages (Mph), Dendritic cells (DC), Natural killers (NK), and non-immune cancer-associated fibroblasts (CAFs) map. Cell type-specific innate-immune maps were integrated together with specific information on Neutrophils, Mast cells and MDSC in TME, which gave rise to a seamless comprehensive meta-map of innate immune response in cancer, depicting signalling responsible for anti- and pro-tumour activities of innate immunity system as a whole. It is a ‘geographical-like’ hierarchically organized meta-map with functional ‘zones’, namely, signalling mechanisms contributing to anti-tumor or pro-tumor immune phenotypes. The map contains 1476 objects and based on the information manually retrieved from 820 papers. It will soon become part of ACSN, http://acsn.curie.fr.

Finally, we applied these network maps for identification of molecular mechanisms regulating cell reprogramming in several innate immune cell types. We applied unsupervised classification methods for decomposing single cell RNASeq data on fibroblasts, natural killers and macrophages from melanoma and selected several sub-groups in each population, which probably play different functional roles in TME. Analysis and interpretation of the expression data for each subset in the context of cell-type specific maps and the innate immunity meta-map revealed the functional differences resulting from transcriptomic heterogeneity in several cell types.

Session A-474: Hyperpath Relaxations for Signaling Pathway Analysis

COSI: NetBio

Anna Ritz, Reed College, United States
Nicholas Franzese, Reed College, United States
Barney Potter, Reed College, United States
Adam Groce, Reed College, United States
James Fix, Reed College, United States

Short Abstract: Signaling pathways are series of reactions that are typically initiated by an extracellular ligand to a membrane-bound receptor, culminating in altered expression of a set of target genes. Pathways are commonly represented as graphs, which offer elegant algorithms for analyzing signaling pathways but fail to capture many-to-many relationships among molecules in signaling reactions. We recently presented a shortest path formulation posed on directed hypergraphs, a generalization of graphs which capture many aspects of signaling reactions. However, it offered a strict and restrictive definition of connectivity that limited applicability to real-world signaling pathways. Here, we extend a mixed Integer Linear Program (mILP) to achieve hyperpath relaxations in two ways. First, we allow simple cycles in shortest hyperpaths that capture feedback loops. Second, we allow plausible "source" nodes that are not specified in advance. We apply these relaxations to hypergraphs built automatically from pathway databases.

Session A-478: TETRAMER, a temporal transcriptional regulation modeller

COSI: NetBio

Marco Antonio Mendoza Parra, IGBMC, France

Short Abstract: Describing living systems through the reconstitution of their genomic-regulatory functions stands for the biggest challenge of the current "big-data omics" era. Here we present TETRAMER, a cytoscape app providing a user-friendly framework for the reconstruction of cell fate transition-specific GRNs by integrating user-provided temporal transcriptomes with generic GRNs derived from (i) the analysis of multiple publicly available human/mouse gene expression profiles (CellNet); (ii) the genome-wide mapping of promoters and enhancers in multiple cell type/tissues from CAGE data generated by the FANTOM5 consortium (regulatory circuits); (iii) the systematic analysis of most publicly available ChIP-sequencing data corresponding to TF-binding in a variety of human or mouse cell type/tissues (ngs-qc: http://ngs-qc.org).
TETRAMER evaluates the capacity of each TF, retrieved on the GRN, to drive cell fate transformation. For it the temporal transcriptional regulation cascade derived from each TF is scrutinized as a way to verify its influence on the reconstitution of the differential gene expression patterns associated to the cell fate transition. TETRAMER is available from its dedicated website: http://igbmc.fr/Gronemeyer/qcgenomics/TETRAMER

Session A-500: Network Guided Regression Analysis of Molecular Pharmacology Data for Prediction of Drug Response in Cancer Cell Lines

COSI: NetBio

Augustin Luna, Dana-Farber Cancer Institute, United States
Vinodh Rajapakse, National Cancer Institute, United States
Lisa Loman, National Cancer Institute, United States
Chris Sander, Dana-Farber Cancer Institute, United States
William Reinhold, National Cancer Institute, United States
Yves Pommier, National Cancer Institute, United States

Short Abstract: A major challenge in precision medicine is the development of methods to predict drug response using multi-omic data with models that are both accurate and interpretable. There are a large number of statistical methods (e.g. LASSO, elastic net, random forest regression) that generate predictive models and allow for automatic feature selection. Still, these methodologies often fail to yield biologically interpretable models because their results are purely numerically-driven and issues, such as collinearity, can mask the most biologically relevant features. We present an analysis of large cell line databases using a network-constrained regression that incorporates pathway information from the Pathway Commons database to bridge the limiting gap in existing methods that ignore biological knowledge. In this work, we examine multivariate linear models based on combinations of molecular features and evaluate their predictive power for ~200 FDA-approved or investigational compounds.

Session A-504: Visualizing metabolomics data in directed biological networks

COSI: NetBio

Martina Kutmon, Maastricht University, Netherlands
Ryan Miller, Maastricht University, Netherlands
Denise Slenter, Maastricht University, Netherlands
Chris T Evelo, Maastricht University, Netherlands
Egon L Willighagen, Maastricht University, Netherlands

Short Abstract: We developed a new solution to visualize the biological pathways involved in sparse metabolomics data. Using knowledge from two pathway resources and ontology-based approaches, we can show the directed networks between active metabolites from metabolomics data. The data from both resources is made interoperable by collapsing metabolites in the pathways into single nodes in the biological networks using ontological approaches. This explicit ontological linking allows for precise biological interpretation of the paths. By using Neo4j and Cytoscape, we ensure the computational calculation environment for larger networks as well as advanced visualization functionality to investigate the identified subnetworks. The generic nature of this approach opens up the option to combine with other omics data sources, such as proteomics and transcriptomics.

View Posters By Category

Session A: (July 22 and July 23)

3Dsig
Bioinformatics Open Source Conference (BOSC)
CAMDA
Education
Network Biology
Regulatory Genomics (RegGenSig)
RNA
Computational Modeling of Biological Systems (SysMod)

Session B: (July 24 and July 25)

Bio-Ontologies
BioVis
Function
High Throughput Sequencing Algorithms and Applications (HitSeq)
Machine Learning Systems Biology (MLSB)
Translational Medicine (TransMed)
VarI
Other

ISMB/ECCB 2017

Sponsors

Accepted Posters

Track: Network Biology

View Posters By Category

Session A: (July 22 and July 23)

Session B: (July 24 and July 25)

Search Posters: