Posters - Schedules

Posters Home

View Posters By Category

Monday, July 11 and Tuesday, July 12 between 12:30 PM CDT and 2:30 PM CDT
Wednesday July 13 between 12:30 PM CDT and 2:30 PM CDT
Session A Poster Set-up and Dismantle Session A Posters set up:
Monday, July 11 between 7:30 AM CDT - 10:00 AM CDT
Session A Posters dismantle:
Tuesday, July 12 at 6:00 PM CDT
Session B Poster Set-up and Dismantle Session B Posters set up:
Wednesday, July 13 between 7:30 AM - 10:00 AM CDT
Session B Posters dismantle:
Thursday. July 14 at 2:00 PM CDT
Virtual: Application of module detection methods to TCGA Head and Neck Cancer dataset
COSI: NetBio
  • Amulya Shastry, Boston University, United States
  • Zachary Peters Wakefield, Boston University, United States
  • Elizabeth Irene Tchantouridze, Boston University, United States
  • Lina Kroehling, Boston University, United States
  • Mohammed Muzamil Khan, Boston University, United States
  • Anthony Federico, Boston University, United States
  • Gary Benson, Boston University, United States
  • Stefano Monti, Boston University, United States


Presentation Overview: Show

Head and Neck Squamous Cell Carcinomas (HNSCCs), classified based on their causative agents Human papillomavirus (HPV +ve) and alcohol and smoking (HPV -ve), are the sixth most common cancers worldwide. Although single gene differential analysis aids our understanding of disease mechanisms, it lacks the context incorporated by co-expression networks. Using RNA-Seq samples from The Cancer Genome Atlas, we applied three module detection methods: i) weighted gene correlation network analysis (WGCNA), ii) independent component analysis (ICA), and iii) multiscale embedded gene co-expression networks (MEGENA), across two phenotype categories, HPV status and age. We investigated connectivity of genes between phenotypes (differential connectivity) in these modules using the ConAn tool and calculated the jaccard index to detect the gene overlaps between modules detected by the three methods. For age, only WGCNA identified significant modules(<0.05 FDR) indicating a requirement for stronger phenotype for module detection methods. For HPV status, all the three methods identified modules with higher connectivity in HPV -ve samples with high overlap of genes across modules (jaccard index).These shared modules show consistent enrichment for specific metabolism and immune pathways and could lead to potential identification of novel biomarkers or therapeutic targets in HPV -ve HNSCCs.

Virtual: Automorphism orbits based characterization of protein interaction networks across the three domains of life
COSI: NetBio
  • Vikram Singh, Central University of Himachal Pradesh, India
  • Vikram Singh, Central University of Himachal Pradesh, India


Presentation Overview: Show

The enormous diversity of life forms thriving in drastically different environmental milieus involves a complex interplay among constituent proteins interacting with each other. However, the organizational principles characterizing the evolution of protein interaction networks (PINs) across the tree of life are largely unknown. Here we study 4,738 PINs belonging to 16 phyla to discover phyla-specific architectural features and examine if there are some evolutionary constraints imposed on the networks' topologies. We utilized positional information of a network’s nodes by normalizing the frequencies of automorphism orbits appearing in graphlets of sizes 2-5. We report that orbit usage profiles (OUPs) of networks belonging to the three domains of life are contrastingly different not only at the domain level but also at the scale of phyla. Integrating the information related to protein families, domains, subcellular location, gene ontology, and pathways, our results indicate that wiring patterns of PINs in different phyla are not randomly generated rather they are shaped by evolutionary constraints imposed on them. There exist subtle but substantial variations in the wiring patterns of PINs that enable OUPs to differentiate among different superfamilies. A deep neural network was trained on differentially expressed orbits resulting in a prediction accuracy of 85%.

Virtual: Generating a Fungal Knowledge-Base and Ontology Terms by Integrating Multi-omics data and Natural Language Processing for LPMO-specific Lignocellulosic Deconstruction
COSI: NetBio
  • Shashank Ravichandran, SASTRA Deemed to be University, Thanjavur, India
  • Yatindrapravanan Narasimhan, SASTRA Deemed to be University, Thanjavur, India
  • Pulkit Anupam Srivastava, Qriousatom, India
  • Ragothaman Yennamalli, SASTRA Deemed to be University, India


Presentation Overview: Show

The efficient degradation of the plant biomass is one of the primary goals for many industrial
applications. However, the crystallinity and composition of lignocellulosic biomass in the
plant cell wall makes it recalcitrant. Conventional cellulases fail to act on crystalline
polysaccharides, Lytic Polysaccharide Monooxygenases (LPMOs) oxidatively cleave them.
LPMOs belong to a new family of enzymes acting synergistically with other cellulolytic
enzymes by oxidative cleavage of internal glycosidic bonds inaccessible to the endo-acting
hydrolases. We created a one-stop solution/knowledge base that integrates various LPMO
data and create a pipeline to process/aggregate existing data. Specifically, we looked at a co-
expression network-based meta-analysis of DEGs to address issues such as improper
annotation of LPMO coding genes in fungi and lack of knowledge regarding different
mechanisms of genes involved in the lignocellulosic deconstruction. We performed the study
in three parts: a) getting the relevant datasets from the GEO database; b) using NGS data
processing pipeline implemented in python to identify the significant DEGs and build the co-
expression network using WGCNA in R, c) followed by performing basic NLP including
Named Entity Recognition to generate ontology terms specific to lignocellulosic degradation
using Biobert, improving the usability and applicability of the knowledge base.

Virtual: Inferring biologically relevant molecular tissue substructures by agglomerative clustering of digitized spatial transcriptomes with multilayer
COSI: NetBio
  • Marco Antonio Mendoza-Parra, French National Sequencing Center / Genoscope, France


Presentation Overview: Show

Spatially resolved transcriptomics (SrT) can investigate organ or tissue architecture from the angle of gene programs that define their molecular complexity. However, computational methods to analyze SrT data underexploit their spatial signature. Inspired by contextual pixel classification strategies applied to image
analysis, we developed MULTILAYER to stratify maps into functionally relevant molecular substructures.
MULTILAYER applies agglomerative clustering within contiguous locally defined transcriptomes (gene expression elements or ‘‘gexels’’) combined with community detection methods for graphical partitioning.
MULTILAYER resolves molecular tissue substructures within a variety of SrT data with superior performance to commonly used dimensionality reduction strategies and still detects differentially expressed genes on par with existing methods.
MULTILAYER can process high-resolution as well as multiple SrT data in a comparative mode, anticipating future needs in the field. MULTILAYER provides a digital image perspective for SrT analysis and opens the door to contextual gexel classification strategies for developing self-supervised molecular diagnosis solutions.

Virtual: Microbial network change in the global ocean
COSI: NetBio
  • Ina Maria Deutschmann, Institut de Biologie de l’Ecole Normale Supérieure, CNRS, INSERM, Ecole Normale Supérieure, Université PSL, France
  • Erwan Delage, Université de Nantes, CNRS UMR 6004, LS2N, F-44000, and FR2022 / Tara Oceans GOSEE, France
  • Caterina R. Giner, Institute of Marine Sciences, CSIC, Spain
  • Marta Sebastián, Institute of Marine Sciences, CSIC, Spain
  • Julie Poulain, Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, France
  • Javier Arístegui, Instituto de Oceanografía y Cambio Global, IOCAG, Universidad de Las Palmas de Gran Canaria, ULPGC, Spain
  • Carlos M. Duarte, King Abdullah University of Science and Technology (KAUST), Red Sea Research Center (RSRC), Thuwal, Saudi Arabia
  • Silvia G. Acinas, Institute of Marine Sciences, CSIC, Spain
  • Ramon Massana, Institute of Marine Sciences, CSIC, Spain
  • Josep M. Gasol, Institute of Marine Sciences, CSIC, and School of Sciences, Edith Cowan University, Spain
  • Damien Eveillard, Université de Nantes, CNRS UMR 6004, LS2N, F-44000, and FR2022 / Tara Oceans GOSEE, France
  • Samuel Chaffron, Université de Nantes, CNRS UMR 6004, LS2N, F-44000, and FR2022 / Tara Oceans GOSEE, France
  • Ramiro Logares, Institute of Marine Sciences, CSIC, Spain


Presentation Overview: Show

Microbial interactions underpin ocean ecosystem function but remain barely known. Multiple studies analyzed microbial interactions using static association networks based on omics data, yet microbial interactions can change across spatiotemporal scales. So far, few studies have investigated the dynamics of microbial interactions, which is needed to comprehend ocean ecosystems and food webs better. We explored associations between archaea, bacteria, and picoeukaryotes along the water column, from the surface to the deep ocean, across the northern subtropical to the southern temperate ocean and the Mediterranean Sea. By delineating 397 sample-specific subnetworks, we examined the change in microbial associations across space. We found that associations tend to change with depth and geographical scale, with a few associations being global (i.e., present across regions within the same depth layer) and most of them being regional. Most associations observed in surface waters disappeared with depth, suggesting that surface ocean interactions are not transferred to the deep sea, despite microbial sinking mediated by particles. Our results suggest highly heterogeneous distributions of microbial interactions in the ocean that do not mirror taxonomic distributions. Our work contributes to a better understanding of the dynamics of microbial interactions, which is needed in a context of global change.

Virtual: Modelling CAR T cell signalling using prior knowledge networks
COSI: NetBio
  • Alice Driessen, IBM Researcg, Switzerland
  • Rocío Castellanos Rueda, ETH Zürich, Switzerland
  • Constance le Gac, IBM Research, Switzerland
  • Nicolas Deutschman, IBM Research, Switzerland
  • Maria Rogriguez Martinez, IBM Research, Switzerland
  • Sai Reddy, sai.reddy@bsse.ethz.ch, Switzerland


Presentation Overview: Show

Chimeric antigen receptor (CAR) T cells are a promising new approach in cancer immunotherapy. Their safety, efficacy and phenotype depend heavily on the design of the CAR, which intracellular tail contains up to three domains derived from a range of cellular signalling receptors. Due to its modular design and the multitude of possible domains, there is a vast combinatorial space of CAR designs. There are substantial efforts to improve CAR T cells based on CAR designs. However, testing the effect of each CAR design experimentally is very resource and labour intensive, and not feasible beyond a few hundred different combinations. Therefore, we aim to predict T cell phenotypes upon expression of different CAR designs, informed by single-cell RNA sequencing of a small library of 30 CAR designs using combinations of five different domains. Exploiting a prior knowledge signalling network, we design models of signalling pathways. We evaluate multiple network algorithms linking CAR domains to the phenotype of T cells, including flow maximisation, integer linear programming and pathway signal flow. As a result, we will present an interpretable model that identifies pathways activated by different CAR designs, predict the phenotype of CAR T cells and guide CAR T cell therapy.

Virtual: Novel Protein-Protein Interaction Approach Defines Mismatch Repair Deficient Breast Cancer as a Distinct Molecular Subtype
COSI: NetBio
  • Sean Hacking, Department of Pathology and Laboratory Medicine, Rhode Island Hospital and Lifespan Medical Center, Providence, RI, United States
  • Charissa Chou, Department of Biology, Brown University, Providence, RI, United States
  • Yigit Baykara, Department of Pathology and Laboratory Medicine, Rhode Island Hospital and Lifespan Medical Center, Providence, RI, United States
  • Yihong Wang, Department of Pathology and Laboratory Medicine, Rhode Island Hospital and Lifespan Medical Center, Providence, RI, United States
  • Alper Uzun, Department of Pediatrics, Women and Infants Hospital, Providence, RI, United States
  • Ece Uzun, Department of Pathology and Laboratory Medicine, Rhode Island Hospital and Lifespan Medical Center, Providence, RI, United States


Presentation Overview: Show

Mismatch repair (MMR) alterations are important prognostic and predictive biomarkers in a variety of cancer subtypes including colorectal and endometrial. However, in breast cancer (BC), the distinction and clinical significance of MMR is largely unknown. Genome-wide association studies (GWAS) have grown in popularity as it offers an approach to investigate complex diseases such as BC. However, GWAS has fallen short in identifying ‘missing heritability’, our fundamental understanding of the complex architecture of the genome due to additive genetic effects is not fully represented by GWAS. Analyzing the complex architecture could demonstrate subgroups of patients which share variants in genes in specific networks with shared phenotype. To solve this problem, we recently developed Proteinarium, a multi-sample protein-protein interaction (PPI) tool with the ability to identify clusters of cancer patients with shared gene networks. In the present study, we analyzed The Cancer Genome Atlas (TCGA) data using a novel multi-sample protein-protein interactions (PPI) analysis tool, Proteinarium, and showed a distinct separation in the MMR deficient and intact specific networks in a cohort of 996 BC patients. Proteinarium analysis successfully demonstrated MMR deficient BC to be a distinct molecular subtype with unique PPI networks involving histones hub genes.

Virtual: Prediction of drug combinations based on efficacy and safety estimates
COSI: NetBio
  • Arindam Ghosh, University of Eastern Finland, Finland
  • Vittorio Fortino, University of Eastern Finland, Finland


Presentation Overview: Show

Identification of drug combinations to treat complex diseases like cancer necessitates a balance between efficacy and safety. Lately, a combination of network biology and machine learning has been successful due to their ability to put in a single framework the different entities related to drugs and diseases and discover hidden patterns in vast datasets. Previous research has shown that network distances between proteins effected by diseases and those targeted by drugs could be effectively used for identifying drug combinations. However, most current methods address the aspects of efficacy and safety separately. Here we present a method that uses network derived efficacy and safety estimates to train supervised machine learning classifier for predicting effective and safe drug combinations. Initial assessment of the method using leukemia and lung cancer drug combinations showed promising results for random forest-based models which achieved test accuracy greater than 0.9 and 0.7, respectively. Currently, efforts are underway to validate the models using external dataset and to improve the feature engineering used to create the features for assessment of safety and efficacy. The developed method can aid in pre-assessment of efficacy and safety of drug combination before entering clinical trials and thus reducing the number of failures.

Virtual: Revealing Network-Based Similarity of Cancer Drugs
COSI: NetBio
  • Seyma Unsal Beyge, METU, Turkey
  • Nurcan Tuncbag, Koc University, Turkey


Presentation Overview: Show

Drugs not only perturb their immediate protein targets but also modulate multiple signaling pathways. In this study, we explored networks modulated by several drugs across multiple cancer cell lines by integrating their targets with transcriptomic and phosphoproteomic data. As a result, we obtained 236 reconstructed networks covering five cell lines and 70 drugs. A rigorous topological and pathway analysis showed that chemically and functionally different drugs may modulate overlapping networks. Additionally, we revealed a set of tumor-specific hidden pathways with the help of drug network models that are not detectable from the initial data. The difference in the target selectivity of the drugs leads to disjoint networks despite sharing a similar mechanism of action, e.g., HDAC inhibitors. We also used the reconstructed network models to study potential drug combinations based on the topological separation and found literature evidence for a set of drug pairs. Overall, network-level exploration of drug-modulated pathways and their deep comparison may potentially help optimize treatment strategies and suggest new drug combinations.

Virtual: Tissue-specific Pathway Activities: A Retrospective Analysis in COVID-19 Patients
COSI: NetBio
  • Nhung Pham, Maastricht university, Netherlands
  • Finterly Hu, Maastricht University, Netherlands
  • Chris Evelo, Maastricht University, Netherlands
  • Martina Kutmon, Maastricht University, Netherlands


Presentation Overview: Show

The SARS-CoV-2 virus binds to ACE2 receptors to enter the cell where it alters and hijacks many processes in the human body. These ACE2 receptors are widely distributed in not only the lung, the primary infected organ in COVID-19, but also in other tissues such as the kidney, liver, and heart.
We developed an automatic workflow to analyze proteomics data to study pathway activities in different tissues using pathway collections from WikiPathways (www.wikipathways.org), an established community-curated pathway database, and the COVID-19 Disease Map, a community knowledge repository of molecular mechanisms of COVID-19.
Pathway activities are greatly diverse among tissues. In total, 69 out of 640 pathways were found to have altered activities in at least one tissue. Among tissues, the thyroid has the greatest number of pathways that change activities in COVID-19 patients.
With this automated and reproducible proteomics data analysis workflow, we enable quick (re-)analysis of proteomics datasets in the context of pathway activities to further expand our understanding of the molecular mechanisms involved in a COVID-19 infection.

L-001: Single-cell network biology characterizes cell type gene regulation for drug repurposing and phenotype prediction in Alzheimer’s disease
COSI: NetBio
  • Chirag Gupta, University of Wisconsin-Madison, United States
  • Jielin Xu, Cleveland Clinic, Cleveland, Ohio, United States
  • Ting Jin, University of Wisconsin-Madison, United States
  • Saniya Khullar, University of Wisconsin-Madison, United States
  • Xiaoyu Liu, University of Wisconsin-Madison, United States
  • Sayali Alatkar, University of Wisconsin-Madison, United States
  • Feixiong Cheng, Cleveland Clinic, Cleveland, Ohio, United States
  • Daifeng Wang, University of Wisconsin-Madison, United States


Presentation Overview: Show

Dysregulation of gene expression in Alzheimer’s disease (AD) remains elusive, especially at the cell type level. Gene regulatory network, a key molecular mechanism linking transcription factors and regulatory elements to govern target gene expression, can change across different cell types in the human brain and thus serve as a model for studying gene dysregulation in AD. However, AD-induced changes in brain cell type gene networks remain uncharted. Here, we integrated single-cell multi-omics data to predict gene regulatory networks for four major cell types in AD and healthy human brains. We consistently observed changes in network structures, including hubs, gene connectivity, and network motif concentrations across cell types. We demonstrate that known AD-risk genes functionally converge into cell-type-specific modules of coregulated and functionally-related genes, which can be utilized for systematic drug repurposing. Furthermore, leveraging the network connectivity patterns of known AD-risk genes, we developed a machine learning model to classify and prioritize novel AD genes. We validated our model using independent datasets and found that top prioritized genes predict clinical phenotypes of AD with reasonable accuracy. Together, our analysis outlines regulatory changes that occur in AD and suggests network-based strategies for drug repurposing and gene ranking at single-cell resolution.

L-002: Multiscale phase separation by explosive percolation with the single chromatin loop resolution
COSI: NetBio
  • Kaustav Sengupta, Center of New Technologies, University of Warsaw, Warsaw, Poland, Poland
  • Michał Denkiewicz, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland, Poland
  • Mateusz Chiliński, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland, Poland
  • Teresa Szczepińska, Centre for Advanced Materials and Technologies, Warsaw Technical University, Warsaw, Poland, Poland
  • Ayatullah Faruk Mollah, Department of Computer Science and Engineering, Aliah University, Kolkata, West Bengal, India, India
  • Raissa D'Souza, Department of Computer Science, University of California, Davis, USA, United States
  • Yijun Ruan, The Jackson Laboratory for Genomic Medicine, USA, United States
  • Dariusz Plewczynski, Centre of New Technologies, University of Warsaw, Warsaw, Poland, Poland


Presentation Overview: Show

We propose models of dynamical human genome folding into hierarchical components in human lymphoblastoid and stem cell lines. Our models are based on explosive percolation theory. The chromosomes are modeled as graphs where CTCF chromatin loops are represented as edges. The folding trajectory is simulated by gradually introducing loops to the graph following various edge addition strategies that are based on topological network properties, chromatin loop frequencies, compartmentalization, or chromatin epigenomic features. Finally, we propose the genome folding model as a biophysical pseudo-time process of chromatin loops formation guided by a single scalar order parameter, which value is calculated by Linear Discriminant Analysis from chromatin features to classify the compartments efficiently. The chromatin phase separation, where fiber is condensing in three-dimensional space into topological domains and compartments, is observed when the critical number of contacts is reached. Overall, our in silico model integrates the high-throughput 3D genome interaction experimental data with the novel theoretical concept of phase separation, which allows us to model event-based time dynamics of chromatin loop formation and folding trajectories.
Availability: https://github.com/SFGLab/percolation
Contact: d.plewczynski@cent.uw.edu.pl

L-003: FAVA: High-quality functional association networks inferred from massive scRNA-seq and proteomics data
COSI: NetBio
  • Mikaela Koutrouli, Novo Nordisk Foundation Center of Protein Research, Denmark
  • Pau Piera Líndez, Novo Nordisk Foundation Center of Protein Research, Denmark
  • Robbin Bouwmeester, VIB-UGent Center for Medical Biotechnology | Department of Biomolecular Medicine, Ghent University, Ghent, Belgium, Belgium
  • Lennart Martens, VIB-UGent Center for Medical Biotechnology | Department of Biomolecular Medicine, Ghent University, Ghent, Belgium, Belgium
  • Lars Juhl Jensen, Novo Nordisk Foundation Center of Protein Research, Denmark


Presentation Overview: Show

Protein networks are commonly used for understanding the interplay between proteins in the cell as well as for visualizing omics data. Unfortunately, existing networks such as STRING are heavily biased by data availability in the sense that well-studied proteins have many more interactions than understudied proteins. To create networks also for the latter, we need to use high-throughput data, such as single cell RNA-seq (scRNA-seq) and proteomics, which do not have this literature bias. However, due to the sparseness (i.e. many proteins not observed in each cell/sample) and redundancy (many similar cells/samples) of such data, simple correlation analysis does not result in high-quality networks. We present FAVA, Functional Associations using Variational Autoencoders, which deals with these issues by compressing the high-dimensional data into a meaningful, dense, low-dimensional latent space. We demonstrate that calculating correlations in this latent space results in much improved networks compared to the original representation for massive scRNA-seq and proteomics data from Human Protein Atlas and PRIDE, respectively. We show that these networks, which given the nature of the input data should be free of literature bias, indeed have much better coverage of understudied proteins than existing networks.

L-004: Testing network prediction of drug combination effects using electronic health record analysis
COSI: NetBio
  • Jennifer L. Wilson, University of California Los Angeles, United States
  • Ethan Steinberg, Stanford University, United States
  • Rebecca Racz, Food and Drug Administration, United States
  • Russ Altman, Stanford University, United States
  • Nigam Shah, Stanford University, United States
  • Kevin Grimes, Stanford University, United States


Presentation Overview: Show

Protein interaction network methods are attractive for identifying new drug effects, yet they are limited by their tendency to overpredict drug outcomes. We recently discovered that a context-specific analysis could increase network prediction performance and precision; prediction accuracy improved when connecting drugs to genes downstream of drugs with known effects. We hypothesized that downstream genes were part of drug-induced pathways and that combination drugs that blocked these pathways would alter drug-induced outcomes. We classified drugs based on shared downstream proteins and used electronic health record analysis to measure where combination drugs had class-level effects on drug-induced outcomes. By classifying drugs using their downstream proteins, we had an 80.7% sensitivity for predicting rare drug combination effects documented in gold-standard datasets. We further measured the effect of predicted drug combinations on adverse outcome phenotypes using novel observational studies in the electronic health record. We tested predictions for 60 network-drug classes on 7 adverse outcomes and measured changes in clinical outcomes for predicted combinations. Our results suggest that downstream network proteins may be part of drug-induced effects and further, that drug combinations may be rationally selected based on network relationships between their targets.

L-005: The Role of Cell Specific Regulation in Mendelian Disease
COSI: NetBio
  • Jordan H. Whitlock, The University of Alabama at Birmingham, United States
  • T.C. Howton, The University of Alabama at Birmingham, United States
  • Vishal H. Oza, The University of Alabama at Birmingham, United States
  • Brittany N. Lasseigne, The University of Alabama at Birmingham, United States


Presentation Overview: Show

Individually rare diseases may affect small subpopulations, yet collectively over 25 million Americans live with a rare disease. Today there are 10,000 Mendelian diseases (MD), caused by variants in the germline that are incorporated into all cells of the body. However, the corresponding phenotypic manifestations typically occur in a limited number of tissues. Gaps still remain in connecting genomic variation to corresponding phenotypic outcomes of MD despite recent advances in sequencing technology and interpretation of variants. As a result, the molecular basis is unknown for a third of MDs, and the mechanism(s) linking genetic variants to disease manifestation is poorly understood for ~20% of MDs even when the causal gene is identified. This leaves half of existing MDs with an undiscovered mechanism of disease. Evidence has shown that cell-specific differences in regulation drive tissue-specific disease manifestations (e.g., in cystic fibrosis and type II diabetes). Using publicly available single-cell and single-nuclei RNA-sequencing expression datasets for human cerebral cortex and kidney, we construct cell-specific regulatory networks for renal- and neurological- associated MD genes in order to uncover potential regulatory roles driving disease manifestation. We find MD subnetworks with cell-specific differences in regulation intrinsic to each cell type.

L-006: SPRAS: A workflow for streamlining network-based pathway reconstruction
COSI: NetBio
  • Adam Shedivy, University of Wisconsin-Madison, United States
  • Pramesh Singh, Reed College, United States
  • Christopher Magnano, Havard Medical School, United States
  • Tobias Rubel, University of Maryland, United States
  • Anna Ritz, Reed College, United States
  • Anthony Gitter, University of Wisconsin-Madison, United States


Presentation Overview: Show

Computational methods that use protein interaction data to characterize a cellular response are increasingly common tools in molecular systems biology. For example, algorithms have been developed to reconstruct signaling pathways by connecting genes and proteins of interest in the context of a protein-protein interaction network. Although dozens of graph algorithms have been applied to pathway reconstruction tasks, practical challenges have suppressed their broader adoption. Each individual method has its own input and output file formats, installation process, and user-specified parameters. We present the Signaling Pathway Reconstruction Analysis Streamliner (SPRAS), a unified conceptual framework and workflow for pathway reconstruction. SPRAS defines a single common Snakemake workflow for pathway reconstruction and wraps popular pathway reconstruction algorithms within the workflow. A universal output format for predicted pathways supports supplementing pathway reconstruction with downstream analyses such as network visualization, gene set enrichment, and comparison with pathway databases. The modular design allows third-party contributors to add new algorithms to the workflow. Through SPRAS, researchers can easily explore different types of pathway reconstruction algorithms and select those that best address their biological questions. The ability to compare multiple algorithms directly can reveal gaps in state-of-the-art methods and facilitate objective benchmarking of new algorithms.

L-007: Illuminating Dark Proteins using Reactome Pathways
COSI: NetBio
  • Guanming Wu, Oregon Health & Science University, United States
  • Lisa Matthews, NYU Grossman School of Medicine, United States
  • Robin Haw, Ontario Institute for Cancer Research, Canada
  • Timothy Brunson, Oregon Health & Science University, United States
  • Nasim Sanati, Oregon Health & Science University, United States
  • Solomon Shorser, Ontario Institute for Cancer Research, Canada
  • Patrick Conley, Oregon Health & Science University, United States
  • Lincoln Stein, Ontario Institute for Cancer Research, Canada
  • Peter D'Eustachio, NYU Grossman School of Medicine, United States


Presentation Overview: Show

Due to biased biological studies, we have very limited knowledge about one third of protein coding genes. Elucidating functions of these “dark” proteins may offer therapeutic opportunities for many diseases. Reactome is the most comprehensive, open access pathway knowledgebase. Placing dark proteins in the context of Reactome pathways provides a framework of reference for these proteins, facilitating hypothesis generation for experimental biologists to develop targeted experiments, unravel potential functions of these proteins, and then design drugs to manipulate them. To this end, we have trained a random forest with 106 protein/gene pairwise features collected from multiple resources to predict functional interactions between dark proteins and proteins annotated in Reactome and then developed three scores to measure the interactions between dark proteins and Reactome pathways based on enrichment analysis and fuzzy logic simulations. Literature evidence via manual checking and systematic NLP-based analysis support predicted interacting pathways for dark proteins. To visualize dark proteins in the context of Reactome pathways, we have also developed a new web site, idg.reactome.org, by extending the Reactome web application with new features illustrating these proteins together with tissue specific protein and gene expression levels and drug interactions.

L-008: Joint embedding of biological networks for cross-species functional alignment
COSI: NetBio
  • Lechuan Li, Rice University, United States
  • Ruth Dannenfelser, Rice University, United States
  • Yu Zhu, Rice University, United States
  • Nathaniel Hejduk, Rice University, United States
  • Santiago Segarra, Rice University, United States
  • Vicky Yao, Rice University, United States


Presentation Overview: Show

Model organisms are widely used to better understand the molecular causes of human disease. While sequence similarity greatly aids this transfer, sequence similarity does not imply functional similarity, and thus, several current approaches incorporate protein-protein interactions (PPIs) to help map findings between species. Existing transfer methods either formulate the alignment problem as a matching problem, which pits network features against known orthology, or more recently, as a joint embedding problem. Here, we propose a novel state-of-the-art joint embedding solution: Embeddings to Network Alignment (ETNA). More specifically, ETNA generates individual network embeddings based on network topological structures and then uses a Natural Language Processing-inspired cross-training approach to align the two embeddings using sequence orthologs. The final embedding preserves both within and between species gene functional relationships, and we demonstrate that it captures both pairwise and group functional relevance. In addition, ETNA's embeddings can be used to transfer genetic interactions across species and identify phenotypic alignments, laying the groundwork for potential opportunities for drug repurposing and translational studies.

L-009: Predicting gene regulatory networks from multi-omics to link genetic risk variants and neuroimmunology to Alzheimer’s disease phenotypes
COSI: NetBio
  • Saniya Khullar, University of Wisconsin - Madison, United States
  • Daifeng Wang, University of Wisconsin - Madison, United States


Presentation Overview: Show

Background: How Alzheimer’s disease (AD) risk SNPs affect AD phenotypes and neuroimmunology remains elusive. Also, our understanding of cellular and molecular mechanisms from SNPs to phenotypes is limited. Thus, we performed integrative multi-omics analysis of genotype, transcriptomics, and epigenomics for revealing gene regulatory mechanisms from SNPs to AD phenotypes.

Method: Given cohort gene expression data, we construct and cluster its co-expression network to identify modules for AD phenotypes. Next, we predict transcription factors (TFs) regulating genes and SNPs interrupting TF binding sites on regulatory elements. Finally, we construct a gene regulatory network (GRN) linking SNPs, interrupted TFs, and regulatory elements to target genes and modules for AD phenotypes. This network provides systematic insights into gene regulatory mechanisms from SNPs to phenotypes. We used machine learning to prioritize GRNs for AD-Covid pathway genes for predicting Covid-19 severity.

Results: Our comparative analyses revealed cross-region-conserved and region-specific GRNs in 3 major AD brain regions. Further, we used Covid-19 as a proxy for immune dysregulation to identify possible regulatory mechanisms for AD neuroimmunology. Decision Curve Analysis suggests our AD-Covid genes can be potential novel biomarkers for neuroimmunology. Our open-source results provide deeper mechanistic understanding of the interplay among multi-omics, brain regions, neuroimmunology, and phenotypes.

L-010: Interpretability of network measures in host-virus protein interactions
COSI: NetBio
  • Alyssa Adams, Morgridge Institute for Research, United States
  • Anthony Gitter, Morgridge Institute for Research, United States
  • Karthik Anantharaman, University of Wisconsin-Madison, United States


Presentation Overview: Show

Network analysis is a powerful systems biology approach to understand pathogen infection but it lacks techniques to control host responses. Analyzing the structure of pathogen-host protein-protein interaction (PPI) networks may lead to identifying nodes and interactions that may be susceptible to disruption, attack, or even reprogrammability. Because host-virus PPI networks are readily available, we are interested in understanding how a wide variety of network measures (including centrality, clustering, and complexity) could be used to prioritize proteins to control host responses to infection. Given the number of networks that have direct biological interpretability in the lab, we assess how these measures inform the networks’ measurable implications for systems that can be characterized in various ways experimentally. By running nine network measures on PPI networks derived from the viruses.STRING database, we will be able to assess whether topological properties of host proteins are associated with their molecular functions or role in viral phenotypes such as replication.

L-011: Topological patterns in promoter capture Hi-C chromatin interaction networks
COSI: NetBio
  • Andrejs Sizovs, Institute of Mathematics and Computer Science, University of Latvia, Latvia
  • Gatis Melkus, Institute of Mathematics and Computer Science, University of Latvia, Latvia
  • Lelde Lace, Institute of Mathematics and Computer Science, University of Latvia, Latvia
  • Pēteris Ručevskis, Institute of Mathematics and Computer Science, University of Latvia, Latvia
  • Sandra Siliņa, Institute of Mathematics and Computer Science, University of Latvia, Latvia
  • Edgars Celms, Institute of Mathematics and Computer Science, University of Latvia, Latvia
  • Juris Viksna, Institute of Mathematics and Computer Science, University of Latvia, Latvia


Presentation Overview: Show

The spatial organization of chromatin in the nuclei of eukaryotic cells is a significant factor in gene regulation both through cis-regulatory elements such as enhancers and repressors as well as other modes of organization. A key method in studying this spatial organization is the family of methods based on chromatin conformation capture (3C), including the widely used Hi-C assay which provides a complete list of chromatin contacts between all loci in a sample. These methods can be informative about chromatin architecture, its variability between biological samples and its functional implications.

In our work we construct graphs – chromatin interaction networks – out of publicly available promoter capture Hi-C datasets and employ topology-based analyses to derive systematic information about the structure and patterns in these interactions on a broader level. We investigate whether topological graph metrics based on connected components, cliques and other graph features vary significantly between different cell types, and whether these topological metrics match up to tissue relatedness as calculated from RNA-seq data. In addition, we consider whether regulatory features such as enhancers or Polycomb-mediated repressive complexes occur in particular, mutually distinct topological configurations that may be possible to algorithmically differentiate even in unannotated data.

L-012: Quadratic GCN for graph classification
COSI: NetBio
  • Yoram Louzoun, Bar Ilan University, Israel
  • Shoval Friedman, Bar ILan University, Israel


Presentation Overview: Show

Graph Convolutional Networks (GCNs) have been extensively used to classify vertices in graphs and have been shown to outperform other vertex classification methods. GCNs have been extended to graph classification tasks (GCT). In GCT, graphs with different numbers of edges and vertices belong to different classes, and one attempts to predict the graph class. GCN based GCT have mostly used pooling and attention-based models. The accuracy of existing GCT methods is still limited. We here propose a novel solution combining GCN, methods from knowledge graphs, and a new self-regularized activation function to significantly improve the accuracy of the GCN based GCT. We present quadratic GCN (QGCN) - A GCN formalism with a quadratic layer. Such a layer produces an output with fixed dimensions, independent of the graph vertex number. We applied this method to a wide range of graph classification problems, and show that when using a self regularized activation function, QGCN outperforms the state of the art methods for all graph classification tasks tested with or without external input on each graph. The code for QGCN is available at: https://github.com/Unknown-Data/QGCN .

L-013: Towards heterogenous biological networks with the stringApp in Cytoscape
COSI: NetBio
  • Nadezhda T. Doncheva, NNF Center for Protein Research, University of Copenhagen, Denmark
  • John Scooter Morris, RBVI, University of California, San Francisco, United States
  • Henrietta Holze, NNF Center for Protein Research, University of Copenhagen, Denmark
  • Rebecca Kirsch, NNF Center for Protein Research, University of Copenhagen, Denmark
  • Katerina Nastou, NNF Center for Protein Research, University of Copenhagen, Denmark
  • Lars Juhl Jensen, NNF Center for Protein Research, University of Copenhagen, Denmark


Presentation Overview: Show

Biological networks are often used to represent complex biological systems and as such they are intrinsically heterogenous. Although such networks are supported by the state-of-the-art software tool Cytoscape and its many apps, the main focus of the stringApp has been on using intra-species protein-protein interactions from the STRING database to complement the interpretation of results from high-throughput experiments, most commonly proteomics.
Here, we highlight both new and existing functionality of the stringApp that goes beyond its most common use cases. In particular, it is possible to create heterogenous networks that consist of proteins known to the stringApp and nodes representing other biological entities from external sources. This could be useful for integrating STRING networks from transcriptomics data with predicted non-coding RNA interactions or with known gene-disease associations from other databases. Currently, we are also extending the stringApp with protein-protein interactions between eukaryotic parasites and their hosts, thereby complementing the already existing host-virus interactions and making it possible to visualize inter-species STRING networks in Cytoscape. Finally, the latest stringApp version supports the retrieval of not only functional associations, but also physical interactions, whose confidence scores indicate the likelihood of the proteins to be in the same protein complex.

L-014: Regression Method for Gene-Gene Association Network Using A Pseudo-Value Approach
COSI: NetBio
  • Seungjun Ahn, Department of Biostatistics, University of Florida, United States
  • Tyler Grimes, Department of Mathematics and Statistics, University of North Florida, United States
  • Somnath Datta, Department of Biostatistics, University of Florida, United States


Presentation Overview: Show

The rapid advancement of the RNA-sequencing (RNA-seq) data from high-throughput sequencing technologies has brought significant benefits in gene expression studies. The differential network (DN) analysis detects changes in measures of association among genes in a network. However, there is lack of direct regression modeling approaches for DN analysis, which regress gene expression levels between the two comparison groups with the inclusion of additional clinical variables. In this paper, we present a robust regression modeling method that regresses the jackknife pseudo-values using the measure of connectivity of genes in a network to estimate a set of covariates of interests. Specifically, the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE) algorithm is applied for the estimation of measure of connectivity or mutual information. The results of our simulation show that the proposed method reaches a consistently high degree of precision and recall to identify differentially connected (DC) genes in overall, as compared with existing alternatives. Our proposed method is applied to the chronic obstructive pulmonary disease (COPD) gene study data to identify DC genes.

L-015: Gaussian graphical models under experimental treatments
COSI: NetBio
  • Yunyi Shen, University of Wisconsin Madison, United States
  • Claudia Solis-Lemus, University of Wisconsin Madison, United States


Presentation Overview: Show

Gaussian graphical models are widely used as network models because of their interpretability, mathematical simplicity, and numerical tractability. However, such models can only be used in static conditions without experimental perturbation to the nodes. To also model the experiments applied to the network, one can use regression models which can take at least two forms in multivariate cases, namely multivariate regression and chain graph. We discuss the interpretation of both models’ parameters and conclude that if one needs to predict network response under an experimental setting a multivariate regression can be desirable. In contrast, if one needs to understand the network structure with experiments involved, chain graph models should be preferred. In this talk, we will propose several methods to fit a sparse chain graph model and study experimental design focused on inferring the underlying network among responses. We conclude that prior knowledge about the treatment’s direct effect is crucial for an experiment to outperform a null experiment. This result provides an insight into the reason why the specificity of perturbation is very useful in biological experiments because it is the easiest to obtain such necessary prior knowledge.

L-016: Jointly modeling networks from multiple species to improve network-based gene classification
COSI: NetBio
  • Christopher Mancuso, Michigan State University, United States
  • Kayla Johnson, Michigan State University, United States
  • Sneha Sundar, Michigan State University, United States
  • Arjun Krishnan, Michigan State University, United States


Presentation Overview: Show

Network-based machine learning is a powerful approach for leveraging the cellular context of genes to computationally predict novel/under-characterized genes that are functionally similar to a set of known genes of interest. One powerful network-based gene classification method that is gaining popularity is to use supervised learning algorithms where the features for each gene are determined by that gene’s connections in a molecular network. In this work, we explore how networks from multiple species can be jointly leveraged to improve this gene classification method. We first build multi-species networks by connecting nodes (genes/proteins) in different species if they belong to the same orthologous group. Then, we create feature representations by directly considering a gene's connection to all other genes in the entire multi-species network or considering a low-dimensional embedding for the entire network. We find that adding information across species improves performance for the tasks of predicting human and model species gene annotations across a set of non-redundant gene ontology biological processes. In addition to providing better predictions, this approach allows genes across species to be represented in the same “space” where they can be naturally incorporated into any joint model.

L-017: Module-based prediction improves network-based prioritization of genes associated with complex traits and diseases
COSI: NetBio
  • Alexander McKim, Michigan State University, United States
  • Arjun Krishnan, Michigan State University, United States


Presentation Overview: Show

Genome-scale networks provide a rich source of functional features to use in supervised learning algorithms to predict novel genes associated with complex traits and diseases. However, we have previously shown that the performance of network-based gene classification performance for predicting trait/disease genes is systematically inferior to predicting pathway/process genes. To improve this scenario, we have developed a new approach for network-based disease gene prediction that leverages a well-established property of disease genes: that, in the context of a large-scale gene network, they converge onto a small number of disease modules, where each module corresponds to disease-associated molecular mechanisms. Our approach begins by partitioning the several tens or hundreds of genes associated with each disease into gene modules using network clustering, then using genes from each module as the seed set for network-based expansion using a machine learning classifier, and then combining the predictions from all the classifiers to obtain a final list of novel disease gene candidates. By evaluating the performance on tens of complex polygenic traits and diseases, we show that this approach improves gene prediction for several diseases even after controlling for confounding factors such as gene set size and node degree.

L-018: Pathway analysis of genome-wide linkage analysis regions on essential hypertension in African-derived Brazilian Quilombo populations
COSI: NetBio
  • Vinicius Magalhães Borges, Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, WV, 25755 USA, United States
  • Andréa Roseli Vançan Russo Horimoto, Department of Biostatistics, University of Washington, Seattle, WA, 98105 USA, United States
  • Ellen M. Wijsman, Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, 98105 USA, United States
  • Lilian Kimura, Department of Genetics and Evolutionary Biology, University of Sao Paulo, Sao Paulo, SP, 05508-090 Brazil, Brazil
  • Diogo Meyer, Department of Genetics and Evolutionary Biology, University of Sao Paulo, Sao Paulo, SP, 05508-090 Brazil, Brazil
  • Regina Célia Mingroni-Netto, Department of Genetics and Evolutionary Biology, University of Sao Paulo, Sao Paulo, SP, 05508-090 Brazil, Brazil
  • Alejandro Q. Nato Jr., Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, WV, 25755 USA, United States


Presentation Overview: Show

Essential Hypertension (EH) is a complex disease where blood pressure is constantly elevated. It is one of the major public health problems worldwide, causing about 9.4 million deaths per year, mostly affecting people of African descent populations. Our goal is to perform pathway analysis of mapped regions of interest (ROI) from linkage analysis (adjusted for presence of admixture), to explain EH in African-derived admixed quilombo populations from the Vale do Ribeira region (Sao Paulo, Brazil). We evaluated one large pedigree with 68 genotyped individuals (38 affected, 29 non-affected and 1 unknown). The pedigree was built (GenoPro) based on kinship coefficients (KING, MORGAN and PBAP). We performed genotyping of 650,000 SNPs and the dataset was handled and pruned (KING and PBAP) to achieve unique subpanels of markers (PBAP). Haplotype phasing, local ancestry (SNPFlip, SHAPEIT2 and RFMIX) and SNP allele frequency (ADMIXFRQ) were estimated to accommodate admixture. We performed genome-wide linkage analyses using three dense subpanels of markers (MORGAN). Three ROIs co-segregated with EH. All genes within ROIs were ranked and evaluated by crossing the findings with public databases. We classified molecular and biological processes related to these genes (PANTHER) and constructed an interaction network (Cytoscape, ClueGO, and CluePedia).

L-019: Visualization, Benchmarking and Characterization of Temporal and Nested Biological Systems as Dynamic Forest Mixtures
COSI: NetBio
  • Benedict Anchang, National Institute of Environmental Health Sciences-National Institutes of Heath, United States
  • Raul Mendez-Giraldez, National Institute of Environmental Health Sciences, United States
  • Xiaojiang Xu, National Institute of Environmental Health Sciences, United States
  • Archer Trevor, National Institute of Environmental Health Sciences, United States
  • Qing Chen, National Institutes of Environmental Health Sciences, United States
  • Guang Hu, National Institutes of Environmental Health Sciences, United States
  • Sylvia Plevritis, Stanford School of Medicine, United States
  • Alison Motsinger-Reif, National Institutes of Environmental Health Sciences, United States
  • Jian-Liang Li, National Institutes of Environmental Health Sciences, United States


Presentation Overview: Show

Complete understanding of normal and disease progression depends on whether development can be effectively characterized into continuous, discrete or mixtures of both states. This uncertainty confounds our ability to effectively map lineages and state transitions during development. Time dependent single-cell analysis is key to tackle such a challenge with pseudo-trajectory models being the standard for modeling these kinds of data. However, these models are not optimized for visualizing both disjointed and continuous transitions between states. In addition, they are difficult to benchmark due to lack of gold standards. We recently published a data-driven model called Dynamic Spanning Forest Mixtures (DSFMix) which uses decision trees to select significant time-dependent markers associated with marginal distributions of multimodality and skewness to build a forest for visualizing and characterizing complex developmental processes that unfold over time. The selected markers are then used to connect all cells with a minimum spanning tree (MST), and the tree is then broken up into a Dynamic spanning forest. The trees of the forest are derived by combining tree-based clustering (TAHC) with a dynamic branch-cutting method based on the shape of the underlying dendrogram. DSFMix input consists of single-cell data collected at different time points, representing distinct stages of development, and its output is a mixture of nested, discrete and/or continuous (directed) cell lineages. We show how DSFMix differs from other pseudotime and temporal models and bring out some of its diverse applications including visualization, comparison, and characterization of complex relationships during biological processes such as epithelial-mesenchymal transition, spermatogenesis, stem cell pluripotency, early transcriptional response from hormones and immune response to coronavirus disease. Also from a theoretical perspective, DSFMix optimal input genes exhibit a very high proportion of non-uniform, marginally distributed shapes that are mostly skewed to the right and multimodal with the latter genes strongly associated with major steady states during development. These finding challenges current downstream statistical methods that are optimized for averages and bimodality. In summary, DSFMix is a powerful tool to integrate discrete cell lineages and continuous molecular data and optimal for the characterization and visualization of real-time dependent cell transitions and interactions.

L-020: Network-based data integration and visualization provides a global understanding of regulatory mechanisms in Aspergillus fumigatus
COSI: NetBio
  • Spencer Halberg-Spencer, University of Wisconsin-Madison, Wisconsin Institute for Discovery, United States
  • Saptarshi Pyne, University of Wisconsin-Madison, Wisconsin Institute for Discovery, United States
  • Cristobal Carriel, University of Wisconsin-Madison, United States
  • Jean-Michel Ané, University of Wisconsin-Madison, United States
  • Nancy Keller, University of Wisconsin-Madison, United States
  • Sushmita Roy, University of Wisconsin-Madison, Wisconsin Institute for Discovery, United States


Presentation Overview: Show

Invasive Aspergillosis (IA), a fungal infection of the lungs caused by the pathogen Aspergillus fumigatus, is the most common invasive fungal infection in immunosuppressed individuals. Recent studies have recognized IA as a secondary infection that complicates COVID-19 increasing mortality. Despite the high clinical relevance of A. fumigatus, the molecular mechanisms that underlie IA and co-morbid conditions remain poorly characterized. We present a network-based analysis pipeline that combines gene regulatory network (GRN) inference and network-based interpretation of regulatory modules to characterize A. fumigatus transcriptional response. Our GRN inference approach incorporates latent transcription factor activity (TFA) estimation to elucidate transcription factors that are post-transcriptionally regulated for which gene expression may not be informative. We provide an interactive network visualization framework that incorporates statistical and topological tools used to investigate context specific roles of regulators within the network. Our framework can be used to interpret input gene lists to predict associated biological pathways, prioritize regulators based on kernel diffusion and identify novel subnetwork components using a Steiner tree approximation. Application of our framework to A. fumigatus predicted known and novel regulators of multiple secondary metabolite regulatory pathways. Our approach and resource are broadly applicable for network-based interpretation of clinically significant fungal species.

L-021: A network-based approach for modeling gene regulatory relationships
COSI: NetBio
  • Angelina Brilliantova, Rochester Institute of Technology (RIT), United States
  • Hannah Miller, Rochester Institute of Technology (RIT), United States
  • Ivona Bezáková, Rochester Institute of Technology (RIT), United States


Presentation Overview: Show

A gene regulatory network (GRN) represents a set of molecular interactions between genes that define the gene expression levels of mRNA and proteins. Such interactions usually are either activation (increasing a target gene’s product) or repression (decreasing a target gene’s product). GRNs have been successfully modeled as networks to generate new biological hypotheses, predict missing links, and derive modules for streamlined drug design. Prior works emphasized GRN topology; our work models gene interactions by probabilistically assigning the regulation type (edge coloring) between a pair of genes (nodes). We propose several models of gene regulation: in the Source Model, genes tend to activate or repress other genes with some probability (a model parameter); in the Target Model, genes tend to be activated or repressed; and in the Bi-gene Model, both the regulator gene and its target gene influence the interaction type. We developed efficient algorithms to find the parameters of models via Markov chain sampling and evaluated the model’s explanatory power on public datasets. Our work can be used for predicting unknown gene regulations, creating biological hypotheses, and finding a null GRN network model to generate realistic benchmark datasets.

L-022: Dimensionality reduction methods for extracting functional networks from large-scale CRISPR screens
COSI: NetBio
  • Arshia Z. Hassan, University of Minnesota, United States
  • Henry N. Ward, University of Minnesota, United States
  • Mahfuzur Rahman, Lowes Home Center, United States
  • Maximilian Billman, The University of Bonn, Germany
  • Chad L. Myers, University of Minnesota, United States


Presentation Overview: Show

Analyses of CRISPR-Cas9 screening data facilitate the discovery of functional relationships between genes as well as phenotype-specific dependencies. The Cancer Dependency Map (DepMap) project is the largest dataset of whole-genome CRISPR screens published thus far and aims to identify cancer-specific genetic dependencies across human cell lines. However, benchmarking analyses of the DepMap reveal that signals from mitochondria-associated protein complexes likely mask signals from smaller protein complexes. In this study, we explore three unsupervised dimensionality reduction techniques - autoencoders as well as robust and classical principal component analyses - for normalizing mitochondrial-associated signal from the DepMap and boosting signal within smaller complexes. We additionally propose a novel “onion” normalization technique to combine various “layers” of normalized data into a single network. Benchmarking analyses reveal that, while all methods reduce mitochondrial signal and boost signal within smaller complexes, robust principal component analysis combined with onion normalization performs the best and it outperforms existing methods for normalizing the DepMap. Our work demonstrates the value of removing low-dimensional signal from the DepMap before constructing functional gene networks, and we provide generalizable tools for dimensionality reduction-based normalization approaches that may be applicable in other settings beyond analysis of CRISPR screen data.

L-023: SPLINT: a tool for detecting splicing-associated network rewiring events
COSI: NetBio
  • Ruth Dannenfelser, Rice University, United States
  • Jean-Pierre Roussarie, Boston University, United States
  • Olga Troyanskaya, Princeton University, United States
  • Vicky Yao, Rice University, United States


Presentation Overview: Show

Protein-protein interactions (PPI) mediate the majority of biological processes, and it has become increasingly clear that in order to understand disease dynamics, we must develop a strong understanding of protein tissue/cell-specificity and dynamic changes (e.g., environmental response). While network-based approaches have found initial success in the application of PPIs towards systems-level biological explorations, they often overlook protein alternative splicing. There are multiple key challenges that have limited the study of differential isoform interactions, including limitations in known PPI and an even bigger lack of known isoform-specific interactions. We address these challenges by developing SPLINT, a computational method that considers (1) interlogs (pairs of proteins that are orthologs and known to interact) and (2) the fact that PPI are predominantly mediated by domain-domain interactions in concert with differential isoform or exon usage data generated from large-scale transcriptomic experiments. Using this approach, we can predict which PPI are disrupted or possibly even increased due to splicing events. We apply our approach to the study of tissue and environment dynamics in cancer and frontotemporal dementia. Furthermore, because of our interlog approach, we make our tool available for the study of 20 eukaryotic species, including major model organisms.

L-024: Discovery of genetic interactions associated with COVID-19 severity
COSI: NetBio
  • Wen Wang, University of Minnesota, United States
  • Chad Myers, University of Minnesota, United States


Presentation Overview: Show

Previous GWAS studies have linked human genetic variation in the ABO and 3p21.31 loci to COVID-19 severity. However, like many human diseases, the genetic architecture underlying COVID-19 is likely to be complex, with potential interactions between variants determining disease outcomes. In this study, we utilize the BridGE method we previously developed to search for pathway-level genetic interactions associated with COVID-19. We applied the BridGE method to the UK Biobank England cohort and validated the discovered pathway-level interactions using the Scotland and Wales cohorts. We found 135 replicable between/within-pathway interactions at FDR<0.05 associated with increased COVID-19 severity. Six of these interactions could be strongly replicated in the independent cohorts (FDR<0.05), including interactions involving the antigen processing and presentation pathway and the viral myocarditis pathway. We also found many driver variants of several interactions are located in or nearby the HLA super-locus (chrom. 6p21), which is intriguing given the important roles of these genes in regulating the immune response. While several of the discovered pathways showed clear relevance to COVID-19, most of these have not been previously implicated by GWAS studies. Our study suggests there are new COVID-19 host genetic mechanisms to be discovered if we consider complex interactions between variants.

L-025: hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies
COSI: NetBio
  • Kevin Drew, University of Illinois at Chicago, United States
  • John B Wallingford, University of Texas at Austin, United States
  • Edward M Marcotte, University of Texas at Austin, United States


Presentation Overview: Show

A general principle of biology is the self-assembly of proteins into functional complexes. Characterizing their composition is, therefore, required for our understanding of cellular functions. Unfortunately, we lack knowledge of the comprehensive set of identities of protein complexes in human cells. To address this gap, we developed a machine learning framework to identify protein complexes in over 15,000 mass spectrometry experiments which resulted in the identification of nearly 7,000 physical assemblies. We show our resource, hu.MAP 2.0, is more accurate and comprehensive than previous state of the art high throughput protein complex resources and gives rise to many new hypotheses, including for 274 completely uncharacterized proteins. Further, we identify 259 promiscuous proteins that participate in multiple complexes pointing to possible moonlighting roles. We have made hu.MAP 2.0 easily searchable in a web interface (http://humap2.proteincomplexes.org/), which will be a valuable resource for researchers across a broad range of interests including systems biology, structural biology, and molecular explanations of disease.

L-026: Models for simulating genetic interactions from real human genotypes
COSI: NetBio
  • Mehrad Hajiaghabozorgi, University of Minnesota, United States
  • Wen Wang, University of Minnesota, United States
  • Chad L. Myers, University of Minnesota, United States


Presentation Overview: Show

Genetic interactions underlie complex diseases and can be important for predicting individual phenotypes. The systematic analysis of genetic interaction from Genome-Wide Association Studies (GWAS) with traditional methods has been difficult due to power issues. Many approaches for computing genetic interactions in GWAS have been proposed. The absence of known, gold standard interactions is one of the challenges in evaluating these proposed approaches. Realistic simulation models based on real human genotypes may provide one solution to this problem. We propose a method for simulating SNP-level interactions in standard human GWAS datasets that retains actual genotypes while generating phenotype labels under disease models with user-defined parameters. It has been observed in model organisms that the distribution of genetic interactions across genetic variants does not follow a uniform distribution. Mimicking alternative distributions and interaction topologies is beneficial for a variety of objectives in GWAS research. Our method can successfully simulate interactions based on the network designs provided, as well as other input factors like the minor allele frequency of implicated SNPs, the interaction effect size, and noise. We demonstrate the utility of our framework for evaluating methods for discovering interactions from GWAS studies.

L-027: Pathway Analysis Through Mutual Information
COSI: NetBio
  • Gustavo S Jeuken, KTH Royal Institute of Technology, Sweden
  • Lukas Käll, KTH Royal Institute of Technology, Sweden


Presentation Overview: Show

Pathway analysis comes in many forms. Most are seeking to establish a connection between the activity of a certain biological pathway and a difference in phenotype, often relying on an upstream differential expression analysis to establish a difference between case and control. This process usually models this relationship using many assumptions, often of a linear nature, and may also involve statistical tests where the calculation of false discovery rates is not trivial.

Here, we propose a new method for pathway analysis that relies on information theoretical principles, and therefore is absent of a model for the nature of the association between pathway activity and phenotype, resulting on a very minimal set of assumptions. For this, we construct a different graph of samples for each pathway and score the association between the structure of this graph and any phenotype variable using Mutual Information, while adjusting for the effects of random chance in each score.

Our experiments show that this method produces robust and reproducible scores that successfully result in a high rank for target pathways on single cell datasets, outperforming established methods for pathway analysis on these same conditions.

L-028: Multi-omic integration: adding network topology to study axial spondyloarthritis
COSI: NetBio
  • Annabelle Beaudoin, Institut Pasteur, France
  • Vincent Guillemot, Institut Pasteur, France
  • Natalia Pietrosemoli, Institut Pasteur, France


Presentation Overview: Show

We propose a method called netSGCCA, based on the integration of multiblock data. These types of approaches are now essential to analyze the increasingly complex data that are produced in biology and health sciences: from multi-omics data, to imaging data. netSGCCA derives from framework of the Generalized Canonical Correlation Analysis studying the relationship between groups of variables. In particular, we base it on the SGCCA (Sparse Generalized Canonical Correlation Analysis), filtering the most pertinent variables when the blocks have a large number of variables, such as in omics data. We implement a GraphNet penalty, integrating network topology information reflecting the interactions among the variables within a given data block. netSGCCA may benefit then from the rich and complex information present in biological reference databases such as STRING-DB, describing known associations between the molecular players.

We apply netSGCCA to a study aimed at understanding the pathophysiology of spondyloarthritis. It comprises three data blocks: two gene expression blocks, corresponding to two stimulations simulating innate and acquired immunity, and one block of quantitative variables corresponding to the clinical progress scores.
netSGCCA offers promising means for adding network topology information while also integrating multi-omics datasets to identify the key variables driving the phenotypes.

L-029: Mapping regulatory programs that drive glioblastoma heterogeneity in single cell RNA sequence data using Gene Regulatory Networks
COSI: NetBio
  • Vishal H Oza, University of Alabama at Birmingham, United States
  • Brittany N Lasseigne, University of Alabama at Birmingham, United States


Presentation Overview: Show

Glioblastoma (GBM) is classified as stage IV astrocytoma and is the most common and aggressive form of malignant brain cancer in adults. Clinical management of glioblastoma has been especially challenging because of both the presence of heterogeneity between patients as well as within a single tumor. The advent of single-cell RNA sequencing (scRNA-seq) has provided a great opportunity to further analyze this heterogeneity. To date, most studies have mapped this heterogeneity to cellular composition and tumor microenvironment. These studies have been restricted to identifying gene expression differences by looking at individual genes as drivers of tumor heterogeneity. However, the relationship between the genes and how they are regulated have not been explored. In this analysis, we use the Passing Attributes between Networks for Data Assimilation (PANDA) network approach to construct gene regulatory networks for GBM single cell samples and identify regulatory programs that drive the heterogeneity in GBM cell types.

L-030: Network propagation of multi-omics data for the identification of disease mechanisms
COSI: NetBio
  • Kristina Thedinga, Max-Planck-Institute for Molecular Genetics, Germany
  • Matthias Lienhard, Max-Planck-Institute for Molecular Genetics, Germany
  • Alia Ben-Kahla, Institute Pasteur Tunis, Tunisia
  • Lamia Guizani-Tabbane, Institute Pasteur Tunis, Tunisia
  • Ralf Herwig, Max-Planck-Institute for Molecular Genetics, Germany


Presentation Overview: Show

Modern high-throughput technologies generate a multitude of disease-related data and a crucial step is to set these data into context and to identify the underlying disease mechanisms. This can effectively been realized by integrating the molecular activities in networks of interacting molecules. We have developed the resource ConsensusPathDB, that integrates molecular interactions in a large network that can be used as a scaffold for the analysis of multiple omics data with network propagation. We present a combined approach consisting of i) a PPI scaffold, ii) a generic scoring method to assign multi-omics experimental data to network nodes, and iii) a robust propagation algorithm, NetCore, in order to identify modules of interacting proteins that are mostly relevant for the question under study.

We have applied the approach to two biomedical domains: First, we present the data integrating capabilities of the approach to the identification of toxicity modules from drug treatment data; second, we show that network propagation is able to distinguish different pathogen response mechanisms to Leishmania infection.

L-031: Community Detection Analysis in Multilayer COVID-19 Patient Similarity Networks
COSI: NetBio
  • Piotr Sliwa, University of Oxford, United Kingdom
  • Heather Harrington, University of Oxford, United Kingdom
  • Gesine Reinert, University of Oxford, United Kingdom
  • Julian Knight, University of Oxford, United Kingdom


Presentation Overview: Show

Advancement of biological assays has enabled generation of datasets containing multiple modalities, such as RNA-seq data and proteomics data, per patient for large cohorts. Integrating the information from different modalities is an open research challenge; a popular method is Similarity Network Fusion. Here we propose an alternative network-science based approach, where we construct patient similarity networks in a principled fashion, one network per modality, and then combine these similarity networks into a multilayer network. On this network, communities are detected using the Leiden algorithm.

We apply this approach to a recent comprehensive multimodal dataset, the Covid-19 Multi-omics Blood Atlas (COMBAT), containing among others information on COVID-19 and sepsis patients and healthy volunteers across modalities spanning proteomics, bulk and single cell transcriptomics and mass cytometry, and explore the resulting communities for correlations with important clinical variables and activity of drug-relevant and immune system related molecular pathways. The analysis reveals a signal which differentiates between healthy, COVID-19, and sepsis patients. We show that using multilayer networks we can refine classifications made using one modality. Our pipeline can also be applied to other similar datasets and may help to better understand multimodal molecular health data.

L-033: depGCN: Debiased personalized gene co-expression networks from scRNA-seq
COSI: NetBio
  • Shan Lu, University of Wisconsin - Madison, United States
  • Sündüz Keleş, University of Wisconsin - Madison, United States


Presentation Overview: Show

Gene co-expression network analysis is instrumental for detecting latent relationships invisible to standard workflows of clustering and differential expression analysis. Emerging population-scale single cell RNA-seq (scRNA-seq) datasets across multiple individuals are creating an unprecedented opportunity to quantify expression variation across individuals at the gene co-expression network level. Gene-gene correlation estimates from scRNA-seq tend to be severely biased towards zero for low expression genes. Here, we present depGCN to debias gene-gene correlation estimates and facilitate accurate quantification of network level variation across population-scale scRNA-seq datasets. DepGCN corrects correlation estimates in the general Poisson measurement model and provides a metric to quantify high noise gene pairs. Computational experiments with population-scale scRNA-seq data established that depGCN estimates are robust to mean expression levels of the genes and the sequencing depths of the datasets. Compared to alternatives, depGCN results in fewer false positive edges in the co-expression networks, and yields markedly better estimates of network centrality and modules. Applications of depGCN to population-scale scRNA-seq of oligodendrocytes from postmortem human tissues of Alzheimer disease and controls resulted in co-expression networks with significantly better validation rates according to external databases. Furthermore, network-based centrality analysis yields biologically coherent gene groups that are associated with the phenotypic variation.