Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

NetBio COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in UTC
Monday, July 26th
11:00-11:40
NetBio Keynote: Machine Learning Frontiers in Network Biology
Format: Live-stream

Moderator(s): Anaïs Baudot

  • Karsten Borgwardt, ETH Zürich, Switzerland

Presentation Overview: Show

Exploiting our growing knowledge about biological networks in computational biology often requires new algorithm development in machine learning, from learning on graphs to combinatorial feature selection. In this talk, we will describe several fundamental problems in machine learning in network biology and algorithmic solutions that we have proposed.

11:40-12:00
Proceedings Presentation: A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data
Format: Pre-recorded with live Q&A

Moderator(s): Anaïs Baudot

  • Hung-Cuong Trinh, Ton Duc Thang University, Viet Nam
  • Yung-Keun Kwon, University of Ulsan, South Korea

Presentation Overview: Show

Motivation: It is a challenging problem in systems biology to infer both the network structure and dynamics of a gene regulatory network from steady-state gene expression data. Some methods based on Boolean or differential equation models have been proposed but they were not efficient in inference of large-scale networks. Therefore, it is necessary to develop a method to infer the net-work structure and dynamics accurately on large-scale networks using steady-state expression.
Results: In this study, we propose a novel constrained genetic algorithm-based Boolean network inference (CGA-BNI) method where a Boolean canalyzing update rule scheme was employed to capture coarse-grained dynamics. Given steady-state gene expression data as an input, CGA-BNI identifies a set of path consistency-based constraints by comparing the gene expression level be-tween the wild-type and the mutant experiments. It then searches Boolean networks which satisfy the constraints and induce attractors most similar to steady-state expressions. We devised a heuristic mutation operation for faster convergence and implemented a parallel evaluation routine for execution time reduction. Through extensive simulations on the artificial and the real gene expression datasets, CGA-BNI showed better performance than four other existing methods in terms of both structural and dynamics prediction accuracies.
Conclusion: Taken together, CGA-BNI is a promising and scalable tool to predict both the structure and the dynamics of a gene regulatory network when a highest accuracy is needed at the cost of sacrificing the execution time.

12:00-12:20
Beyond protein-protein interaction networks: Exploring the impact of alternative splicing using DIGGER and NEASE
Format: Pre-recorded with live Q&A

Moderator(s): Anaïs Baudot

  • Jan Baumbach, Computational Systems Biology, Hamburg University, Germany
  • Markus List, Chair of Experimental Bioinformatics, Technical University of Munich, Germany
  • Zakaria Louadi, Computational Systems Biology, Hamburg University, Germany
  • Kevin Yuan, Chair of Experimental Bioinformatics,Technical University of Munich, Germany
  • Alexander Gress, Helmholtz Centre for Infection Research, Germany
  • Olga Tsoy, Computational Systems Biology, Hamburg University, Germany
  • Olga V. Kalinina, Helmholtz Centre for Infection Research, Germany
  • Tim Kacprowski, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Medical School Hannover, Germany

Presentation Overview: Show

Alternative splicing plays a major role in regulating the functional repertoire of the proteome. However, isoform-specific effects to protein-protein interactions (PPIs) are usually overlooked, making it impossible to judge the functional role of individual exons on a systems biology level. We overcome this barrier by integrating PPIs, domain-domain interactions (DDIs) and residue-level interactions to lift exon expression analysis to a network level. Our user-friendly tool and database DIGGER (Domain Interaction Graph Guided Explorer) is available at https://exbio.wzw.tum.de/digger and allows users to seamlessly switch between isoform and exon-centric views of the interactome and to extract subnetworks of relevant isoforms.

Furthermore, alongside the database, we propose network enrichment of alternative splicing events (NEASE), a method for differential splicing analysis using the PPIs and DDIs joint network. NEASE considers interactions affected by AS and identifies enriched pathways based on affected edges rather than affected genes. Our analysis shows that NEASE largely outperforms classic gene set enrichment in the context of AS and generates meaningful biological insights on the impact of AS. The DIGGER database and NEASE tool together provide essential resources for studying mechanistic consequences of AS in systems medicine.

12:40-13:00
NetControl4BioMed: A web-based platform for controllability analysis of protein-protein interaction networks
Format: Pre-recorded with live Q&A

Moderator(s): Anaïs Baudot

  • Victor Popescu, Åbo Akademi University, Finland
  • Jose Angel Sanchez Martin, Technical University of Madrid, Spain
  • Daniela Schacherer, Heidelberg University, Germany
  • Sadra Safadoust, Koç University, Turkey
  • Negin Majidi, University of California, Santa Cruz, United States
  • Andrei Andronescu, Polytechnic University of Bucharest, Romania
  • Alexandru Nedea, Polytechnic University of Bucharest, Romania
  • Diana Ion, Polytechnic University of Bucharest, Romania
  • Eduard Mititelu, Polytechnic University of Bucharest, Romania
  • Eugen Czeizler, Åbo Akademi University, Finland
  • Ion Petre, University of Turku, Finland

Presentation Overview: Show

Target network controllability aims to discover suitable external interventions that can guide a system to a specific state. In the biomedical domain, it can translate to finding drugs that can influence a cell in a desired way. This can lead to novel and personalized therapeutic suggestions based on drug combinations and drug repurposing. We introduce NetControl4BioMed, a free open-source web-based application that allows users to generate or upload personalized protein-protein interaction networks and to investigate and analyze them from a controllability point of view, providing customized drug therapeutic suggestions through a user-friendly interface, while offering close integration with external applications and databases. Additionally, it makes sharing between users possible, offering the possibility for collaboration on work. The application integrates protein data from HGNC, Ensemble, UniProt, NCBI, and InnateDB, protein-protein interaction data from InnateDB, Omnipath, and SIGNOR, cell-line data from COLT and DepMap, and drug-target data from DrugBank. The application and data are available online at https://netcontrol.combio.org/. The source code is available at https://github.com/Vilksar/NetControl4BioMed under an MIT license.

13:00-13:20
On the limits of active module identification
Format: Pre-recorded with live Q&A

Moderator(s): Anaïs Baudot

  • Jan Baumbach, Technical University of Munich, Germany
  • Olga Lazareva, Technical University of Munich, Germany
  • Markus List, Technical University of Munich, Germany
  • David B. Blumenthal, Technical University of Munich, Germany

Presentation Overview: Show

In network and systems medicine, active module identification methods (AMIMs) are widely used for discovering candidate molecular disease mechanisms. AMIMs combine network analysis algorithms with molecular profiling data by projecting gene expression data onto generic protein-protein interaction (PPI) networks. Although active module identification has led to various novel insights into complex diseases, there is increasing awareness in the field that the combination of gene expression data and PPI network is problematic because up-to-date PPI networks have a very small diameter and are subject to both technical and literature bias. In this paper, we report the results of an extensive study where we analyzed for the first time whether widely used AMIMs really benefit from using PPI networks. Our results clearly show that the tested AMIMs mostly do not produce biologically more meaningful candidate disease modules on widely used PPI networks than on random networks with the same node degrees. AMIMs hence mainly learn from the node degrees and mostly fail to exploit the biological knowledge encoded in the edges of the PPI networks. We suggest that novel algorithms are needed which overcome the degree-bias of most existing AMIMs and/or work with customized, context-specific networks instead of generic PPI networks.

13:20-13:40
Dissecting differential Cell Cell Communication with CrossTalkeR
Format: Pre-recorded with live Q&A

Moderator(s): Anaïs Baudot

  • James Shiniti Nagai, RWTH Aachen University Medical School, Germany, Germany
  • Nils B. Leimkühler, University Hospital Essen,Erasmus Medical Center, Germany
  • Rebekka K. Schneider, RWTH Aachen University, Erasmus Medical Center, Germany
  • Michael T. Schaub, RWTH Aachen University, Germany
  • Ivan G. Costa, RWTH Aachen University Medical School, Germany

Presentation Overview: Show

By combining the single cell RNA sequencing (scRNA-seq) data and a Ligand Receptor (LR) interaction repositories, methods allows the study of cellular crosstalk via inference of putative links between cell types in a given sample. However, identifying the most prominent cell-type pairs from such data and LR interactions is hard. Some further challenges for the study of CCI includes, the phenotypes’ differential CCI analysis: disease vs. normal; the understanding of gene expression and its biological function (e.g ligand or receptor); and the identification of cell-type specific crosstalk signatures. Motivated by this, we developed CrossTalkeR(Nagai et al. 2021) a novel network based crosstalk analysis tool, which can be used with both single and comparative phenotype data. Our method facilitates the extraction of salient patterns through network properties such as centrality and PageRank at two resolution levels: Cell-Cell Communication (CCC) and Gene-Cell Communication level (GCC). We revisited the Bone Marrow PMF niche study(Leimkühler et al. 2021) using CrossTalkeR (Fig 1.). CrossTalkeR log-odds pagerank revealed disease related main cell types, Megakaryocytes and Neural. Using the Principal Component Analysis of the GCC niche related proteins were identified as examples of cellular matrix (FN1 and COL1A1) and hematopoiesis (CXCL12 and SDC1).

13:40-14:00
Reducing false GO term calls in network-based active module identification: methodology and a new algorithm
Format: Pre-recorded with live Q&A

Moderator(s): Anaïs Baudot

  • Hagai Levi, Tel Aviv University, Israel
  • Ran Elkon, Tel Aviv University, Israel
  • Ron Shamir, Tel Aviv University, Israel

Presentation Overview: Show

Algorithms for active module identification (AMI) are central to analysis of omics data. Such algorithms receive a gene network and genes' activity scores as input and report sub-networks with high activity signal ('active modules'), thus representing biological processes that presumably play key roles in the analyzed conditions. Here, we systematically evaluated six popular AMI methods on gene expression and GWAS data. We observed that GO terms enriched in modules detected on the real data were often also enriched on modules found on randomly permuted data. This indicated that AMI methods frequently report modules that are not specific to the biological context measured by the analyzed omics dataset. To tackle this bias, we designed a permutation-based method that empirically evaluates GO terms reported by AMI methods. We used the method to fashion five novel AMI performance criteria. Last, we developed DOMINO, a novel AMI algorithm, that outperformed the other six algorithms in extensive testing on GE and GWAS data. Software is available at https://github.com/Shamir-Lab.

14:20-14:40
Modeling multi-scale -omics data via a network of networks
Format: Pre-recorded with live Q&A

Moderator(s): Marinka Zitnik

  • Shawn Gu, University of Notre Dame, United States
  • Meng Jiang, University of Notre Dame, United States
  • Pietro Hiram Guzzi, University Magna Graecia of Catanzaro, Italy
  • Tijana Milenkovic, University of Notre Dame, United States

Presentation Overview: Show

Prediction of node and graph labels are prominent tasks in network science. Sometimes, entities represented by nodes in a higher-level (i.e., higher-scale) network can themselves be modeled as networks at a lower level. So, we argue that systems involving such entities should be integrated with a ``network of networks'' (NoN) representation. We ask whether label prediction using integrated, multi-level NoN data is more accurate than using each of single-level node and graph data alone, i.e., than node label prediction on the higher-level network and graph label prediction on the lower-level networks. We design a novel framework to investigate this question. We develop the first synthetic NoN generator to study different NoN properties, and we construct a biological NoN. We extend traditional single-level node and graph label prediction approaches to their NoN counterparts and propose a novel, integrative NoN-based graph neural network model. We evaluate the accuracy of each approach on the synthetic (predicting artificial labels) and biological (predicting proteins' functions) NoNs. We find that our NoN approaches outperform or are as good as single-level node- and network-level ones depending on the type of NoN and/or its properties. As such, NoN-based data integration is an important and exciting research direction.

14:40-15:20
NetBio Keynote: Learning Gene Regulatory Networks from Bulk and Single Cell Omic Data
Format: Live-stream

Moderator(s): Marinka Zitnik

  • Sushmita Roy, University of Wisconsin-Madison, United States

Presentation Overview: Show

Gene regulatory networks are molecular networks that control which genes must be expressed when and where in a living cell, translating the information encoded in an organism’s genome to context-specific responses. Identification of these networks is important to advance our understanding of many biological processes such as development, disease, response to stress, and evolution. Technological advances in genomics are enabling us to measure high-throughput molecular readouts at multiple levels including the transcriptome and epigenome for both bulk populations and single cells, which is enabling the study of normal and disease processes at an exceedingly high resolution. However, there are numerous computational challenges that arise to effectively integrate these data to gain insight into the gene regulatory networks that govern these processes. I will present some recent computational tools towards mapping genome-scale regulatory networks using multi-omic datasets measured at the population and single cell level. These tools enable us to define cell type-specific regulatory networks and uncover network components important for establishing cell-type specific programs in dynamic processes such as cell fate specification.

Tuesday, July 27th
11:00-11:40
NetBio Keynote: Network Medicine—From Protein-Protein to Human-Machine Interactions
Format: Live-stream

Moderator(s): Martina Summer-Kutmon

  • Jörg Menche

Presentation Overview: Show

From protein interactions to signal transduction, from metabolism to the nervous system: Virtually all processes in health and disease rely on the careful orchestration of a large number of diverse individual components ranging from molecules to cells and entire organs. Networks provide a powerful framework for describing and understanding these complex systems in a holistic fashion. They offer a unique combination of a highly intuitive, qualitative description, and a plethora of analytical, quantitative tools. In my presentation, I will first review how molecular networks can be understood as maps for elucidating the relation between molecular-level perturbations and their phenotypic manifestations. I will then sketch out a number of future challenges in the areas of network biology and network medicine, as well as recent efforts of my group to address them. These challenges include methodological aspects concerning the visualization and interpretation of large biomedical data, as well as translational aspects concerning concrete clinical applications in the area of rare diseases.

11:40-12:00
CROssBAR: Comprehensive Resource of Biomedical Relations with Knowledge Graph Representations
Format: Pre-recorded with live Q&A

Moderator(s): Martina Summer-Kutmon

  • Tunca Dogan, Hacettepe University, Turkey
  • Heval Ataş, Middle East Technical University, Turkey
  • Vishal Joshi, EMBL-EBI, United Kingdom
  • Ahmet Atakan, Middle East Technical University, Turkey
  • Ahmet Süreyya Rifaioğlu, Middle East Technical University, Turkey
  • Esra Nalbat, Middle East Technical University, Turkey
  • Andrew Nightingale, EMBL-EBI, United Kingdom
  • Rabie Saidi, UniProt, European Bioinformatics Institute, Cambridge, United Kingdom
  • Vladimir Volynkin, EMBL-EBI, United Kingdom
  • Hermann Zellner, EMBL, United Kingdom
  • Rengul Atalay, University of Chicago, United States
  • Maria Martin, EMBL-EBI, United Kingdom
  • Volkan Atalay, Middle East Technical University, Turkey

Presentation Overview: Show

Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms and developing novel and effective treatment approaches against diseases. However, different layers of the available data are produced using different technologies and scattered across individual computational resources without any explicit connections to each other, which hinders extensive/integrative multi-omics-based analysis. We aimed to address this issue by developing a new data integration/representation methodology and its application by constructing a biological data resource. CROssBAR is a comprehensive system that integrates large-scale biological/biomedical data from various resources and stores them in a NoSQL database. CROssBAR is enriched with the deep-learning-based prediction of relationships between numerous data entries, which is followed by the rigorous analysis of the enriched data to obtain biologically meaningful modules. These complex sets of entities and relationships are displayed to users via easy-to-interpret, interactive knowledge graphs within an open-access service. CROssBAR knowledge graphs incorporate relevant genes-proteins, molecular interactions, pathways, phenotypes, diseases, as well as known/predicted drugs and bioactive compounds, and they are constructed on-the-fly based on simple non-programmatic user queries. These intensely processed heterogeneous networks are expected to aid systems-level research, especially to infer biological mechanisms in relation to genes, proteins, their ligands, and diseases (https://crossbar.kansil.org).

12:00-12:20
Proceedings Presentation: Graph Transformation for Enzymatic Mechanisms
Format: Pre-recorded with live Q&A

Moderator(s): Martina Summer-Kutmon

  • Jakob L. Andersen, IMADA, University of Southern Denmark, Denmark
  • Rolf Fagerberg, IMADA, University of Southern Denmark, Denmark
  • Christoph Flamm, Department of Theoretical Chemistry, University of Vienna, Austria
  • Walter Fontana, Department of Systems Biology, Harvard Medical School, United States
  • Juraj Kolčák, IMADA, University of Southern Denmark, Denmark
  • Christophe V.F.P. Laurent, IMADA, University of Southern Denmark, Denmark
  • Daniel Merkle, IMADA, University of Southern Denmark, Denmark
  • Nikolai Nøjgaard, IMADA, University of Southern Denmark, Denmark

Presentation Overview: Show

Motivation: The design of enzymes is as challenging as it is consequential for making chemical synthesis in medical and industrial applications more efficient, cost-effective and environmentally friendly. While several aspects of this complex problem are computationally assisted, the drafting of catalytic mechanisms, i.e. the specification of the chemical steps—and hence intermediate states—that the enzyme is meant to implement, is largely left to human expertise. The ability to capture specific chemistries of multi-step catalysis in a fashion that enables its computational construction and design is therefore highly desirable and would equally impact the elucidation of existing enzymatic reactions whose mechanisms are unknown. Results: We use the mathematical framework of graph transformation to express the distinction between rules and reactions in chemistry. We derive about 1000 rules for amino acid side chain chemistry from the M-CSA database, a curated repository of enzymatic mechanisms. Using graph transformation we are able to propose hundreds of hypothetical catalytic mechanisms for a large number of unrelated reactions in the Rhea database. We analyze these mechanisms to find that they combine in chemically sound fashion individual steps from a variety of known multi-step mechanisms, showing that plausible novel mechanisms for catalysis can be constructed computationally. Availability and Implementation: The source code of the initial prototype of our approach is available at https://github.com/Nojgaard/mechsearch Contact: daniel@imada.sdu.dk Supplementary information: Supplementary data are available at https://cheminf.imada.sdu.dk/preprints/ECCB-2021

12:40-13:00
Proceedings Presentation: Disease Gene Prediction with Privileged Information and Heteroscedastic Dropout
Format: Pre-recorded with live Q&A

Moderator(s): Martina Summer-Kutmon

  • Jianzhu Ma, Department of Computer Science and Department of Biochemistry, Purdue University, United States
  • Sheng Wang, Paul G. Allen School of Computer Science, University of Washington, United States
  • Juan Shu, Department of Statistics, Purdue University, United States
  • Yu Li, Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
  • Bowei Xi, Department of Statistics, Purdue University, United States

Presentation Overview: Show

Recently, machine learning models have achieved tremendous success in prioritizing candidate genes for genetic diseases. These models are able to accurately quantify the similarity among disease and genes based on the intuition that similar genes are more likely to be associated with similar diseases. However, the genetic features these methods rely on are often hard to collect due to high experimental cost and various other technical limitations. Existing solutions of this problem significantly increase the risk of overfitting and decrease the generalizability of the models.

In this work, we propose a graph neural network (GNN) version of the Learning Under Privileged Information (LUPI) paradigm to predict new disease gene associations. Unlike previous gene prioritization approaches, our model does not require the genetic features to be the same at training and test stages. If a genetic feature is hard to measure and therefore missing at the test stage, our model could still efficiently incorporate its information during the training process. We develop a Heteroscedastic Gaussian Dropout algorithm, where the dropout probability of the GNN model is determined by another GNN model with a mirrored GNN architecture. We compared our method with four state-of-the-art methods on the Online Mendelian Inheritance in Man (OMIM) dataset to prioritize candidate disease genes. Extensive evaluations show that our model could improve the prediction accuracy when all the features are available compared to other methods. More importantly, our model could make very accurate predictions when >90% of the features are missing at the test stage.

13:00-13:20
Simulation, modeling, and network-guided detection of epistasis
Format: Pre-recorded with live Q&A

Moderator(s): Martina Summer-Kutmon

  • Jan Baumbach, Chair of Computational Systems Biology, University of Hamburg, Germany
  • Markus List, Chair of Experimental Bioinformatics, Technical University of Munich, Germany
  • David Blumenthal, Chair of Experimental Bioinformatics, Technical University of Munich, Germany
  • Tim Kacprowski, PLRI, TU Braunschweig, MHH, BRICS, Germany
  • Markus Hoffmann, Chair of Experimental Bioinformatics, Technical University of Munich, Germany

Presentation Overview: Show

Genome-wide association studies (GWAS) link genetic variants to phenotypic traits of interest (i.e., a disease), usually by looking for biallelic single nucleotide polymorphisms (SNPs) that are individually predictive of the phenotype. SNPs usually account only for a fraction of the investigated traits’ heritability. The most common hypothesis is that the missing heritability can be explained by epistasis, i.e., by interactions between SNPs that are jointly predictive of the phenotype but individually have little or no effect. Although epistasis is assumed to play an important role in the genomics of complex phenotypic traits, no undisputed cases of epistasis in humans are known. Developing epistasis detection tools is problematic for at least three reasons: Firstly, there is no suitable human data with ground truth that could be used for evaluation. Secondly, it is unclear how epistasis should be formally modeled to render it algorithmically accessible. Thirdly, it is often unclear whether predicted cases of epistasis are biologically meaningful or mere statistical artifacts. In our work, we address these problems with (1) an epistasis simulation tool, (2) a comparison of existing statistical models, and (3) a detection tool guided by biological knowledge to lift state-of-the-art epistasis detection to a systems-oriented network biology level.

13:20-13:40
GoNetic: Network-Based Driver Identification using Probabilistic Pathfinding
Format: Pre-recorded with live Q&A

Moderator(s): Martina Summer-Kutmon

  • Kathleen Marchal, Ghent University, Belgium
  • Louise de Schaetzen van Brienen, Ghent University, Belgium
  • Giles Miclotte, Ghent University, Belgium
  • Maarten Larmuseau, Ghent University, Belgium

Presentation Overview: Show

Network-based driver identification methods use an underlying network to drive their analysis. Unlike frequency-based driver identification methods, network-based methods allow to also identify rarely mutated driver genes. Most state-of-the-art network-based driver identification methods cannot handle sample-specific mutational information and cannot cope with the weights and edge directionalities of their underlying network.
Hence, we developed GoNetic, a method based on probabilistic pathfinding that is able to use a weighted network and patient-specific mutational information. GoNetic extracts subnetworks that maximally connect mutated genes in different samples with a minimal number of interactions. These subnetworks are proxies of recurrently mutated driver pathways.
When applied to a large metastatic prostate cancer cohort, GoNetic identified in addition to well-known metastatic drivers several rarely mutated driver candidates as members of frequently mutated subnetworks. Some candidates were more frequently mutated in metastatic than in primary samples, confirming their metastatic importance. Validation with other public datasets of matching primary and metastatic samples allowed differentiating drivers involved in early from those involved in later disease stages.
In conclusion, GoNetic is a flexible network-based driver identification method that can handle large tumor cohorts, exploits properties of individual mutations and samples and can be easily be used for data integration.

13:40-14:00
Mutation Edgotype Drives Fitness Effect in Human
Format: Pre-recorded with live Q&A

Moderator(s): Martina Summer-Kutmon

  • Mohamed Ghadie, McGill University, Canada
  • Yu Xia, McGill University, Canada

Presentation Overview: Show

Missense mutations are known to perturb protein-protein interaction (PPI) networks (“interactome networks”) in different ways. However, it remains unknown how different interactome perturbation patterns (“edgotypes”) impact organismal fitness. Here, we estimate the fitness effect of missense mutations with different interactome perturbation patterns in human, by calculating the fractions of neutral and deleterious mutations that do not disrupt PPIs (“quasi-wild-type”), or disrupt PPIs either by disrupting the binding interface (“edgetic”) or by disrupting overall protein stability (“quasi-null”). We first map pathogenic mutations and common non-pathogenic mutations onto homology-based three-dimensional structural models of proteins and PPIs in human. Next, we perform structure-based calculations to classify each mutation as either quasi-wild-type, edgetic, or quasi-null. Using our predicted as well as experimentally determined interactome perturbation patterns, we estimate that >~40% of quasi-wild-type mutations are effectively neutral and the remaining are mostly mildly deleterious, that >~75% of edgetic mutations are only mildly deleterious, and that up to ~75% of quasi-null mutations may be strongly detrimental. Our results suggest that while mutations that do not disrupt the interactome tend to be effectively neutral, the majority of human PPIs are under strong purifying selection and the stability of most human proteins is essential to life.

14:20-14:40
Inference of cell type-specific gene regulatory networks from single-cell omic datasets
Format: Pre-recorded with live Q&A

Moderator(s): Tijana Milenkovic

  • Sushmita Roy, University of Wisconsin-Madison, United States
  • Shilu Zhang, University of Wisconsin-Madison, United States
  • Stefan Pietrzak, University of Wisconsin-Madison, United States
  • Alireza Siahpirani, University of Wisconsin-Madison, United States
  • Saptarshi Pyne, University of Wisconsin-Madison, United States
  • Rupa Sridharan, University of Wisconsin-Madison, United States

Presentation Overview: Show

Single-cell omic technologies offer unprecedented opportunities to study transcriptional programs of heterogeneous populations, allowing a high-resolution view of cell type-specific gene regulatory networks. However, single-cell data are noisy and sparse, which makes gene regulatory network inference difficult. To address this challenge, we propose single-cell Cell type Varying Networks (scCVN), a multi-task learning framework for joint inference of cell type-specific gene regulatory networks that leverages the lineage structure and scRNA-seq and scATAC-seq measurements to enable robust inference of cell type-specific networks. We benchmarked the performance of five multi-task learning algorithms, including scCVN, and found that multi-task learning algorithms have better performance in predicting the gene regulatory relationships than single-task learning algorithms, especially for cell populations with few samples. Additionally, we applied scCVN on scRNA-seq and a novel scATAC-seq time course data measured during mouse cellular reprogramming. Our approach identified known connections between key developmental regulators and target genes and captured regulatory interaction dynamics during the cellular reprogramming process. In summary, scCVN is a powerful framework to infer cell type-specific gene regulatory networks for single-cell omic datasets.

14:40-15:20
NetBio Keynote: Few-Shot Learning for Network Biology
Format: Live-stream

Moderator(s): Tijana Milenkovic

  • Marinka Zitnik, Harvard Medical School, United States

Presentation Overview: Show

Prevailing methods for learning on biological networks require abundant label information. However, labeled examples are scarce at frontiers of network biology, considerably limiting the methods' use for problems that require reasoning about new phenomena, such as novel drugs in development, emerging pathogens, and patients with rare diseases. In this talk, I will describe algorithms that enable few-shot learning for network biology. At the core is the notion of local subgraphs that transfer information from one learning task to another, even when each task has only a handful of labeled examples. This principle is theoretically justified as we show that the evidence for a prediction can be found in the local subgraph surrounding target nodes or edges. I will illustrate few-shot learning methods on two problems, the modeling of ultra high-order drug combinations and studying of protein interactions across 1,840 species.



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube