Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters - Schedules

Poster presentations at ISMB/ECCB 2021 will be presented virtually. Authors will pre-record their poster talk (5-7 minutes) and will upload it to the virtual conference platform site along with a PDF of their poster beginning July 19 and no later than July 23. All registered conference participants will have access to the poster and presentation through the conference and content until October 31, 2021. There are Q&A opportunities through a chat function and poster presenters can schedule small group discussions with up to 15 delegates during the conference.

Information on preparing your poster and poster talk are available at: https://www.iscb.org/ismbeccb2021-general/presenterinfo#posters

Ideally authors should be available for interactive chat during the times noted below:

View Posters By Category

Session A: Sunday, July 25 between 15:20 - 16:20 UTC
Session B: Monday, July 26 between 15:20 - 16:20 UTC
Session C: Tuesday, July 27 between 15:20 - 16:20 UTC
Session D: Wednesday, July 28 between 15:20 - 16:20 UTC
Session E: Thursday, July 29 between 15:20 - 16:20 UTC
A methodology for predicting tissue-specific metabolic roles of receptors applied to subcutaneous adipose
COSI: RegSys
  • Gur Arieh Yehudaa, University of Haifa, Israel
  • Judith Somekh, University of Haifa, Israel

Short Abstract: The human biological system uses ‘inter-organ’ communication to achieve a state of homeostasis. This communication occurs through the response of receptors, located on target organs, to the binding of secreted ligands from source organs. Albeit years of research, the roles these receptors play in tissues is only partially understood. This work presents a new methodology based on the enrichment analysis scores of co-expression networks fed into support vector machines (SVMs) and k-NN classifiers to predict the tissue-specific metabolic roles of receptors. The approach is primarily based on the detection of coordination patterns of receptors expression. These patterns and the enrichment analysis scores of their co-expression networks were used to analyse ~ 700 receptors and predict metabolic roles of receptors in subcutaneous adipose. To facilitate supervised learning, a list of known metabolic and non-metabolic receptors was constructed using a semi-supervised approach following literature-based verification. Our approach confirms that pathway enrichment scores are good signatures for correctly classifying the metabolic receptors in adipose. We also show that the k-NN method outperforms the SVM method in classifying metabolic receptors. Finally, we predict novel metabolic roles of receptors. These predictions can enhance biological understanding and the development of new receptor-targeting metabolic drugs.

A Statistical Framework for Single-Molecule Transcription Factor Footprinting
COSI: RegSys
  • Danilo Dubocanin, Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA., United States
  • Daniela Witten, Department of Statistics, U. of Washington, Seattle, WA;Department of Biostatistics, U. of Washington, Seattle, WA, USA, United States
  • Andrew Stergachis, Division of Medical Genetics, Dept. of Medicine, U. of Washington, Seattle,WA; Brotman Baty Institute, Seattle, WA, USA, United States

Short Abstract: Transcription Factor (TF) binding and the regulatory networks they compose form the foundations of gene regulation. Currently, our knowledge on TF binding is based on short-read data, which is unable to delineate TF-DNA interactions in hard-to-map genomic regions, and cannot expose combinatorial dynamics of TF binding along individual chromatin fibers. Here we demonstrate an approach for footprinting individual TF binding events along multi-kilobase single-molecule chromatin fibers. We use a non-specific N6-methyladenine-methyltransferase (m6A-MTase) to methylate all adenine residues in DNA that are not protein-bound, and then perform PacBio sequencing. m6A-modified bases are identified using a Gaussian Mixture Model trained on polymerase kinetics obtained during single-molecule sequencing, and single-molecule nucleosome footprints are identified using a Hidden Markov Model. To identify individual single-molecule TF binding events within nucleosome free regions (NFRs), we developed a statistical framework that adjusts for the intrinsic methylation preferences of the m6A-MTase, as well as the local density of m6A-modified bases. We applied this for the de novo discovery of single-molecule TF footprints, as well as for the identification of single-molecule TF binding events at a priori defined TF elements. This method delineates the combinatorial dynamics of TF binding along multi-kilobase single-molecule DNA fibers at unprecedented resolutions.

A Stochastic Dynamical System Identifies Genes With Variable Transcription Factor Binding Activities
COSI: RegSys
  • James Brunner, Mayo Clinic, United States
  • Jacob Kim, Columbia University, United States
  • Timothy Downinig, University of California Irvine, United States
  • Eric Mjolsness, University of California Irvine, United States
  • Kord Kober, University of California San Francisco, United States

Short Abstract: Understanding the role of epigenetic factors in the regulation of gene expression is a fundamental question of molecular biology. Here, we present a novel stochastic dynamical system to predict gene expression levels and transcription factor (TF) binding activities across a gene regulatory network (GRN) using gene expression and epigenetic data.

A GRN was defined as a bipartite network with TF to binding site (i.e. promoter regions) and binding site to target edges. Edges were defined using high confidence public data. The GRN was created from genes in the KEGG Breast Cancer Pathway (hsa05224) and their TFs. The model was trained using RNA-seq and methylation data of solid tumor from 312 patients diagnosed with breast cancer.

The final GRN included 106 genes and 122 binding sites. TF binding activities were predicted across all genes and samples. Our model identified variation in the binding activity of TFs (i.e., E2F1 and E2F3) that regulate the gene AURKA. Aurora kinases are serine/threonine kinases and AURKA is amplified in breast cancer cells. E2F1 and E2F3 are transcriptional activators and are over expressed in patients with breast cancer. Our novel method identified genes that have previously been identified as promising targets in cancer therapy.

Assessing the predictive power of Hi-C interactions across species
COSI: RegSys
  • Brittany Baur, University of Wisconsin-Madison, United States
  • Tiegh Taylor, University of Toronto, Canada
  • Da-Inn Lee, University of Wisconsin-Madison, United States
  • Gurdeep Singh, University of Toronto, Canada
  • Liangxi Wang, University of Toronto, Canada
  • Kumaragurubaran Rathnakumar, University of Toronto, Canada
  • Michael Wilson, University of Toronto, Canada
  • Jennifer Mitchell, University of Toronto, Canada
  • Sushmita Roy, University of Wisconsin-Madison, United States

Short Abstract: Three-dimensional organization of the genome has emerged as an important layer of gene regulation, wherein regulatory elements such as enhancers regulate a gene megabases away. An open question is the extent to which such interactions and their associated mechanisms are conserved across species. We previously identified conserved and diverged enhancers based on transcription factor binding and H3K27ac signal in human and mouse embryonic stem cells (ESCs). Here, we examine the role of enhancer conservation and 3D genome organization by training enhancer-centric random forest models based on one-dimensional data and high resolution Hi-C counts and testing the ability to predict interactions across species for these enhancers. Cross-species Hi-C predictions agree better with measured counts for conserved enhancers compared to enhancers that are species-specific, indicating the presence of shared regulatory mechanisms governing the interactions. Target genes interacting with conserved and diverged enhancers are involved in similar developmental processes, suggesting cell state is established by a combination of shared and species-specific mechanisms. Taken together, our results suggest that exploiting conserved signals could improve our ability to predict Hi-C counts and help gain insight into the mechanisms of long-range regulation.

Characterising the Cis-Regulatory Landscape during Human Plasma-Cell Differentiation: A Penalized Regression Approach
COSI: RegSys
  • Amber Emmett, University of Leeds, United Kingdom
  • David Westhead, University of Leeds, United Kingdom

Short Abstract: Gene regulation is complex: an ensemble of transcription factors and cis-regulatory elements are recruited to each gene, exercising tight control on its expression. In silico enhancer prediction typically focuses on pairwise enhancer-promoter interactions, overlooking the complexities of the regulatory landscape. Our method looks beyond this binary association to identify communities of cis-regulatory elements which, bound by common transcription factors, act in union to regulate transcription. Employing a statistical approach hinged on community detection and LASSO regression, we construct gene-specific models that identify cis-regulatory element communities from ATAC-seq and RNA-seq data. Applying our method to datasets spanning the transition from B cell to plasma cell, we identify cis-regulatory elements whose dynamic activity drives human plasma cell differentiation. We validate our predictions with support from chromosomal contacts and disease-linked polymorphisms.

Chromatin Signatures and their Role in Transcriptional Elongation Control
COSI: RegSys
  • Toray Akcan, Helmholtz Research Center Munich, Germany

Short Abstract: Promotor-proximal Polymerase II (POLII) pausing is recognised as a hallmark of protein-coding genes and represents a key control mechanism to gene expression. However, we lack a quantitative description of associated factors with the potential to reveal the relative importance of implicated factors and to identify previously unknown regulators of pausing. Here we present machine learning models that predict promotor-proximal POLII pausing in human cell lines from chromatin signatures of protein binding events and DNA sequence features with high accuracy (pearson’s rho 0.83, R2 0.69). Cross-validation with data of independent cell lines reveal the cell line agnostic relations and generalisability to other systems. Harnessing the predictive model structures we deconvolve the pausing signal and contrast the transcriptional pause and elongation mechanisms and their associated factors. Thereby we identify known and novel pausing related factors and further suggest novel 7SK ncRNA pause mediator complex binding factors. Pathway-level analysis of predictive factors implicates multiple RNA processing mechanisms to be intricately connected to the pause mechanism. Taken together, we have built a model of transcriptional pausing to identify known and novel pausing factors, quantified their contributions in different RNA regulatory processes in the context of transcriptional pausing.

Combining in vitro quantification with in vivo detection of protein-DNA interactions reveals subtle signals in gene regulatory networks
COSI: RegSys
  • Yuning Zhang, Center for Genomic and Computational Biology, Duke University, United States
  • Raluca Gordan, Duke University, United States

Short Abstract: Transcription factors (TFs) bind specific sites across the genome to regulate gene expression. To understand this regulation, genomic binding sites of TFs are oftentimes profiled using in vivo assays such as ChIP-seq (which captures binding in the cell but is prone to technical biases), as well as in vitro assays such as protein binding microarray or PBM (which yields quantitative binding measurements but lacks cellular context).

We show that using quantitative in vitro measurements to interpret ChIP-seq data can help us uncover subtle signals in genomic targeting by TFs. First, to understand the relationship between in vitro and in vivo DNA-binding data, we simulated the generation of ChIP-seq reads based on PBM-derived binding probabilities combined with chromatin accessibility. We found that in vitro binding specificities are largely retained in the cell, although noise is introduced by the ChIP-seq experimental steps.

Next, we focused on two model systems of competitive binding by paralogous TFs. With prior knowledge from in vitro binding data, we uncovered in vivo evidence that TF paralogs compete for DNA-binding in a fine-tuned fashion driven by subtle differences in specificity. This finding supports the strategy of combining in vitro quantification with in vivo profiling to understand TF-driven regulation.

Deep learning integration explains dynamic control of mouse brain development
COSI: RegSys
  • Ariane Mora, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia QLD 4072, Australia, Australia
  • Jonathan Rakar, Department of Clinical and Experimental Medicine, Linkoping University, SE-58185, Linkoping, Sweden, Sweden
  • Ignacio Monedero Cobeta, Universidad Autonoma de Madrid, Madrid, Spain, Spain
  • Behzad Yaghmaeian Salmani, Department of Clinical and Experimental Medicine, Linkoping University, SE-58185, Linkoping, Sweden, Sweden
  • Annika Starkenberg, Department of Clinical and Experimental Medicine, Linkoping University, SE-58185, Linkoping, Sweden, Sweden
  • Stefan Thor, School of Biomedical Sciences, University of Queensland, St Lucia QLD 4072, Australia, Australia
  • Mikael Bodén, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia QLD 4072, Australia, Australia

Short Abstract: Central nervous system (CNS) growth is driven by tissue and temporal specific gene expression patterns along the brain to spinal cord (A-P) axis. These patterns are in part controlled by dynamic repressive signals applied by the epigenetic machinery Polycomb Repressor Complex 2 (PRC2). PRC2 inactivation results in dramatic truncation of brain growth, while leaving the spinal cord comparatively unaffected. However, determining the precise role of PRC2 during CNS development is particularly challenging owing to cascading regulatory effects.
Variational autoencoders (VAEs) are machine learning methods amenable to pattern extraction by way of interrogating the learnt latent distributions. To capture both simple and cascading effects of PRC2 on brain growth, we produced a transcriptomic dataset of the developing mouse A-P axis for wild-type (WT) and PRC2 mutant. To identify the trigger of dysregulation we integrated our transcriptome data with WT epigenetic data using a VAE, extracting both direct and indirect patterns. We disentangled drivers underpinning the undergrowth phenotype, finding, 1) epigenetic machinery intersects non-trivially with the cell-cycle and 2) PRC2 gating is selective and evolutionarily conserved. Our VAE analysis highlights the power of integration and shows how latent parametric distributions can be used to interpret system wide spatio-temporal relationships.

Design and power analysis for multi-sample single cell transcriptomics experiments
COSI: RegSys
  • Katharina T. Schmid, Institute of Computational Biology, Helmholtz Zentrum München, Germany
  • Cristiana Cruceanu, Department of Translational Research, Max Planck Institute for Psychiatry, Munich, Germany
  • Anika Böttcher, Institute of Diabetes and Regeneration Research, Helmholtz Diabetes Center, Helmholtz Zentrum München, Germany
  • Heiko Lickert, Institute of Diabetes and Regeneration Research, Helmholtz Diabetes Center, Helmholtz Zentrum München, Germany
  • Elisabeth B. Binder, Department of Translational Research, Max Planck Institute for Psychiatry, Munich, Germany
  • Fabian J. Theis, Institute of Computational Biology, Helmholtz Zentrum München, Germany
  • Matthias Heinig, Institute of Computational Biology, Helmholtz Zentrum München, Germany

Short Abstract: Single cell RNA-seq has revolutionized transcriptomics by providing cell type resolution for interindividual differential gene expression and expression quantitative trait loci analyses. However, efficient power analysis methods accounting for the characteristics of single cell data and interindividual comparison are missing.
We present a statistical framework for design and power analysis of multi-sample single cell transcriptomics experiments. The model relates sample size, number of cells per individual and sequencing depth to the power of detecting differentially expressed genes and expression quantitative trait loci genes within cell types. The overall power is decomposed into the probability of detecting the expression in sparse single cell RNA-seq and the power of the statistical test.
The estimated power of our model was supported by simulation-based methods, however requiring drastically less runtime and memory. It thus enables fast systematic comparison of alternative experimental designs and optimization for a limited budget. We evaluated data driven priors for a range of applications and single cell platforms. In many settings, shallow sequencing of high numbers of cells leads to higher overall power than deep sequencing of fewer cells.
The model including priors is implemented as an R package scPower available on github and is accessible as a web tool.

DNA methylation alterations in de novo and therapy-related AML in several genomic regions
COSI: RegSys
  • Agnieszka Cecotka, Silesian University of Technology, Poland
  • Grainne O'Brien, Public Health England, United Kingdom
  • Christophe Badie, Public Health England, United Kingdom
  • Joanna Polanska, The Silesian University of Technology, Poland

Short Abstract: DNA methylation, occurring in CpG sites of the genome, is an epigenetic, transcription controlling process. I has most impact on genes expression in CpG-rich gene promoters. The role of DNA methylation in other genomic regions is not found yet. Changes in DNA methylation play a crucial role in AML development. However, prognosis and drug response are worse for t-AML, hence AMLs can differ in methylation alterations.
Analyzed data comes from 16 people: 5 healthy donors, 5 de novo AML, and 6 therapy-related AML (t-AML) patients. Data was obtained with Illumina Infinium MethylationEPIC array and contains β-value (from 0 to 1) for 841,323 CpG sites across the whole genome. According to Illumina annotations, CpG sites are assigned to CpG-rich gene promoters, gene bodies, or 3'UTR regions.
Methylation level among healthy donors, de novo AML, and t-AML patients was compared for each CpG site. For both AMLs, more sites are up methylated than down methylated (comparing to control). However, in t-AML, this process is stronger, especially in promoter regions. Around 33% of CpG sites in promoter regions, 25% in gene body regions, and 23% in 3'UTR regions are differentially methylated between de novo and t-AML.
Financed by European Social Fund POWR.03.02.00-00-I029 (AC).

Domain adaptive neural networks improve cross-species prediction of transcription factor binding
COSI: RegSys
  • Kelly Cochran, Stanford University, United States
  • Divyanshi Srivastava, The Pennsylvania State University, United States
  • Avanti Shrikumar, Stanford University, United States
  • Akshay Balsubramani, Stanford University, United States
  • Anshul Kundaje, Stanford University, United States
  • Shaun Mahony, The Pennsylvania State University, United States

Short Abstract: The intrinsic DNA sequence preferences and cell-type specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell-type specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results demonstrate that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.

Elucidation of Dynamic MicroRNA Regulations in Cancer Progression Using Integrative Machine Learning
COSI: RegSys
  • Juan Cui, University of Nebraska - Lincoln, United States
  • Haluk Dogan, University of Nebraska - Lincoln, United States

Short Abstract: Empowered by advanced genomics discovery tools, modern biomedical research has produced a massive amount of data on (post-)transcriptional regulations related to transcript factors, microRNAs, lncRNAs, epigenetic modifications, and genetic variations. Using these data, computational modeling has successfully generated promising testable quantitative models to represent complex interplay among different regulatory mechanisms. However, given the complex interactome in systems as chaotic as cancers and the dramatic growth of heterogeneous data in this field, such promise has encountered unprecedented challenges in model complexity. Here we introduce a new integrative machine learning method to infer multifaceted gene regulations in cancers, along with new strategies for data integration and graphical model fusion. With a particular interest in microRNA, a supervised deep learning model was integrated to identify conditional miRNA-mRNA interactions.
In a breast cancer case study, we have identified distinct gene regulatory networks associated with progressive stages. The subsequent functional analysis has revealed microRNA-mediated dysregulation in major cancer hallmarks, as well as novel pathological signaling and metabolic processes, which shed light on microRNAs’ regulatory roles in breast cancer progression. We believe this integrative model can be a robust and effective discovery tool to understand key regulatory characteristics in complex biological systems. Availability: sbbi-panda.unl.edu/pin/.

Epigenetic alterations at distal enhancers are linked to proliferation in human breast cancer
COSI: RegSys
  • Jørgen Ankill, Oslo University Hospital, Norway
  • Miriam Ragle Aure, University of Oslo, Norway
  • Sunniva Bjørklund, University of Oslo, Norway
  • Severin Langberg, Cancer registry of Norway, Norway
  • Vessela N. Kristensen, University of Oslo, Norway
  • Valeria Vitelli, University of Oslo, Norway
  • Xavier Tekpli, University of Oslo, Norway
  • Thomas Fleischer, Oslo University Hospital, Norway

Short Abstract: Aberrant DNA methylation is an early event in breast carcinogenesis and constitutes a critical role in regulating gene expression. We performed genome-wide expression-methylation Quantitative Trait Loci (emQTL) analysis integrating DNA methylation and gene expression to identify disease-driving pathways under epigenetic control. By grouping the emQTLs using biclustering we identify associations representing important biological processes associated with breast cancer pathogenesis including regulation of proliferation and tumor-infiltrating fibroblasts.

We report genome-wide loss of enhancer methylation at binding sites of proliferation-driving transcription factors including CEBP-β, FOSL1, and FOSL2 with concomitant high expression of proliferation-related genes in aggressive breast tumors as confirmed by single-cell RNA-seq. The identified emQTL-CpGs and genes were found connected through chromatin loops, indicating that proliferation in breast tumors is under epigenetic regulation by DNA methylation. Interestingly, the associations between enhancer methylation and proliferation-related gene expression were observed also within known subtypes of breast cancer, suggesting a universal role of epigenetic regulation of proliferation. Taken together, we show that proliferation in breast cancer is linked to loss of methylation at specific enhancers and transcription factor binding mediated through chromatin loops.

Evaluating the predictive power of enhancer-mediated cell-type specific gene regulatory networks
COSI: RegSys
  • Aryan Kamal, EMBL, Germany
  • Christian Arnold, EMBL, Germany
  • Annique Claringbould, EMBL, Germany
  • Sophia Müller-Dott, EMBL, Germany
  • Neha Daga, EMBL, Germany
  • Olga Sigalova, EMBL, Germany
  • Maksim Kholmatov, EMBL, Germany
  • Lixia He, University Hospital Heidelberg, Germany
  • Caroline Pabst, University Hospital Heidelberg, Germany
  • Judith Zaugg, EMBL, Germany

Short Abstract: Disease-associated genetic variants often lay in non-coding regions, and likely have a regulatory role. To understand their effects it is crucial to identify genes that are modulated by specific regulatory elements (e.g. enhancers). However, these regulatory elements are also modulated by the activity of transcription factors (TFs) that regulate them, often in a cell-type specific manner. Thus, regulatory elements integrate genetic, epigenetic, and TF-mediated cellular signals. TFs, regulatory elements, and their target genes form a gene regulatory network (GRN) that comprises cell-type specific links. Many methods exist for reconstructing GRNs from high-throughput expression or chromatin data. However, most methods either lack regulatory elements (i.e. connect TFs directly to genes), or focus on enhancer-gene links (i.e. lacking TFs), and thus are unable to integrate the impact of genetic variants and TFs simultaneously. Another important bottleneck for GRN reconstruction is the lack of a framework to globally evaluate their performance. To address these challenges, we (1) present a method for reconstructing cell-type specific GRNs that integrate enhancers and TFs, (2) a general framework for evaluating GRNs based on their cell-type specific predictive power, and (3) apply our approach to understand the response to infection in macrophages.

Exploiting marker genes for robust classification and characterization of single-cell chromatin accessibility
COSI: RegSys
  • Risa Kawaguchi, Cold Spring Harbor Laboratory, United States
  • Ziqi Tang, Cold Spring Harbor Laboratory, United States
  • Stephan Fischer, Cold Spring Harbor Laboratory, United States
  • Rohit Tripathy, Cold Spring Harbor Laboratory, United States
  • Peter K Koo, Cold Spring Harbor Laboratory, United States
  • Jesse Gillis, Cold Spring Harbor Laboratory, United States

Short Abstract: Single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) measures genome-wide chromatin accessibility for the discovery of cell-type specific regulatory networks. On the other hand, because of the stochastic lack of marker gene activities, cell type identification by scATAC-seq remains difficult even at a cluster level.
In this study, we exploit reference knowledge to define existing cell types and uncover the cell-type specific epigenetic regulation using 7 mouse brain scATAC-seq datasets, including a reference atlas generated by the BRAIN Initiative Cell Census Network (BICCN). By comparing the area under the receiver operating characteristics curves (AUROCs), cell-typing performance by single markers is found to be highly variable. However, the signal aggregation of the marker gene set optimized via multiple scRNA-seq data achieves the highest cell-typing performances among other gene sets.
We demonstrate the applicability of meta-analytic marker sets in a comprehensive assessment of cell typing with supervised learning methods. Moreover, a deep neural network trained to predict chromatin accessibility in each subtype from DNA sequence identify key motifs enriched about robust gene sets for each neuronal subtype. Our results strongly support the value of robust marker gene selection as a feature selection tool and cross-dataset comparison between scATAC-seq datasets.

Exploiting RNA velocity to increase the resolution of genotype-phenotype association maps
COSI: RegSys
  • Rodrigo Gonzalo Parra, European Molecular Biology Laboratory, Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Marc Jan Bonder, European Molecular Biology Laboratory, Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Julia Rühle, German Cancer Research Center (DKFZ), Heidelberg, Germany

Short Abstract: Human induced pluripotent stem cells (iPSC) harbor great potential for investigating cell fate decisions in early developmental processes. However, to fully exploit their potential, a deep understanding of the mechanisms underlying cellular transitions and how they are influenced by genetic variation is required. In this study, we use single-cell RNA-sequencing data obtained from human iPSC lines of ~100 donors that were induced towards endoderm differentiation and assayed at four time points [1]. Previous work [1] investigating the genetic regulatory landscape of these cell types identified hundreds of genes with at least one expression quantitative trait locus (eQTL). Moreover, these loci were shown to dynamically impact expression in a cellular-state-dependent manner.

To further characterize the dynamics of these genotype-phenotype associations, we extend the existing association maps by looking at RNA velocity QTL (vQTL). Since RNA velocity can predict the rate and direction of gene expression changes at future times, we exploit it to add a more dynamic dimension to the study of QTLs. With this, we increase the resolution of phenotype-genotype association maps, as we not only link gene expression with genetic variation, but also its differential kinetics.


[1] Cuomo et al 2020, Nat Comms.

FPseg: Fast Poisson segmentation for nucleotide-resolution genomic annotation
COSI: RegSys
  • Ali Tugrul Balci, University of Pittsburgh, United States
  • Maria Chikina, University of Pittsburgh, United States

Short Abstract: Automating the analysis of multi-sample/multi-assay epigenetic data remains a challenge due to the large and noisy data sets. One class of tools that greatly facilitate this process is genome segmentation and annotation algorithms (GSAs) which partition a genome into segments that exhibit similar epigenetic characteristics. Several GSAs have been proposed such as Dynamic Bayesian Networks (DBN), with Hidden Markov Models (HMMs) as a special case. However, despite high profile applications on large consortia datasets, GSAs have not been widely adopted, and existing implementations have several limitations. To address these issues we propose a method that takes a hybrid approach by combining Poisson L0 segmentation with Gaussian mixture models. The method uses Poisson loss and can operate directly on count data. Aside from the improved scalability and ease of use, we demonstrate that our method outperforms Segway, the state-of-the-art DBN, on several metrics such as the maximal fold enrichment.

Functional Transcription Factor Target Networks Illuminate Control of Epithelial Remodelling
COSI: RegSys
  • Ian Overton, Queen's University Belfast, United Kingdom
  • Andrew Sims, University of Edinburgh, United Kingdom
  • Jeremy Owen, Harvard University, United States
  • Bret Heale, University of Edinburgh, United Kingdom
  • Matthew Ford, University of Edinburgh, United Kingdom
  • Alex Lubbock, University of Edinburgh, United Kingdom
  • Erola Pairo-Castineira, University of Edinburgh, United Kingdom
  • Abdelkader Essafi, University of Edinburgh, United Kingdom

Short Abstract: Cell identity is governed by gene expression, regulated by Transcription Factor (TF) binding at cis-regulatory modules. Decoding the relationship between TF binding patterns and gene regulation is nontrivial, remaining a fundamental limitation in understanding cell decision-making. We developed the NetNC software to predict functionally active regulation of TF targets; demonstrated on nine datasets for the TFs Snail, Twist and modENCODE Highly Occupied Target (HOT) regions. Snail and Twist are canonical drivers of Epithelial to Mesenchymal Transition (EMT), a cell programme important in development, tumour progression and fibrosis. Predicted ‘neutral’ (non-functional) TF binding always accounted for the majority (50% to 95%) of candidate target genes from statistically significant peaks and HOT regions had higher functional binding than most of the Snail and Twist datasets examined. Our results illuminated conserved gene networks that control epithelial plasticity in development and disease. We identified new gene functions and network modules including crosstalk with notch signalling and regulation of chromatin organisation, evidencing networks that reshape Waddington’s epigenetic landscape during epithelial remodelling. Expression of orthologous functional TF targets discriminated breast cancer molecular subtypes and predicted novel tumour biology, with implications for precision medicine. Predicted invasion roles were validated using a tractable cell model, supporting our approach.

Reference: Overton et al. (2020) Cancers 12, 2823; doi.org/10.3390/cancers12102823

Gene expression variation in the context of chromatin organization
COSI: RegSys
  • Patrycja Rosa, Faculty of Mathematics, Informatics and Mechanics University of Warsaw, Poland
  • Aleksander Jankowski, Faculty of Mathematics, Informatics and Mechanics University of Warsaw, Poland

Short Abstract: Gene expression varies between different cell types of an organism. Genes with high or low variation in their expression can be identified by aggregating expression data from multiple cell types. We hypothesize that the variation of gene expression could be associated with the gene position within the 3D organization of the genome. While this organization spans multiple scales, the central role is given to chromatin domains, in particular Topologically Associating Domains (TADs). It is observed that TADs can regulate gene expression by limiting the scope of enhancer-promoter interaction to each TAD.
We analysed single-cell expression data from published sources using Seurat library. In particular, we used single-cell RNA-seq data from the ventral nerve cord of the fruit fly Drosophila melanogaster, comprising 26 thousand cells and 120 cell clusters. We used these clusters as individual cell types, noting that they could be differentiated by expression of specific marker genes. The variation of gene expression was quantified as standard deviation of averaged gene expression levels from each cell type. We further integrated the expression data with chromatin organization information on TAD boundaries obtained from published Hi-C experiments. We show how the variation of gene expression changes along with gene position within TAD.

Identification of tissue-specific and common methylation quantitative trait loci in healthy individuals using MAGAR
COSI: RegSys
  • Michael Scherer, Centre for Genomic Regulation, Spain
  • Gilles Gasparoni, Saarland University, Germany
  • Souad Rahmouni, University of Liège, Belgium
  • Tatiana Shashkova, Kurchatov genomics center of the Institute of Cytology and Genetics, Russia
  • Marion Arnoux, Saarland University, Germany
  • Edouard Louis, Liège University Hospital, Belgium
  • Arina Nostaeva, Novosibirsk State University, Russia
  • Diana Avalos, University of Geneva, Switzerland
  • Emmanouil T. Dermitzakis, University of Geneva, Switzerland
  • Yurii S. Aulchenko, Kurchatov genomics center of the Institute of Cytology and Genetics, Russia
  • Thomas Lengauer, Max Planck Institute for Informatics, Germany
  • Paul A. Lyons, University of Cambridge, United Kingdom
  • Michel Georges, University of Liège, Belgium
  • Jörn Walter, Saarland University, Germany

Short Abstract: Understanding the influence of genetic variants on DNA methylation is fundamental for the interpretation of epigenomic data in the context of disease. There is a need for computational approaches for identifying methylation quantitative trait loci (methQTL), and for discriminating general from cell-type-specific effects.

Here, we present a two-step computational framework (MAGAR, bioconductor.org/packages/devel/bioc/html/MAGAR.html), which supports the identification of methQTLs from matched genotyping and DNA methylation data, and which allows for the identification of cell-type-specific methQTLs through colocalization analysis. MAGAR performs data import in its first, and methQTL calling in its second stage. The second stage identifies CpG correlation blocks as regions of jointly regulated CpGs. Using linear models, MAGAR determines SNPs that are significantly associated with the DNA methylation state of the CpG correlation blocks.

We applied MAGAR on data in four tissues from healthy individuals and demonstrate the separation of common and cell-type-specific methQTLs. We computationally validate both types of methQTLs in an independent dataset comprising additional cell types and tissues. More shared than tissue-specific methQTLs were found, and cell-type-specific methQTLs were preferentially located in enhancer elements.

Our analysis demonstrates that a systematic analysis of methQTLs provides important new insights on the influences of genetic variants to cell-type-specific epigenomic variation.

Identification of transcription factor co-binding partners using non-negative matrix factorization
COSI: RegSys
  • Ieva Rauluseviciute, Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway, Norway
  • Timothée Launay, Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway, Norway
  • Jaime A Castro-Mondragon, Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway, Norway
  • Anthony Mathelier, Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway, Norway

Short Abstract: Transcription factor (TF) binding to DNA is key to transcription regulation. While the binding properties of many individual TFs are well known, there is limited understanding on how TFs interact with DNA cooperatively, either forming dimers or co-binding to the same region. Such combinatorial binding of TFs is important to cell differentiation, development, and responses to external stimuli. We present COBIND, a novel method based on non-negative matrix factorization (NMF) to automatically reveal co-appearing binding motifs and infer co-binding TF partners. Specifically, NMF is applied to one-hot encoded regions flanking direct TF-DNA interactions from UniBind, which are used as anchors to identify non-redundant co-appearing motifs at fixed distances. Using motif similarity and protein-protein interaction knowledge, COBIND culminates with the identification of co-binding TF partners and their underlying binding grammar. We applied COBIND to 6,237 TFBS datasets for 404 TFs in 7 species. The method uncovers well known co-binding events (e.g. SOX2-POU5F1 and CTCF-ZBTB3) together with new co-binding configurations not yet reported in the literature. We show that co-binding configurations are usually recurrent within TF families and tend to be more evolutionarily conserved than individual binding sites for several TFs.

Identification of transcription factor cooperativity regulating expression from the mouse inactive X chromosome
COSI: RegSys
  • Yuvia Alhelí Pérez Rico, EMBL, Germany
  • Léna Clerquin, EMBL, Germany
  • Edith Heard, EMBL, Germany

Short Abstract: Transcription factors (TFs) cooperate to establish and maintain transcriptional programs during development. One of the hallmarks of mammalian development is the inactivation of one X chromosome in females, which ensures dosage compensation of X-linked genes between sexes. This process leads to the formation of an inactive X (Xi) chromosome mainly composed of heterochromatin, nevertheless, approximately 7% of genes are expressed from this chromosome in mouse. Here, we use functional genomics techniques, to determine TFs binding to cis-regulatory elements in the Xi chromosome and investigate their interplay. Specifically, we generated TF footprints by integrating allele-specific ATAC-seq, H3K27ac CUT&RUN and RNA-seq data of female mouse neuronal progenitors. To determine possible differences in the regulation of genes in the Xi and active X chromosomes, we will select TFs with enriched footprints in the X chromosomes and generate their binding profiles. These data will enable the reconstruction of X-linked gene regulatory networks. In particular, two networks will be used to understand TF cooperativity by abolishing binding of different combinations of TFs by genome editing and degron systems. Thus, this project will contribute to identify regulatory mechanisms of genes, which dysregulation is often associated with X-linked intellectual disability and sex-linked disorders.

Inferring TF activities and activity regulators from gene expression data with constraints from TF perturbation data
COSI: RegSys
  • Cynthia Ma, Washington University in St. Louis, United States
  • Michael R. Brent, wustl, United States

Short Abstract: The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Many methods of inferring TF activity from gene expression data have been described, but due to the lack of appropriate large-scale datasets, systematic and objective validation has not been possible until now.

Using a new dataset, we systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a condition-independent matrix of control strengths and a condition-dependent matrix of TF activity levels. This approach requires a TF network map to specify the target genes of each TF. We evaluate different methods of building the network map and deriving constraints on the matrices. We find that expression data in which the activities of individual TFs have been perturbed are both necessary and sufficient for obtaining good performance. Control strengths inferred using expression data from one growth condition are shown to carry over considerably to other conditions. Finally, we apply these methods to gain insight into the upstream factors that regulate the activities of four yeast TFs: Gcr2, Gln3, Gcn4, and Msn2.

Evaluation code and data available at github.com/BrentLab/TFA-evaluation

Integrated Transcriptomic and Epigenomic Analysis of Disparate Breast Cancer Cohorts
COSI: RegSys
  • George Acquaah-Mensah, Massachusetts College of Pharmacy & Health Sciences, United States
  • Boris Aguilar, Institute for Systems Biology, United States
  • Kawther Abdilleh, General Dynamics Information Technology, ISB-CGC, United States

Short Abstract: Among women, breast invasive carcinoma (BrCA) remains a leading cause of mortality. There are, however, disparities in biomolecular and clinical presentations, racial distribution, and incidences of aggressive types of breast cancer. For this purpose, we examined data of BrCA samples from stage II patients aged 50 or younger that are black (B/AA50) or white(W50) as deposited in The Cancer Genome Atlas (TCGA). We combined transcriptomic and epigenomic data across multiple ISB-CGC Google BigQuery tables. We identified genes that have methylation states and gene expression that are significantly associated with the two cohorts. The AMARETTO package was used to identify activator and repressor driver genes in regulatory modules based on copy number variation, gene expression and DNA methylation data. We identified five modules across both cohorts. We also identified driver and target genes unique to the individual cohorts. Genes with suppressed expression in B/AA50 (relative to W50) include ZNF776,… and PPP1R12A. These have inverse relationships with DNA methylation. Likewise, genes with suppressed expression in W50 (relative to B/AA50) include OTUB1,… and HIST1H2AE have inverse relationships with DNA methylation. Some of these genes play a role in the observed differences between the cohorts due to their known functionality.

Integrative gene expression analysis reveals distinct sex-specific regulatory pathways in circulating monocytes associated with cardiovascular disease
COSI: RegSys
  • Chang Lu, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Netherlands
  • Marjo Donners, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Netherlands
  • Javier Perales-Patón, Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University Hospital, BioQuant, Germany
  • Rachel Cavill, Department of Data Science and Knowledge Engineering, Faculty of Science and Engineering, Maastricht University, Netherlands
  • Pieter Goossens, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Netherlands
  • Han Jin, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Netherlands
  • Joel Karel, Department of Data Science and Knowledge Engineering, Faculty of Science and Engineering, Maastricht University, Netherlands
  • Adriaan Kraaijeveld, Division of Cardiology, UMC Utrecht; Center for Translational Molecular Medicine(CTMM), Utrecht, Netherlands
  • Erik Biessen, Cardiovascular Research Institute Maastricht, Maastricht University; Center for Translational Molecular Medicine (CTMM), Netherlands

Short Abstract: It is yet unclear to what extent the overt sex-related differences in Cardiovascular disease (CVD) are mirroring differences in transcriptional makeup in monocytic cells, which are instrumental in CVD progression. We therefore comprehensively dissected sex differences in gene expression profiles and regulatory context, of monocytes obtained from a cohort of CVD versus healthy subjects.

We integrated the microarray data from the CTMM Circulating Cells cohort and a similarly designed but smaller monocyte cohort(GSE9820), and then generated two separate gene expression signatures(GES) for females(n=154) and males (n=342), comparing CVD versus healthy subjects. We inferred 14 dominant pathway activities for both female and male GES using PROGENy, a footprint-based pathway analysis method. We then identified the main transcription factors(TF) driving these sex-specific GESs by analytic rank-based enrichment analysis(VIPER) using DoRothEA, a resource containing directional TF-target interactions. Finally, the regulatory network architectures for both GESs were extracted from a prior-knowledge-based PPI network (OnmiPath), guided by the most dysregulated pathways and TFs using CARNIVAL.

Our study revealed PI3K (suppressed in male GES) and NFkB/TNFa (suppressed in female GES), as the most prominent pathways showing opposite activities in male versus female, which was corroborated via over-representation enrichment analysis.

Investigation of transcriptional regulation using of single-cell multiomics
COSI: RegSys
  • Maksim Kholmatov, European Molecular Biology Laboratory (EMBL), Germany
  • Judith Zaugg, European Molecular Biology Laboratory (EMBL), Germany

Short Abstract: Regulatory networks governing gene expression can be partially inferred from different sources of information. Integration of these sources allows for a better network reconstruction. Transcription factor (TF) activity can be inferred from either its effect on chromatin accessibility around its binding sites or from the effect it has on expression of its targets (regulon). Here we analysed the Multiome dataset of Peripheral Blood Mononuclear Cells published by 10xGenomics, consisting of cells individually profiled with ATAC-seq and RNA-seq simultaneously. We calculated both types of activity measures using chromVAR for chromatin accessibility and SCENIC for regulon expression and used them to classify TFs as having either an activatory or repressory function. We show an example of a number of transcription factors involved in differentiation of Plasmacytoid Dendritic Cells (pDCs) also exhibiting high accessibility-based, but not regulon-based activity in other cell populations that share common progenitors with pDCs. We demonstrate how discrepancies between different measures of TF activity can shine some light on TF poising and how integration of different data modalities can help explain context specific effects of gene regulation.

This work was supported by the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No860002.

Learning a genome-wide score of human-mouse conservation at the functional genomics level
COSI: RegSys
  • Soo Bin Kwon, University of California, Los Angeles, United States
  • Jason Ernst, University of California, Los Angeles, United States

Short Abstract: Identifying genomic regions with functional genomic properties that are conserved between human and mouse is an important challenge in the context of mouse model studies. To address this, we develop a method to learn a score of evidence of conservation at the functional genomics level by integrating information from a compendium of epigenomic, transcription factor binding, and transcriptomic data from human and mouse. The method, Learning Evidence of Conservation from Integrated Functional genomic annotations (LECIF), trains neural networks to generate this score for the human and mouse genomes. The resulting LECIF score highlights human and mouse regions with shared functional genomic properties and captures correspondence of biologically similar human and mouse annotations. Analysis with independent datasets shows the score also highlights loci associated with similar phenotypes in both species. LECIF will be a resource for mouse model studies by identifying loci whose functional genomic properties are likely conserved.

Mapping the DNA accessibility landscape of B-ALL patients revealed principles of cancer evolution.
COSI: RegSys
  • Giacomo Corleone, IRCCS Regina Elena National Cancer Institute, Italy
  • Stefano Di Giovenale, IRCCS Regina Elena National Cancer Institute, Italy
  • Cristina Sorino, IRCCS Regina Elena National Cancer Institute, Italy
  • Maurizio Fanciulli, IRCCS Regina Elena National Cancer Institute, Italy

Short Abstract: Pediatric B-cell Acute Lymphoblastic Leukemia (B-ALL) is the primary cause of death from hematological disease in children. Despite the enormous improvement of treatments based on innovative immunotherapy, applications of new approaches particularly effective in relapsed patients constitute a significant emergency in clinical practice.

In this work, we have built the most extensive map of the accessibility landscape of B-ALL to date from 35 B-ALL patients. We integrated this map with a plethora of transcriptomic and epigenomics pan-cancer profiles to define the key determinant of B-ALL post-therapy relapse. We observe that relapsed patients are dominated by regulatory elements (N=~6000) originally represented at diagnosis that shrinks under treatments and subsequently re-expand, driving the relapse. Motif analysis coupled with multi-omics integration suggests that these elements are likely regulated by the binding of the transcription factors ERG, EBF, and RUNX.

We identified an enhancer regulating the DCMP Deaminase gene (DCTD) and gained insights into its mechanistic regulation. While the DCTD gene is broadly expressed in human tissues, enhancer activity is detected only in Leukemia Cell Lines (GTEX data). To directly test the regulatory potential of DCTD enhancers, we generated conditional knock-out of the DCTD enhancer and gene in primary cell lines. Our data revealed that DCTD enhancer is a crucial determinant of DCTD mRNA expression and protein abundance. Strikingly, DCTD enhancer depletion abrogates proliferation, thus suggesting that DCTD enhancer activity is a driver of clonal proliferation in B-ALL relapse.

Taken together, our data revealed that regulatory activity dynamically changes during cancer progression and represents principal phenomena underlying functional mechanisms of B-ALL.

Mechanisms underlying divergent signal responses of genetically distinct macrophages
COSI: RegSys
  • Marten Hoeksema, University of California San Diego, United States
  • Zeyang Shen, University of California San Diego, United States
  • Inge Holtman, University of Groningen, Netherlands
  • An Zheng, University of California San Diego, United States
  • Nathan Spann, University of California San Diego, United States
  • Isidoro Cobo, University of California San Diego, United States
  • Melissa Gymrek, University of California San Diego, United States
  • Christopher Glass, University of California San Diego, United States

Short Abstract: Mechanisms by which noncoding genetic variation influences gene expression remain only partially understood but are considered to be major determinants of phenotypic diversity and disease risk. Here, we evaluated the effects of >50 million single-nucleotide polymorphisms and short insertions/deletions provided by five inbred strains of mice on the responses of macrophages to interleukin-4 (IL-4), a cytokine that plays pleiotropic roles in immunity and tissue homeostasis. Of >600 genes induced >2-fold by IL-4 across the five strains, only 26 genes reached this threshold in all strains. By applying deep learning and motif mutation analyses to epigenetic data for macrophages from each strain, we identified the dominant combinations of lineage-determining and signal-dependent transcription factors driving IL-4 enhancer activation. These studies further revealed mechanisms by which noncoding genetic variation influences absolute levels of enhancer activity and their dynamic responses to IL-4, thereby contributing to strain-differential patterns of gene expression and phenotypic diversity.

Molecular Mechanism Conferring Spatial Identity during Cochlear Duct Extension
COSI: RegSys
  • Shuze Wang, University of Michigan, United States
  • Yujuan Fu, University of Michigan, United States
  • Mary Lee, University of Michigan, United States
  • Scott Jones, University of Michigan, United States
  • Jie Liu, University of Michigan, United States
  • Joerg Waldhaus, University of Michigan, United States

Short Abstract: Hearing is mediated by the organ of Corti (OC) which is within the cochlea of the inner ear. The organ of Corti is stretched along the apex-to-base axis of the cochlea enabling the detection of a wide range of sound frequencies. Generally, fine-tuning of sensory hair cells to different frequencies requires a unique set of proteins to be expressed, which is why we speculate about the existence of a mechanism conferring spatial identity during the development of the cochlea. We hypothesize that two different mechanisms could potentially confer spatial information during the time course of cochlear extension. 1) Due to the strict control of timing in OC development, we hypothesize a mechanism, named the time-space translation model, could translate developmental time into spatial information. 2) Alternatively, we hypothesize that morphogen-based signaling originating from the opposing ends of the cochlear duct could potentially confer spatial information, named the dynamic morphogen model. To test our hypotheses, we generated scRNA-sequencing profiles from the developing cochlea duct at E12.5 and E14.5. By aligning E12 cells to E14 dataset, we found the results support the dynamic morphogen model. The results of this study will further understand how spatial identity gets conferred during inner ear development.

Motif syntax determinants of single-cell chromatin dynamics in human somatic cell reprogramming
COSI: RegSys
  • Surag Nair, Stanford University, United States
  • Mohamed Ameen, Stanford University, United States
  • Laksshman Sundaram, Stanford University, United States
  • Akshay Balsubramani, Stanford University, United States
  • Glenn Markov, Stanford University, United States
  • David Burns, Stanford University, United States
  • Helen Blau, Stanford University, United States
  • Kevin Wang, Stanford University, United States
  • Anshul Kundaje, Stanford University, United States

Short Abstract: Ectopic induction of the Yamanaka factors— OCT4, SOX2, KLF4, and MYC (OSKM) in somatic cells initiates a multi-phasic process that culminates in the conversion of a fraction of starting cells into an embryonic stem cell (ESC) like state. To overcome challenges of low efficiency and high heterogeneity, we profile the chromatin and expression dynamics at single-cell resolution over a time course of human fibroblasts induced with OSKM. We train neural networks to learn predictive regulatory sequence models of base-resolution chromatin accessibility profiles from each of the cell states across the reprogramming pseudotime trajectories. Locus level interpretation of cell state-specific deep learning models yields dynamic motif syntax maps of each CRE that often showcase different repertoires of cooperative transcription factors regulating CREs at different stages of reprogramming. Synthesizing motif syntax maps across modules of cis-regulatory elements (CREs) highlights previously underappreciated sequence determinants— a new class of anti-accessibility motifs whose presence is predictive of lower accessibility, and low-affinity OSK motifs in transient CREs. Collectively, our data and analysis refine our understanding of human somatic cell reprogramming by providing a detailed resource that links stage-specific combinatorial transcription factor activity across dynamic CREs with gene regulation that results in distinct non-reprogramming and reprogramming trajectories.

Multi-omics analysis of the infection cycle of Legionella pneumophila in human macrophages
COSI: RegSys
  • Lambert Moyon, Institute of Computational Biology, Helmholtz Zentrum Muenchen, Neuherberg, Germany, Germany
  • Wilhelm Bertrams, Philipps-Universität Marburg, UGMLC iLung-Institute for Lung Research, Marburg, Germany, Germany
  • Stefanie Herbel, Philipps-Universität Marburg, UGMLC iLung-Institute for Lung Research, Marburg, Germany, Germany
  • Sascha Blankenburg, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany, Germany
  • Anna Lena Jung, Philipps-Universität Marburg, UGMLC iLung-Institute for Lung Research, Marburg, Germany, Germany
  • Leon Schulte, Philipps-Universität Marburg, UGMLC iLung-Institute for Lung Research, Marburg, Germany, Germany
  • Uwe Völker, Department of Functional Genomics, University Medicine Greifswald, Greifswald, Germany, Germany
  • Bernd Schmeck, Philipps-Universität Marburg, UGMLC iLung-Institute for Lung Research, Marburg, Germany, Germany
  • Annalisa Marsico, Institute of Computational Biology, Helmholtz Zentrum Muenchen, Neuherberg, Germany, Germany

Short Abstract: Pathogen infections induce important changes in the gene regulatory network of both the host and the pathogen. But it is a challenge to jointly study how infected cells deploy defense mechanisms and what is the pathogen's response. Here we investigate such changes during the infection cycle of Legionella pneumophila in human lung macrophages.

We propose a multi-omics analysis pipeline for both organisms by the integration of datasets coming from dual-RNA sequencing and dual-proteomic quantification. By comparing infected cells with controls, we identified activation of specific immune-response pathways in human macrophages. Notably, feature-selection methods and network analyses allowed to pinpoint key host factors whose deregulation was the most important during the infection. In addition, to point out genes of interest in Legionella pneumophila, we combined results from the multi-omics analyses with systematic prediction of gene function. We pinpointed candidate genes likely to be of major importance for the infection process. Some of these candidates will be experimentally evaluated to confirm their function and their role in infection efficiency in cultured cells.

Overall this work highlights the benefits of multi-omics integration, and allows us to better understand how pathogens such as Legionella pneumophila hijack the host cell machinery to their benefit.

Predicting cis-regulatory element-gene pairing from epigenetic data in mouse hematopoiesis
COSI: RegSys
  • Kathryn J. Weaver, Johns Hopkins University, United States
  • Michael E.G. Sauria, Johns Hopkins University, United States
  • Guanjue Xiang, The Pennsylvania State University, Dana-Farber Cancer Institute, United States
  • Yu Zhang, The Pennsylvania State University, Two Sigma Investments LP, United States
  • Ross C. Hardison, The Pennsylvania State University, United States
  • Rajiv C. McCoy, Johns Hopkins University, United States

Short Abstract: Transcription of DNA into RNA is regulated by several factors including a gene’s primary sequence, epigenetic marks, as well as trans- and cis-regulatory elements (CREs). One outstanding question in molecular biology is how CREs interact with one another and their cellular environment to control gene expression. Further, it has been shown in cell lines that considering both an enhancer’s strength and how often that enhancer comes into contact with a gene improves prediction of gene-CRE pairs. Previously, regression modeling of hematopoietic gene expression used only the epigenetic composition of CREs and promoters to select the most likely set of CREs needed to explain each gene’s expression. In this work, we adjust this modeling task to incorporate a contact metric together with the epigenetic profiles to map CREs to candidate genes in a cell-type-specific manner. To this end, we developed a unique genome partitioning approach for the designation of training and testing sets to measure out-of-sample prediction error. Specifically, by dividing the genome at candidate heterochromatin boundaries, we reduce the risk of separating genes and their CREs into different sets. Using this approach, we estimated the uncertainty of each gene-CRE pairing prediction, improving the accuracy and interpretability of inferred cis-regulatory networks.

Prediction of cancer mutation states across multiple data modalities reveals the utility and redundancy of gene expression and DNA methylation
COSI: RegSys
  • Jake Crawford, Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, United States
  • Brock Christensen, Department of Epidemiology, Geisel School of Medicine, Dartmouth College, United States
  • Maria Chikina, Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, United States
  • Casey Greene, Center for Health AI, University of Colorado School of Medicine, United States

Short Abstract: Cancer researchers are increasingly able to choose from many -omics assays as functional readouts, and it is not always immediately clear which readout is most suitable for a particular study. As a representative problem, we consider prediction of cancer mutation status (presence or absence) from functional -omics data. We compare the predictive ability of six different data types from the TCGA Pan-Cancer Atlas -- RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures -- for mutations in ~100 cancer-associated genes.

We found that for most genes, RNA-seq and DNA methylation provided the best performance. Furthermore, both were approximately equally effective; performance was driven primarily by the target gene. We also observed that combining data types into a single multi-omics model provided little or no improvement in predictive ability over the best individual data type. For studies considering the functional outcomes of cancer mutations, we recommend focusing on gene expression or DNA methylation.

Single Cell Multi-omics Analysis Of Chromothriptic Medulloblastoma Highlights Genomic And Transcriptomic Consequences Of Genome Instability
COSI: RegSys
  • Nicola Casiraghi, European Molecular Biology Laboratory (EMBL), Germany
  • Aurelie Ernst, Group “Genome Instability in Tumors”, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Oliver Stegle, European Molecular Biology Laboratory (EMBL), Germany
  • Stefan Pfister, Hopp Children's Cancer Center (KiTZ), Heidelberg, Germany., Germany
  • Marc Zapatka, Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Thomas Höfer, Division of Theoretical Systems Biology, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Andrey Korshunov, Clinical Cooperation Unit Neuropathology, DKFZ, Department of Neuropathology, Heidelberg University Hospital, Germany
  • Anna Jauch, Institute of Human Genetics, University of Heidelberg, Heidelberg, Germany, Germany
  • Kristian W Pajtler, Hopp Children's Cancer Center (KiTZ), Heidelberg, Germany., Germany
  • Kendra Corina Maaß, Hopp Children's Cancer Center (KiTZ), Heidelberg, Germany., Germany
  • David R Norali Ghasemi, Hopp Children's Cancer Center (KiTZ), Heidelberg, Germany., Germany
  • R. Gonzalo Parra, European Molecular Biology Laboratory (EMBL), Germany
  • Rithu Kumar, Group “Genome Instability in Tumors”, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Thorsten Kolb, Group “Genome Instability in Tumors”, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Martin Sill, Hopp Children's Cancer Center (KiTZ), Heidelberg, Germany., Germany
  • Phillip Mallm, Single-cell Open Lab, German Cancer Research Center (DKFZ) and Bioquant, Heidelberg, Germany, Germany
  • Verena Körber, Division of Theoretical Systems Biology, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • John Kl Wong, Division of Molecular Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Manasi Ratnaparkhe, Group “Genome Instability in Tumors”, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Hana Susak, European Molecular Biology Laboratory (EMBL), Germany
  • Milena Simovic, Group “Genome Instability in Tumors”, German Cancer Research Center (DKFZ), Heidelberg, Germany, Germany
  • Moritz J Przybilla, European Molecular Biology Laboratory (EMBL), Germany

Short Abstract: Introduction:
Chromothripsis is a form of genome instability, whereby a presumably single catastrophic event generates extensive genomic rearrangements of one or few chromosome(s). However, little is known about the heterogeneity across different clones from the same tumor, as well as changes in response to treatment. We present a multi-omics study of the genomic and transcriptomic alterations, linked with chromothripsis in human Li-Fraumeni syndrome (LFS) medulloblastoma.

Results:
We generated single-cell DNA and RNA sequencing from 757 and 30,106 cells from 7 LFS medulloblastoma samples, respectively. At the DNA level, we developed computational methods to 1) infer clones and place them into tree-like structures reflecting their most likely clonal evolutionary history, highlighting the main differences across clones over time and 2) quantify chromothripsis at the clonal level. At the RNA level, we detected a variety of malignant and non-malignant cell types across samples, and found different transcriptional programs being altered in them. We inferred CNV states from the transcriptomic data and statistically matched CNV clones from DNA to RNA.

Concluding Remarks:
Our work significantly contributes to the understanding of genomic heterogeneity and its transcriptomic consequences which is essential to identify new therapeutic strategies for this subgroup of patients.

Single-cEll Marker IdentificaTiON by Enrichment Scoring
COSI: RegSys
  • Anna Hendrika Cornelia Vlot, Berlin Institute for Medical Systems Biology, Germany
  • Setareh Maghsudi, University of Tübingen, Germany
  • Uwe Ohler, Berlin Institute for Medical Systems Biology, Germany

Short Abstract: Cell idenity marker identification from single-cell omics data commonly consists of differential testing between cell clusters. The assignment of cells to clusters is nontrivial and often requires prior knowledge. Yet, cluster assignment unertainities are not generally taken into account. In response, we present SEMITONES, a method for cluster-independent marker identification. This method identifies marker genes of potentially overlapping neighbourhoods using a linear regression framework quantifying feature selectivity to a certain cell neighbourhood. SEMITONES is implemented in Python 3 and freely available on GitHub (www.github.com/ohlerlab/SEMITONES). In healthy human haematopoiesis single-cell RNA-sequencing, SEMITONES accurately identifies known and potential novel marker genes, inclduing known markers not identified by Seurat v3 or the cluster-inedpendent differential expression testing method singleCellHaystack. SEMITONES also outperforms competitors on simulated scRNA-seq data (SEMITONES AUROC 0.78, others 0.54-0.73). Further applications of the method include the construction of co-enrichment graphs, identification of cis-regulatory regions from single-cell ATAC-seq data, and identification of spatially restricted markers from spatial transcriptomics data. Finally, SEMITONES can be used for the inverse problem, i.e. the annotation of cells based on significant enrichment of known marker genes. Overall, SEMITONES provides a flexible and accurate framework for cluster-independent marker identification from diverse single-cell omics data.

SPaRTAN, a computational framework for linking cell-surface receptors to transcriptional regulators
COSI: RegSys
  • Xiaojun Ma, University of Pittsburgh, United States
  • Ashwin Somasundaram, University of Pittsburgh, United States
  • Zengbiao Qi, University of Pittsburgh, United States
  • Douglas Hartman, University of Pittsburgh, United States
  • Harinder Singh, University of Pittsburgh, United States
  • Hatice Osmanbeyoglu, University of Pittsburgh, United States

Short Abstract: The identity and functions of specialized cell types are dependent on the complex interplay between signaling and transcriptional networks. Recently single-cell technologies such as CITE-seq have been developed that enable simultaneous quantitative analysis of cell-surface receptor expression with transcriptional states. To date, these datasets have not been used to systematically develop cell-context-specific maps of the interface between signaling and transcriptional regulators orchestrating cellular identity and function. We present SPaRTAN (Single-cell Proteomic and RNA based Transcription factor Activity Network), a computational method to link cell-surface receptors to transcription factors (TFs) by exploiting cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) datasets with cis-regulatory information. SPaRTAN is applied to immune cell types in the blood to predict the coupling of signaling receptors with cell context-specific TFs. The predictions are validated by prior knowledge and flow cytometry analyses. SPaRTAN is then used to predict the signaling coupled TF states of tumor infiltrating CD8+ T cells in malignant peritoneal and pleural mesotheliomas. SPaRTAN greatly enhances the utility of CITE-seq datasets to uncover TF and cell-surface receptor relationships in diverse cellular states.

Systematic discovery of directional chromatin-associated regulatory motifs affecting human gene transcription
COSI: RegSys
  • Naoki Osato, Waseda University, Japan

Short Abstract: Chromatin interactions are essential in enhancer-promoter interactions (EPIs) and transcriptional regulation. CTCF and cohesin proteins located at chromatin interaction anchors and other DNA-binding proteins such as YY1, ZNF143, and SMARCA4 are involved in chromatin interactions. However, there is still no good overall understanding of proteins associated with chromatin interactions and insulator functions. Here, I describe a systematic and comprehensive approach for discovering DNA-binding motifs of transcription factors (TFs) that affect EPIs and gene expression. This analysis has identified 96 biased orientations [64 forward-reverse (FR) and 52 reverse-forward (RF)] of motifs that significantly affected the expression level of putative transcriptional target genes in monocytes, T cells, HMEC, and NPC and included CTCF, cohesin (RAD21 and SMC3), YY1, and ZNF143; as some TFs have more than one motif in databases, the total number (96) is smaller than the sum of FRs and RFs. KLF4, ERG, RFX, RFX2, HIF1, SP1, STAT3, and AP1 were associated with chromatin interactions. Many other TFs were also known to have chromatin-associated functions. The predicted biased orientations of motifs were compared with chromatin interaction data. Correlations in expression level of nearby genes separated by the motif sites were then examined among 53 tissues.

The Network Zoo: An integrated infrastructure for the development of gene regulatory network tools and models
COSI: RegSys
  • Marouen Ben Guebila, Harvard School of Public Health, United States
  • Tian Wang, Harvard School of Public Health, United States
  • John Quackenbush, Harvard School of Public Health, United States

Short Abstract: Gene regulation plays a fundamental role in shaping tissue identity, function, and responses to perturbation. These regulatory processes are controlled by complex networks of interacting elements, including transcription factors and their target genes. The structure of these gene regulatory networks helps to determine phenotypes and can ultimately influence the development of disease or response to therapy. We developed the Network Zoo (netZoo; netzoo.github.io) as an open-source set of computational tools that allow users to build, analyze, and visualize gene regulatory networks. We used version control to allow community developers to add new features while maintaining the stability of our core base code. We also developed netBooks (netbooks.networkmedicine.org), a collection of Jupyter notebooks that introduce netZoo applications including in studies of cancer and other diseases. Finally, we created the Gene Regulatory Network Database (GRAND; grand.networkmedicine.org) as a catalog of gene regulatory networks inferred using netZoo tools, including human cell lines, a wide range of cancer types, and normal human tissues. Collectively, the elements in the netZoo ecosystem provide a unique set of resources for modeling and analyzing gene regulatory networks and understanding how changes in regulation drive processes associated with health and disease.

The regulatory landscape of cells in the developing mouse cerebellum
COSI: RegSys
  • Ioannis Sarropoulos, Center for Molecular Biology (ZMBH), University of Heidelberg, Germany
  • Mari Sepp, Center for Molecular Biology (ZMBH), University of Heidelberg, Germany
  • Robert Frömel, Centre for Genomic Regulation (CRG), Spain
  • Kevin Leiss, Center for Molecular Biology (ZMBH), University of Heidelberg, Germany
  • Nils Trost, Center for Molecular Biology (ZMBH), University of Heidelberg, Germany
  • Evgeny Leushkin, Center for Molecular Biology (ZMBH), University of Heidelberg, Germany
  • Konstantin Okonechnikov, German Cancer Research Center (DKFZ), Germany
  • Piyush Joshi, German Cancer Research Center (DKFZ), Germany
  • Lena Kutscher, German Cancer Research Center (DKFZ), Germany
  • Margarida Cardoso-Moreira, Francis Crick Institute, United Kingdom
  • Stefan Pfister, German Cancer Research Center (DKFZ), Germany
  • Henrik Kaessmann, Center for Molecular Biology (ZMBH), University of Heidelberg, Germany

Short Abstract: Organ development is orchestrated by cell- and time-specific gene regulatory networks. Here we investigated the regulatory programs underlying mouse cerebellum development from early neurogenesis to adulthood. By acquiring snATAC-seq profiles for ∼90,000 cells spanning eleven developmental stages, we mapped all major cerebellar cell types and identified candidate cis-regulatory elements (CREs), most of which are cell type- and time-specific.

We detected extensive spatiotemporal heterogeneity among progenitor cells, with early development characterized by stronger temporal signals that are often shared between germinal zones. Modeling the differentiation trajectories of major cerebellar neuron types revealed cell type-specific and shared transcriptional regulators, with most shared (pleiotropic) CREs being active in early differentiation. This decrease in pleiotropy is associated with a universal decline in distal CRE sequence conservation during development and differentiation. However, the degree of evolutionary constraint differs between cell types. Notably, microglia have the fastest evolving CREs in the cerebellum, whereas astrocytes are enriched for CREs shared across mammals and are overall the most constrained cell type in the adult mouse.

Collectively, our work delineates the developmental and evolutionary dynamics of gene regulation in cerebellar cells and provides general insights into the regulatory programs that define cell type identities during organ development.

The role of cell-specific gene expression and regulatory networks in tissue-specific Mendelian disease manifestation
COSI: RegSys
  • Jordan H. Whitlock, University of Alabama at Birmingham, United States
  • Vishal H. Oza, University of Alabama at Birmingham, United States
  • Brittany N. Lasseigne, University of Alabama at Birmingham, United States
  • T C Howton, University of Alabama at Birmingham, United States

Short Abstract: There are approximately 10,000 Mendelian diseases affecting 25 million Americans nationwide. Mendelian diseases are caused by germline aberrations that are present in all cells of the body but typically manifest in a limited number of tissues. Our long-term research goal is to understand the role of cell-specific gene expression and regulatory networks in tissue-specific Mendelian disease manifestation. There are multiple hypothesized mechanisms of tissue-specificity related to gene expression, regulatory networks, and non-cell-autonomous based mechanisms. Despite recent advances in genomic sequencing technology, gaps remain in translating individual genomic variation to observed phenotypic outcomes. For nearly a third of Mendelian diseases, the molecular basis is unknown. Moreover, for the remaining two-thirds, how molecular variations lead to disease is poorly understood in one-third of cases. As a test set, we are focusing on human diseases from the Online Mendelian Inheritance of Man (OMIM) with neurological and renal manifestations. Using computational approaches (e.g., TissueEnrich), we assessed the tissue-specific expression patterns of our OMIM test set. This work is critical for generating hypotheses about key drivers in the brain and kidney-associated Mendelian disease and towards understanding how cellular specificity contributes to tissue-specific disease manifestation.

Using High-throughput Screening to Pinpoint Regulatory Mechanisms that Lead to Susceptibility Differences in a Genetically Diverse Zebrafish Model
COSI: RegSys
  • David Reif, North Carolina State university, United States
  • Dylan Wallis, North Carolina State university, United States
  • Jane La Du, Oregon State University, United States
  • Preethi Thunga, North Carolina State university, United States
  • Lisa Truong, Oregon State University, United States
  • Robyn Tanguay, Oregon State University, United States

Short Abstract: Understanding mechanisms behind individual susceptibility differences is key to protecting vulnerable populations. Elucidating gene-environment interactions (GxE) causing these differences presents daunting challenges. We leveraged high-throughput screening data from genetically diverse zebrafish to find evidence of GxE eliciting differential susceptibility to developmental malformation after exposure Abamectin. We used a combined bioinformatic and experimental approach to probe the genetic mechanisms underlying susceptibility differences across the population. Whole-genome sequencing data compared exposed, morphologically normal fish to phenotypically “affected” fish. This analysis highlighted a region upstream of sox7 associated with differential response. Further analysis was conducted to predict differences in transcription factor binding sites (TFBS) using TFBS prediction software Matbase. This revealed significant differences in predicted transcription factor binding between different fish. Sequences that were associated with different TFBS were cloned upstream of sox7 in an expression vector and transfected into MCF-7 cells. Data shows significant expression differences in the presence or absence of the indel upstream of sox7. Collectively, we demonstrated a method combining bioinformatic and wet-lab approaches to identify chemicals that elicit high variation in individual susceptibility due to gene-environment interactions and elucidate the pathways that lead to these effects.



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube