Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters - Schedules

Poster presentations at ISMB/ECCB 2021 will be presented virtually. Authors will pre-record their poster talk (5-7 minutes) and will upload it to the virtual conference platform site along with a PDF of their poster beginning July 19 and no later than July 23. All registered conference participants will have access to the poster and presentation through the conference and content until October 31, 2021. There are Q&A opportunities through a chat function and poster presenters can schedule small group discussions with up to 15 delegates during the conference.

Information on preparing your poster and poster talk are available at: https://www.iscb.org/ismbeccb2021-general/presenterinfo#posters

Ideally authors should be available for interactive chat during the times noted below:

View Posters By Category

Session A: Sunday, July 25 between 15:20 - 16:20 UTC
Session B: Monday, July 26 between 15:20 - 16:20 UTC
Session C: Tuesday, July 27 between 15:20 - 16:20 UTC
Session D: Wednesday, July 28 between 15:20 - 16:20 UTC
Session E: Thursday, July 29 between 15:20 - 16:20 UTC
A data-driven, molecular-based framework for drug repurposing in bladder cancer
COSI: TransMed
  • Shaman Narayanasamy, University of Luxembourg, Luxembourg
  • Irina Balaur, University of Luxembourg, Luxembourg
  • Agnieszka Latosinska, Mosaiques Diagnostics GmbH, Germany
  • Maria Frantzi, Mosaiques Diagnostics GmbH, Germany
  • Soumyabrata Ghosh, University of Luxembourg, Luxembourg
  • Wei Gu, University of Luxembourg, Luxembourg
  • Marika Mokou, Mosaiques Diagnostics GmbH, Germany
  • Zoran Culig, Medical University of Innsbruck, Austria
  • Harald Mischak, Mosaiques Diagnostics GmbH, Germany
  • Reinhard Schneider, University of Luxembourg, Luxembourg
  • Venkata Satagopam, University of Luxembourg, Luxembourg

Short Abstract: Bladder cancer (BC) is one of the most common cancer types worldwide, with both high incidence (i.e. 550,000 new cases in 2018) and mortality (i.e. 200,000 deaths in 2018) rates. The molecular heterogeneity of the disease, despite resulting in similar clinical manifestation, frequently limits the response to therapies, thus complicating clinical management. Drug repurposing has been established as a promising approach that simultaneously accelerates- and reduces cost- of drug development. Here, we present the ongoing development of a data-driven, molecular-based framework for drug repurposing in BC. The approach builds upon defining molecular signatures of BC through integration of multi- omic/parametric data (proteomics, transcriptomics, clinical data) and BC-associated features extracted literature and specialized resources (e.g. DisGeNet, Reactome, etc.). Integrated BC molecular signatures will be used to define potential drug candidates for repositioning in BC that are able to reverse (specific parts of the) disease signature. Most promising candidates will be shortlisted for further in vitro and in vivo experimentation through assessment of novelty and retrieval of existing information about drugs (e.g. safety, toxicity, FDA approval) from publicly available databases. Finally, we believe our framework could serve as an example for future data-driven drug repurposing explorations of other human diseases or conditions.

A Graphical User Interface for the Interactive Analysis of Mutational Signatures
COSI: TransMed
  • Zainab Khurshid, Boston University, United States
  • Nathan Sahelijo, Boston University, United States
  • Tong Tong, Boston University, United States
  • Aaron Chevalier, Boston University, United States
  • Joshua Campbell, Boston University, United States

Short Abstract: Cancer development is driven by the accumulations of somatic mutations. Modern sequencing technologies have the ability to generate large-scale mutational datasets from tumor samples. From these datasets we can identify patterns of co-occurring mutations, also known as mutational signatures, which are produced by exposure to carcinogens or aberrant endogenous processes. We developed an R/shiny Graphical User Interface (GUI) for the musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) package to provide a framework for discovery and analysis of mutational signatures. This user-friendly interface streamlines the musicatk pipeline by providing a step-by-step workflow that includes preprocessing, deconvolution, and downstream exploratory analysis. The application can import multiple file formats including mutation annotation format (MAF) and variant call format (VCF). A wrapper for TCGABiolinks package is also available to allow for the import of TCGA data. Latent Dirichlet Allocation (LDA) and Negative Matrix Factorization (NMF) algorithms are provided for predicting signatures. Modules for downstream analysis include clustering of tumors into subgroups, comparing discovered signatures to those in the COSMIC dataset, identifying differentially exposed signatures, and plotting sample-exposure heatmaps. The musicatk platform can facilitate carcinogenesis research and establish probabilistic models to predict tumor etiology in patients.

A novel feature selection pipeline for identifying predictive targets associated with drug toxicity
COSI: TransMed
  • Yun Hao, University of Pennsylvania, United States
  • Jason Moore, University of Pennsylvania, United States

Short Abstract: In silico assessment of drug toxicity is becoming a critical step in drug development. Existing models are limited by low accuracy and lack of interpretability. Further, they often fail to explain cellular mechanisms underlying structure-toxicity associations. We addressed these limitations by incorporating target profile as an intermediate connecting structure to toxicity. To accommodate for high-dimensional feature space, we developed a pipeline that can identity subset of predictive features. We implemented our pipeline to study 569 targets and 815 adverse events. The features identified by our pipeline comprise less than ten percent of the original feature space, nevertheless, they accurately predicted binding outcomes for 377 targets and toxicity outcomes for 36 adverse events. We demonstrated that predictive targets tend to be differentially expressed in the tissue of toxicity. We rediscovered key cellular functions associated with cardiotoxicity from the predictive targets, as well as markers of skin and liver diseases. We found evidence supporting diagnostic/therapeutic applications of some predictive targets in hepatotoxicity and nephrotoxicity. Our findings highlighted the critical role of predictive targets in cellular mechanisms leading to toxicity. In general, our study improved the interpretability of toxicity prediction without sacrificing accuracy. Our novel pipeline may benefit future studies of high-dimensional datasets.

A review of transriptome-based cancer cell line radiosensitivity signatures
COSI: TransMed
  • Ian Overton, Patrick G. Johnston Centre for Cancer Research, Queen’s University Belfast, BT9 7AE, United Kingdom, United Kingdom
  • John O'Connor, Patrick G. Johnston Centre for Cancer Research, Queen’s University Belfast, BT9 7AE, United Kingdom, United Kingdom
  • Stephen McMahon, Patrick G. Johnston Centre for Cancer Research, Queen’s University Belfast, BT9 7AE, United Kingdom, United Kingdom

Short Abstract: Genomic predictors of sensitivity to radiation therapy have yet to translate to clinical practice. Several models based on gene expression have been trained on small sample, in vitro dose-response data, and some have shown predictive capability in clinical cohorts. This work aimed to investigate a selection of published transcriptome-based radiosensitivity predictors on a larger in vitro dataset.

Two cancer cell line datasets (NCI60 [n=60] and Cancer Cell Line Encyclopaedia (CCLE) [n=522]) with available in vitro radiosensitivity measurements were used to test 7 published radiosensitivity signatures. Models were tested using reported parameters or principal component regression. To benchmark results, random signatures of varying sizes were produced by sampling from all genes available in the dataset along with an intercept only model.

No published model tested outperformed the 2 standard deviation limit for mean error of randomly sampled gene signatures of a similar size, and some had higher errors than an intercept only model. Poor performance of signatures suggests a need for model improvement which may be aided by greater sample size, improved modelling methods, incorporation of multiomics and external validations. Further assessment of radiosensitivity signatures in clinical cohorts using suitable null hypotheses and adjustment for confounding is needed.

Agglomerative Clustering on Raman Spectra for Curating Glioma Tumor-Samples
COSI: TransMed
  • Ion Petre, Department of Mathematics and Statistics, University of Turku, Finland, Finland
  • Joel Sjöberg, Computer Science, Åbo Akademi University, Finland
  • Adrian Lita, Neuro-Oncology Branch, National Cancer Institute, National Institutes of Health, United States
  • Stefan Filipescu, National Institute for Research and Development in Biological Sciences, Romania
  • Orieta Celiku, Neuro-Oncology Branch, National Cancer Institute, National Institutes of Health, United States
  • Luigia Petre, Åbo Akademi University, Finland
  • Mark Gilbert, Neuro-Oncology Branch, National Cancer Institute, National Institutes of Health, United States
  • Houtan Noushmehr, Department of Neuro Oncology, Henry Ford Health System, United States
  • Mioara Larion, Neuro-Oncology Branch, National Cancer Institute, National Institutes of Health, United States

Short Abstract: Glioma is a type of brain cancer which manifests within the glial cells and has dismal survivability and grave impact on the patients' quality of life. The life expectancy of glioblastoma (the most aggressive subtype of glioma) patients remains a few months, despite multimodal treatments that include surgery, radiation, and chemotherapy. Raman spectroscopy is a non-destructive chemical analysis technique that can be used to identify detailed molecular fingerprints of the sample. It has recently been used successfully in optimizing brain tumor surgeries through detection of tumor barriers and in deep learning classification of tumors, demonstrating its promise to characterize key aspects of tumor tissues. Our hypothesis is that Raman spectra can be used to separate tumor regions from non-tumor regions (for example, blood or necrotic cells). We use Raman spectroscopy to analyze glioma tumor samples extracted from 45 patient tumors. We analysed the spectra by utilizing agglomerative clustering, a form of unsupervised machine learning. We found that the majority cluster matches very well the tumor spots characterized by the frequency criterion for three representative results. The average accuracy over all samples was 90.3%, the average precision was 99.6% and the average recall was 90.2%.

Analyzing spatial heterogeneity of tumor mutation burden and immune infiltrates on whole slide images signals correlation with bladder cancer survival
COSI: TransMed
  • Tae Hyun Hwang, Cleveland Clinic, United States
  • Hongming Xu, Dalian University of Technolog, Cleveland Clinic, China
  • Sunho Park, Cleveland Clinic, United States
  • Jean Clemenceau, Cleveland Clinic, United States
  • Jinhwan Choi, Cleveland Clinic, United States
  • Sung Hak Lee, Seoul St.Mary’s Hospital, Korea, Republic of

Short Abstract: Recent work has shown that high tumor mutation burden (TMB-H) could result in an increased number of neoepitopes from somatic mutations expressed by a patient’s tumor cells, which can be recognized and targeted by neighboring tumor-infiltrating lymphocytes (TILs). A deeper understanding of the spatial heterogeneity and organization of tumor cells and their neighboring immune infiltrates could provide new insights into the biology of tumor progression and treatment response, including immunotherapy. We developed and applied computational approaches using digital whole slide images (WSIs) to investigate the spatial heterogeneity and organization of regions harboring TMB-H tumor cells and TILs within tumors, and their impact in prognostic and predictive utility. In experiments using WSIs from The Cancer Genome Atlas bladder cancer (BLCA), our findings show that WSI-based approaches can reliably predict patient-level TMB status, delineate spatial TMB heterogeneity and identify co-organization with TILs. TMB-H patients with low spatial heterogeneity enriched with high TILs showed improved overall survival. Furthermore, we evaluated our models using BLCA patients treated with immunotherapy from the real-world clinical setting. Our results indicate both prognostic and predictive roles for image-based TMB and TILs.

Assessing the Heterogeneity in Alzheimer’s Disease Progression Across Multiple Cohort Studies
COSI: TransMed
  • Holger Fröhlich, Fraunhofer SCAI, Germany
  • Colin Birkenbihl, Fraunhofer SCAI, Germany
  • Yasamin Salimi, Fraunhofer SCAI, Germany

Short Abstract: Clinical cohort study data often build the foundation for data-driven Alzheimer’s disease (AD) progression modeling. The employment of specific inclusion and exclusion criteria forms the distribution from which study participants are recruited and subsequently introduces a bias into the collected data. Therefore, it remains unclear whether patterns found in one dataset generalize beyond the discovery cohort and are reproducible in independent cohorts.
We used multi-state models (MSM) to data mine AD progression patterns from six distinct cohort datasets. We trained a conceptually same MSM on each dataset and compared the resulting progression signals. Furthermore, we propose a novel technique to cluster cohort datasets based on their similarity of progression.
Our study revealed significant differences in progression signals across cohorts. Investigation of the fitted models elucidated that they learned significantly different, cohort-specific parameters which bias their predictions and can impede model generalization.
Our results emphasize the need for external validation of data-driven results. Given the heterogeneity of AD cohort data, building a single progression model that serves all predictive purposes and is applicable to the entire population seems inconceivable. Instead, to eventually support clinical decision making, subpopulation-specific models that embrace the individual characteristics of a stratified target group seem more promising.

Assessing the role of Digital Device Technology in Alzheimer’s Disease using Artificial Intelligence
COSI: TransMed
  • Holger Fröhlich, Fraunhofer SCAI and University of Bonn, Germany
  • Meemansa Sood, Fraunhofer SCAI and University of Bonn, Germany
  • Mohamed Aborageh, Fraunhofer SCAI and University of Bonn, Germany
  • Robbert Harms, Altoida Inc.* 2100 West Loop South, Suite 1450, Houston, TX, 77027 - USA, United States
  • Maximilian Bügler, Altoida Inc.* 2100 West Loop South, Suite 1450, Houston, TX, 77027 - USA, United States
  • Ioannis Tarnanas, Altoida Inc.* 2100 West Loop South, Suite 1450, Houston, TX, 77027 - USA, United States

Short Abstract: In Alzheimer’s Disease (AD) the use of digital technologies has gained a lot of attention, because it may help to diagnose the disease in a pre-symptomatic stage. However, before any use in clinical routine, digital measures (DMs) need to be evaluated carefully by assessing their relationship to established clinical scores and understanding their diagnostic benefit. Along these lines, the IMI project RADAR-AD (www.radar-ad.org) evaluates a smartphone based virtual reality game panel that can help to assess cognitive impairment. In our work we applied Variational Autoencoder Modular Bayesian Network (VAMBN) [1] on the virtual reality game data and analysed connections between DMs and cognitive assessments (e.g. Mini Mental State Examination). Based on our model we then predicted DMs within the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort. This resulted into a network that allowed us to disentangle and quantify the relationship between DMs, established clinical scores, brain volumes as well as molecular mechanisms. Therefore, DMs may have the potential to act as a vital measure in the diagnosis of AD in a pre-symptomatic stage.

[1] Gootjes-Dreesbach L, Sood M…..Fröhlich H (2020) Variational Autoencoder Modular Bayesian Networks for Simulation of Heterogeneous Clinical Study Data. Front. Big Data 3:16. doi: 10.3389/fdata.2020.00016

Bringing the Algorithms to the Data - Secure Distributed Medical Analytics using the Personal Health Train
COSI: TransMed
  • Oliver Kohlbacher, University of Tübingen, Germany
  • Marius Herr, University Hospital Tübingen & University of Tübingen, Germany
  • Lukas Zimmermann, University Hospital Tübingen & University of Tübingen, Germany
  • Michael Graf, University Hospital Tübingen, Germany
  • Peter Placzek, University Hospital Tübingen & University of Tübingen, Germany
  • Florian König, University of Tübingen, Germany
  • Mete Akgun, University Hospital Tübingen, Germany
  • Felix Boette, University Hospital Tübingen & University of Tübingen, Germany
  • Tyra Stickel, University Hospital Tübingen & University of Tübingen, Germany
  • Michael Slupina, University Hospital Tübingen, Germany
  • Stephanie Biergans, University Hospital Tübingen, Germany
  • Nico Pfeifer, University of Tübingen & Max Planck Institute for Informatics, Saarbrücken, Germany

Short Abstract: Data protection laws force hospitals to create data silos, making it difficult to apply machine learning and artificial intelligence methods across distributed datasets. The Personal Health Train is a paradigm proposed within the GO-FAIR initiative to utilize these methods and improve personalized medicine by enabling the learning of more robust models in a distributed setting. Our deployment-ready and open-source Personal Health Train architecture (Figure 1) enables the execution of arbitrarily complex analysis pipelines with a strong focus on security. Without transferring the data to a central analysis site, container technologies allow the user to run a wide range of algorithms iteratively among participating hospitals. Our framework is dynamically extensible to accommodate the particular needs of researchers and hospitals. After deployment of a station at a hospital, no further installation steps are required. A hospital never gives up control over its data and can independently decide to join in an analysis. We demonstrate our framework's capabilities for raw genomic analysis, including homomorphically encrypted count queries and deep neural networks applied to image data (Figures 2, 3).

Building a Platform for Precision Animal Model Selection and Drug Repurposing Prioritization
COSI: TransMed
  • Matthew Might, UAB SOM, United States
  • Vishal Oza, UAB SOM, United States
  • Brittany Lasseigne, UAB SOM, United States
  • Elizabeth Ramsey, UAB SOM, United States
  • Brandon Wilk, UAB SOM, United States
  • Angelina Uno-Antonison, UAB SOM, United States
  • Rabab Fatima, UAB SOM, United States
  • Donna Brown, UAB SOM, United States
  • Members Of The Uab Center For Precision Animal Modeling, UAB SOM, United States
  • Bradley Yoder, UAB SOM, United States
  • Elizabeth Worthey, UAB SOM, United States

Short Abstract: Genome-driven precision medicine seeks to improve diagnosis and pair the right patient with the right drug at the right time. However, understanding and interpreting the cellular impact of molecular variation for both purposes continues to remain a major challenge. The UAB Center for Precision Animal Modeling (C-PAM) aims to efficiently analyze and model such disease-associated variants through the development and application of computational approaches and by developing preclinical models. As a component of this Center, the C-PAM Bioinformatics Section (BIS) has integrated computational biology and data science methods to generate an analytical suite that supports the review, prioritization, interpretation, and selection of variants for model organism studies. We have shown that our BIS deep learning-based methods automate the variant annotation, classification, and prioritization of variants and compare favorably to expert geneticist interpretation. Additionally, we are also developing and applying ensemble machine learning methods with rank-based prioritization of novel targets and repurposed drugs to enhance downstream therapeutic testing. Here we describe our platform and methodology, present findings from proof of principle studies, and discuss diagnoses and drug prioritization for C-PAM cases through rare, misdiagnosed, or undiagnosed disease program cases at UAB and collaborating institutions.

Comparative Analysis of HPV-Human Protein Interaction Networks in Oro-pharyngeal and Oral Squamous Cell Carcinomas
COSI: TransMed
  • Faisal F. Khan, Precision Medicine Lab, Peshawar, Pakistan
  • Habiba Faiz, Precision Medicine Lab, Peshawar, Pakistan

Short Abstract: Oropharyngeal squamous cell carcinoma (OPSCC) and oral squamous cell carcinoma (OSCC) cases are rising worldwide, specifically among young individuals in the South Asian region with risk factors including use of tobacco, gutka, alcohol and human papilloma virus (HPV) infections. HPV, specifically the subtype HPV16, is an established risk factor in the initiation and progression of OPSCC, compared to OSCC. The study aimed to carry out a comparative analysis of protein-protein interactions (PPI) between HPV and human proteins, particularly those encoded by key cancer genes in OPSCC and OSCC. We constructed a HPV-Human PPI network using interaction data from BioGRID. Next, we identified cancer genes, for OPSCC and OSCC, using data retrieved from key databases (NIH GDC, COSMIC and cBioPortal) and conducted network analysis along with enrichment analysis of gene ontology annotations and KEGG Pathway labels. Six human proteins (PML, JAK1, CDK4, RB1, ZBTB16 and MYC) identified, were unique to HPV interactions in OPSCC. Two unique proteins (SMAD4, TRAF3) were interacting with HPV proteins in OSCC. Furthermore, three proteins (CREBBP, EP300, TP53) had common HPV-Human protein interactions between the two cancers. Our next step is to experimentally validate the putative role of these proteins in the initiation and/or progression of OPSCC.

Computational classification of pathological stages of Head and Neck Squamous Cell Carcinoma using DNA methylation
COSI: TransMed
  • Faisal Khan, CECOS University, Pakistan
  • Maryam Shah, Precision Medicine Lab, RMI, Pakistan
  • Arsalan Riaz, CECOS University, Pakistan

Short Abstract: Head and neck cancer is the sixth leading cause of cancer across the globe and is prevalent in South Asian countries. Prediction of pathological stages of cancer can play a pivotal role in early diagnosis and personalized medicine. This project ventures into the prediction of different stages of head and neck squamous cell carcinoma (HNSCC) using prioritized DNA methylation patterns. DNA methylation profiles for each HNSCC stage (I-IV) were used to analyze 485,577 methylation CpG sites and prioritize them on the basis of the predictive power using a wrapper-based feature selection method, along with different classification models. We identified 68 methylation sites which predicted the pathological stage of HNSCC samples with 90.62 % accuracy using a Random Forest classifier. We set out to construct a PPI network for the proteins encoded by the 67 genes associated with these sites to study its network topology and also undertook enrichment analysis of nodes in their immediate neighborhood for GO and KEGG Pathway annotations which revealed their role in cancer-related pathways, cell differentiation, signal transduction, metabolic and biosynthetic processes. With information on the predictive power of each of the 67 genes in each HNSCC stage, we unveil a dynamic stage-course network for HNSCC.

Discovering Biomarkers for Pediatric Acute Myeloid Leukemia Using single-cell RNA sequencing
COSI: TransMed
  • Robert Schauner, Case Western Reserve University, United States
  • Zachary Jackson, Case Western Reserve University, United States
  • Nethrie Idippily, Case Western Reserve University, United States
  • Grace Lee, Case Western Reserve University, United States
  • Sheela Karunanithi, Case Western Reserve University, United States
  • Shivaprassad Manjappa, University Hospitals, United States
  • Tae Hyun Hwang, Cleveland Clinic, United States
  • David Wald, Case Western Reserve University, United States

Short Abstract: Pediatric Acute Myeloid Leukemia (AML) with a FLT3 internal tandem duplication mutation (FLT3-ITD) is a challenging disease due to poor outcomes in many patients. The 4-year progression free survival is still only 31%. Current biomarkers are insufficient to predict why certain patients with FLT3-ITD AML relapse and others do not. The development of prognostic biomarkers in FLT3-ITD pediatric AML may help improve the outcomes and management of these patients. In order to develop new biomarkers and identify therapeutic targets, we performed single cell RNA sequencing on a panel of 37 diagnostic samples and 18 paired diagnostic/relapse samples from FLT3-ITD patients that did not and did relapse respectively. Using this RNA sequencing dataset comprised of over 250k single cells, we first investigated if the frequency of specific clusters of AML cells may help predict patient outcome. We found several clusters that were significantly different between patients that did and did not relapse. Using these clusters, we deconvoluted publicly available RNAseq data to build prediction models using scRNAseq clusters. We found these prediction models to be highly specific as a prognostic biomarker for pediatric AML (p << 0.01) and were better than models trained using genes from the publicly available RNAseq data.

Drug Response Prediction in Cancer using Deep Learning
COSI: TransMed
  • Durdam Das, Fraunhofer ITEM Regensburg, Mathematical Disease Modelling, Division of Personalized Tumor Therapy, Germany
  • Christoph A. Klein, cine and Therapy Research, University of Regensburg | Fraunhofer ITEExperimental MediM, Germany
  • Martin Hoffmann, Fraunhofer ITEM Regensburg, Mathematical Disease Modelling, Division of Personalized Tumor Therapy, Germany

Short Abstract: Predicting the sensitivity of tumor cells to specific anti-cancer therapy is a task of paramount importance for precision medicine. Several research groups have approached this task in the past decade by integrating multi-omics data with machine learning. Deep learning has achieved high-level performance compared to other methods. The relative half-maximal inhibitory concentration has been most commonly used to predict drug response in the literature. Here, we target other drug response metrics like Maximal Effect and Area Under the Curve, which are more informative in distinguishing between effective and ineffective drugs. Integrating large-scale multi-omics data from different sources is especially challenging due to varying experimental conditions resulting in significant inconsistencies. We addressed this problem by homogenizing data generated from various experiments on cancer cell lines and integrated them to be modeled by deep neural networks. We evaluated single cancer – single drug feed-forward neural networks using the gene expression data and achieved a correlation coefficient of 91%. We will also include mutation and copy number variation data and assess model performance. The final aim is to apply neural networks to rare disseminated tumor cells via transfer learning.

EORTC-SPECTA/Arcagen: Molecular characterisation and treatment in patients with rare cancers from retrospective feasibility cohort
COSI: TransMed
  • Aleksandra Stevovic, EORTC, Belgium
  • Marie Morfouace, EORTC, Belgium
  • Vassilis Golfinopoulos, EORTC, Belgium
  • Dexter Jin, Foundation Medicine Inc, United States
  • Oliver Holmes, Foundation Medicine Inc, United States
  • Rachel Erlich, Foundation Medicine Inc, United States
  • Jerome Fayette, Centre Leon Berard, France
  • Sabrina Croce, Department of Biopathology, Institut Bergonie, France
  • Isabelle Ray-Coquard, Centre Leon Berard, France
  • Nicolas Girard, Institut Curie, France
  • Jean-Yves Blay, Department of Medical Oncology, Center Léon Bérard, France

Short Abstract: Background
Rare cancers are diagnosed in less than 6 out of 100,000 cancer patients per year. Their molecular characteristics and treatment options are still not well defined. ARCAGEN aims to explore clinical and molecular information of patients with rare cancers across Europe.
Methods
Tumor samples and clinical data were collected and successfully screened for 77 patients with rare cancers: 41 sarcomas, 9 yolk sac tumors, 14 rare head and neck cancers, and 13 thymomas. Molecular analysis was performed using FoundationOne Heme for sarcomas and FoundationOne CDx assay for other histologies. Findings were compared to Foundation Medicine dataset for common cancers.
Results
Most patients reported some genomic alterations (89%), mostly in genes that regulate cell cycle (TP53, RB1, CDKN2A/B, MDM2), as well as in RAS/RAF family. Direct actionable mutations for which there is a treatment approved in Europe within the patient’s tumor type were detected in 4 cases (4.7%), whereas such actionable mutations were reported in 58% of samples from common cancers. Moreover, fewer cases with no treatment recommendation were present in common cancers than in Arcagen retrospective cohort (9% vs 51.8%). This highlights the need for new studies that focus on molecular analysis, biomarker discovery and treatment in rare cancers.

Formulating a Gene Signature for Diagnosis of Autoimmune and Infectious Diseases
COSI: TransMed
  • Riya Gupta, Center for Biomedical Informatics Research, Department of Medicine, Stanford University, United States
  • Aditya Rao, Immunology Graduate Program, Department of Medicine, Stanford University, United States
  • Lara Murphy Jones, Division of Critical Care Medicine, Department of Pediatrics, School of Medicine, Stanford University, United States
  • Purvesh Khatri, Center for Biomedical Informatics Research, Department of Medicine, Stanford University, United States

Short Abstract: When patients with an underlying autoimmune condition such as juvenile idiopathic arthritis or lupus report life-threatening symptoms, physicians need to quickly determine whether these symptoms are caused by an acute infection or a complication of their autoimmune condition. As immunosuppressive drugs are harmful to someone undergoing an infection, accurate and timely diagnosis is critical. In recent years, host-response-based diagnostics have shown promise in accurately and non-invasively diagnosing a number of infectious and autoimmune diseases.
Here, we collected and curated blood transcriptome profiles of 14,587 patients from 42 countries across 122 independent datasets and grouped them into infectious, autoimmune, and healthy control categories. Using a novel statistical framework, we created two gene signatures from this data: one to differentiate patients with autoimmune or infectious diseases from healthy individuals and another to differentiate between patients with autoimmune or infectious diseases. Both signatures achieve an area under the receiver operating characteristics curve (AUROC) of >0.87 on completely independent datasets. Because our training and testing data included heterogeneity across many factors, these gene signatures can be utilized in diverse clinical populations. Furthermore, these signatures can aid physicians across a broad range of clinical scenarios, where existing diagnostics are invasive, expensive, or non-specific.

Gene expression profiling reveals distinct molecular subgroups of T-cell prolymphocytic leukemia
COSI: TransMed
  • Nathan Mikhaylenko, Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, TU Dresden, Dresden, Germany
  • Linus Wahnschaffe, Dep. I of Internal Medicine, Center for Integrated Oncology Aachen-Bonn-Cologne-Duesseldorf, University of Cologne, Germany
  • Marco Herling, Klinik und Poliklinik für Hämatologie, Zelltherapie und Hämostaseologie, Universitätsklinikum Leipzig, Leipzig, Germany
  • Ingo Roeder, Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, TU Dresden, Dresden, Germany
  • Michael Seifert, Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, TU Dresden, Dresden, Germany

Short Abstract: T-cell prolymphocytic leukemia (T-PLL) is a rare cancer with poor overall survival. T-PLL diagnosis criteria consider the presence of clonal prolymphocytic T-cells in combination with increased white blood cell counts and complex chromosomal aberrations. The proto-oncogene TCL1A and tumor suppressor ATM are putative drivers of T-PLL development. Despite an improved understanding of complex molecular alterations, little is known about the existence of T-PLL subtypes. We performed an analysis of gene expression profiles of 70 T-PLL patients by hierarchical clustering revealing three robust T-PLL subgroups. These subgroups did not show strong significant differences in survival, but patients of the subgroup that was co-clustered together with normal references had the worst chances. Further analyses revealed several similarities of the subgroups at the level of individual genes, signaling and metabolic pathways and alterations of gene regulatory networks, but each subgroup also had its specific molecular characteristics. These differences were mainly reflected at the gene expression level, whereas gene copy number profiles of the subgroups were much more similar to each other except for a few characteristic differences. Especially, major regulators identified by our network approach could potentially contribute to future developments of improved T-PLL stratification systems and design of targeted treatment strategies.

GVViZ: A Bioinformatics Platform for Variable Gene-Disease Annotation, Visualization, and Expression Analysis
COSI: TransMed
  • Zeeshan Ahmed, Rutgers Institute for Health, Health Care Policy and Aging Research, and Rutgers Robert Wood Johnson Medical School, United States
  • Eduard Renart, Rutgers Institute for Health, Health Care Policy and Aging Research, United States
  • Saman Zeeshan, Rutgers Cancer Institute of New Jersey, United States
  • Xinqi Dong, Rutgers Institute for Health, Health Care Policy and Aging Research, United States

Short Abstract: Investigating disease-causing and highly expressed genes can support finding the root causes of uncertainties in patient care. However, independent, and timely high-throughput next generation sequencing data analysis is still a challenge for non-computational biologists and geneticists. Here, we present GVViZ, a robust bioinformatics, user-friendly, cross-platform, and database application for RNA-seq-driven variable and complex gene-disease data annotation, and expression analysis with dynamic heat map visualization. GVViZ has potential to find patterns across millions of features and extract actionable information, which can support early detection of complex disorders and the development of new therapies for personalized patient care. The execution of GVViZ is based on a set of simple instructions that users without a computational background can follow to design and perform customized data analysis. It can assimilate patient’s transcriptomics data with public, proprietary, and our in-house developed gene-disease databases to query, easily explore, and access information on gene annotation and disease phenotypes with greater visibility and customization. Experts can use GVViZ to visualize and interpret transcriptomics data making it a powerful tool to study the dynamics of gene expression and regulation.

HISTOPATHOLOGICAL IMAGE ANALYSIS FOR ORAL SQUAMOUS CELL CARCINOMA CLASSIFICATION USING CONCATENATED DEEP LEARNING MODELS
COSI: TransMed
  • Faisal Khan, Institute of Integrative Biosciences, CECOS University, Phase VI, Hayatabad, Peshawar, Pakistan, Pakistan
  • Ibrar Amin, Precision Medicine Lab, Rehman Medical Institute, Phase V, Hayatabad, Peshawar, Pakistan, Pakistan
  • Hina Zamir, Precision Medicine Lab, Rehman Medical Institute, Phase V, Hayatabad, Peshawar, Pakistan, Pakistan

Short Abstract: Oral squamous cell carcinoma (OSCC) is a subset of head and neck cancer (HNSCC), the seventh most common cancer worldwide, and accounts for more than ninety percent of oral malignancies. Early detection of OSCC is essential for effective treatment and reducing the mortality rate. However, the gold standard method of microscopy-based histopathological investigation is often challenging, time-consuming and relies on human expertise. Automated analysis of oral biopsy images can aid the histopathologists in performing a rapid and arguably more accurate diagnosis of OSCC. In this study, we present deep learning (DL) based automated classification of 290 normal and 934 cancerous oral histopathological images published by Tabassum et al (Data in Brief, 2020). We utilized a transfer learning approach by adapting three pre-trained DL models to OSCC detection. VGG16, InceptionV3, and Resnet50 were fine-tuned individually and then used in concatenation as feature extractors. The concatenated model outperformed the individual models and achieved 96.66% accuracy (95.16% precision, 98.33% recall, and 95.00% specificity) compared to 89.16% (VGG16), 94.16% (InceptionV3) and 90.83% (ResNet50). These results demonstrate that the concatenated model can effectively replace the use of a single DL architecture.

Identification of antineoplastic agents for oral squamous cell carcinoma
COSI: TransMed
  • Faisal Khan, CECOS University, Pakistan
  • Abdus Salam, Rehman Medical Institute, Pakistan
  • Abdus Salam, Precision Medicine Lab, Pakistan

Short Abstract: Oral squamous cell carcinoma (OSCC) is the most common malignant epithelial neoplasm of head and neck region in South Asian countries, with a 5-year survival rate between 20 to 50%. In this study, we perform a metanalysis of five gene expression datasets (GSE23558, GSE25099, GSE30784, GSE37991 and TCGA-OSCC) that produced 1851 statistically significant differentially-expressed genes (DEGs) in OSCC. The DEGs were involved in key biological pathways that putatively drive the progression of OSCC. A comprehensive protein-protein interaction (PPI) network was constructed for proteins encoded by the DEGs to study the topology including hubs and top modules using Cytoscape. Next, 125 DEGs from top modules were mapped with antineoplastic agents using the L1000CDS2 server. We found 37 perturbing agents out of which 12 FDA-approved antineoplastic agents (Teniposide, Palbociclib, Etoposide, Fedratinib, Tivozanib, Afatinib, Vemurafenib, Mitoxantrone, Idamycin, Canertinib, Dovitinib and Selumetinib) which showed interactions with over-expressed DEGs were selected for further study. Next, the candidate antineoplastic agents are now taken into an in vitro drug screen against a library of primary cell lines obtained from tumours of OSCC patients of local, South Asian origin.

Identification of transcriptional network disruptions in drug-resistant prostate cancer with TraRe
COSI: TransMed
  • Charles Blatti, University of Illinois at Urbana-Champaign, Urbana, IL, United States
  • Jesus De la Fuente Cedeño, TECNUN, University of Navarra, Spain., Spain
  • Huanyao Gao, College of Medicine, Mayo Clinic, USA, United States
  • Irene Marin, Computational Biology Program, CIMA University of Navarra, Spain., Spain
  • Zikun Chen, University of Illinois at Urbana-Champaign, Urbana, IL, United States
  • Sihai Dave Zhao, University of Illinois at Urbana-Champaign, Urbana, IL, United States
  • Weinshilboum Richard, College of Medicine, Mayo Clinic, USA., United States
  • Krishna R Kalari, College of Medicine, Mayo Clinic, USA, United States
  • Liewei Wang, College of Medicine, Mayo Clinic, USA., United States
  • Mikel Hernaez, Computational Biology Program, CIMA University of Navarra, Spain., Spain

Short Abstract: The identification of significant changes in Gene Regulatory Networks (GRNs) under different response groups can help discover novel molecular diagnostics and prognostic signatures.

In this work, we present a computational method, TraRe, which combines unsupervised learning and non-parametric testing to mechanistically understand how transcription networks are differentially regulated.

We applied TraRe on RNAseq data of metastatic Castration-Resistant Prostate Cancer (CRPC) patients from the PROMOTE clinical study (NCT 01953640) treated with abiraterone (ABI). Rewired GRNs between ABI- responders and non-responders were found to be enriched in genes down-regulated in prostate cancer samples, as well as in transcription factors (TFs) involved in the androgen receptor signaling pathway as well as associated with other cancers. Further MDX1, a TF that acts as a transcriptional repressor and is a candidate tumor suppressor, is among the top rewiring-specific TFs.

Key rewired TF-target relationships were validated in vitro via qRT-PCR. After knock-down of the top TFs, expression levels of four key genes were significantly changed between parent cell lines and ABI-resistant cell lines.

TraRe efficiently uncovers GRNs from high-throughput sequencing data, performing differential network analysis that unravel phenotype-specific regulatory disruptions.

Identifying Dysfunctional Mechanisms of Pancreas-residing T-cells in Islet Autoimmunity Through Single-cell Immune Profiling
COSI: TransMed
  • Mohammad Lotfollahi, Institute of Computational Biology, Helmholtz-Zentrüm München, Germany
  • Juan Henao, Institute of Computational Biology, BiologyHelmholtz-Zentrüm München, Germany
  • Marius Lange, Institute of Computational Biology, Helmholtz-Zentrüm München, Germany
  • Isabelle Serr, Group Immune Tolerance in Type 1 Diabetes, Helmholtz-Zentrüm München, Germany
  • Michael Sterr, Institute of Diabetes and Regeneration Research, Helmholtz-Zentrüm München, Germany
  • Julius Wiener, Helmholtz Pioneer Campus, Helmholtz-Zentrüm München, Germany
  • Thomas Walzthoeni, Institute of Computational Biology,Helmholtz-Zentrüm München, Germany
  • Matthias Meier, Helmholtz Pioneer Campus, Helmholtz-Zentrüm München, Germany
  • Carolin Daniel, Division of Clinical Pharmacology, Ludwig-Maximilians-Universität München, Germany
  • Benjamin Schubert, Department of Mathematics, Technische Universität München, Germany

Short Abstract: Type 1 diabetes (T1D) is an autoimmune disease characterized by the progressive loss of pancreatic beta-cells due to dysregulation of pancreatic T-cells. However, the exact mechanisms and interplay of T-cell subpopulations remains elusive.

We present a multi-omics immune profiling study to characterize and compare pancreatic T-cell populations of four healthy non-autoimmune prone (BalbC) and non-obese diabetic (NOD) mice with islet autoimmunity at a single-cell level to discover T1D-driving mechanisms. We identified nine T-cell subtypes including a CD4+Foxp3+CD25lowHELIOS+ cluster involved in autoimmunity regulation, with downregulated Hspa8 and Lag3 in NOD, potentially explaining their dysfunctionality in T1D. We also identified an effector CD8+ cluster with high level of clonal expansion and gene expression related to T1D according to overenrichment analysis. Overall, we observed a cell-type compositional shift in all CD4+ and CD8+ related clusters towards a significative increase in the number of cells in samples with islet autoimmunity. Specifically, we detect a dramatic increase in CD8+ effector cluster in comparison to regulatory T-cells providing information about the possible compositional imbalance as a relevant factor behind T1D development.

Our preliminary results provide first insights into the T-cell imbalance associated with T1D development and dysfunctional regulation of different lymphocytic cell populations in the pancreas.

Independent component analysis of gene expression recapitulates histopathological properties of pancreatic tumors
COSI: TransMed
  • Sang-Yoon Kim, Luxembourg Institute of Health, Luxembourg
  • Michel Mittelbronn, Laboratoire national de santé, Luxembourg
  • Petr Nazarov, Luxembourg Institute of Health, Luxembourg
  • Aliaksandra Kakoichankava, Vitebsk State Medical University, Belarus

Short Abstract: Pancreatic cancers (PCs) are among the most deadly solid-tumor cancers and often do not show specific symptoms. Because of late diagnostics, only 10% of the patients cross a 5-year survival. We applied a data deconvolution method based on consensus independent component analysis (ICA) to transcriptomes of 183 pancreatic tumors from TCGA. By mapping the tumors into the space defined by independent components, it is possible to disentangle the activity of various biological processes, show potential technical effects or make predictions about abundance of specific cells. Previously we reported components specific to normal pancreas activity (secretion), stroma (neoangiogenesis), infiltrated immune and tumor cells (cell cycle, hypoxia, keratinization, etc.). Interestingly, we also observed a strong linkage between component weights and visual features of corresponding hematoxylin/eosin staining slides. For example, tumor tissues from samples with a strong cell cycle component showed also a high degree of pleomorphism, higher cell density and mitoses. In contrast, samples with weak cell cycle and strong secretion-related components were almost normal histologically, with clear ductal structures and low number/absent mitotic cells. Therefore, ICA of transcriptomic data from PC patients recapitulates the histopathological properties. We are currently developing a method that would link deconvolved molecular profiles with histological features.

Integrated multi-omics and longitudinal analysis of early breast milk constituents
COSI: TransMed
  • Etienne Thévenot, CEA, France
  • Camilo Broc, CEA, France
  • Eric Venot, INRAE, France
  • Blanche Guillon, CEA, France
  • Florence Castelli, CEA, France
  • Benoit Colsch, CEA, France
  • Mikaïl Berdi, INRAE, France
  • François Fenaille, CEA, France
  • Karine Adel-Patient, INRAE, France
  • Blandine De Lauzon-Guillain, INRAE, France

Short Abstract: Despite breast feeding impact being well-known, post-partum evolution of human breast milk constituents remain poorly understood. To give new insights to it with a cutting-edge acquisition technique, we aim at combining data obtained from four different types of molecular families assayed in EBM and analysed through suitable multi-omics statistical tools.
Milk samples (n=257) were collected from days 2 to 6 within the EDEN mother-child cohort (Berdi et al., 2019). Untargeted analyses of oligosaccharides (HMOs), lipids and metabolites were performed while targeted analysis of numerous cytokines, growth factors and antibodies was achieved. After correcting a strong effect of the collection center, single-omic data analysis was performed to assess the evolution of each type of molecules family independently (univariate tests, PLS-DA). Multi-block approaches (multi-block PLS-DA and WGNCA) were used to highlight potential associations between different types of variables.
We evidenced that HMOs, lipids and metabolites have stronger temporal variations than cytokines. Interestingly, multi-block methods infer associations between families of molecules, notably, cytokines with specific metabolites.
Combination of various omics approaches provided an unprecedented exhaustive view of the biochemical composition of BM. The further association of global milk composition with mother exposure or with infant health outcomes could lead to establishing relevant biomarkers.

Multimodal analysis of cell-free DNA whole genome sequencing for pediatric cancers with low mutational burden
COSI: TransMed
  • Peter Peneder, St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria, Austria
  • Adrian Stütz, St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria, Austria
  • Christoph Bock, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria, Austria
  • Eleni M. Tomazou, St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria, Austria

Short Abstract: Sequencing of cell-free DNA in the blood of cancer patients (liquid biopsy) provides attractive opportunities for early diagnosis, assessment of treatment response, and minimally invasive disease monitoring. To unlock liquid biopsy analysis for pediatric tumors with few genetic aberrations, we introduce an integrated genetic/epigenetic analysis method and demonstrate its utility on 241 deep whole genome sequencing profiles of 95 patients with Ewing sarcoma and 31 patients with other pediatric sarcomas. Our method achieves sensitive detection and classification of circulating tumor DNA in peripheral blood independent of any genetic alterations. Moreover, we benchmark different metrics for cell-free DNA fragmentation analysis, and we introduce the LIQUORICE algorithm for detecting circulating tumor DNA based on cancer-specific chromatin signatures. Finally, we combine several fragmentation-based metrics into an integrated machine learning classifier for liquid biopsy analysis that is tailored to cancers with low mutation rates while exploiting widespread epigenetic deregulation. Clinical associations highlight the potential value of cfDNA fragmentation patterns as prognostic biomarkers in Ewing sarcoma. In summary, our study provides a comprehensive analysis of circulating tumor DNA beyond recurrent somatic mutations, and it renders the benefits of liquid biopsy more readily accessible for childhood cancers.

NeDRex - an integrative and interactive network medicine platform for drug repurposing
COSI: TransMed
  • Sepideh Sadegh, Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Germany
  • James Skelton, School of Computing, Newcastle University, United Kingdom
  • David B. Blumenthal, Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Germany
  • Elisa Anastasi, School of Computing, Newcastle University, United Kingdom
  • Gihanna Galindez, Division Data Science in Biomedicine, PLRI, TU Braunschweig and Hannover Medical School, Germany
  • Anil Wipat, School of Computing, Newcastle University, United Kingdom
  • Tim Kacprowski, Division Data Science in Biomedicine, PLRI, TU Braunschweig and Hannover Medical School, Germany
  • Jan Baumbach, Chair of Computational Systems Biology, University of Hamburg, Germany

Short Abstract: Traditional drug discovery faces a severe efficacy crisis. Repurposing of registered drugs provides an alternative with lower costs, reduced risk, and faster clinical application. The underlying mechanisms of complex diseases are best described by disease modules. These modules represent disease-relevant pathways and contain potential drug targets which can be identified in silico with network-based methods. The data necessary for the identification of disease modules and network-based drug repurposing are scattered across independent databases, moreover, existing studies have been limited to predictions for specific diseases or non-translational algorithmic approaches. Hence, there is an unmet need for adaptable tools allowing biomedical researchers to employ network-based drug repurposing approaches for their specific use cases. We close this gap with NeDRex, an integrative and interactive platform for network-based drug repurposing. NeDRex integrates different data sources covering genes, proteins, drugs, drug targets, disease annotations, and their relationships, resulting in a network with 350,142 nodes and 14,127,004 edges. NeDRex allows for constructing heterogeneous biological networks, mining them for disease modules, and prioritizing drugs targeting disease mechanisms. NeDRex generalizes the approach implemented in our previous work for COVID-19 drug repurposing, CoVex (doi.org/10.1038/s41467-020-17189-2), to be applicable for other diseases.

Personalized medicine as a strategy to address combinatorial diseases
COSI: TransMed
  • Maryam Nazarieh, NA, Germany

Short Abstract: Cell proliferation, differentiation, and apoptosis are three main biological processes that collaboratively can cause tissue abnormality incurring to cancer (Evan & Vousden, 2001). This fact can be very useful in case of metastasis that tumor growth affects other organs. It indicates that targeting the common genes involved in these processes can prevail metastasis (Nazarieh & Helms, 2019). But carrying multiple diseases does not restrict to just metastasis. e.g. a patient can suffer from colon cancer and simultaneously from diabetes.
The need for personalized medicine manifests since different types of biological factors cooperate together to emerge a specific condition that causes a disease-specific drugs do not affect a patient suffering from several diseases. Therefore, building a model that integrates multiple factors weighing them proportionally becomes imperative. Here, I propose a pipeline comprising of six stages to address patient-specific condition which leads to a more accurate diagnosis and treatment such as image processing for the disease diagnosis and model evaluation, single-cell data analysis to identify e.g. cancer stem cells, multiomics data integration for the enhancing of the model, e.g in case of edge prediction, personalized network modelling, identification of personalized biomarkers, and identification of mutations causing disease, respectively.

Ranking Cancer Drivers via Betweenness-based Outlier Detection and Random Walks
COSI: TransMed
  • Cesim Erten, Antalya Bilim University, Turkey
  • Aissa Houdjedj, Antalya Bilim University, Turkey
  • Hilal Kazan, Antalya Bilim University, Turkey

Short Abstract: Background: Recent cancer genomic studies have generated detailed molecular data on a large number of cancer patients. A key remaining problem in cancer genomics is the identification of driver genes.
Results: We propose BetweenNet, a computational approach that integrates genomic data with a protein-protein interaction network to identify cancer driver genes. BetweenNet utilizes a measure based on betweenness centrality on patient specific networks to identify the so-called outlier genes that correspond to dysregulated genes for each patient. Setting up the relationship between the mutated genes and the outliers through a bipartite graph, it employs a random-walk process on the graph, which provides the final prioritization of the mutated genes. We compare BetweenNet against state-of-the-art cancer gene prioritization methods on lung, breast, and pan-cancer datasets.
Conclusions: Our evaluations show that BetweenNet is better at recovering known cancer genes based on multiple reference databases. Additionally, we show that the GO terms and the reference pathways enriched in BetweenNet ranked genes and those that are enriched in known cancer genes overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods.

Reconciling Multiple Connectivity Scores for Drug Repurposing
COSI: TransMed
  • Kewalin Samart, Michigan State University, United States
  • Phoebe Tuyishime, Michigan State University, United States
  • Stephanie Hickey, Michigan State University, United States
  • Arjun Krishnan, Michigan State University, United States
  • Janani Ravi, Michigan State University, United States

Short Abstract: The key principle of recent drug repurposing methods is an efficacious drug will reverse the disease molecular ‘signature’ with minimal side-effects. This principle was defined and popularized by the influential ‘connectivity map’ study in 2006 regarding reversal relationships between disease- and drug-induced gene expression profiles, quantified by a disease-drug ‘connectivity score.’ Over the past 15 years, several studies have proposed variations in calculating connectivity scores towards improving accuracy and robustness in light of massive growth in reference drug profiles. However, these variations have been formulated inconsistently using various notations and terminologies even though various scores are based on a common set of conceptual and statistical ideas. Here, we present a systematic reconciliation of multiple disease-drug similarity metrics and connectivity scores by defining them using consistent notation and terminology. In addition to providing clarity and deeper insights, this coherent definition of connectivity scores and their relationships provides a unified scheme that newer methods can adopt, enabling the computational drug-development community to compare and investigate different approaches easily. This resource will be available as a live document (jravilab.github.io/connectivity_scores) coupled with a GitHub repository (github.com/JRaviLab/connectivity_scores) to facilitate the continuous and transparent integration of newer methods.

Segmentation, grading, and intratumor heterogeneity analysis of lung adenocarcinomas by a convolutional neural network
COSI: TransMed
  • John Lockhart, Moffitt Cancer Center, United States
  • Hayley Ackerman, Moffitt Cancer Center, United States
  • Kyubum Lee, Moffitt Cancer Center, United States
  • Mahmoud Abdalah, Moffitt Cancer Center, United States
  • Andrew Davis, Moffitt Cancer Center, United States
  • Nicole Hackel, Moffitt Cancer Center, United States
  • Theresa Boyle, Moffitt Cancer Center, United States
  • James Saller, Moffitt Cancer Center, United States
  • Aysenur Keske, Moffitt Cancer Center, United States
  • Kay Hanggi, Moffitt Cancer Center, United States
  • Brian Ruffell, Moffitt Cancer Center, United States
  • Olya Stringfield, Moffitt Cancer Center, United States
  • Aik Choon Tan, Moffitt Cancer Center, United States
  • Elsa Flores, Moffitt Cancer Center, United States

Short Abstract: Analysis by a clinical pathologist is the gold standard for preclinical histological analysis but may be difficult to obtain due to the cost and availability of their services. As an alternative we have developed a digital pathology pipeline to segment, grade, and analyze lung adenocarcinoma tumors. This convolutional neural network (CNN) was trained to classify normal lung tissue, normal airways, and the different grades (1 – 4) of lung adenocarcinoma from 36,000 224x224 pixel image patches (~6,000 patches per class) extracted from hematoxylin and eosin-stained sections collected from 4 different mouse models.
As a test of our CNN, we analyzed two mouse models to better understand the role of TAp73 in lung adenocarcinoma: KrasG12D/+ (“K”) and KrasG12D/+;TAp73fltd/fltd (“TK”). Both human raters and our CNN reported a significant increase in the tumor burden of the compound mutant “TK” mice compared to the single mutant “K” mice. The higher grading resolution provided by our CNN showed the increased tumor burden observed in the “TK” mice was due to expansion of Grade 2 regions within higher grade tumors. Future work will expand this tool into a multidimensional digital pathology pipeline that can accelerate current investigations and reveal new therapeutic targets and prognostic markers.

Strategies and Techniques for Quality Control and Semantic Enrichment with Multiple Orthogonal Data Types: A Case Study in Colorectal Cancer
COSI: TransMed
  • Tom Toner, Queen's Univerity Belfast, United Kingdom
  • Paul Miller, Queen's Univerity Belfast, United Kingdom
  • Thorsten Forster, LifeArc, United Kingdom
  • Helen Coleman, Queen's Univerity Belfast, United Kingdom
  • Ian Overton, Queen's Univerity Belfast, United Kingdom

Short Abstract: Increasingly, medical datasets link multiple domains thereby furthering their potential to uncover new knowledge and develop our understanding of diseases. However, integration and analysis of large orthogonal datasets is challenging.
We have successfully applied several quality control techniques to a multi-modal colon cancer dataset, containing unstructured data and various encodings, including: information extraction for eleven variables from free-text, identification of three variable pairs with internal inconsistencies, and utilisation of an information theoretic approach to support ten variable merges. We also developed methods for numeric encoding of non-numeric health data, hierarchical clustering of data completeness, and record-keeping of modifications for review. Semantic relationships in medical ontologies can be used to enrich medical datasets prior to analysis. We have developed an ontology-agnostic method to identify semantic commonalities between dataset variables when mapped to ontological entities, demonstrated with SNOMED CT and the Gene Ontology. Variables are then aggregated by their commonalities and aggregations are appended to the dataset.
We anticipate that the improved quality, structuring, and encoding of the data, as well as the added semantic information, will facilitate improved performance and interpretability of subsequent analyses in health datasets. We are currently developing an R package to share these approaches with the community.

SunShine: a semi-automated interactive graphical workflow to determine absolute genomic copy numbers in cancer
COSI: TransMed
  • Durdam Das, Fraunhofer ITEM Regensburg, Mathematical Disease Modelling, Division of Personalized Tumor Therapy, Germany
  • Christoph A. Klein, Experimental Medicine and Therapy Research, University of Regensburg | Fraunhofer ITEM, Germany
  • Martin Hoffmann, Fraunhofer ITEM Regensburg, Mathematical Disease Modelling, Division of Personalized Tumor Therapy, Germany
  • Felix Elsner, Experimental Medicine and Therapy Research, University Regensburg | Institute of Pathology, Univ. Hospital Erlangen, Germany
  • Mariam Gevorgyan, Fraunhofer ITEM Regensburg, Mathematical Disease Modelling, Division of Personalized Tumor Therapy, Germany
  • Steffi Treitschke, Fraunhofer ITEM Regensburg, High-Throughput Drug and Target Discovery, Division of Personalized Tumor Therapy, Germany
  • Christian Werno, Fraunhofer ITEM Regensburg, Preclinical Therapy Models, Division of Personalized Tumor Therapy, Germany
  • Bernhard M. Polzer, Fraunhofer ITEM Regensburg, Cellular and Molecular Diagnostics, Division of Personalized Tumor Therapy, Germany

Short Abstract: Since the early 1990s, fluorescence microscopy-based assessment of chromosomal copy number variation has been superseded by sequence-based methods such as comparative genomic hybridization (CGH) and next generation sequencing. Sequence-based methods are more efficient and have an improved resolution. However, they are presently not quantitative. Instead, mean-normalized relative copy numbers (relating to an unknown ploidy level) are commonly processed to reconstruct absolute copy numbers. In our hands, published algorithms did not come up to our needs. Specifically, they did not allow to handle differences in individual sample quality and technical artifacts that are difficult to foresee. Because generally few disseminated or circulating tumor cells can be analyzed per patient, every sample is valuable. Accordingly, we developed a semi-automated graphical workflow using R Shiny that performs automated processing and subsequently assists users in manually adjusting genomic profiles for apparent artifacts. The automated pipeline includes new copy number estimation methods and an option to mount an external classifier (in our case, a classifier resting on the Mitelman database). SunShine is presently in use for in-house arrayCGH and low-pass sequencing data. Evaluation on public cancer cell line data is ongoing.

SynLeGG: analysis and visualization of multiomics data for discovery of cancer ‘Achilles Heels’ and gene function relationships
COSI: TransMed
  • Ian Overton, Queens University, United Kingdom
  • Alex Lubbock, Vanderbilt University, United States
  • Mark Wappett, Queens University, United Kingdom
  • Adam Harris, Queens University, United Kingdom
  • Ian Lobb, Almac Discovery, United Kingdom
  • Simon McDade, Queens University, United Kingdom

Short Abstract: Achilles’ heel relationships arise when the status of one gene exposes a cell’s vulnerability to perturbation of a second gene, providing therapeutic opportunities for precision oncology. Here we present the web server SynLeGG (www.overton-lab.uk/synlegg), developed using R and shiny, that identifies and visualizes mutually exclusive loss signatures in ‘omics data to enable discovery of genetic dependency relationships (GDRs) across 783 cancer cell lines and 30 tissues. SynLeGG depends upon the MultiSEp algorithm for unsupervised assignment of cell lines into gene expression clusters, which provide the basis for analysis of CRISPR scores and mutational status in order to propose candidate GDRs.
Results, generated at both the pan-cancer and tissue-specific level are searchable, allowing the user to recover established relationships, such as synthetic lethality for SMARCA2 with SMARCA4. Proteomics, Gene Ontology, protein-protein interactions and paralogue information are provided to assist interpretation and candidate drug target prioritization. Benchmarking using SynLethDB demonstrates favourable performance for MultiSEp against competing approaches, finding significantly higher area under the Receiver Operator Characteristic curve and between 2.8-fold to 8.5-fold greater coverage. We hope SynLeGG will expedite the clinical positioning of existing therapies and the discovery of more focused and effective cancer treatments.

TCGA breast cancer data: Using machine learning to fill in incomplete attribute characterizations
COSI: TransMed
  • Shail Rakesh Modi, Massachusetts College of Pharmacy and Health Sciences, United States
  • George Acquaah-Mensaah, Massachusetts College of Pharmacy and Health Sciences, United States

Short Abstract: There are several instances in the Cancer Genome Atlas (TCGA) where samples from patients are not fully characterized. Triple-negative breast cancer (TNBC) is a type of breast cancer lacking the expression of estrogen receptors, progesterone receptors, and human epidermal growth receptor-2. This study aimed to use machine learning (ML) to predict the receptor expression status of uncharacterized samples. We used the pan-cancer TCGA 2016 dataset and grouped instances into training and test sets. After evaluating performances of six different ML classifiers, we chose J48 for the prediction of test set classes due to its consistent precision and sensitivity among all the receptor subtypes. TNBC was contrasted with non-TNBC (nTNBC). For validation, we identified proteins that were differentially expressed (DEP) between the two groups as well as find the pathways overrepresented between TNBC and nTNBC, using the training set and then the test set (with predicted classes). We also used these DEPs to analyze protein-protein interactions using the STRING network in Cytoscape. We identified 8 common DEPs. Activation of the RAS/RAF/MAPK pathway was common to both sets. There were protein-protein interactions common to both sets. Thus, we were able to characterize and validate the uncharacterized receptor status of samples using ML.

The IDH mutation induces a glutamate and GABA neurotransmission metabolism shift in glioma.
COSI: TransMed
  • Michelle Scott, Université de Sherbrooke, Canada
  • Hoang Dong Nguyen, Université de Sherbrooke, Canada
  • Maxime Richer, Université de Sherbrooke, Canada

Short Abstract: Glial tumors are traditionally classified based on their histological resemblance to normal brain astrocytes and oligodendrocytes. Morphological features are now combined with genetic alterations into integrated diagnoses providing increased accuracy. Isocitrate dehydrogenase (IDH1/2) gene mutations define glial tumor categories associated with longer survival. However, the role of these mutations in malignancy development remains unclear.
IDH1/2 codes for enzymes that metabolizes the isocitrate substrate to alpha-ketoglutarate, a fundamental substrate involved in multiple metabolic pathways. Alpha-ketoglutarate is also the precursor of the two main neurotransmitters, glutamate and GABA. This observation may suggest a close relationship between the two neurotransmitter systems and glioma aggressivity.
In this study, we hypothesized that the tumor expression patterns of GABA and glutamate signaling elements may define clinically relevant glial tumors categories. Analyzes of GABA and glutamate expression profiles from 661 glioma from TCGA recapitulated established glial tumor categories but also defined novel groups with different clinical and cellular characteristics such as an altered tumor immune micro-environment. These findings suggest an important role of glutamate and GABA neurotransmission in gliomagenesis associated with immune micro-environment regulation. This research will deepen the current classification and lead to better understanding of novel druggable neurotransmitter-related signaling pathways in gliomas.

The molecular underpinnings of wild-type Von Hippel–Lindau clear cell renal cell carcinomas
COSI: TransMed
  • Aashil A Batavia, ETH Zurich and University Hospital Zurich, Switzerland
  • Dorothea Rutishauser, University Hospital Zurich, Switzerland
  • Jack Kuipers, ETH Zurich, Switzerland
  • Peter Schraml, University Hospital Zurich, Switzerland
  • Niko Beerenwinkel, ETH Zurich, Switzerland
  • Holger Moch, University Hospital Zurich, Switzerland

Short Abstract: wtVHL ccRCC account for 5-12% of all ccRCC but have been shown to be more aggressive conferring a worse survival. A portion of wtVHL ccRCC are thought to be TCEB1 ccRCC a novel and contentious subtype. We combine publicly available resources together with 369 ccRCC samples from the University Hospital Zürich Renal Cancer Biobank to identify phenotypic characteristics and molecular changes promoting the aggressive nature of wtVHL ccRCC using histological, genetic, epigenetic, transcriptomic, and proteomic datasets. Using a bidirectional network diffusion method (NetICS) to integrate mutation, CNV, and gene expression data we identify genes central to orchestrating downstream differential gene expression given the upstream aberrations in a protein-protein interaction network. We also apply unsupervised clustering to determine where wtVHL samples lie on the broader spectrum of renal carcinoma given the various omics datasets and with the addition of papillary and chromophobe RCCs. We find ccRCC samples with TCEB1 mutations and VHL aberrations dispelling the notion that these are mutually exclusive. HMGA1 is identified as a key mediator within wtVHL ccRCC. We believe the identified factors promoting and permitting EMT, extracellular matrix degradation, cell mobility, and cell migration result in the more invasive and metastatic phenotype attributed to wtVHL ccRCC tumours.

The wearables for wellness pilot: data-enabled primary care in an LMIC context
COSI: TransMed
  • Faisal Khan, Institute of Integrative Biosciences, CECOS University, Pakistan
  • Hammad Iqbal, Precision Medicine Lab, Pakistan
  • Shagufta Rehmat, Precision Medicine Lab, Pakistan

Short Abstract: The Wearables for Wellness programme is a unique pilot study that aims to leverage the latest in technology, especially increasingly low-cost wearables, and data science to help the primary health care provider in an LMIC context. With an increasing risk of non-communicable diseases especially CVDs in developing countries and a trend of non-adherence to medication that increases in older adults, we set out to monitor the wellness of our patients and to enable the early detection of disease onset. We aim to generate a preliminary dataset by tracking 6 individuals (3 males and 3 females; 3 physically active and 3 non-active) for a period of 60 days (May-June) using the popular Mi Band 5 smartwatch (By Xiaomi; $30) along with an array of four digital health monitoring gadgets including a scale, thermometer, oximeter and blood pressure unit. We measure the following to extract features and undertake machine learning experiments: heart-rate, sleep, steps, calories burnt, exercise, blood pressure, SpO2, body temperature, weight as well as body measurements (chest, abdomen and thighs), daily food intake and daily mental and physical fatigue levels using questionnaires. We hope to begin to address the dearth of data especially in the case of NCDs in LMICs.



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube