TransMed COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in CDT
Tuesday, July 12th
10:30-11:10
Keynote Presentation: “Data Science”: A trendy name for conventional research using data, or a truly revolutionary new...
Room: Madison A
Format: Live-stream

Moderator(s): Reinhard Schneider

  • Anthony Brookes, University of Leicester, UK


Presentation Overview: Show

Over the last 20 years the kind of "data wrangling" research I do has been called information technology, then bioinformatics, and more recently health data science - with all three names being open to subjective interpretation. Considerable funding is currently directed towards health data science, which many take to mean "the use of data to derive new knowledge or utility". But in my view such activities are not really health data science, but simply health research. In contrast, I contend that health data science entails a virtuous circle that connects novel methods research to innovative data engineering in order to access and leverage especially large or challenging datasets in new ways (thereby identifying the need for further new methods). Some examples of this virtuous circle will be presented, in the context of making valuable and sensitive assets (data, patients, samples, etc) responsibly discoverable and thereby more widely used.

11:10-11:30
Proceedings Presentation: MLGL-MP: A Multi-Label Graph Learning Framework Enhanced by Pathway Interdependence for Metabolic Pathway Prediction
Room: Madison A
Format: Live-stream

Moderator(s): Reinhard Schneider

  • Bing-Xue Du, School of Life Sciences, Northwestern Polytechnical University, China
  • Peng-Cheng Zhao, School of Life Sciences, Northwestern Polytechnical University, China
  • Bei Zhu, School of Life Sciences, Northwestern Polytechnical University, China
  • Siu-Ming Yiu, Department of Computer Science, The University of Hong Kong, Hong Kong, China
  • Arnold K Nyamabo, School of Computer Science, Northwestern Polytechnical University, China
  • Hui Yu, School of Computer Science, Northwestern Polytechnical University, China
  • Jian-Yu Shi, School of Life Sciences, Northwestern Polytechnical University, China


Presentation Overview: Show

Motivation: During lead compound optimization, it is crucial to identify pathways where a drug-like compound is metabolized. Recently, machine learning-based methods have achieved inspiring progress to predict potential metabolic pathways for drug-like compounds. However, they neglect the knowledge that metabolic pathways are dependent on each other. Moreover, they are inadequate to elucidate why compounds participate in specific pathways.
Results: To address these issues, we propose a novel multi-label graph learning framework of metabolic pathway prediction boosted by pathway inter-dependence, called MLGL-MP, which contains a compound encoder, a pathway encoder, and a multi-label predictor. The compound encoder learns compound embedding representations by graph neural networks (GNNs). After constructing a pathway dependence graph by re-trained word embeddings and pathway co-occurrences, the pathway encoder learns pathway embeddings by graph convolutional networks (GCNs). Moreover, after adapting the compound embedding space into the pathway embedding space, the multi-label predictor measures the proximity of two spaces to discriminate which pathways a compound participates in. The comparison with state-of-the-art methods on KEGG pathways demonstrates the superiority of our MLGL-MP. Also, the ablation studies reveal how its three components contribute to the model, including the pathway dependence, the adapter between compound embeddings and pathway embeddings, as well as the pre-training strategy. Furthermore, a case study illustrates the interpretability of MLGL-MP by indicating crucial substructures in a compound, which are significantly associated with the attending metabolic pathways. It’s anticipated that this work can boost metabolic pathway predictions in drug discovery.

11:30-11:40
Decoding tumour microenvironment heterogeneity using graph convolutional networks and multiplexed imaging
Room: Madison A
Format: Live-stream

Moderator(s): Reinhard Schneider

  • Muhammed Khawatmi, University of Oxford, United Kingdom
  • Enric Dominigo, University of Oxford, United Kingdom
  • Fiona Ginty, University of Oxford, United Kingdom
  • Heba Sailem, University of Oxford, United Kingdom


Presentation Overview: Show

Determining the contribution of the tumour microenvironment (TME) to tumour progression and resistance has proven a complex challenge due to its heterogeneity. Multiplexed imaging provides an unprecedented opportunity for studying the interaction between cancer cells and the TME. We utilised a multiplexed tissue imaging dataset of 746 colorectal tumours from different stages. Each tumour section was stained with 60 markers simultaneously to visualise immune and stromal cells as well as key cancer signalling pathways. By implementing image analysis and segmentation approaches, around 3000 cells were quantified per tumour resulting in data of ~3 million single cells. We performed compartmentalised image analysis to determine signalling activities in cancer, stromal, and immune cells. We developed a graph convolutional network (GCNs) and visualisation approach to determine cellular subpopulations associated with patient survival. Using our approach, we found that signalling of mTOR pathway can have heterogeneous activation patterns in different TME compartments which correlate with different patient outcomes. We further validated our observations using transcriptional data from TCGA. Our findings can have a significant impact on the design of mTOR-based therapies and future clinical trials. This demonstrates the utility of GCNs in determining clinically relevant signatures and biomarkers from heterogeneous single cell imaging data.

11:40-11:50
Perturbed Transcriptomic Analyses Identify Chemo-immunotherapy Synergisms to Shift Anti-PD1 Resistance in Cancer
Room: Madison A
Format: Live from venue

Moderator(s): Reinhard Schneider

  • Yue Wang, UNIVERSITY OF PITTSBURGH, United States
  • Dhamotharan Pattarayan, UNIVERSITY OF PITTSBURGH, United States
  • Min Zhang, UNIVERSITY OF PITTSBURGH, United States
  • Da Yang, UNIVERSITY OF PITTSBURGH, United States


Presentation Overview: Show

Immune checkpoint blockade (ICB) prompts a revolution in cancer treatment, but its low response rate and high resistance remains a problem. Here, we reported a novel algorithm to reliably predict chemo-ICB synergism for overcoming ICB resistance, terming as Perturbed Transcriptome-based Synergism Prediction for ICB-Chemotherapy Combinations (PerTSynIC). Through a clinical response-guided feature selection procedure, we established that treatment-induced gene expression changes (TECs) are among the major determinative phenotypes for anti-PD1 response in melanoma. Through integrating one million perturbed transcriptomes of cancer cell lines treated with ten thousand genetic and pharmacological inhibitors from high-throughput screening studies, PerTSynIC identified chemo-/targeted agents who can induce TEC shifting between anti-PD1 non-responders and responders. These agents include MEKi, HDACi and CDKi, whose synergism with ICBs have been reported in clinical practice. PerTSynIC characterized 23 top synergy target genes whose genetic and pharmacological inhibition share consistent TEC shift ability in melanoma. Among these genes, PAK4 and its pharmacological inhibitors are identified. In vitro assay validated that treatment of PAK inhibitors on melanoma cell MEL526 can induce significant dose-dependent activation of antigen processing/presentation and type II interferon signaling. Our study provides a reliable prediction method for chemo-ICB synergism, which will help cancer patients better cope with immunotherapy resistance.

11:50-12:00
ICU Survival Prediction Incorporating Test-Time Augmentation to Improve the Accuracy of Ensemble-Based Models
Room: Madison A
Format: Live from venue

Moderator(s): Reinhard Schneider

  • Seffi Cohen, Ben-Gurion University of the Negev, Israel
  • Nurit Cohen-Inger, BeyondMinds, Israel
  • Noa Dagan, Clalit Research Institute, Israel
  • Dan Ofer, Hebrew University of Jerusalem, Israel
  • Lior Rokach, Ben-Gurion University of the Negev, Israel


Presentation Overview: Show

Severity evaluation is crucial in clinical settings for evaluating patients prognosis. These calculators are used to evaluate survival chances and to optimize patient treatments and resources, notably in Intensive Care Units (ICU). In this work, we present a novel method for applying Test Time Augmentation (TTA) to tabular data. We used TTA along with an ensemble of 42 models to achieve superior performance on the MIT Global Open Source Severity of Illness Score (GOSSIS) initiative, of 131,051 ICU visits and outcomes. This method achieved an AUC of 0.915 on the private test set (19,669 admissions) and won first place at Stanford's WiDS Datathon 2020 challenge on Kaggle, while the widely used Acute Physiology and Chronic Health Evaluation (APACHE) IV model achieved an AUC of 0.868. In addition to improving predictions of patient risk, our method also reduces “unfair” bias

12:00-12:10
Use of machine learning to classify high-risk variants of uncertain significance in lamin A/C cardiac disease
Room: Madison A
Format: Live from venue

Moderator(s): Reinhard Schneider

  • David Gordon, Institute for Genomic Medicine at Nationwide Children's Hospital, United States
  • Jeffrey Bennett, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States
  • Uddalak Majumdar, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States
  • Patrick Lawrence, Institute for Genomic Medicine at Nationwide Children's Hospital, United States
  • Adrianna Matos-Nieves, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States
  • Katherine Myers, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States
  • Anna Kamp, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States
  • Julie Leonard, Center for Injury Research and Policy, Abigail Wexner Research Institute, Nationwide Children’s Hospital, United States
  • Kim McBride, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States
  • Peter White, Institute for Genomic Medicine at Nationwide Children's Hospital, United States
  • Vidu Garg, Center for Cardiovascular Research and Heart Center, Nationwide Children’s Hospital, United States


Presentation Overview: Show

Variation in lamin A/C (LMNA) results in a spectrum of clinical disease, including arrhythmias and cardiomyopathy. Known benign variation is rare, and current in silico predictions have limited utility in driving ACMG classification of LMNA missense variants. Our study of a family with inherited conduction system disease revealed a novel segregating missense variant, p.Asp136Glu, initially reported as a VUS by a commercial testing company. Additional familial analysis and in vitro testing enabled classification of the variant as likely pathogenic per ACMG guidelines. However, extended familial analysis is not always feasible, leaving clinicians with little genetic guidance beyond the presence of a missense variant. This prompted the development of an ML algorithm to aid clinical interpretation of LMNA missense variants. While insufficient known benign variation exists to create an ML classifier, unsupervised clustering of previously observed variants in gnomAD and Clinvar using UMAP and K-means identified three clusters with significantly different proportions of reported pathogenic/likely pathogenic variants (38.8%, 15.0%, and 6.1%). We anticipate that these findings can be translated to clinical use by guiding the treatment of patients with a VUS present in a cluster enriched for pathogenicity and may prove useful in other genes where classification is difficult.

12:10-12:20
PRState: Incorporating Genetic Ancestry in Prostate Cancer Risk Scores for African American Men
Room: Madison A
Format: Live from venue

Moderator(s): Reinhard Schneider

  • Meghana Pagadala, UCSD, United States
  • Joshua Linscott, Maine Medical Center, United States
  • James Talwar, UCSD, United States
  • Tyler Seibert, UCSD, United States
  • Brent Rose, UCSD, United States
  • Julie Lynch, VA Salt Lake City Healthcare System, United States
  • Matthew Panizzon, UCSD, United States
  • Richard Hauger, UCSD, United States
  • Moritz Hansen, Maine Medical Center, United States
  • Jesse Sammon, Maine Medical Center, United States
  • Matthew Hayn, Maine Medical Center, United States
  • Karim Kader, UCSD, United States
  • Hannah Carter, UCSD, United States
  • Stephen Ryan, Maine Medical Center, United States


Presentation Overview: Show

Prostate cancer (PrCa) is one of the most genetically driven solid cancers with heritability estimates as high as 57%. African American men are at an increased risk of PrCa; however, current risk prediction models are based on European ancestry groups and may not be broadly applicable. In this study, we define an African ancestry group of 4,533 individuals to develop an African ancestry-specific PrCa polygenic risk score (PRState). We identified risk loci on chromosomes 3, 8, and 11 in the African ancestry group GWAS and constructed a polygenic risk score (PRS) from 10 African ancestry-specific PrCa risk SNPs, achieving an AUC of 0.61 [0.60-0.63] and 0.65 [0.64-0.67], when combined with age and family history. Performance dropped significantly when using ancestry-mismatched PRS models but remained comparable when using trans-ancestry models. Importantly, we validated the PRState score in the Million Veteran Program, demonstrating improved prediction of PrCa and metastatic PrCa in African American individuals. This study underscores the need for inclusion of individuals of African ancestry in gene variant discovery to optimize PRS.

12:20-12:30
Genome-Derived Diagnosis: Deep Learning Model for Tumor Type Prediction using MSK-IMPACT data
Room: Madison A
Format: Live from venue

Moderator(s): Reinhard Schneider

  • Madison Darmofal, Memorial Sloan Kettering, Weill Cornell Graduate School, United States
  • Quaid Morris, Sloan Kettering Institute, United States
  • Michael Berger, Memorial Sloan Kettering, United States


Presentation Overview: Show

Knowledge of a patient’s tumor type is essential for guiding clinical treatment decisions in cancer, but histologic-based diagnosis remains challenging. Genomic alterations are highly indicative of tumor type, and can be used to build classifiers which predict diagnoses, but most genomic-based classification methods use WGS data which is not feasible for widespread clinical implementation at present. MSK-IMPACT is a FDA-approved clinical sequencing fixed-panel assay which reports genomic alterations including mutations, indels and copy number alterations across 468 cancer-associated genes, and has sequenced over 65,000 Memorial Sloan Kettering patients to date. We use genomic features from this large dataset to develop Deep Genome-Derived-Diagnoses (GDD-NN): a deep-ensemble tumor type classifier. GDD-NN achieves 78.6% accuracy across 40 common cancer types, outperforming similar models. For MSK-IMPACT patients with rarer cancers, we implement out-of-distribution detection using ensemble-based features, which classifies OOD samples (AUC = .94) without explicitly training on them. For patients where non-genomic information might inform predictions, we implement a prediction-specific adaptive prior and report improved accuracy after adjusting predictions given sample biopsy site. Overall, integrating GDD-NN into the well-established MSK-IMPACT pipeline will enable clinically-relevant tumor type predictions that can guide treatment decisions in real time at an institutional level.

14:30-14:50
Proceedings Presentation: From drug repositioning to target repositioning: prediction of therapeutic targets using genetically perturbed transcriptomic signatures
Room: Madison A
Format: Live from venue

Moderator(s): Irene Ong

  • Satoko Namba, Kyushu Institute of Technology, Japan
  • Michio Iwata, Kyushu Institute of Technology, Japan
  • Yoshihiro Yamanishi, Kyushu Institute of Technology, Japan


Presentation Overview: Show

Motivation: A critical element of drug development is the identification of therapeutic targets for diseases. However, the depletion of therapeutic targets is a serious problem.
Results: In this study, we propose the novel concept of target repositioning, an extension of the concept of drug repositioning, to predict new therapeutic targets of various diseases. Predictions were performed by a trans-disease analysis which integrated genetically perturbed transcriptomic signatures (knock-down of 4,345 genes and over-expression of 3,114 genes) and disease-specific gene transcriptomic signatures of 79 diseases. The trans-disease method, which takes into account similarities among diseases, enabled us to distinguish the inhibitory from activatory targets, and to predict the therapeutic targetability of not only proteins with known target–disease associations, but also orphan proteins without known associations. Our proposed method is expected to be useful for understanding the commonality of mechanisms among diseases and for therapeutic target identification in drug discovery.
Availability: Supplemental information and software are available at the following website [http://labo.bio.kyutech.ac.jp/~yamani/target_repositioning/].
Contact: yamani@bio.kyutech.ac.jp
Supplementary information: Supplementary data are available at Bioinformatics online.

14:50-15:10
Proceedings Presentation: Synthetic-to-Real: Instance Segmentation of Clinical Cluster Cells with Unlabelled Synthetic Training
Room: Madison A
Format: Live-stream

Moderator(s): Irene Ong

  • Meng Zhao, Tianjin University of Technology, China
  • Siyu Wang, Tianjin University of Technology, China
  • Fan Shi, Tianjin University of Technology, China
  • Chen Jia, Tianjin University of Technology, China
  • Xuguo Sun, Tianjin Medical University, China
  • Shengyong Chen, Tianjin University of Technology, China


Presentation Overview: Show

The presence of tumor cell clusters in pleural effusion may be a signal of cancer metastasis. The instance segmentation of single cell from cell clusters plays a pivotal role for cluster cell analysis. However, current cell segmentation methods perform poorly for cluster cells due to the overlapping/ touching characters of clusters, multiple instance properties of cells, and the poor generalization ability of the models. In the paper, we propose a contour constraint instance segmentation framework (CC framework) for cluster cells based on a cluster cell combination enhancement module. The framework can accurately locate each instance from cluster cells and realize highprecision contour segmentation under a few samples. Specifically, we propose the contour attention constraint (CAC) module to alleviate over-segmentation and under-segmentation among individual cell-instance boundaries. In addition, to evaluate the framework, we construct a pleural effusion cluster cell dataset including 197 high-quality samples. The quantitative results show that the numeric result of AP mask is greater than 90%, a more than 10% increase compared with state-of-the-art semantic segmentation algorithms. From the qualitative results, we can observe that our method rarely has segmentation errors.

15:10-15:20
1H-NMR metabolomics-based models to impute common clinical variables and endpoints in epidemiological studies
Room: Madison A
Format: Live from venue

Moderator(s): Irene Ong

  • Daniele Bizzarri, Leiden University Medical Center, Netherlands
  • Marcel Reinders, Tu Delft, Netherlands
  • Marian Beekman, Leiden University Medical Center, Netherlands
  • Anna Niehues, Radboud University Medical Centre, Netherlands
  • Peter-Bram Hoen, Radboud University Medical Centre, Netherlands
  • Eline Slagboom, Leiden University Medical Center, Netherlands
  • Erik van den Akker, Leiden University Medical Center, Netherlands


Presentation Overview: Show

1H-NMR metabolomics platform is rapidly gaining popularity in epidemiological research, as it provides a reproducible and cost-effective assessment of the blood metabolome. We will illustrate how we used 1H-NMR metabolomics data of a commercial platform to successfully predict 19 out of 20 routinely assessed clinical variables using a logistic ElasticNET. We will detail on how these models were trained and evaluated within the 26 biobanks participating in BBMRI-nl (~26,000 samples). We will continue by showing that these surrogates can be used to impute missing phenotypic information in external cohorts. Moreover, we will demonstrate that these metabolic surrogates can be used as substitutes for partially or completely unobserved confounders in association studies (Metabolome- or Transcriptome- Wide Association studies) and show that the metabolic surrogates themselves can be used as novel biomarkers, by presenting significant associations with incident all-cause mortality in the elderly population. Finally, we will present our new R-shiny tool (MiMIR) able to compute new and previously published multivariate metabolomics models in other cohorts with 1H-NMR metabolomics, calibrate their predicted values using Platt’s method, and compare the uploaded Nightingale metabolomics quantifications to the metabolites’ distributions observed in BBMRI-nl.

15:20-15:30
A network-based approach to identify expression modules underlying rejection in pediatric liver transplantation
Room: Madison A
Format: Live from venue

Moderator(s): Irene Ong

  • Mylarappa Ningappa, University of Pittsburgh, United States
  • Syed A Rahman, University of Pittsburgh, United States
  • Brandon Higgs, University of Pittsburgh, United States
  • Chethan S Ashokkumar, University of Pittsburgh, United States
  • Nidhi Sahni, MD Anderson Cancer Center, United States
  • Rakesh Sindhi, University of Pittsburgh, United States
  • Jishnu Das, University of Pittsburgh, United States


Presentation Overview: Show

Selecting the right immunosuppressant to ensure rejection-free outcomes poses unique challenges in pediatric liver transplant (LT) recipients. A molecular predictor can comprehensively address these challenges. Currently, there are no well-validated blood-based biomarkers for pediatric LT recipients either pre- or post-LT. Here, we discover and validate separate pre- and post-LT transcriptomic signatures of rejection. Using an integrative machine learning approach, we combine transcriptomic data with the reference high-quality human protein interactome to identify network module signatures, which underlie rejection. Unlike gene signatures, our approach is inherently multivariate, more robust to replication and captures the structure of the underlying network, encapsulating additive effects. We also identify, in a patient-specific manner, signatures that can be targeted by current anti-rejection drugs and other drugs that can be repurposed. Overall, our approach can enable personalized adjustment of drug regimens for the dominant targetable pathways in pre- and post-LT in children.

16:00-16:40
Keynote Presentation: Learning cellular interactions from spatial transcriptomics with SpaceMarkers
Room: Madison A
Format: Live from venue

Moderator(s): Anoop Mayampurath

  • Atul Desphande, Johns Hopkins University, USA
  • Melanie Loth, Johns Hopkins University, USA
  • Dimitri Sidiropoulos, Johns Hopkins University, USA
  • Dimitri Sidiropoulos, Johns Hopkins University, USA
  • Shuming Zhang, Johns Hopkins University, USA
  • Long Yuan, Johns Hopkins University, USA
  • Alexander Bell, Johns Hopkins University, USA
  • Qingfeng Zhu, Johns Hopkins University, USA
  • Won Jin Ho, Johns Hopkins University, USA
  • Cesar Santa-Maria, Johns Hopkins University, USA
  • Danielle Gilkes, Johns Hopkins University, USA
  • Stephen Williams, 10X Genomics, USA
  • Cedric Uytingco, 10X Genomics, USA
  • Jennifer Chew, 10X Genomics, USA
  • Andrej Hartnett, 10X Genomics, USA
  • Zachary Bent, 10X Genomics, USA
  • Alexander Favorov, Johns Hopkins University, USA
  • Mark Yarchoan, Johns Hopkins University, USA
  • Lei Zheng, Johns Hopkins University, USA
  • Elizabeth Jaffee, Johns Hopkins University, USA
  • Robert Anders, Johns Hopkins University, USA
  • Ludmila Danilova, Johns Hopkins University, USA
  • Genevieve Stein-O'Brien, Johns Hopkins University, USA
  • Luciane Kagohara, Johns Hopkins University, USA
  • Elana Fertig, Johns Hopkins University, USA


Presentation Overview: Show

Spatial molecular data provides unprecedented characterization of the cellular and molecular architecture of human tissue and disease. These technologies are particularly important for cancer immunotherapy, in which the interactions between diverse cell types mediate therapeutic response and resistance. In this talk, we describe how spatial molecular data enable us to uncover mechanisms of therapeutic response and resistance in a liver cancer immunotherapy clinical trial. Moreover, our new analysis approach SpaceMarkers to infer molecular changes from cell-cell interaction from latent space analysis of ST data from this trial. Further transfer learning in matched scRNA-seq data enabled further quantification of the specific cell types in which SpaceMarkers are enriched. Altogether, SpaceMarkers can identify the location and context-specific molecular interactions within the TME from ST data.

16:40-17:00
Proceedings Presentation: Prediction of Recovery from Multiple Organ Dysfunction Syndrome in Pediatric Sepsis Patients
Room: Madison A
Format: Live-stream

Moderator(s): Anoop Mayampurath

  • Karsten Borgwardt, ETH Zurich, Switzerland
  • Bowen Fan, ETH Zurich, Switzerland
  • Juliane Klatt, ETH Zurich, Switzerland
  • Michael Moor, ETH Zurich, Switzerland
  • Latasha Daniels, Ann & Robert H. Lurie Children's Hospital of Chicago, United States
  • Lazaro Sanchez-Pinto, Ann & Robert H. Lurie Children's Hospital of Chicago, United States
  • Philipp Agyeman, University Hospital of Bern, Switzerland
  • Luregn Schlapbach, University Children’s Hospital Zurich, Switzerland
  • Swiss Pediatric Sepsis Study , Switzerland


Presentation Overview: Show

Sepsis is a leading cause of death and disability in children globally, accounting for approximately three million childhood deaths per year. In pediatric sepsis patients, the multiple organ dysfunction syndrome (MODS) is considered a significant risk factor for adverse clinical outcomes characterized by high mortality and morbidity in the pediatric intensive care unit (PICU). The recent rapidly growing availability of electronic health records (EHRs) has allowed researchers to vastly develop data-driven approaches like machine learning in healthcare and achieved great successes. However, effective machine learning models which could make the accurate early prediction of the recovery in pediatric sepsis patients from MODS to a mild state and thus assist the clinicians in the decision-making process is still lacking.

This study develops a machine learning-based approach to predict the recovery from MODS to zero or single organ dysfunction~(Z/SOD) by one week in advance in the Swiss Pediatric Sepsis Study (SPSS) cohort of children with blood-culture confirmed bacteremia. Our model achieves internal validation performance on the SPSS cohort with an AUROC of 79.1 and AUPRC of 73.6, and it was also externally validated on another pediatric sepsis patients cohort collected in the U.S., yielding an AUROC of 76.4 and AUPRC of 72.4. These results indicate that our model has the potential to be included into the EHRs system and contribute to patient assessment and triage in pediatric sepsis patient care.

17:00-17:20
Proceedings Presentation: Self-supervised learning of cell type specificity from immunohistochemical images
Room: Madison A
Format: Live-stream

Moderator(s): Anoop Mayampurath

  • Michael Murphy, Massachusetts Institute of Technology, United States
  • Stefanie Jegelka, Massachusetts Institute of Technology, United States
  • Ernest Fraenkel, Massachusetts Institute of Technology, United States


Presentation Overview: Show

Motivation: Advances in bioimaging now permit in-situ proteomic characterization of cell-cell interactions in complex tissues, with important applications across a spectrum of biological problems from development to disease. These methods depend on selection of antibodies targeting proteins that are expressed specifically in particular cell types. Candidate marker proteins are often identified from single-cell transcriptomic data, with variable rates of success, in part due to divergence between expression levels of proteins and the genes that encode them. In principle, marker identification could be improved by using existing databases of immunohistochemistry for thousands of antibodies in human tissue, such as the Human Protein Atlas. However, these data lack detailed annotations of the types of cells in each image.
Results: We develop a method to predict cell type specificity of protein markers from unlabeled images. We train a convolutional neural network with a self-supervised objective to generate embeddings of the images. Using nonlinear dimensionality reduction, we observe that the model clusters images according to cell types and anatomical regions for which the stained proteins are specific. We then use estimates of cell type specificity derived from an independent single-cell transcriptomics dataset to train an image classifier, without requiring any human labelling of images. Our scheme demonstrates superior classification of known proteomic markers in kidney compared to differential expression in single-cell transcriptomics.

17:20-17:30
Predictive Model for Endometriosis with Clinical, Lifestyle and Genetic Information
Room: Madison A
Format: Live from venue

Moderator(s): Anoop Mayampurath

  • Michal Linial, The Hebrew University of Jerusalem, Israel
  • Ido Blass, The Hebrew University of Jerusalem, Israel
  • Nadav Rappoprt, Ben-Gurion University of the Negev, Israel
  • Tali Sahar, McGill University Health Centre, Montreal, Canada
  • Adi Shribman, The Academic College of Tel Aviv-Yaffo, Israel


Presentation Overview: Show

Endometriosis is a disorder in which endometrial tissues are implanted outside of the uterus. Endometriosis affects 5–10% of all women of reproductive age yet is under-diagnosed. This research aims to develop an endometriosis model using multiple inputs from the UK-biobank (UKBB). The data was split into those with a diagnosis of endometriosis (5,924; ICD-10: N80) and the rest (142,576). Over 1000 variables were used, including personal information regarding female health, lifestyle, self-reported data, genetic variants, and medical history prior to the endometriosis diagnosis. An endometriosis prediction model was developed using machine learning (ML) algorithms. CatBoost's gradient boosting methods produced the best prediction for the data-combined model, with an area under the ROC curve (ROC-AUC) of 0.78. We found that prior to being diagnosed with endometriosis, women had significantly more ICD-10 diagnoses than the average unaffected woman. Irritable bowel syndrome (IBS) and the length of the menstrual cycle were among the most informative variables ranked by SHAP values. Despite the restrictions of missing data and noisy medical input, we conclude that the UKBB's large population-based retrospective data is useful for the development of predictive models. The informative features extracted from the model may increase endometriosis diagnostic clinical utility.

17:30-17:40
Immune response-related gene regulatory pathways perturbed by targeted therapies in colorectal cancer: CALGB/SWOG 80405
Room: Madison A
Format: Live from venue

Moderator(s): Anoop Mayampurath

  • Akram Yazdani, Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
  • Heinz-Josef Lenz, USC Norris Comprehensive Cancer Center, Los Angeles, CA, United States
  • Gianluigi Pillonetto, Department of Information Engineering, University of Padova, Padova, Italy
  • Raul Mendez-Giraldez, Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, NC, United States
  • Monica Marie Bertagnolli, Dana-Farber/ Partners Cancer Care, Harvard Medical School, Boston, MA, United States
  • Alan P Venook, University of California at San Francisco, San Francisco, CA, United States
  • Mark J Ratain, Division of the Biological Sciences, University of Chicago, Chicago, IL, United States
  • Naim Rashid, Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
  • Benjamin G Vincent, Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
  • Xueping Qu, Genentech, South San Francisco, CA, United States
  • Azam Yazdani, Center of Perioperative Genetics and Genomics, Department of Anesthesiology, Perioperative and Pain Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, MA, United States
  • Yujia Wen, Alliance for Clinical Trials in Oncology, Chicago, IL, United States
  • William F Symmans, Department of Pathology, University of Texas MD Anderson Cancer Center, Houston, TX, United States
  • Andrew B Nixon, Duke Center for Cancer Immunotherapy, Duke University, Durham, NC, United States
  • Michael Kosorok, Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
  • Charles M Perou, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
  • Federico Innocenti, Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States


Presentation Overview: Show

Cetuximab or bevacizumab combined with chemotherapy are approved regimens for first-line metastatic colorectal cancer (mCRC). However, the unknown underlying biological pathways perturbed by the therapies can be responsible for the large variation observed in their therapeutic responses. To elucidate these mechanisms, we used tumor RNA-seq and germline genotype data from 1,284 patients mCRC treated with cetuximab/bevacizumab through a randomized phase III trial (CALGB/SWOG-80405). We conducted a novel integrative approach and identified treatment-specific putative causal biomarkers and gene regulatory pathways impacting overall survival (OS). This analysis accounted for confounders using Mendelian randomization and reproduced the findings using replication sets. To gain insight into their biological functions, we evaluated the relationships of the causal gene regulatory pathways with estimated immune features from RNA-seq data. Our study suggested a potentially important role for the interaction between RELT and MYO1G related to tumor cell escape mechanism under cetuximab therapy. We identified a pathway with a common function in DNA damage and repair and a pathway highly correlated to cytotoxicity signatures, impacting the response to cetuximab and bevacizumab, respectively. Moreover, SCD5, with a causal effect on OS of patients treated with bevacizumab, highlighted the possible risk of dyslipidemia in the use of VEGF inhibitors.

Support: U10CA180821, U10CA180882, U24CA196171, https://acknowledgments.alliancefound.org; U10CA180888 (SWOG); Lilly, Genentech, and Pfizer; ClinicalTrials.gov Identifier: NCT00265850

17:40-17:50
Direction-aware data fusion techniques for multi-omics pathway enrichment analysis and biomarker discovery
Room: Madison A
Format: Live from venue

Moderator(s): Anoop Mayampurath

  • Mykhaylo Slobodyanyuk, University of Toronto, Canada
  • Jüri Reimand, University of Toronto, Canada


Presentation Overview: Show

Different omics techniques allow us to characterise the genetic, transcriptomic, epigenomic and proteomic landscapes of cells and tissues, and better understand their perturbations in disease. However, joint analyses of different omics datasets for a holistic understanding of cell function present a computational challenge. We recently developed ActivePathways, an integrative pathway enrichment analysis method that uses data fusion to merge signals from multiple omics datasets, prioritizes genes and pathways through p-value merging, and evaluates their contribution from individual input datasets. Here we extend this computational framework to account for directional activities of genes and proteins across the input omics datasets. For example, fold-change in protein expression would be expected to associate positively with mRNA change of the corresponding gene, while DNA methylation change of the gene promoter would be expected to associate negatively. We extend our method to encode such directional interactions and penalize genes and proteins where such assumptions are violated. We demonstrate the approach by integrating cancer RNA-seq, DNA methylation, and proteomics datasets in the CPTAC and TCGA projects, in which we uncover novel candidate biomarkers and pathways that have been previously overlooked in the analysis of individual datasets.

17:50-18:00
Solving the Puzzle of Genetic Disease with Bits and Bytes
Room: Madison A
Format: Live from venue

Moderator(s): Anoop Mayampurath

  • Peter White, Nationwide Children's Hospital, United States


Presentation Overview: Show

Genomic medicine positively impacts pediatric care, from rapid diagnosis of genetic disorders to optimization of childhood cancer treatments. The Computational Genomics Group at Nationwide Children’s Hospital supports multiple translational research protocols, combining genomics and bioinformatics to improve patient outcomes. Our “Genomics of Rare Disease” protocol has utilized genome sequencing and novel bioinformatics approaches to find answers for patients with undiagnosed disease, revealing novel disease mechanisms. Through the application of cloud computing technologies, optimized bioinformatics pipelines, and an Apache Spark-backed variant warehouse, our “Rapid Genome Sequencing” protocol returns results for infants in the ICU within 48 hours. The “Cancer Protocol” provides extensive genomic profiling, positively impacting patient diagnosis, prognosis, and therapy. Finally, the Molecular Characterization Initiative is a new screening approach to identify therapeutic vulnerabilities in pediatric cancers, by performing genomic, transcriptomic and epigenomic characterization of thousands of pediatric cancer samples nationwide. All genomic data is generated, analyzed, and interpreted within 14 days. Deidentified genomic and clinical data is submitted to dbGaP within minutes of a case being completed. Together with sharing data from our translational protocols, we are creating a community resource that will enable wide-scale engagement of the translational bioinformatics community to help solve the puzzle of genetic disease.

18:00-18:05
Closing
Room: Madison A
Format: Live from venue

Moderator(s): Anoop Mayampurath