Attention Presenters - please review the Presenter Information Page available here
Schedule subject to change
All times listed are in EDT
Monday, July 15th
16:40-17:00
Introduction
Room: 522
Format: In person

Moderator(s): Reinhard Schneider


Authors List: Show

Quality Assurance, Semantic Enrichment and Integration of Multimodal Health Data for Phenotype and Cohort Discovery with Deep Learning
Confirmed Presenter: Ian Overton, Queen's University Belfast, United Kingdom

Room: 522
Format: In Person

Moderator(s): Reinhard Schneider


Authors List: Show

  • Tom Toner, Queen's Univerity Belfast, United Kingdom
  • Rashi Pancholi, Queen's Univerity Belfast, United Kingdom
  • Tanya Sabwa, Queen's Univerity Belfast, United Kingdom
  • Paul M, Queen's Univerity Belfast, United Kingdom
  • Thorsten Forster, LifeArc, United Kingdom
  • Helen Coleman, Queen's Univerity Belfast, United Kingdom
  • Ian Overton, Queen's University Belfast, United Kingdom

Presentation Overview: Show

Integration of data from multiple domains can greatly enhance the quality and applicability of knowledge generated in analysis workflows. However, working with health data is challenging, requiring careful preparation in order to support meaningful interpretation. We developed an R package for electronic health data preparation, “eHDPrep” (Gigascience 2023;12:giad030, https://cran.r-project.org/package=eHDPrep) demonstrated upon a multimodal colorectal cancer dataset (661 patients, 155 variables; Colo-661); a further demonstrator is taken from The Cancer Genome Atlas (459 patients, 94 variables; TCGA-COAD). eHDPrep offers user-friendly methods for quality control, including internal consistency checking and redundancy removal with information-theoretic variable merging (Figures 1, 2). eHDPrep also facilitates numerical encoding, variable extraction from free text, and completeness analysis. Semantic enrichment functionality can generate new informative “meta-variables” according to ontological common ancestry, demonstrated with SNOMED CT and the Gene Ontology (Figure 3).
We deployed variational autoencoders with a complex loss function evaluating reconstruction and clustering on the above data and whole-slide tumour images to discover phenotypes and candidate cohorts for more effective molecular stratification (Figures 4-6). Phenotypes represent novel combinations of features across tumour pathology, standard clinical parameters, lifestyle and demographic variables. Molecular stratification within these novel phenotypes seeks to develop new clinical tools for precision oncology.

17:00-17:20
Proceedings Presentation: TA-RNN: an Attention-based Time-aware Recurrent Neural Network Architecture for Electronic Health Records
Confirmed Presenter: Serdar Bozdag, University of North Texas, United States

Room: 522
Format: In Person

Moderator(s): Irina Balaur


Authors List: Show

  • Mohammad Al Olaimat, University of North Texas, United States
  • Serdar Bozdag, University of North Texas, United States

Presentation Overview: Show

Motivation: Electronic Health Records (EHR) represent a comprehensive resource of a patient's medical history. EHR are essential for utilizing advanced technologies such as deep learning (DL), enabling healthcare providers to analyze extensive data, extract valuable insights, and make precise and data-driven clinical decisions. DL methods such as Recurrent Neural Networks (RNN) have been utilized to analyze EHR to model disease progression and predict diagnosis. However, these methods do not address some inherent irregularities in EHR data such as irregular time intervals between clinical visits. Furthermore, most DL models are not interpretable. In this study, we propose two interpretable DL architectures based on RNN, namely Time-Aware RNN (TA-RNN) and TA-RNN-Autoencoder (TA-RNN-AE) to predict patient’s clinical outcome in EHR at next visit and multiple visits ahead, respectively. To mitigate the impact of irregular time intervals, we propose incorporating time embedding of the elapsed times between visits. For interpretability, we propose employing a dual-level attention mechanism that operates between visits and features within each visit.
Results: The results of the experiments conducted on Alzheimer’s Disease Neuroimaging Initiative (ADNI) and National Alzheimer’s Coordinating Center (NACC) datasets indicated superior performance of proposed models for predicting Alzheimer’s Disease (AD) compared to state-of-the-art and baseline approaches based on F2 and sensitivity. Additionally, TA-RNN showed superior performance on Medical Information Mart for Intensive Care (MIMIC-III) dataset for mortality prediction. In our ablation study, we observed enhanced predictive performance by incorporating time embedding and attention mechanisms. Finally, investigating attention weights helped identify influential visits and features in predictions.

17:20-17:40
Share genetics between breast cancer and its predisposing diseases identifies candidate drugs for repurposing for breast cancer
Confirmed Presenter: Panagiotis Nikolaos Lalagkas, University of Massachusetts Lowell, United States

Room: 522
Format: In Person

Moderator(s): Irina Balaur


Authors List: Show

  • Panagiotis Nikolaos Lalagkas, University of Massachusetts Lowell, United States
  • Rachel Melamed, University of Massachusetts Lowell, United States

Presentation Overview: Show

The success of drugs targeting disease genes is widely acknowledged. However, identifying causal genes for common complex diseases remains a non-trivial task. This necessitates innovative approaches to accelerate complex disease drug discovery. We have previously shown that clinical associations between Mendelian and complex diseases can inform complex disease drug discovery due to pleiotropic effects of Mendelian genes. Here, we extend our approach to exploit clinical associations between pairs of complex diseases for drug discovery. We hypothesize that pleiotropic genes shared between a complex disease and its predisposing diseases can help us discover new uses for drugs currently approved only for the predisposing diseases. To test our hypothesis, we start with breast cancer, a well-studied and highly prevalent disease. We compile a list of six traits known to increase breast cancer risk (predisposing diseases), such as depression, high LDL, and type 2 diabetes. Using GWAS summary statistics and local genetic correlation analysis, we find a total of 84 genomic loci harboring mutations with positively correlated effects between breast cancer and each predisposing disease. These loci contain 202 protein-coding genes (shared genes). Using a network biology approach, for each disease pair, we connect drugs already indicated for the predisposing disease to its shared genes with breast cancer and identify drug repurposing candidates for breast cancer. Finally, we show that our list of candidate drugs is enriched for currently investigated and indicated drugs for breast cancer. Our findings suggest a novel way to accelerate drug discovery for complex diseases by leveraging shared genetics.

Prevalence and biological impact of clinically relevant gene fusions in head and neck cancer
Confirmed Presenter: Emily Hoskins, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States

Room: 522
Format: In Person

Moderator(s): Irina Balaur


Authors List: Show

  • Emily Hoskins, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Raven Vella, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Julie Reeser, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Michele Wing, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Eric Samorodnitsky, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Altan Turkoglu, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Leah Stein, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States
  • Elizabeth Breuning, The Bioinformatics Program, Loyola University Chicago, Chicago, IL, United States
  • Michelle Churchman, Clinical & Life Sciences, AsterInsights, Tampa, Florida, United States
  • Nancy Single, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH., United States
  • Sameek Roychowdhury, Comprehensive Cancer Center and James Cancer Hospital, The Ohio State University, Columbus, OH, United States

Presentation Overview: Show

Objective: Head and neck cancer (HNC) is the seventh most common cancer worldwide, with a 5-year survival rate of ~50%. The only existing genomic biomarker that guides targeted therapies in HNC is oncogenic HRAS mutations. Gene fusions are clinically targetable, genomic events that involve chromosomal rearrangement, resulting in aberrant function. Here we describe the biological and clinical impact of oncogenic fusions in a combined dataset of HNC. Methods: We evaluated RNA sequencing data from HNCs from the Oncology Research Information Exchange Network (ORIEN, n=1,540), The Cancer Genome Atlas (TCGA, n=528), and other published studies (n=588). We utilized STAR-Fusion and Arriba to detect gene fusions from RNAseq data. Results: Leveraging our combined cohort of 2,666 tumors with RNAseq, we identified 74 cases (2.8%) harboring a clinically relevant gene fusion. The most common fusions involved FGFR3 (N=19), EGFR (n=10), and FGFR2 (n=5). We observed significant gene overexpression in fusion-positive samples with respect to their gene fusion partner (p<0.001). Intrigued by the EGFR fusions that we uncovered, which have not previously been described in head and neck cancers, we further assessed the structure and breakpoints in these fusions. In ORIEN, 4/5 gene fusions harbored the same breakpoint in EGFR with a gene fusion structure found to be successfully clinically targetable in lung cancer. Conclusions: Our results demonstrate that oncogenic gene fusions are prevalent in HNC, often lead to overexpression of the oncogene fusion partner, and are clinically relevant. Our results provide expanded therapeutic opportunities for patients with HNC.

17:40-18:00
Proceedings Presentation: PhiHER2: Phenotype-informed weakly supervised model for HER2 status prediction from pathological images
Confirmed Presenter: Jian Liu, College of Computer Science, Centre for Bioinformatics and Intelligent Medicine, Nankai University, China

Room: 522
Format: Live Stream

Moderator(s): Irina Balaur


Authors List: Show

  • Chaoyang Yan, College of Computer Science, Centre for Bioinformatics and Intelligent Medicine, Nankai University, China
  • Jialiang Sun, College of Computer Science, Centre for Bioinformatics and Intelligent Medicine, Nankai University, China
  • Yiming Guan, College of Computer Science, Centre for Bioinformatics and Intelligent Medicine, Nankai University, China
  • Jiuxin Feng, College of Computer Science, Centre for Bioinformatics and Intelligent Medicine, Nankai University, China
  • Hong Liu, The Second Surgical Department of Breast Cancer, Tianjin Medical University Cancer Institute & Hospital, China
  • Jian Liu, College of Computer Science, Centre for Bioinformatics and Intelligent Medicine, Nankai University, China

Presentation Overview: Show

Motivation: HER2 status identification enables physicians to assess the prognosis risk and determine the treatment schedule for patients. In clinical practice, pathological slides serve as the gold standard, offering morphological information on cellular structure and tumoral regions. Computational analysis of pathological images has the potential to discover morphological patterns associated with HER2 molecular targets and achieve precise status prediction. However, pathological images are typically equipped with high-resolution attributes, and HER2 expression in breast cancer images often manifests the intratumoral heterogeneity.
Results: We present a phenotype-informed weakly-supervised multiple instance learning architecture (PhiHER2) for the prediction of the HER2 status from pathological images of breast cancer. Specifically, a hierarchical prototype clustering module is designed to identify representative phenotypes across whole slide images. These phenotype embeddings are then integrated into a cross-attention module, enhancing feature interaction and aggregation on instances. This yields a prototype-based feature space that leverages the intratumoral morphological heterogeneity for HER2 status prediction. Extensive results demonstrate that PhiHER2 captures a better WSI-level representation by the typical phenotype guidance and significantly outperforms existing methods on real-world datasets. Additionally, interpretability analyses of both phenotypes and WSIs provide explicit insights into the heterogeneity of morphological patterns associated with molecular HER2 status.

Tuesday, July 16th
8:40-9:20
Invited Presentation: Advancing Genomic Medicine through Clinical and Research Strategies
Confirmed Presenter: Heidi Rehm

Room: 519
Format: In Person

Moderator(s): Maria Secrier


Authors List: Show

  • Heidi Rehm

Presentation Overview: Show

Supporting genomics in research and medicine requires infrastructure, including standards, knowledgebases and global data sharing, as well as a rich interface between research and clinical care as new discoveries are made. This talk will present strategies to identify novel causes of rare disease including the application of new technologies and analysis methods as well as building innovative approaches to global data sharing in collaboration with AnVIL and the Global Alliance for Genomics and Health. It will end on novel approaches to support genetics and genomics in medical practice.

9:20-9:40
Transcriptional modulation unique to vulnerable motor neurons predict ALS across species and SOD1 gene mutations
Confirmed Presenter: Irene Mei, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden, Sweden

Room: 519
Format: In Person

Moderator(s): Maria Secrier


Authors List: Show

  • Irene Mei, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden, Sweden
  • Susanne Nichterwitz, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden, Sweden
  • Melanie Leboeuf, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden, Sweden
  • Jik Nijssen, Department of Cellular and Molecular Biology, Karolinska Institutet, Stockholm, Sweden, Sweden
  • Isadora Lenoel, Institute de Cerveau (ICM), Hôpital Pitié Salpêtrière, Paris, France, France
  • Dirk Repsilber, School of Medical Sciences, Örebro University, Örebro, Sweden, Sweden
  • Christian S. Lobsiger, Institute de Cerveau (ICM), Hôpital Pitié Salpêtrière, Paris, France, France
  • Eva Hedlund, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden, Sweden

Presentation Overview: Show

Amyotrophic lateral sclerosis (ALS) is characterized by the progressive loss of somatic motor
neurons (MNs), which innervate skeletal muscles. However, certain MN groups including ocular MNs that regulate eye movement are relatively resilient to ALS. To reveal mechanisms of differential MN vulnerability, we investigate the transcriptional dynamics of two vulnerable and two resilient MN populations in SOD1G93A ALS mice. Differential gene expression analysis shows that each neuron type displays a largely unique spatial and temporal response to ALS. Resilient MNs regulate few genes in response to disease, but show clear divergence in baseline gene expression compared to vulnerable MNs, which in combination may hold the key to their resilience. EASE, fGSEA and ANUBIX enrichment analysis demonstrate that vulnerable MN groups share pathway activation, including regulation of neuronal death, ERK and MAPK cascades, inflammatory response and synaptic signaling. These pathways are largely driven by 11 upregulated genes, including Atf3, Cd44, Gadd45a, Ngfr, Ccl2, Ccl7, Gal, Timp1, Nupr1 and indicate that cell death occurs through similar mechanisms across vulnerable MNs albeit with distinct timing. Random Forest machine learning-based approach using DEGs upregulated in our SOD1G93A spinal MNs predict disease in human stem cell-derived MNs harboring the SOD1E100G mutation, and show that dysregulation of VGF, PENK, INA and NTS are strong disease-predictors across SOD1 mutations and species. A shared transcriptional vulnerability was also assessed through a meta-analysis across mouse SOD1 transcriptome datasets. In conclusion our study reveals vulnerability-specific gene regulation that may act to preserve neurons and can be used to predict disease.

Multi-dimensional Integration of PPI Network with Genetic and Molecular Data to Decipher the Genetic Underpinnings of RA Endotypes
Confirmed Presenter: Javad Rahimikollu, University of Pittsburgh, United States

Room: 519
Format: In Person

Moderator(s): Maria Secrier


Authors List: Show

  • Javad Rahimikollu, University of Pittsburgh, United States
  • Priyamvada Guha Roy, University of Pittsburgh, United States
  • Larry Moreland, University of Colorado, United States
  • Jishnu Das, University of Pittsburgh, United States

Presentation Overview: Show

Rheumatoid arthritis (RA) is a complex autoimmune disease with polyetiological genetic basis. Serum rheumatoid factor (RF) and anti-citrullinated peptide (CCP) antibodies are used to diagnose RA. However, it is unknown whether corresponding serological profiles map to distinct endotypes of RA. To address this, we first dissected differences across ~900 RA patients half of whom were serologically CCP+RF+ (i.e., double positive – DP), and half that were RF+ alone (RF). Surprisingly, there was a significant difference in heritability across these groups (~30%), suggesting fundamental differences in genetic risk of these two kinds of RA. Next, we carried out a genome wide association analysis (GWAS) and identified the HLA locus as explaining part of but not the entire difference in heritability between DP and RF RA. To delve into the missing heritability, we implemented a network-based GWAS approach. We adapt Linkage Disequilibrium Adjusted Kinships (LDAK) to aggregate the impact of multiple regulatory SNPs associated with a gene into a single score, taking into account the underlying LD structure. Using network propagation, we then identify modules that explain significant the differences in heritability across DP and RF. These modules include HLA genes, but also capture other cytokines, chemokines and immune regulators and almost completely capture the entire difference in heritability. We were also able to further validate these modules by recapitulating some of the corresponding differences at the transcriptomic and proteomic level. Together, our results suggest that DP and RF RA are different disease endotypes with distinct genetic bases and pathophysiology.

9:40-10:00
AI Epilepsy: Software solution to aid in the diagnosis of epilepsy using machine learning algorithms
Confirmed Presenter: Juan Carvajal, Universidad de los Andes, Colombia

Room: 519
Format: In Person

Moderator(s): Maria Secrier


Authors List: Show

  • Juan Carvajal, Universidad de los Andes, Colombia
  • Laura Guio, HOMI, Fundacion Hospital Pediatrico La Misericordia, Colombia
  • Danilo García-Orjuela, Biotecnología y Genética SAS, Colombia
  • David Diaz, Universidad de los Andes, Colombia
  • Diego Granada, Universidad de los Andes, Colombia
  • Andres Delgado Ruiz, Universidad de los Andes, Colombia
  • Nestor Gonzalez, Universidad de los Andes, Colombia
  • Jennifer Guzmán-Porras, HOMI, Fundación Hospital Pediátrico La Misericordia, Colombia
  • Paula Siaucho, Biotecnología y Genética SAS, Colombia
  • Jorge Díaz-Riaño, Biotecnología y Genética SAS, Colombia
  • Andres Naranjo, HOMI, Fundación Hospital Pediátrico La Misericordia, Colombia
  • Silvia Maradei-Anaya, Biotecnología y Genética SAS, Colombia
  • Jorge Duitama, Universidad de los Andes, Colombia
  • Kelly Garces, Universidad de los Andes, Colombia

Presentation Overview: Show

Epilepsy is a chronic neurological disorder characterized by recurrent seizures, affecting approximately 50 million people worldwide. Different methods have been developed for efficient diagnosis, including prediction of cases requiring surgical intervention due to lack of effectiveness of drug-based treatments (known as refractory epilepsy). These methods include signal processing using electroencephalography (EEG), analysis of structural MRI, and expression of miRNA biomarkers in peripheral blood. Given the heterogeneity of this data, we developed a software solution to perform an integrated analysis of these data types, to aid diagnosis of epilepsy. Users can load the results of the different exams to generate a common report including the results of the different analyses. The analysis includes a machine learning approach for detection of seizures from EEG data. It also includes a classification model for brain structural anomalies from MRI data. Finally, it includes a classification module based on the expression patterns of blood miRNA data. The software follows a distributed architecture with five main components orchestrated through docker compose. It facilitates the execution of asynchronous processes to run complex predictions by implementing Rabbit message queues. A visualzer of MRI scans was integrated for visualization and interaction with the data obtained from these images. Validation experiments show that the application is efficient and easy to use, taking into account the size and complexity of the data that needs to be analyzed together for epilepsy patients. We expect that this software makes a significant contribution towards the development of new tools and methods for epilepsy research.

HTJ2K as a Default Storage Format for Medical Images​
Confirmed Presenter: Utkarsh Rai, University of Arkansas for Medical Sciences, United States

Room: 519
Format: In Person

Moderator(s): Maria Secrier


Authors List: Show

  • Utkarsh Rai, University of Arkansas for Medical Sciences, United States
  • Lawrence Tarbox, University of Arkansas for Medical Sciences, United States
  • Chris Hafey, AWS HealthImaging, United States

Presentation Overview: Show

Healthcare systems around the world store large volumes of medical images, like X-rays or scans. The largest public archive currently has 30.9 million radiology images. These images are high quality and use a lot of space making them difficult to store and share.​
Image compression comes with two main challenges, loss in the quality of the image and additional resources needed to compress currently existing images. My project proposes using a recently introduced image format, high-throughput JPEG 2000 (HTJ2K), for lossless compression of these images and bringing their size down by a rough factor of 3.​
This new format allows you to see a blurry version first which gradually gets clearer. This is very handy when you are dealing with slow internet or huge files. A single image file holds several copies of gradually improving resolutions and medical researchers can pick from any of these, without having to duplicate their datasets.​
My project provides open-source tools to convert medical images to HTJ2K, methods for users of the images to decode, view and use them as necessary and pipelines that system architects can use to model medical image storage using HTJ2K format such that they are easy to maintain.​

10:40-11:20
Invited Presentation: The challenges of clinical deployment of automated cancer type classification for routine use
Confirmed Presenter: Quaid Morris, Memorial Sloan Kettering Cancer Center, United States

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Quaid Morris, Memorial Sloan Kettering Cancer Center, United States
  • Madison Darmofal, Memorial Sloan Kettering Cancer Center, United States
  • Michael Berger, Memorial Sloan Kettering Cancer Center, United States

Presentation Overview: Show

Accurate cancer type classifiers would have profound impact on the success of cancer treatment. Each year, in the US, more than 30,000 people present with new cancers of unknown primary (CUP), for which treatment options are very limited. Up to half of these patients could be matched with FDA-approved therapies if their cancer type were known. Cancer type classifiers can also distinguish new cancers from reoccurrences and resolve difficult diagnostic challenges. We recently deployed a highly accurate cancer type classifier, GDD-ENS, at Memorial Sloan Kettering Cancer Center (MSKCC) based on inputs derived from an FDA-approved, and routinely applied, targeted DNA sequencing panel called MSK-IMPACT. GDD-ENS, based on ENSembles of multilayer perceptrons, and replaced a pre-existing MSKCC system, GDD-RF. To make GDD-ENS well-suited to the clinical setting, based on lessons learned from GDD-RF, we made specific design choice in the classifier, in its training and evaluation, and how its outputs are integrated with other routinely available clinical data. I will present GDD-ENS, these choices and their impacts, as well as, GDD-ENS’ successes and some areas of improvement. I will also discuss our efforts to generalize GDD-ENS to other targeted cancer gene panels.
Joint work with Dr Michael Berger and our labs.

11:20-11:40
CIViC - an open-access knowledgebase for community driven curation of clinical variants in cancer
Confirmed Presenter: Mariam Khanfar, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Mariam Khanfar, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Susanna Kiwala, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Kilannin Krysiak, Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Adam C. Coffman, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Joshua F. McMichael, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Arpad M. Danos, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Jason Saliba, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Nilan Patel, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Steven Jones, Canada’s Michael Smith Genome Sciences Centre, Vancouver, BC, Canada, Canada
  • Cameron J. Grisdale, Canada’s Michael Smith Genome Sciences Centre, Vancouver, BC, Canada, Canada
  • Caralyn Reisle, Canada’s Michael Smith Genome Sciences Centre, Vancouver, BC, Canada, Canada
  • Jake Lever, School of Computer Science, University of Glasgow, Glasgow, United Kingdom, United Kingdom
  • Alex H. Wagner, Departments of Pediatrics and Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, USA, United States
  • Malachi Griffith, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States
  • Obi L. Griffith, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA, United States

Presentation Overview: Show

In the era of personalized oncology, identifying clinically relevant variants is critical due to the rapidly increasing variant data and need for consensus variant interpretation. The Clinical Interpretation of Variants in Cancer (CIViC-www.civicdb.org) knowledgebase is a free, open-access, open-source, and open-license public resource with an intuitive user interface and flexible public API for programmatic access to all content.

CIViC supports variant interpretations with six evidence types: Predictive (Therapeutic response), Diagnostic, Prognostic, Predisposing, Oncogenic, and Functional. The model also supports curating Molecular Profiles, which allows users to logically associate one or more variants with evidence. This expansion into ""Complex"" multi-variant profiles enables the evaluation of clinical significance in contexts such as variant co-occurrence or mutual exclusivity, further enhancing the utility of CIViC in the field of oncology.

All content in CIViC adheres to a structured data model which follows a published standard operating procedure for curation. This data model incorporates ontologies, standards and guidelines from across the field to promote interoperability and compatibility with other efforts. The CIViC community currently has >350 contributors that have generated >10,000 evidence items from >3,600 sources spanning >390 diseases and >530 therapies.

CIViC's key role in cancer variant interpretation was recognized with its inclusion in the list of 37 Global Core Biodata Resources, underscoring its value to the biological and life sciences community. As CIViC continues to adhere to rigorous standards in maintaining data quality, it remains an invaluable, freely accessible resource, advancing the field of personalized oncology.

Timing the development of chemoresistance in relapsed pediatric cancer
Confirmed Presenter: Sasha Blay, The Hospital for Sick Children, Canada

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Sasha Blay, The Hospital for Sick Children, Canada
  • Mehdi Layeghifard, The Hospital for Sick Children, Canada
  • Scott Davidson, The Hospital for Sick Children, Canada
  • David Chen, The Hospital for Sick Children, Canada
  • Astra Schwertschkow, The Hospital for Sick Children, Canada
  • Vijay Ramaswamy, The Hospital for Sick Children, Canada
  • Michael Taylor, The Hospital for Sick Children, Canada
  • Elli Papaemmanuil, Memorial Sloan Kettering Cancer Center, United States
  • Anita Villani, The Hospital for Sick Children, Canada
  • David Malkin, The Hospital for Sick Children, Canada
  • Ludmil Alexandrov, University of California San Diego, United States
  • Mark Cowley, Children's Cancer Institute, New South Wales, Australia
  • Adam Shlien, The Hospital for Sick Children, Canada

Presentation Overview: Show

Survivors of pediatric cancer face lifelong battles with severe morbidities, including a significant risk of recurrence. Mutational signatures are patterns of somatic mutations in the cancer genome with specific etiologies. Recent cell line work links mutational signatures to chemotherapy response, signifying chemoresistance. The Shlien lab has identified therapy-associated mutational signatures in the genomes of relapsed pediatric patients, creating an opportunity to characterise when and where the effects of chemotherapy are felt in the pediatric cancer genome. I thus developed a pipeline that combines clonal evolution reconstruction with mutational signature extraction to elucidate changes in mutational processes. I used this pipeline to analyze 1,743 pediatric tumor genomes from 10 pediatric cancer datasets. I detected mutational signatures linked to 4 chemotherapy drugs: temozolomide, platinum-based agents, fluorouracil, and thiopurine. Of 235 samples with confirmed exposure, 37.9% displayed one or more therapy-associated signatures. Mutational signatures associated with alkylating agents like cisplatin were more prevalent and mutationally heavy than those linked to antimetabolites, suggesting the drug mechanism dictates its presentation in the genome. I identified specific subclones with chemotherapy signatures, demarking subclone-level resistance. In cases with multiple tumour samples, resistant subclones in recurrences were traced back to ancestors in the primary diagnostic tumors, suggesting certain lineages possessed the ability to withstand chemotherapy-induced pressures from an early stage and then expanded following treatment. Thus, investigating mutational signatures at the subclonal level unveils new insights into the clonal dynamics of pediatric cancers and the development of chemoresistance.

11:40-12:00
PHENO-DEX: Phenotypic Mapping of Dexamethasone Response in Breast Cancer Cells using Single-cell Transcriptomics
Confirmed Presenter: Jiaqi Li, NIH, United States

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Jiaqi Li, NIH, United States
  • Benedict Anchang, NIH, United States

Presentation Overview: Show

Identifying tumor heterogeneity in response to treatment prior to clinical intervention is critical for long-term survival. We’ve developed an AI-based reference mapping strategy to profile tumor subpopulations in response to perturbations using single-cell transcriptomics. This strategy, known as PHENO-DEX, integrates two major algorithms: DSFMix and PHENOSTAMP. We use DSFMix, based on tree models to identify response/non-response cell trajectories from a Dex-treated breast cancer cell dataset. Then, using a feed forward loop neural network algorithm, PHENOSTAMP, we next create a Dex-responding reference map, identifying 9 cell states (4 responsive and 5 non-responsive). Each cell state exhibits unique characteristics which correlates with cell plasticity response to Dex. We projected thirty breast cancer cell lines and three clinical breast cancer tumors onto the reference map, effectively revealing their cell state heterogeneity in response to Dex. In summary, we’ve provided a framework to comprehensively characterize both cell lines and clinical samples, which better dissects the responsive states to Dex of tumors prior to any treatment, thereby providing clinical guidance for treatment decisions.

Spatial landscape of malignant pleural and peritoneal mesothelioma tumor immune microenvironment
Confirmed Presenter: Hatice Osmanbeyoglu, University of Pittsburgh, United States

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Xiaojun Ma, University of Pittsburgh, United States
  • David Lembersky, University of Pittsburgh, United States
  • Elena Kim, University of Pittsburgh, United States
  • Joseph Testa, Fox Chase Cancer Center, United States
  • Tullia Bruno, University of Pittsburgh/Hillman Cancer Center, United States
  • Hatice Osmanbeyoglu, University of Pittsburgh, United States

Presentation Overview: Show

Immunotherapies have shown modest clinical benefit thus far for malignant mesothelioma (MM). A deeper understanding of immune cell spatial distribution within the tumor immune microenvironment (TIME) is needed to identify interactions between tumor and different immune cell types that might impact the effectiveness of potential immunotherapies. We performed multiplex immunofluorescence (mIF) using tissue microarrays (TMAs, n=3) of samples from patients with malignant peritoneal (n=25) and pleural (n=88) mesothelioma (MPeM and MPM, respectively) to elucidate the spatial distributions of major immune cell populations and their association with LAG3, BAP1, NF2, and MTAP expression, the latter as a proxy for CDKN2A/B. We also analyzed the relationship between the spatial distribution of major immune cell types with MM patient prognosis and clinical features. The distribution of immune cells within the TIME is similar between MPM and MPeM. However, there is a higher level of interaction between immune cells and tumor cells in MPM than MPeM. Within MPM tumors, there is increased amount of interaction between tumor cells and CD8+ T cells in BAP1-low than in BAP1-high expressing tumors. The cell-cell interactions identified in this investigation have potential implications for the immune response against MM tumors and could be a factor in the different behaviors of MPM and MPeM. Our findings provide a valuable resource for the MM cancer research community and exemplifies the utility of spatial resolution within single-cell analyses. Our mesothelioma spatial atlas mIF dataset is available at
https://mesotheliomaspatialatlas.streamlit.app/.

12:00-12:20
Poster Flash Talks
Room: 519
Format: In person

Moderator(s): Irene Ong


Authors List: Show

14:20-14:40
Integrative transcriptomic analysis and predictive modeling for immunotherapy response in melanoma
Confirmed Presenter: Yamil Damian Mahmoud, Laboratorio de Glicomedicina, Instituto de Biología y Medicina Experimental, CONICET, Argentina

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Yamil Damian Mahmoud, Laboratorio de Glicomedicina, Instituto de Biología y Medicina Experimental, CONICET, Argentina
  • Florencia Veigas, Laboratorio de Glicomedicina, Instituto de Biología y Medicina Experimental, CONICET, Argentina
  • Marcelo Hill, Laboratory of Immunoregulation and Inflammation, Institut Pasteur de Montevideo, Uruguay
  • Maria Romina Girotti, Laboratorio de Glicomedicina, Instituto de Biología y Medicina Experimental, CONICET, Argentina
  • Juan Manuel Perez-Saez, Laboratorio de Glicomedicina, Instituto de Biología y Medicina Experimental, CONICET, Argentina
  • Gabriel A Rabinovich, Laboratorio de Glicomedicina, Instituto de Biología y Medicina Experimental, CONICET, Argentina

Presentation Overview: Show

Despite significant advances in immunotherapies, a substantial subset of melanoma patients remains unresponsive, emphasizing the critical need for predictive biomarkers. Our study integrates transcriptomic analysis and predictive modeling to address this challenge.
We analyzed public single-cell RNASeq (scRNA-Seq) data from 48 melanoma biopsies (16,291 cells) and bulk RNASeq data from 514 patients treated with anti-PD1/anti-CTLA4. Non-responders exhibited upregulated glycosylation-related genes in macrophages and CD8 T-cells, indicative of compromised immune function. Additionally, macrophages from non-responder biopsies displayed an immunosuppressive profile, coinciding with a treatment-resistant cell sub-group.
Furthermore, we integrated scRNA-Seq data from various cancers (totaling 382,019 cells) and developed a signature for immune cell deconvolution in bulk datasets. Responders during treatment showed higher levels of CD8 T-cells, CD4 activated memory T-cells, and total immune infiltrate. Interestingly, responders also displayed increased levels of progenitor and terminally exhausted CD8 T cells compared to non-responders pre- and during treatment, respectively.
To create a robust predictive model of response to immunotherapies in melanoma, we combined the estimated immune cell composition with glycosylation-related genes, our previously published inflammasome pathway signature, and other known indicators, including tertiary lymphoid structures, cytolytic score, and PDL1 expression. Our XGBoost-based machine learning model was trained on the bulk RNA cohort data and achieved an accuracy of 0.79 and AUC of 0.87 with cross-validation.
In conclusion, our findings underscore the potential of integrating transcriptomic analysis and predictive modeling in translational medicine for predicting immunotherapy response in melanoma patients. This emphasizes the critical role of multi-omic approaches in precision medicine for cancer immunotherapy.

A computational approach for the high-throughput identification of cancer-specific antigens for immunotherapeutic development
Confirmed Presenter: Rawan Shraim, Children's Hospital of Philadelphia/Drexel University, United States

Room: 519
Format: In Person

Moderator(s): Irene Ong


Authors List: Show

  • Rawan Shraim, Children's Hospital of Philadelphia/Drexel University, United States
  • Brian Mooney, BC Cancer Research Institute, Canada
  • Karina L. Conkrite, Children's Hospital of Philadelphia, United States
  • Amber K. Weiner, Children's Hospital of Philadelphia, United States
  • Gregg B. Morin, BC Cancer Research Institute, Canada
  • Poul H. Sorensen, BC Cancer Research Institute, Canada
  • John M. Maris, Children's Hospital of Philadelphia, United States
  • Sharon J. Diskin, Children's Hospital of Philadelphia, United States
  • Ahmet Sacan, Drexel University, United States

Presentation Overview: Show

Cancer remains a major global health challenge, with current treatments such as chemotherapy and radiotherapy often limited by toxicity and late effects. This has prompted the development of targeted immunotherapies. An obstacle to the development of these therapies is the identification of cancer-specific antigens as therapeutic targets.

To address this challenge computationally, we developed a tool that prioritizes potential immunotherapeutic targets by integrating multi-source data, including user-supplied cancer expression data (e.g., proteomics or RNA-sequencing) and quantitative features from various databases selected to address a predefined criteria for ideal immunotherapeutic targets. Our tool can adjust for normalization, missing values, and applies feature weighting, producing a gene-specific score that reflects its suitability as a therapeutic target. We evaluated our tool’s performance using mean-average-precision (MAP) score, which assesses the prioritization rankings of known therapeutic targets within the cancer phenotype. Utilizing twelve pediatric cancer cell line proteomics datasets for validation of our methodology, we generated optimized parameters leading to a 27-fold increase (p < 0.001) in the MAP score, highlighting our tools’ target prioritization capabilities. Using the generated optimized parameters, our tool was able to score known chimeric antigen receptor T-cell targets such as CD19, CD22, CD79b in the top 10 targets in B-cell non-Hodgkin’s lymphoma, validating our methodology. Additionally, HLA-G was identified as a novel potential target across pediatric cancer phenotypes surveyed in the analysis.

We have developed a tool to efficiently identify immunotherapeutic targets that can be used to accelerate the development of safer and more effective cancer immunotherapies.

14:40-15:00
Leveraging a Single-Cell Language Model for Precise EMT Status Prediction and Gene Signature Identification in Cancer
Confirmed Presenter: Shi Pan, University college london, United Kingdom

Room: 519
Format: In Person

Moderator(s): Sikander Hayat


Authors List: Show

  • Shi Pan, University college london, United Kingdom
  • Maria Secrier, University College London, United Kingdom

Presentation Overview: Show

The epithelial-to-mesenchymal transition (EMT) is pivotal in tumour progression and resistance to treatment, yet its heterogeneity complicates the precise assessment of EMT status of individual tumour cells. While key epithelial and mesenchymal genes driving the transformation are well characterised, other regulators, especially at intermediate stages of the process, are less well understood.
By leveraging a pre-trained single-cell language model, we develop a generalisable classifier named EMT-language model (EMT-LM) to predict multiple states within the EMT continuum at single cell resolution. Our training data use an RNA-seq dataset from Cook et al [1], which profile single cells from 0 hours to 7 days during EMT. EMT-LM demonstrates an average prediction accuracy of EMT state of 90% AUROC across various cancers. Our Attention-Driven Expression Significance Index (ADESI) combines attention scores from EMT-LM and the gene expression, to uncover genes that are critical in regulating the entire timeline of EMT. Our top regulators include genes involved in mitochondrial function (e.g., NDUFB10, MRPL51) and oxidative stress response (e.g., PRDX1) suggesting a metabolic reprogramming during EMT. And patients exhibiting the 8h and 3d EMT signatures, as identified by genes with high attention scores in these categories, showed a notable decrease in survival rates in the METABRIC dataset.
In conclusion, EMT-LM exemplifies the effective application of language models in cancer biology research, offering a novel approach to EMT status prediction and identifying clinically relevant gene signatures reflecting the plasticity of the EMT programme.

Streamlining Clinical Trial Matching Using a Two-Stage Zero-Shot LLM with Advanced Prompting
Confirmed Presenter: Mozhgan Saeidi, Stanford University and Gladstone Institute, United States

Room: 519
Format: In Person

Moderator(s): Sikander Hayat


Authors List: Show

  • Mozhgan Saeidi, Stanford University and Gladstone Institute, United States
  • Barbara Engelhardt, Gladstone Institute and Stanford University, United States

Presentation Overview: Show

Identifying patients eligible for clinical trials is a critical bottleneck hindering medical research progress because many clinical trials allow only small, specific patient cohorts to be included in the clinical trial and require a certain number of participating patients to yield definitive results. Manually screening patients through unstructured medical records is time-consuming and expensive. This paper explores the potential of large language models (LLMs) enhanced with medical context to automate patient eligibility assessment for clinical trials. We first designed a two-stage zero-shot LLM approach to analyze a patient’s medical history (presented as unstructured text) to determine their eligibility for a given trial. We use advanced prompting strategies to guide the LLM toward faster and more targeted assessments. Additionally, a two-stage retrieval pipeline pre-filters potential trials using efficient retrieval techniques, reducing the number of trials considered by the LLM. This substantially improves processing speed and efficiency. Our method holds promise for streamlining clinical trial patient matching.

15:00-15:20
Multi-Omics Integration with High-Resolution AI-Derived Retinal Thickness: Unraveling Spatial Patterns of Retinal Susceptibility to Systemic Influences
Confirmed Presenter: Roberto Bonelli, The Lowy Medical Research Institute, United States

Room: 519
Format: In Person

Moderator(s): Sikander Hayat


Authors List: Show

  • Roberto Bonelli, The Lowy Medical Research Institute, United States
  • Victoria Jackson, Population Health and Immunity Division, WEHI, Australia
  • Yue Wu, Department of Ophthalmology, University of Washington, United States
  • Julia Owen, Department of Ophthalmology, University of Washington, United States
  • Samaneh Farashi, Population Health and Immunity Division, WEHI, Australia
  • Yuka Kihara, Department of Ophthalmology, University of Washington, United States
  • Marin Gantner, The Lowy Medical Research Institute, United States
  • Catherine Egan, Moorfields Eye Hospital NHS Foundation Trust, United Kingdom
  • Katie Williams, Moorfields Eye Hospital NHS Foundation Trust, United Kingdom
  • Brendan Ansell, Population Health and Immunity Division, WEHI, Australia
  • Adnan Tufail, Moorfields Eye Hospital NHS Foundation Trust, United Kingdom
  • Aaron Lee, Department of Ophthalmology, University of Washington, United States
  • Melanie Bahlo, Population Health and Immunity Division, WEHI, Australia

Presentation Overview: Show

Retinal thickness is a marker of retinal health and more broadly, a promising biomarker for many systemic diseases. We processed the UK Biobank retinal OCT images using a convolutional neural network on more than 40,000 individuals to produce fine-scale retinal thickness measurements on >29,000 points in the macula, the part of the retina responsible for human central vision. We then performed a multi-omics analysis and tested the association of common genomic variants, metabolomic, blood and immune biomarkers, ICD10 codes and polygenic risk scores with each of the fine-scale macular thickness points. Our analysis reveals high-resolution spatial retinal thickness association with hundreds of genetic loci, metabolites with spatially clustered effects, systemic disorders such as multiple sclerosis affecting specific areas as well as blood biomarkers such as reticulocyte count correlating with strong retinal thinning. Using enrichment analysis, we highlight that the parafoveal region of the macula is particularly susceptible to systemic insults and that metabolic correlations with its thickness magnify with age. Together, these results demonstrate not only the exquisite susceptibility of the retina to molecular and phenotypic changes but also the gains in spatial discovery power and resolution achievable by integrating multi-omics datasets with AI-generated data. All our results are accessible through a bespoke web interface.

15:20-15:40
New methods to discover drug combinations impacting cancer incidence
Confirmed Presenter: Rachel Melamed, UMass lowell, United States

Room: 519
Format: In Person

Moderator(s): Sikander Hayat


Authors List: Show

  • Panagiotis Nikolaos Lalagkas, University of Massachusetts Lowell, United States
  • Rachel Melamed, UMass lowell, United States

Presentation Overview: Show

In this work we seek to mine health claims data to find combinations of drugs that may alter onset of cancer. This work has an ultimate goal of preventing cancer related to medical treatment, and of suggesting new treatments for the disease. Because drug combinations impacting cancer are unlikely to be discovered using randomized trials, we develop new methods using observational data to discover these effects. Our novel method based on the marginal structural model, but also includes a number of evaluations to identify robust signals.

Closing Remarks
Room: 519
Format: In person

Moderator(s): Sikander Hayat


Authors List: Show

  • TransMed Organizers