The SciFinder tool lets you search Titles, Authors, and Abstracts of talks and panels. Enter your search term below and your results will be shown at the bottom of the page. You can also click on a track to see all the talks given in that track on that day.

View Talks By Category

Scroll down to view Results

July 12, 2024
July 13, 2024
July 14, 2024
July 15, 2024
July 16, 2024

Results

July 16, 2024
8:40-9:00
Domain adaptation for cell-free DNA fragmentomics
Confirmed Presenter: Natalie Davidson, University of Colorado, Anschutz Medical Campus
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Natalie Davidson, Natalie Davidson, University of Colorado
  • Casey Greene, Casey Greene, University of Colorado

Presentation Overview:Show

Cell-free DNA (cfDNA) is an emerging minimally-invasive biomarker that could detect cancer, indicate transplant rejection, and predict autoimmune disease severity. A critical application of cfDNA is identifying the cfDNA’s tissue-of-origin, a presumed disease source. The most established cfDNA strategies rely on identifying disease-specific mutations but can only be applied to diseases with a known variant. However, the recent discovery that cfDNA fragmentation patterns reflect nucleosome positioning and active transcription factor binding sites (TFBSs) indicates that the fragmentation patterns alone can predict the tissue of origin and open the door for applications to a broader range of diseases.

Currently, to predict tissue-of-origin, one needs to gather large cohorts to sample their cfDNA, which is commonly infeasible. In contrast, we propose that we instead use domain adaptation to train a model on a complementary data type, ATAC-Seq, such that it can also be used on cfDNA.

To do this, we must address two key problems: 1) generating a tissue prediction model that can translate across the domains of cfDNA and ATAC-Seq; 2) that the majority of cfDNA reads will come from blood.

We address both problems through the use of data augmentation strategies and the utilization of our previously preprinted domain invariant method, BuDDI. We apply this approach first to ATAC-Seq alone, to ensure our model can detect the tissue of origin, even when 99% of total reads come from blood and not the tissue of interest. Finally, we apply our approach to real cfDNA.

July 16, 2024
8:40-9:00
DeepROCK: Error-controlled interaction detection in deep neural networks
Confirmed Presenter: Yang Lu, University of Waterloo, Canada
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Winston Chen, Winston Chen, University of Michigan
  • William Noble, William Noble, University of Washington
  • Yang Lu, Yang Lu, University of Waterloo

Presentation Overview:Show

The complexity of deep neural networks (DNNs) makes them powerful but also makes them challenging to interpret, hindering their applicability in error-intolerant domains. Existing methods attempt to reason about the internal mechanism of DNNs by identifying feature interactions that influence prediction outcomes. However, such methods typically lack a systematic strategy to prioritize interactions while controlling confidence levels, making them difficult to apply in practice for scientific discovery and hypothesis validation. In this paper, we introduce a method, called DeepROCK, to address this limitation by using knockoffs, which are dummy variables that are designed to mimic the dependence structure of a given set of features while being conditionally independent of the response. Together with a novel DNN architecture involving a pairwise-coupling layer, DeepROCK jointly controls the false discovery rate (FDR) and maximizes statistical power. In addition, we identify a challenge in correctly controlling FDR using off-the-shelf feature interaction importance measures. DeepROCK overcomes this challenge by proposing a calibration procedure applied to existing interaction importance measures to make the FDR under control at a target level. Finally, we validate the effectiveness of DeepROCK through extensive experiments on simulated and real datasets.

July 16, 2024
9:00-9:20
Proceedings Presentation: CODEX: COunterfactual Deep learning for the in-silico EXploration of cancer cell line perturbations
Confirmed Presenter: Stefan Schrod, University Medical Center Göttingen, Germany
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Stefan Schrod, Stefan Schrod, University Medical Center Göttingen
  • Helena Zacharias, Helena Zacharias, Hannover Medical School
  • Tim Beissbarth, Tim Beissbarth, University Medical Center Göttingen
  • Anne-Christin Hauschild, Anne-Christin Hauschild, University Medical Center G ̈ottingen
  • Michael Altenbuchinger, Michael Altenbuchinger, University Medical Center Göttingen

Presentation Overview:Show

Motivation: High-throughput screens (HTS) provide a powerful tool to decipher the causal effects of chemical and genetic perturbations on cancer cell lines. Their ability to evaluate a wide spectrum of interventions, from single drugs to intricate drug combinations and CRISPR-interference, has established them as an invaluable resource for the development of novel therapeutic approaches. Nevertheless, the combinatorial complexity of potential interventions makes a comprehensive exploration intractable. Hence, prioritizing interventions for further experimental investigation becomes of utmost importance.
Results: We propose CODEX as a general framework for the causal modeling of HTS data, linking perturbations to their downstream consequences. CODEX relies on a stringent causal modeling strategy based on counterfactual reasoning. As such, CODEX predicts drug-specific cellular responses, comprising cell survival and molecular alterations, and facilitates the in-silico exploration of drug combinations. This is achieved for both bulk and single-cell HTS. We further show that CODEX provides a rationale to explore complex genetic modifications from CRISPR-interference in silico in single cells.
Availability and Implementation: Our implementation of CODEX is publicly available at https://github.com/sschrod/CODEX. All data used in this article are publicly available.
Supplementary information: Supplementary materials are available at Bioinformatics online.

July 16, 2024
9:20-9:40
A statistical method for migration history inference reveals alternative patterns of metastatic dissemination, clonality and phyleticity
Confirmed Presenter: Divya Koyyalagunta, Weill Cornell + MSKCC, United States
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Divya Koyyalagunta, Divya Koyyalagunta, Weill Cornell + MSKCC
  • Quaid Morris, Quaid Morris, MSKCC

Presentation Overview:Show

Although metastasis is the cause of 90% of cancer deaths, little is known about its clonal evolution, genetic drivers, and seeding patterns. Identifying these patterns from DNA sequencing data requires solving a challenging mixed-variable combinatorial optimization problem to reconstruct the history of metastatic migrations. Current methods, based on integer linear programs, are slow, restricted to unrealistic assumptions, and cannot report uncertainty in their reconstructions. Furthermore, a fundamental problem with these methods is their inability to choose between multiple equally or similarly likely metastatic migration histories. To address these challenges, we propose a novel statistical framework for migration history inference, Metient, which uses recent machine learning advancements in discrete variable gradient estimation and metastasis specific priors. Rather than requiring a metastatic seeding dissemination model to be known a priori, Metient aims to answer this question by evaluating all possible migration history hypotheses and choosing the best model as informed by biologically motivated data. On simulated data, Metient outperforms the state-of-the-art, and can sample up to 64 possible solutions in 1% of the time. The migration histories inferred by Metient on 167 patients with four cancer types recover expert-assigned parsimony models in 84% of cases, but find notable differences where more plausible histories are proposed. We find that parallel gains of metastatic potential are much less common than previously proposed, and that polyclonal seeding occurs more in lymph nodes than in distant metastases. Along with significantly improving existing methodology, Metient provides a means to better model metastasis across different cancer types.

July 16, 2024
9:20-9:40
A deep learning model of tumor cell architecture elucidates response and resistance to CDK4/6 inhibitors
Confirmed Presenter: Sungjoon Park, University of California, San Diego
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Sungjoon Park, Sungjoon Park, University of California
  • Erica Silva, Erica Silva, University of California
  • Akshat Singhal, Akshat Singhal, University of California
  • Marcus Kelly, Marcus Kelly, University of California
  • Kate Licon, Kate Licon, University of California
  • Isabella Panagiotou, Isabella Panagiotou, University of California
  • Catalina Fogg, Catalina Fogg, University of California
  • Samson Fong, Samson Fong, University of California
  • John Lee, John Lee, University of California
  • Xiaoyu Zhao, Xiaoyu Zhao, University of California
  • Robin Bachelder, Robin Bachelder, University of California
  • Barbara Parker, Barbara Parker, University of California
  • Kay Yeung, Kay Yeung, University of California
  • Trey Ideker, Trey Ideker

Presentation Overview:Show

Cyclin-dependent kinase 4 and 6 inhibitors (CDK4/6is) have revolutionized breast cancer therapy. However, <50% of patients have an objective response, and nearly all patients develop resistance during therapy. To elucidate the underlying mechanisms, we constructed an interpretable deep learning model of the response to palbociclib, a CDK4/6i, based on a reference map of multiprotein assemblies in cancer. The model identifies eight core assemblies that integrate rare and common alterations across 90 genes to stratify palbociclib-sensitive versus palbociclib-resistant cell lines. Predictions translate to patients and patient-derived xenografts, whereas single-gene biomarkers do not. Most predictive assemblies can be shown by CRISPR–Cas9 genetic disruption to regulate the CDK4/6i response. Validated assemblies relate to cell-cycle control, growth factor signaling and a histone regulatory complex that we show promotes S-phase entry through the activation of the histone modifiers KAT6A and TBL1XR1 and the transcription factor RUNX1. This study enables an integrated assessment of how a tumor’s genetic profile modulates CDK4/6i resistance.

July 16, 2024
9:40-10:00
Proceedings Presentation: oncotree2vec – A method for embedding and clustering of tumor mutation trees
Confirmed Presenter: Monica-Andreea Baciu-Dragan, ETHZ, Switzerland
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Monica-Andreea Baciu-Dragan, Monica-Andreea Baciu-Dragan, ETHZ
  • Niko Beerenwinkel, Niko Beerenwinkel, ETHZ

Presentation Overview:Show

Understanding the genomic heterogeneity of tumors is an important task in computational oncology, especially in the context of finding personalized treatments based on the genetic profile of each patient’s tumor. Tumor clustering that takes into account the temporal order of genetic events, as represented by tumor mutation trees, is a powerful approach for grouping together patients with genetically and evolutionarily similar tumors and can provide insights into discovering tumor subtypes, for more accurate clinical diagnosis and prognosis. Here, we propose oncotree2vec, a method for clustering tumor mutation trees by learning vector representations of mutation trees that capture the different relationships between subclones in an unsupervised manner. Learning low-dimensional tree embeddings facilitates the visualization of relations between trees in large cohorts and can be used for downstream analyses, such as deep learning approaches for single-cell multi-omics data integration. We assessed the performance and the usefulness of our method in three simulation studies, and on two real datasets: a cohort of 43 trees from six cancer types with different branching patterns corresponding to different modes of spatial tumor evolution and a cohort of 123 AML mutation trees.

July 16, 2024
10:40-11:30
Invited Presentation: Deep learning of personal genomes
Confirmed Presenter: Sara Mostafavi, University of Washington , USA
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Sara Mostafavi, Sara Mostafavi, University of Washington
July 16, 2024
11:30-11:40
ConfuseNN: Interpreting convolutional neural network inferences in population genomics with data shuffling
Confirmed Presenter: Linh Tran, University of Arizona, United States
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Linh Tran, Linh Tran, University of Arizona
  • David Castellano, David Castellano, University of Arizona
  • Ryan Gutenkunst, Ryan Gutenkunst, University of Arizona

Presentation Overview:Show

Convolutional neural network (CNN) is an increasingly popular supervised machine learning approach that has been applied to many inference tasks in population genomics. Under this framework, population genomic variation data are typically represented as 2D images with sampled haplotypes as rows and segregating sites as columns. While many published studies reported promising performance of CNNs on various inference tasks, understanding which features in the data meaningfully contributed to the CNN's reported performance remains challenging. Here we propose a novel approach to interpreting CNN performance motivated by population genetic theory on genomic data. Specifically, we designed a suite of scramble tests where each test deliberately disrupts a feature in the genomic image data (e.g. allele frequency, linkage disequilibrium, etc.) to assess how each feature affects the CNN performance. We applied these tests to three networks designed to infer demographic history and natural selection from genetic variation data, identifying the fundamental population genomic features that drive inference for each network.

July 16, 2024
11:40-12:30
Panel: Trustworthy AI in the life sciences
Track: MLCSB

Room: 517d
Format: In Person
Moderator(s): Peter Koo


Authors List: Show

July 16, 2024
14:20-15:10
Invited Presentation: Towards spatiotemporal design principles in multicellular systems
Confirmed Presenter: Mor Nitzan, The Hebrew University, Israel
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Mor Nitzan, Mor Nitzan, The Hebrew University

Presentation Overview:Show

Gene expression profiles of a cellular population, generated by single-cell RNA sequencing, contain rich, 'hidden' information about biological state and collective multicellular behavior that is lost during the experiment or not directly accessible, including cell type, cell cycle phase, gene regulatory patterns, cell-cell communication, and location within the tissue-of-origin. In this talk I will discuss several methods, based on a combination of spectral, machine learning, and dynamical systems approaches, to disentangle and enhance particular spatiotemporal signals that cellular populations encode and interpret their manifestation across space and time in tissues. We will further discuss how we can computationally transfer knowledge across biological datasets and systematically identify gaps in our knowledge.

July 16, 2024
15:10-15:30
Proceedings Presentation: Probabilistic Pathway-based Multimodal Factor Analysis
Confirmed Presenter: Alexander Immer, Biomedical Informatics Group, Department of Computer Science
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Alexander Immer, Alexander Immer, Biomedical Informatics Group
  • Stefan G. Stark, Stefan G. Stark, Biomedical Informatics Group
  • Francis Jacob, Francis Jacob, Ovarian Cancer Research
  • Ximena Bonilla, Ximena Bonilla, Biomedical Informatics Group
  • Tinu Thomas, Tinu Thomas, Biomedical Informatics Group
  • Andre Kahles, Andre Kahles, Biomedical Informatics Group
  • Sandra Goetze, Sandra Goetze, Institute of Translational Medicine
  • Emanuela S. Milani, Emanuela S. Milani, Institute of Translational Medicine

Presentation Overview:Show

Multimodal profiling strategies promise to produce more informative insights into biomedical cohorts via the integration of the information each modality contributes. In order to perform this integration, however, the development of novel analytical strategies are needed. Multimodal profiling strategies often come at the expense of lower sample numbers, which can challenge methods to uncover shared signals across a cohort. Thus, factor analysis approaches are commonly used for the analysis of high-dimensional data in molecular biology, however they typically do not yield representations that are directly interpretable, whereas many research questions often center around the analysis of pathways associated with specific observations.

We develop PathFA, a novel approach for multimodal factor analysis over the space of pathways. PathFA produces integrative and interpretable views across multimodal profiling technologies, which allow for the derivation of concrete hypotheses. PathFA combines a pathway-learning approach with integrative multimodal capability under a Bayesian procedure that is efficient, hyper-parameter free, and able to automatically infer observation noise from the data. We demonstrate strong performance on small sample sizes within our simulation framework and on matched proteomics and transcriptomics profiles from real tumor samples taken from the Swiss Tumor Profiler consortium. On a subcohort of melanoma patients, PathFA recovers pathway activity that has been independently associated with poor outcome. We further demonstrate the ability of this approach to identify pathways associated with the presence of specific cell-types as well as tumor heterogeneity. Our results show that we capture known biology, making it well suited for analyzing multimodal sample cohorts.

July 16, 2024
15:30-15:40
SLIDE: Significant Latent Factor Interaction Discovery and Exploration across biological domains
Confirmed Presenter: Jishnu Das, University of Pittsburgh, United States
Track: MLCSB

Room: 517d
Format: In Person

Authors List: Show

  • Javad Rahimikollu, Javad Rahimikollu, University of Pittsburgh
  • Hanxi Xiao, Hanxi Xiao, University of Pittsburgh
  • Annaelaine Rosengart, Annaelaine Rosengart, University of Pittsburgh
  • Aaron Rosen, Aaron Rosen, University of Pittsburgh
  • Tracy Tabib, Tracy Tabib, University of Pittsburgh
  • Paul Zdinak, Paul Zdinak, University of Pittsburgh
  • Kun He, Kun He, University of Pittsburgh
  • Xin Bing, Xin Bing, University of Toronto
  • Florentina Bunea, Florentina Bunea, Cornell University
  • Marten Wegkamp, Marten Wegkamp, Cornell University
  • Amanda Poholek, Amanda Poholek, University of Pittsburgh
  • Alok Joglekar, Alok Joglekar, University of Pittsburgh
  • Robert Lafyatis, Robert Lafyatis, University of Pittsburgh
  • Jishnu Das, Jishnu Das, University of Pittsburgh

Presentation Overview:Show

Modern multi-omic technologies can generate deep multi-scale profiles. However, differences in data modalities, multicollinearity, and large numbers of irrelevant features make the analyses and integration of high-dimensional omic datasets challenging. Here, we present Significant Latent factor Interaction Discovery and Exploration (SLIDE), a first-in-class interpretable machine learning technique for identifying significant interacting latent factors underlying outcomes of interest from high-dimensional omic datasets. SLIDE makes no assumptions regarding data-generating mechanisms, comes with theoretical guarantees regarding identifiability of the latent factors/corresponding inference, and has rigorous FDR control. SLIDE outperforms a wide range of state-of-the-art approaches, including other latent factor approaches, in terms of prediction. More importantly, it provides biological inference beyond prediction that other methods do not afford. Using SLIDE on scRNA-seq data from systemic sclerosis (SSc) patients, we first uncovered significant interacting latent factors underlying SSc pathogenesis. In addition to outperforming existing benchmarks for prediction, SLIDE uncovered significant factors that included well-elucidated altered transcriptomic states in myeloid cells and fibroblasts and a novel keratinocyte-centric signature validated by protein staining. SLIDE also worked well on a wide range of spatial modalities spanning transcriptomic and proteomic data and was able to accurately identify significant interacting latent factors underlying immune cell partitioning by 3D location in different contexts. Finally, SLIDE leveraged paired scRNA-seq and TCR-seq data to elucidate novel latent factors underlying extents of clonal expansion of CD4 T cells in a nonobese diabetic model of T1D. Overall, SLIDE is a versatile engine for biological discovery from modern multi-omic datasets.