The SciFinder tool lets you search Titles, Authors, and Abstracts of talks and panels. Enter your search term below and your results will be shown at the bottom of the page. You can also click on a track to see all the talks given in that track on that day.

View Talks By Category

Scroll down to view Results

July 14, 2025
July 15, 2025
July 20, 2025
July 21, 2025
July 22, 2025
July 23, 2025
July 24, 2025

Results

July 23, 2025
11:20-12:00
Invited Presentation: Exploring cellular plasticity: 4D epigenomes in the context of the tumour microenvironment
Confirmed Presenter: Vera Pancaldi
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Ferhat Ay


Authors List: Show

  • Vera Pancaldi

Presentation Overview:Show

Oncogenesis is characterized by alterations in chromatin organization and the reactivation of unicellular phenotypes at both metabolic and transcriptional levels. The underlying mechanisms remain largely unexplored, despite their critical relevance in cancer biology. We studied the spatial organization of genes in relation to their evolutionary origins, as well as changes occurring during cell differentiation and oncogenesis. We reveal significant topological changes in chromatin organization during cell differentiation, with patterns in specific regulatory marks involving Polycomb repression and RNA Polymerase II pausing, being reversed during oncogenesis.

Reflecting on recent findings regarding epigenomic routes to oncogenesis made us consider the importance of the tumour microenvironment in determining plasticity of cancer cells in different environments, which we are studying through data-driven inference of regulatory networks in simplified in-vitro culture systems. We will discuss our recent results and frame them in the context of changing oncogenesis paradigms.

July 23, 2025
12:00-12:20
Proceedings Presentation: Leveraging Transcription Factor Physical Proximity for Enhancing Gene Regulation Inference
Confirmed Presenter: Yijie Wang, School of Informatics and Computing, Indiana University
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Ferhat Ay


Authors List: Show

  • Xiaoqing Huang, Xiaoqing Huang, Department of Biostatistics and Health Data Science School of Medicine
  • Aamir Raza Muneer Ahemad Hullur, Aamir Raza Muneer Ahemad Hullur, School of Informatics and Computing
  • Elham Jafari, Elham Jafari, INDIANA UNIVERSITY
  • Kaushik Shridhar, Kaushik Shridhar, School of Informatics and Computing
  • Mu Zhou, Mu Zhou, Rutgers University
  • Kenneth Mackie, Kenneth Mackie, Indiana University Bloomington
  • Kun Huang, Kun Huang, Indiana University School of Medicine
  • Yijie Wang, Yijie Wang, School of Informatics and Computing

Presentation Overview:Show

Motivation: Gene regulation inference, a key challenge in systems biology, is crucial for understanding cell function, as it governs processes such as differentiation, cell state maintenance, signal transduction, and stress response. Leading methods utilize gene expression, chromatin accessibility, Transcription Factor (TF) DNA binding motifs, and prior knowledge. However, they overlook the fact that TFs must be in physical proximity to facilitate transcriptional gene regulation.
Results: To fill the gap, we develop GRIP – Gene Regulation Inference by considering TF Proximity – a gene regulation inference method that directly considers the physical proximity between regulating TFs. Specifically, we use the distance in a protein-protein interaction (PPI) network to estimate the physical proximity between TFs. We design a novel Boolean convex program, which can identify TFs that not only can explain the gene expression of target genes (TGs) but also stay close in the PPI network. We propose an efficient algorithm to solve the Boolean relaxation of the proposed model with a theoretical tightness guarantee. We compare our GRIP with state-of-the-art methods (SCENIC+, DirectNet, Pando, and CellOracle) on inferring cell-type-specific (CD4, CD8, and CD 14) gene regulation using the PBMC 3k scMultiome-seq data and demonstrate its out-performance in terms of the predictive power of the inferred TFs, the physical distance between the inferred TFs, and the agreement between the inferred gene regulation and PCHiC ground-truth data.

July 23, 2025
12:20-12:40
Proceedings Presentation: miRBench: novel benchmark datasets for microRNA binding site prediction that mitigate against prevalent microRNA Frequency Class Bias
Confirmed Presenter: Panagiotis Alexiou, University of Malta, Malta
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Ferhat Ay


Authors List: Show

  • Stephanie Sammut, Stephanie Sammut, University of Malta
  • Katarina Gresova, Katarina Gresova, Masaryk University
  • Dimosthenis Tzimotoudis, Dimosthenis Tzimotoudis, University of Malta
  • Eva Marsalkova, Eva Marsalkova, Masaryk University
  • David Cechak, David Cechak, Masaryk University
  • Panagiotis Alexiou, Panagiotis Alexiou, University of Malta

Presentation Overview:Show

Motivation: MicroRNAs (miRNAs) are crucial regulators of gene expression, but the precise mechanisms governing their binding to target sites remain unclear. A major contributing factor to this is the lack of unbiased experimental datasets for training accurate prediction models. While recent experimental advances have provided numerous miRNA-target interactions, these are solely positive interactions. Generating negative examples in silico is challenging and prone to introducing biases, such as the miRNA frequency class bias identified in this work. Biases within datasets can compromise model generalization, leading models to learn dataset-specific artifacts rather than true biological patterns.

Results: We introduce a novel methodology for negative sample generation that effectively mitigates the miRNA frequency class bias. Using this methodology, we curate several new, extensive datasets and benchmark several state-of-the-art methods on them. We find that a simple convolutional neural network model, retrained on some of these datasets, is able to outperform state-of-the-art methods. This highlights the potential for leveraging unbiased datasets to achieve improved performance in miRNA binding site prediction. To facilitate further research and lower the barrier to entry for machine learning researchers, we provide an easily accessible Python package, miRBench, for dataset retrieval, sequence encoding, and the execution of state-of-the-art models.

Availability: The miRBench Python Package is accessible at https://github.com/katarinagresova/miRBench/releases/tag/v1.0.0

Contact: panagiotis.alexiou@um.edu.mt

July 23, 2025
12:40-13:00
Flash Talk Session 1
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Ferhat Ay


Authors List: Show

  • Aryan Kamal
  • Damla Baydar
  • Laura Hinojosa
  • Charles-Henri Lecellier

Presentation Overview:Show

Session with 4 short talks:
Aryan Kamal - Transcriptional regulation of cell fate plasticity in hematopoiesis
Damla Övek Baydar - Enhancing JASPAR and UniBind databases with deep learning models for transcription factor-DNA interactions
Laura Hinojosa - Master Transcription Factors Regulate Replication Timing
Charles-Henri Lecellier - DNA replication timing and Copy Number Variations are confounders of RNA-DNA interaction data

July 23, 2025
14:00-14:20
Proceedings Presentation: Unicorn: Enhancing Single-Cell Hi-C Data with Blind Super-Resolution for 3D Genome Structure Reconstruction
Confirmed Presenter: Oluwatosin Oluwadare, University of Colorado, Colorado Springs
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Anaïs Bardet


Authors List: Show

  • Mohan Kumar Chandrashekar, Mohan Kumar Chandrashekar, University of Colorado,Colorado Springs
  • Rohit Menon, Rohit Menon, University of Colorado
  • Samuel Olowofila, Samuel Olowofila, University of Colorado
  • Oluwatosin Oluwadare, Oluwatosin Oluwadare, University of Colorado

Presentation Overview:Show

Motivation: Single-cell Hi-C (scHi-C) data provide critical insights into chromatin interactions at individual cell levels, uncovering unique genomic 3D structures. However, scHi-C datasets are characterized by sparsity and noise, complicating efforts to accurately reconstruct high-resolution chromosomal structures. In this study, we present ScUnicorn, a novel blind Super-Resolution framework for scHi-C data enhancement. ScUnicorn employs an iterative degradation kernel optimization process, unlike traditional Super-resolution approaches, which rely on downsampling, predefined degradation ratios, or constant assumptions about the input data to reconstruct high-resolution interaction matrices. Hence, our approach more reliably preserves critical biological patterns and minimizes noise. Additionally, we propose 3DUnicorn, a maximum likelihood algorithm that leverages the enhanced scHi-C data to infer precise 3D chromosomal structures.

Result: Our evaluation demonstrates that ScUnicorn achieves superior performance over the state-of-the-art methods in terms of Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and GenomeDisco scores. Moreover, 3DUnicorn’s reconstructed structures align closely with experimental 3D-FISH data, underscoring its biological relevance. Together, ScUnicorn and 3DUnicorn provide a robust framework for advancing genomic research by enhancing scHi-C data fidelity and enabling accurate 3D genome structure reconstruction.

Code Availability: Unicorn implementation is publicly accessible at https://github.com/OluwadareLab/Unicorn

July 23, 2025
14:20-14:40
Predicting gene-specific regulation with transcriptomic and epigenetic single-cell data
Confirmed Presenter: Laura Rumpf, Goethe University Frankfurt Main, Germany
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Anaïs Bardet


Authors List: Show

  • Laura Rumpf, Laura Rumpf, Goethe University Frankfurt Main
  • Fatemeh Behjati, Fatemeh Behjati, Goethe University Frankfurt Main
  • Dennis Hecker, Dennis Hecker, Goethe University Frankfurt
  • Marcel Schulz, Marcel Schulz, Goethe University

Presentation Overview:Show

To gain insights into phenotype-specific gene regulation, we present our integrative analysis approach MetaFR harnessing single-cell epigenetic and transcriptomic data.
MetaFR generates random forest regression models in a gene-specific manner utilizing both scATAC-seq and scRNA-seq data to predict gene expression in a large window around a target gene. The gene window is partitioned into bins of equal size which correspond to the model features holding the epigenetic signal counts. The importance of model features can be leveraged to prioritize enhancer-gene interactions.
The inherent sparsity problem of single-cell data is addressed by aggregating the scRNA-seq and scATAC-seq signal into metacells based on gene activity similarities.
MetaFR enables large-scale analysis of scATAC-seq and scRNA-seq data in an automated fashion. The automated pipeline has been successfully applied to a human PBMC dataset to identify immune cell-specific enhancer-gene interactions. We validated our findings with experimentally measured interactions (CRISPRi regions) and fine-mapped eQTLs. We benchmarked our performance against the state-of-the-art method SCARlink.
We were able to outperform SCARlink in both accuracy and runtime.
Our pipeline allows time-efficient analysis and obtains reliable models of gene expression, which can be used to study gene regulatory elements in any organism for which scRNA-seq and scATAC-seq data becomes available.   

July 23, 2025
14:40-15:00
Biophysical deep learning resolves how TF and DNA sequence specify the genome state of every cell population in human embryogenesis
Confirmed Presenter: Vitalii Kleshchevnikov, Wellcome Sanger Institute, United Kingdom
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Anaïs Bardet


Authors List: Show

  • Vitalii Kleshchevnikov, Vitalii Kleshchevnikov, Wellcome Sanger Institute
  • Alexander Aivazidis, Alexander Aivazidis, European Molecular Biology Laboratory (EMBL) Heidelberg
  • Donald Hansen, Donald Hansen, Computational Genomics and Systems Genetics at Deutsches Krebsforschungszentrum (DKFZ)
  • Ioannis Sarropolous, Ioannis Sarropolous, Cambridge Stem Cell Institute
  • Jacob Hepkema, Jacob Hepkema, Wellcome Sanger Institute
  • Nana-Jane Chipampe, Nana-Jane Chipampe, Wellcome Sanger Institute
  • Artem Lomakin, Artem Lomakin, Stanford School of Medicine
  • Fani Memi, Fani Memi, Wellcome Sanger Institute
  • Jimmy Lee, Jimmy Lee, Wellcome Sanger Institute
  • Simone Webb, Simone Webb, Newcastle University
  • Emily Stephenson, Emily Stephenson, Newcastle University
  • Antony Rose, Antony Rose, Newcastle University

Presentation Overview:Show

Understanding how interactions between transcription factors (TFs) and DNA sequence are orchestrated and give rise to the vast complexity of cell types is a major challenge of regulatory developmental biology. Large-scale multimodal single-cell RNA-seq and ATAC-seq atlases enable reconstructing the regulatory mechanisms across cell types from data, laying the foundation for cell programming and design of synthetic regulatory elements.


Despite significant progress, current DNA sequence models fail to account for cellular context, TF-DNA sequence relationships and TF combinatorics in a principled manner, limiting their causal expressiveness and generalization capacity across cell types. To overcome this, we developed cell2state, an end-to-end deep learning model with biophysical constraints on how TFs specify the genome accessibility state in every cell population. Cell2state leverages known TF-motif interactions while accounting for biophysical constraints, employs an interpretable neural network based on HyenaDNA architecture and captures TF-TF synergy and antagonism, enablings the model to integrate DNA sequence and transcription factor (TF) abundance. We demonstrated cell2state generalisation capabilities by predicting ATAC-seq signals for new chromosomes and cell types.


To link regulatory TF interactions to developmental processes at whole embryo scale, we applied cell2state to an unpublished multimodal single-cell and spatial transcriptomics atlas covering over 1,000 human developmental cell states (n=4,000 pseudobulk replicates, n=5 embryos). At critical developmental junctions, such as the dorsal-ventral patterning of the spinal cord/hindbrain and anterior-posterior patterning of the forebrain, cell2state revealed how enhancer DNA sequences integrate activities of cell-type-defining TFs (LHX2, PAX6) with cell communication pathway TFs (GLI, TCF).

July 23, 2025
15:00-15:20
Nona: A unifying multimodal masked modeling framework for functional genomics
Confirmed Presenter: Surag Nair, Genentech Inc, United States
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Anaïs Bardet


Authors List: Show

  • Surag Nair, Surag Nair, Genentech Inc
  • Alex Tseng, Alex Tseng, Genentech Inc
  • Ehsan Hajiramezanali, Ehsan Hajiramezanali, Genentech Inc
  • Nathaniel Diamant, Nathaniel Diamant, Genentech Inc
  • Avantika Lal, Avantika Lal, Genentech Inc
  • Tommaso Biancalani, Tommaso Biancalani, Genentech Inc
  • Gabriele Scalia, Gabriele Scalia, Genentech Inc
  • Gokcen Eraslan, Gokcen Eraslan, Genentech Inc

Presentation Overview:Show

We present Nona, a unifying multimodal masked modeling paradigm for functional genomics. Nona is a neural network model that operates on both DNA sequence and epigenetic tracks such as DNase-seq, ChIP-seq, and RNA-seq at base-pair resolution. By leveraging a flexible masking strategy, Nona can predict any subset of masked DNA and/or tracks from the unmasked subset. As a result, Nona encompasses versatile existing and novel use cases that were hitherto addressed using separate models. In addition to vanilla sequence-to-function prediction and DNA language modeling, Nona enables multiple novel application modes, of which we highlight 3: 1) context-aware prediction, where the model predicts epigenetic tracks in a local genomic window by taking into account the observed epigenetic tracks in adjacent windows, in addition to the DNA sequence, 2) sequence generation, where a conditional language model is used to iteratively generate a DNA sequence with desired epigenetic profiles across cellular states, 3) functional genotyping, where a conditional language model trained on base resolution ATAC-seq is used to infer the genotype of the sample donors. Beyond these applications, Nona can enable use cases such as functional perturbations and denoising functional measurements. Altogether, Nona is a versatile paradigm that extends sequence-to-function and masked language modeling to novel applications in regulatory genomics.

July 23, 2025
15:20-15:40
SCRIMPy: Single Cell Replication Inference from Multiome data using Python
Confirmed Presenter: Tatevik Jalatyan, Armenian Bioinformatics Institute; Chromatin and Disease Group, Centre for Human Genetics
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Anaïs Bardet


Authors List: Show

  • Tatevik Jalatyan, Tatevik Jalatyan, Armenian Bioinformatics Institute; Chromatin and Disease Group
  • Jennifer Herrmann, Jennifer Herrmann, Chromatin and Disease Group
  • Antonio Rodriguez-Romera, Antonio Rodriguez-Romera, MRC Weatherall Institute of Molecular Medicine
  • Beth Psaila, Beth Psaila, MRC Weatherall Institute of Molecular Medicine
  • Jim Hughes, Jim Hughes, MRC Weatherall Institute of Molecular Medicine
  • Simone Riva, Simone Riva, MRC Weatherall Institute of Molecular Medicine
  • Robert Beagrie, Robert Beagrie, Chromatin and Disease Group

Presentation Overview:Show

The cell cycle is a fundamental biological process crucial for an organism’s growth and development. Dysregulation of the cell cycle can lead to diseases such as cancer, neurodegenerative, cardiovascular, or autoimmune disorders. Thus, accurate characterization of cell cycle dynamics in healthy and disease states is important for understanding disease mechanisms. Existing methods for cell cycle state prediction from single-cell data use the expression of marker genes in individual cells. However, these approaches perform poorly on single-cell multiome (ATAC+GEX) data, likely due to the increased data sparsity and nuclear RNA bias.
To address these limitations, we propose a novel method for cell cycle state inference that uses replication-driven DNA copy number signals from scATAC-seq data. Our approach is based on two complementary metrics that reflect the replication state of individual cells. First, we capture the imbalance of ATAC fragment depth between early- and late-replicating regions of genome to identify S-phase cells with higher DNA copy number in early replicating domains. Second, we introduce a novel metric for DNA copy number in ATAC-seq data to differentiate G1-phase cells from G2/M-phase cells, since the latter have duplicated DNA content. We apply this method to multiome data from mouse embryonic stem cells sorted by cell cycle state (G1, S, G2/M) and show that SCRIMPy outperforms the commonly used expression-based classifier Seurat.
With the increasing availability of multiome datasets, this approach holds promise for deriving novel insights into cell cycle mechanisms in diseases and identifying potential therapeutic targets.

July 23, 2025
15:40-16:00
Uncovering Novel Cellular Programs and Regulatory Circuits Underlying Bifurcating Human B Cell States
Confirmed Presenter: Jishnu Das, University of Pittsburgh, United States
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Anaïs Bardet


Authors List: Show

  • Zarifeh Rarani, Zarifeh Rarani, University of Pittsburgh
  • Swapnil Keshari, Swapnil Keshari, University of Pittsburgh
  • Akanksha Sachan, Akanksha Sachan, University of Pittsburgh
  • Nicholas Pease, Nicholas Pease, University of Pittsburgh
  • Jingyu Fan, Jingyu Fan, University of Pittsburgh
  • Peter Gerges, Peter Gerges, University of Pittsburgh
  • Harinder Singh, Harinder Singh, University of Pittsburgh
  • Jishnu Das, Jishnu Das, University of Pittsburgh

Presentation Overview:Show

B cells upon antigen encounter undergo activation followed by a bifurcation either into extrafollicular plasmablasts (PB) or into germinal center (GC) B cells. We have assembled gene regulatory networks (GRNs) underlying this bifurcation using temporally resolved single cell multiomics. To complement this, we analyzed transcriptomic states of GC and PB cells using SLIDE, a novel interpretable machine learning approach method to infer a small set of cellular programs (latent factors/LFs) necessary and sufficient to distinguish GC and PB cells. These LFs provide stronger discrimination between the two emergent cell states, than DEG analyses. Interestingly, when the LF genes were cross-referenced with state-specific GRNs, the LFs recapitulated aspects of GRN architecture orchestrating the bifurcation. Intriguingly, the LFs also captured gene programs reflective of cell-fate propensity prior to the bifurcation in activated B cells. These programs were validated using perturbation of key TFs.

To move beyond high-resolution static state-specific GRNs, we used a stochastic ODE-based framework to construct a dynamic GRN across the 5 states. In addition to recapitulating previously known lineage-defining TFs and their regulons, we identify novel regulons as driving divergent gene activity across the bifurcation trajectory. We also combined the dynamic GRN with the inferred cellular programs to predict TF pairs that combinatorically control B cell fate dynamics. Intriguingly, several of these inferred TF pairs are not detected by conventional network topological metrics. Overall, our framework is generalizable and applicable across contexts to identify cellular programs and regulatory circuits underlying diverse cell fate bifurcations.

July 23, 2025
16:40-17:00
Ledidi: Programmatic design and editing of cis-regulatory elements
Confirmed Presenter: Jacob Schreiber, Research Institute of Molecular Pathology (IMP), Austria
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Alejandra Medina Rivera


Authors List: Show

  • Jacob Schreiber, Jacob Schreiber, Research Institute of Molecular Pathology (IMP)
  • Franziska Lorbeer, Franziska Lorbeer, Research Institute of Molecular Pathology (IMP)
  • Monika Heinzl, Monika Heinzl, Research Institute of Molecular Pathology (IMP)
  • Yang Lu, Yang Lu, University of Waterloo
  • Alexander Stark, Alexander Stark, Research Institute of Molecular Pathology (IMP)
  • William Noble, William Noble, University of Washington

Presentation Overview:Show

The development of modern genome editing tools has enabled researchers to make such edits with high precision, but has left unsolved the problem of designing these edits. As a solution, we propose Ledidi, a computational approach that rephrases the design of genomic edits as a continuous optimization problem where the goal is to produce the desired outcome as measured by one or more predictive models using as few edits from an initial sequence as possible. Ledidi can be paired with any pre-trained machine learning model, and when applied across dozens of such models, we find that Ledidi can quickly design edits to precisely control transcription factor binding, chromatin accessibility, transcription, and enhancer activity across several species. Ledidi can achieve its target objective using surprisingly few edits by converting weak affinity TF binding sites into stronger affinity ones, and can do so almost an order of magnitude faster than other approaches. Unlike other approaches, Ledidi can use several models simultaneously to programmatically design edits that exhibit multiple desired characteristics. We demonstrate this capability by designing uniformly accessible regions with controllable patterns of TF binding, by designing cell type-specific enhancers, and by showing how one can use multiple models that predict the same thing to more robustly design edits. Finally, we introduce the concept of an affinity catalog, in which multiple sets of edits are designed that induce a spectrum of outcomes, and demonstrate the practical benefits of this approach for design tasks and scientific understanding.

July 23, 2025
17:00-17:20
Lilliput: Compact native regulatory element design with machine learning-guided miniaturization
Confirmed Presenter: Laura Gunsalus, Genentech, United States
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Alejandra Medina Rivera


Authors List: Show

  • Laura Gunsalus, Laura Gunsalus, Genentech
  • Avantika Lal, Avantika Lal, Genentech
  • Tommaso Biancalani, Tommaso Biancalani, Genentech
  • Gokcen Eraslan, Gokcen Eraslan, Genentech

Presentation Overview:Show

Size-limited gene therapy vectors require compact cell type-specific regulatory elements. Existing miniaturized sequences have been hand-selected and curated, relying on costly experimental iteration. We present Lilliput, a method for designing compact and specific regulatory elements by nominating and iteratively editing endogenous elements with state-of-the-art DNA sequence-to-function models. Our approach involves scoring elements in silico, removing subsequences with limited predicted impact, and introducing minimal mutations to increase specificity. We demonstrate the effectiveness of our approach by reducing a 10kb heart-specific locus to under 300bp. Our method offers a generalizable framework for engineering mini-elements across diverse target cell types. More broadly, we identify core sequence features sufficient to determine cell-type specific expression patterns, advancing our understanding of the mechanisms underlying precise control of gene expression.

July 23, 2025
17:20-18:00
Invited Presentation: What can the diversity of life of Earth teach us about disease?
Confirmed Presenter: Mafalda Dias
Track: RegSys: Regulatory and Systems Genomics

Room: 11BC
Format: In person
Moderator(s): Marcel Schulz


Authors List: Show

  • Mafalda Dias

Presentation Overview:Show

Biological sequences across the tree of life reflect the cumulative effects of millions of years of evolution. Modelling variation in these sequences offers a powerful window into the sequence constraints that shape protein function and genome regulation — and holds great promise for uncovering the genetic basis of human disease. In this talk, I will explore how recent advances in deep learning are enabling us to decode these evolutionary signatures at scale. I will highlight how such models are already improving diagnostic yield of patient sequencing, by providing evidence for hundreds of new disorders, and offer new avenues to assess disease risk before symptoms arise.