Computational Approaches for Spatial Omics

Attention Presenters - please review the Speaker Information Page available here

Mapping the spatial organization of cells and their communication within tissues is crucial for gaining insights into the processes of development and disease formation. Recent advent of spatial omics technologies has empowered researchers to detect biological molecules in their native location within a tissue. Spatial transcriptomics (ST), for example, measure genome-wide mRNA expression across thousands of spots (containing up to 10 cells) on a tissue slice while preserving information about the location of spots and allowing characterization of the microenvironment. Emerging multi-omic spatial technologies further integrate transcriptome-wide gene expression and multiplexed surface protein expression with histology images from the same formalin-fixed, paraffin-embedded tissue section. Given the nascency of such high-throughput spatially omics technologies, new computational methods for analyzing the resulting data are still actively being developed.

Schedule subject to change
All times listed are in EDT
Wednesday, May 15th
10:30-11:00
Invited Presentation: Delineating spatial gene expression signatures of pathology conditions from single-cell spatial transcriptomics data using graph attention networks
Room: Cathedral of Learning, G24
Format: Live from venue

  • Yufei Huang


Presentation Overview: Show

Single-cell spatially resolved transcriptomics (scST) allows for measuring gene expression and spatial locations within tissues, holding the potential for delineating the spatial molecular signatures underlying tissue structures and pathology phenotypes. While differential expression analysis (DEA) is still the main choice of method for determining the pathology associated gene expression patterns, it fails to determine the spatial gene expression (SGE) signatures that reveal intra-pathology heterogeneity. To address this, we propose SPathMap, a novel graph-based learning paradigm utilizing a Graph Attention Network (GAT) to identify spatial expression patterns associated with different pathological conditions. SpathMap creates a GAT representation of the scST dataset and is trained to learn unique spatial gene expression patterns for accurate condition classification. Interpreting the trained GAT generates condition-specific spatial significance scores of each gene in every cell. Clustering gene-level significance scores from scST data allows the delineation of the condition-specific spatial expression patterns. We tested the performance and demonstrated the efficacy of SpathMap classification of subtypes of lung adenocarcinoma and COVID-19 pathology. We showed that SpathMap revealed unique intra-tumoral microenvironments of lung adenocarcinoma, characterized by distinct spatial features and gene expression signatures of fibroblasts, neutrophils, and T cells. Spatially significant genes (SSGs) showed stronger associations with survival than differentially expressed genes (DEGs), with Gene Set Enrichment Analysis (GSEA) revealing richer functions of specific clusters. In COVID-19 patients, SSGs indicated infection-related changes such as organizing pneumonia and lymphocytic infiltration. Specifically, for example, we find that organizing pneumonia revealed two clusters with a checkered spatial pattern with cells from one cluster encircling the cells from another cluster, whose cell type niches indicative of potential damage repair mechanisms of fibroblasts.

11:00-11:30
Invited Presentation: Spatial Transcriptomics technologies and biological applications
Room: Cathedral of Learning, G24
Format: Live from venue

  • Amanda Poholek
11:30-12:00
Invited Presentation: Accurate and Efficient Integrative Reference-Informed Spatial Domain Detection for Spatial Transcriptomics
Room: Cathedral of Learning, G24
Format: Live-stream

  • Ying Ma


Presentation Overview: Show

Spatially resolved transcriptomics (SRT) studies are becoming increasingly common and increasingly large, offering unprecedented opportunities to characterize the spatial and functional organization of complex tissues. Here, we introduce a computational method, IRIS, that characterizes the spatial organization of complex tissues through accurate and efficient detection of spatial domains. IRIS uniquely leverage the widespread availability of single-cell RNA-seq data for reference-informed spatial domain detection, integrates multiple SRT tissue slices jointly while explicitly considering correlation both within and across slices, produces biologically interpretable spatial domains, and benefits from multiple algorithmic innovations for highly scalable computation. We demonstrate the advantages of IRIS through in-depth analysis of six SRT datasets from different technologies across various tissues, species, and spatial resolutions. In these applications, IRIS attains an unprecedented 58% ~ 1,083% accuracy gain over existing methods in the gold standard dataset with known ground truth. Furthermore, IRIS is 8.5 ~ 134.7 times faster than existing methods in moderate-sized datasets and is the only method applicable to large-scale SRT datasets, including the very recent stereo-seq and 10x Xenium. As a result, IRIS uncovers the fine-scale structures of brain regions, reveals the spatial heterogeneity of distinct tumor microenvironments, and characterizes the structural changes of the seminiferous tubules in the testis associated with diabetes, all at a speed and accuracy unachievable by existing approaches.

14:30-14:45
Predicting spatial location from gene expression: a new analytical approach to spatial transcriptomics
Room: Cathedral of Learning, G24
Format: Live from venue

  • Yeojin Kim, Georgia Institute of Technology, United States
  • Zijun Wu, Georgia Institute of Technology, United States
  • Hee-Sun Han, University of Illinois, United States
  • Dave Zhao, University of Illinois, United States
  • Saurabh Sinha, Georgia Institute of Technology, United States


Presentation Overview: Show

Spatial transcriptomics data have opened up new avenues of investigation into the spatial organization of tissues and the relationship between cellular location, regulation and function. For example, several computational methods can combine transcriptomic and spatial descriptions of a cell into a single representation called an “embedding”, facilitating tasks such as cell clustering and tissue segmentation. Studies of the relationship between gene expression and spatial location are not new, and several groups have sought to quantify the extent to which one determines the other, giving us the concept of “positional information” of a cell encoded in its expression profile. Here, we combine these two complementary ideas of cell embedding and positional information, to develop a new computational tool for spatial transcriptomics data analysis.

At the core of our tool is a neural network model that maps each cell’s transcriptome to a vector representation (embedding) such that proximally located cells have similar embeddings. We demonstrate through extensive applications to several real and synthetic data sets that this unique approach to cell representations offers several practical advantages, while also allowing us to extend the quantification of “positional information” to high dimensional ST data.

First, we show that learnt embeddings can capture spatial relationships among cells as accurately as the state-of-the-art embedding methods such as GraphST, SEDR and STAGATE, while additionally ensuring that the embeddings are fully determined by cellular transcriptomes. This latter feature imparts generalizability to the method, allowing the same embedding function to be meaningful for additional biological samples. We demonstrate such generalizability by training the neural network on one tissue and using the trained model to embed cells in a “test” tissue, still achieving high accuracy in predicting cellular location from expression. This also amounts to solving the “spatial reconstruction” task, addressed by previous methods such as TANGRAM, but now in a completely map-free manner, i.e., without the need of a reference tissue to which the reconstructed spatial information must be mapped.

Second, we show that the model can seamlessly learn embeddings for cells in multiple samples of a tissue, thus providing a universal coordinate system to describe cellular locations. We then use this coordinate system to design a statistical test for differences in spatial expression of a gene between two tissue samples. We use simulations to assess the statistical power of this test and use it to detect differential spatial expression of individual genes between brains representing different biological conditions.

14:45-15:00
Systematic benchmarking of imaging spatial transcriptomics platforms in FFPE tissues
Room: Cathedral of Learning, G24
Format: Live from venue

  • Brittany Goods, Dartmouth College, United States
  • Samouil Fahi, Broad Institute, United States
  • Huan Wang, Broad Institute, United States
  • Ruixu Huang, Dartmouth College, United States


Presentation Overview: Show

Imaging spatial transcriptomics (iST) technology quantifies the expression of targeted genes across a tissue section, facilitating the understanding of cell-type organization in tissues, cell-cell interactions, and molecular interactions within tissues. While several iST platforms, including MERSCOPE® (Vizgen), Xenium® (10x Genomics), and CosMx® (NanoString Technologies), have been developed and launched to profile formalin-fixed, paraffin-embedded (FFPE) human samples commonly generated in clinics, their head-to-head performance in real-world circumstances has not been evaluated. This is critical as storage of FFPE tissues is a prevalent method for storing precious clinical tissues for the creation of biobanks. This study presents a systematic benchmarking of the aforementioned iST platforms on two tissue microarrays (TMAs) representing sixteen tissue types and seven tumor types from FFPE samples (total of 23 tissues and >3.3M cells). We evaluated these platforms using a range of technical parameters to assess overall performance. We measured the technical specificity by calculating the percentage of all transcripts corresponding to genes detected relative to the total number of calls (including negative control probes and unused barcodes). We determined each platform’s ability to specifically identify known lineage markers using selected genes with canonical expression patterns. We quantified the sensitivity by measuring the number of genes detected above noise. We examined panel-to-panel and date-to-date reproducibility by correlating total transcript counts per core between different panels of each technology. We also investigated the data concordance with imaging spatial proteomics and orthogonal databases from The Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEx) programs. Additionally, we evaluated the quality of cell segmentation of each platform based on the co-expression patterns of known cell lineage markers. Taken together, this comparative analysis highlighted the impact of the chosen iST platforms on the study results, indicating that specific imaging systems are more suited for particular tissues or tumor types. These insights are crucial for selecting appropriate iST technologies, aligning with specific research needs and sample types, thus serving as a valuable resource for the scientific community.

15:00-15:30
Invited Presentation: Utilize the Cellular Heterogeneity at the Single Cell Level for Cancer Treatment and Survival Prediction
Room: Cathedral of Learning, G24
Format: Live-stream

  • Lana Garmire


Presentation Overview: Show

Heterogeneity is a fundamental property of multicellular organisms and tissues, including cancers. In this talk, I will describe a new class of drug recommendation methods include ASGARD and STADS, which computationally repurposes drugs over heterogeneous cell types in the single cell RNA-Seq data and spatial transcriptomics data respectively. Next, I will go over new surprising findings on breast cancer subtypes based on a large population cohort of single-cell imaging mass cytometry data. Through these examples, we show case the power of computational tools in cancer treatment and prognosis prediction, by taking advantage of inter-cellular heterogeneity.

16:00-16:30
Invited Presentation: Deep learning reveals cell fate transition
Room: Cathedral of Learning, G24
Format: Live from venue

  • Guangyu Wang
16:30-17:00
Invited Presentation: Leveraging deep learning and large language models for precision oncology
Room: Cathedral of Learning, G24
Format: Live from venue

  • Yu-Chiao Chiu, University of Pittsburgh, United States


Presentation Overview: Show

The advances in genome sequencing and high-throughput screening have led to large-scale data resources for cancer discovery, such as the Genomics of Drug Sensitivity in Cancer (GDSC) and the Cancer Dependency Map (DepMap). However, due to data heterogeneity and dimensionality, it remains challenging to comprehensively integrate these datasets to study the central dogma of pharmacogenomics: how multi-omics determine cellular response to perturbations. Our research focuses on developing cutting-edge deep learning models to capture and predict intricate pharmacogenomic patterns among high-dimensional genomics and high-throughput drug screens.

The talk will introduce our deep learning model that predicts cancer cells’ responses to a broad panel of approved and investigational anti-cancer drugs. The model features a specialized “transfer learning” design that enables the translation of in vitro screens to impracticable-to-screen tumors. In addition, we have developed a suite of user-friendly web tools to facilitate access to large-scale pharmacogenomic data and deep learning models, thereby promoting their utilization by biomedical and clinical researchers. The talk will also introduce our recent pilot study, which leverages emerging large language models to infer gene-drug relationships in cancer based on scientific literature. The studies demonstrate the exciting promise of deep learning and large language models for precision oncology by implementing intelligent inference and prioritization of therapeutic agents to enhance the efficiency and precision of drug development.

17:00-18:00
Panel:
Room: Cathedral of Learning, G24
Format: Live from venue