Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide


TransMed COSI

Track Chairs

Venkata Satagopam
Wei GU
Maria Secrier


Schedule subject to change
Wednesday, July 15th
10:40 AM-11:20 AM
Transmed Keynote: 20 Challenges of AI in Medicine
Format: Live-stream

Presentation Overview: Show

The opportunities for using artificial intelligence (AI) and machine learning to improve healthcare are endless. Recent successes include the use of deep learning to identify clinically important patterns in radiographic images. However, there are numerous important challenges which must be addressed before AI can become widely adopted and incorporated into clinical workflows. Some of these challenges come from healthcare data. For example, clinical data from electronic health records (EHR) are notoriously noisy and incomplete. Some of these challenges come from the limitations of AI algorithms. For example, each algorithm looks at data in a different manner. How do you know which are the right methods to employ for a given data set? We will review 10 important clinical data challenges and 10 AI challenges which can impede progress in this area. A number of specific examples and some possible solutions will be provided.

11:20 AM-11:40 AM
Proceedings Presentation: Privacy-preserving Construction of Generalized Linear Mixed Model for Biomedical Computation
Format: Pre-recorded with live Q&A

  • Rui Zhu, Indiana University, United States
  • Chao Jiang, Auburn University, United States
  • Xiaofeng Wang, Indiana University, United States
  • Shuang Wang, Indiana University, United States
  • Hao Zheng, Hangzhou Nuowei Information Technology, China
  • Haixu Tang, Indiana University, United States

Presentation Overview: Show

The Generalized Linear Mixed Model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes into account random effects.Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWAS) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation-Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e., each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects. Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction.

12:00 PM-12:20 PM
Longitudinal multi-omics profiling reveals two biological seasonal patterns in California
Format: Pre-recorded with live Q&A

  • Ahmed Metwally, Stanford University, United States
  • M. Reza Sailani, Stanford University, United States
  • Michael Snyder, Stanford University, United States

Presentation Overview: Show

The influence of seasons on biological processes, particularly at a molecular level, is poorly understood. Moreover, seasons are arbitrarily defined based on four equal segments in the calendar year. In order to identify biological seasonal patterns in humans based on diverse molecular data, rather than calendar dates, we leveraged the power of longitudinal multi-omics data from deep profiling cohort of 105 individuals. These individuals underwent intensive clinical measures and emerging omics profiling technologies including transcriptome, proteome, metabolome, cytokinome as well as gut and nasal microbiome monitoring for up to four years. We identified more than 1000 seasonal variations in omics analytes and clinical measures, including molecular and microbial markers with known seasonality changes, as well as new molecular and microbial markers with seasonality fluctuations. The different molecules grouped into two major seasonal patterns which correlate with peaks in late spring and late fall/early winter in the San Francisco Bay Area. Lastly, we used our recently developed omcis longitudinal differential analysis method, OmicsLonDA, to identify molecules and microbes that demonstrated different seasonal patterns in insulin-sensitive and insulin-resistant individuals. These gained insights have important implications for human health and our methodology framework can be applied to any geographical location.

12:20 PM-12:30 PM
A versatile non-linear transfer learning framework for correcting pre-clinical-based predictors of drug response
Format: Pre-recorded with live Q&A

  • Marcel Reinders, Delft University of Technology, Netherlands
  • Soufiane Mourragui, Delft University of Technology and the Netherlands Cancer Institute, Netherlands
  • Marco Loog, Delft University of Technology, Netherlands
  • Mark van de Wiel, VU University medical center, Netherlands
  • Lodewyk Wessels, The Netherlands Cancer Institute, Netherlands

Presentation Overview: Show

Pre-clinical models have extensively been used to understand the molecular underpinnings of cancer. Cell lines and Patient Derived Xenografts (PDX) are amenable to screening for a wide range of anti-cancer therapeutics. These screens offer a direct measure of drug response for many drugs – data that cannot be collected for human tumors. Pre-clinical models do, however, show behavioral discrepancies with respect to human tumors that impedes the transfer of biomarkers of drug response from pre-clinical models to patients. We present a novel framework for integrating omics data derived from pre-clinical models and tumors. Our approach employs non-linear dimensionality reduction to capture complex genetic interaction patterns that are common to pre-clinical models and humans. These patterns are then used to train a drug response predictor on human tumor data. This work extends PRECISE to allow incorporation of non-linear similarity measures between samples while retaining equivalence to the linear setting.

12:30 PM-12:40 PM
Format: Live-stream

2:00 PM-2:20 PM
Proceedings Presentation: Robust and accurate deconvolution of tumor populations uncovers evolutionary mechanisms of breast cancer metastasis
Format: Pre-recorded with live Q&A

  • Jian Ma, Carnegie Mellon University, United States
  • Russell Schwartz, Carnegie Mellon University, United States
  • Haoyun Lei, Carnegie Mellon University, United States
  • Yifeng Tao, Carnegie Mellon University, United States
  • Xuecong Fu, Carnegie Mellon University, United States
  • Adrian Lee, University of Pittsburgh, United States

Presentation Overview: Show

Motivation: Cancer develops and progresses through a clonal evolutionary process. Understanding progression to metastasis is of particular clinical importance, but is not easily analyzed by recent methods because it generally requires studying samples gathered years apart, for which modern single-cell genomics is rarely an option. Understanding clonal evolution in the metastatic transition thus still depends on unmixing tumor subpopulations from bulk genomic data.
Methods: We develop a method for progression inference from bulk transcriptomic data of paired primary and metastatic samples. We develop a novel toolkit, the Robust and Accurate Deconvolution (RAD) method, to deconvolve biologically meaningful tumor populations from multiple transcriptomic samples spanning distinct progression states. RAD employs a hybrid optimizer to achieve an accurate solution, and a gene module representation to mitigate considerable noise in RNA data. Finally, we apply phylogenetic methods to infer how associated cell populations adapt across the metastatic transition via changes in expression programs and cell-type composition.
Results: We validated the superior robustness and accuracy of RAD over other algorithms on a real dataset, and validated the effectiveness of gene module compression on both simulated and real bulk RNA data. We further applied the methods to a breast cancer metastasis dataset, and discovered common early events that promote tumor progression and migration to different metastatic sites, such as dysregulation of ECM-receptor, focal adhesion, and PI3k-Akt pathways.

2:20 PM-2:30 PM
Deep Hidden Physics Modeling of Cell Signaling Networks
Format: Pre-recorded with live Q&A

  • Rune Linding, Humboldt-Universität zu Berlin, Germany
  • Rune Linding, UCPH, BRIC, Denmark

Presentation Overview: Show

Signaling systems in multicellular organisms are vital for cell-cell communication, tissue
organization and disease. Cancer genomics has unraveled a surprisingly large set of novel
gene lesions from tumors. Our previous studies have globally explored the rewiring of cell
signaling networks underlying malignant transformation caused by kinases and other
signaling proteins. By generating quantitative time-/state-series data and subsequently
using these as input for deep learning based computational modeling - our lab work to
identify the principal changes in the genome, cell signaling and phenotypes of cells
harboring genetic mutations; we validate these models by forward prediction of
experimentally observed phenotypic responses to drug and genetic perturbations. We are
currently deploying such forecasting models on data collected from PDX
tumors to describe how the cell signaling networks are mechanistically, dynamically and
differentially utilized in cancers. Finally, we are working to combine deep learning with
causal/mechanistic models to predict novel treatment and diagnostic strategies for tumors
harboring different genetic lesions. In conclusion, our studies aim to unravel the
fundamental rewiring of cell signaling networks in cancer and will serve as a major
breakthrough in our basic understanding of their impact on the disease, paving the way for
future clinical applications and tumor specific cancer therapy.

2:30 PM-2:40 PM
A deep transfer learning model for extending in vitro CRISPR-Cas9 viability screens to tumors
Format: Pre-recorded with live Q&A

  • Yu-Chiao Chiu, University of Texas Health Science Center at San Antonio, United States
  • Yufei Huang, The University of Texas at San Antonio, United States
  • Yidong Chen, University of Texas Health Science Center at San Antonio, United States

Presentation Overview: Show

The Cancer Dependency Map (DepMap) projects recently employed genome-scale CRISPR-Cas9 loss-of-function screens to identify genes essential for cancer cell proliferation and survival across cancer cell lines. However, it remains very challenging to translate these in vitro results to impracticable-to-screen tumors. To address the challenge, we devised a deep learning model with a unique transfer-learning framework to predict gene dependencies of tumors. The model has a 3-stage design that enables a representation learning of unlabeled tumor genomic data, the prediction of gene dependencies in labeled cell-line screening data, and the application to predict tumor dependencies. The prediction performance was verified using cell-line data. Applying our model to ~8,000 tumors of The Cancer Genome Atlas, we constructed a pan-cancer dependency map of tumors. The results were confirmed by several biomarkers and the response to targeted therapies of the TCGA clinical records. Further investigations revealed gene dependencies associated with specific genomic patterns, such as higher tumor mutation burdens and unique expression/methylation signatures. We also identified highly selective gene dependencies of which inhibitor drugs have been approved to treat cancers. We expect the model to evolve with rapidly developing in vitro CRISPR-Cas9 viability screens and facilitate the translation to identifying therapeutic targets of tumors.

2:40 PM-2:50 PM
The evolution of homologous repair deficiency in high grade serous ovarian carcinoma
Format: Pre-recorded with live Q&A

  • Ailith Ewing, University of Edinburgh, United Kingdom
  • Charlie Gourley, University of Edinburgh, United Kingdom
  • Colin Semple, The University of Edinburgh, United Kingdom

Presentation Overview: Show

Exploiting a large collection of whole genome sequencing (WGS) data from high grade serous ovarian carcinoma (HGSOC) samples (N=207), we have comprehensively characterised mutation and expression at the BRCA1/2 loci. In addition to the known spectrum of short somatic variants (SSVs), we discover that multi-megabase structural variants (SVs) are a frequent but unappreciated source of BRCA1/2 disruption in these tumours. These SVs independently affect a substantial proportion of patients (16%) in addition to those affected by SSVs (25%) to cause homologous recombination repair deficiency (HRD). We also detail compound deficiencies involving SSVs and SVs at both loci, demonstrating that the strongest risk of HRD emerges from combined SVs at both BRCA1 and BRCA2 in the absence of SSVs. Overall, we show that HRD is a complex phenotype in HGSOC, affected by the patterns of short somatic and germline variants, SVs, as well as methylation and expression at multiple loci, and we construct a successful (ROC AUC = 0.62) predictive model of HRD using such variables. These results extend our understanding of the mutational landscape at the BRCA1/2 loci in highly rearranged tumours, and also increase the number of patients predicted to benefit from therapies exploiting HRD in tumours.

2:50 PM-3:00 PM
Format: Live-stream

3:20 PM-3:40 PM
Proceedings Presentation: Identifying diagnosis-specific genotype-phenotype associations via joint multi-task sparse canonical correlation analysis and classification
Format: Pre-recorded with live Q&A

  • Lei Du, Northwestern Polytechnical University, China
  • Fang Liu, Northwestern Polytechnical University, China
  • Kefei Liu, University of Pennsylvania, United States
  • Xiaohui Yao, University of Pennsylvania, United States
  • Shannon Leigh Risacher, Indiana University School of Medicine, United States
  • Junwei Han, Northwestern Polytechnical University, China
  • Lei Guo, Northwestern Polytechnical University, China
  • Andrew Saykin, Indiana University School of Medicine, United States
  • Li Shen, University of Pennsylvania, United States

Presentation Overview: Show

Brain imaging genetics provides us a new opportunity to understand the pathophysiology of brain disorders. It studies the complex association between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype-phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype-phenotype associations.
In this paper, we propose a new joint multi-task learning method, named MT-SCCALR, which absorbs the merits of both SCCA and logistic regression. MT-SCCALR learns genotype-phenotype associations of multiple tasks jointly, with each task focusing on identifying the task-specific genotype-phenotype pattern. To ensure the interpretation and stability, we endow the proposed model with the selection of SNPs and imaging QTs for each diagnostic group alone, while allowing the selection of them shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, the results show that MT-SCCALR yields better or similar canonical correlation coefficients (CCCs) and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype-phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders.

3:40 PM-4:00 PM
POCOVID-Net: Automatic Detection of COVID-19 From a New Lung Ultrasound Imaging Dataset (POCUS)
Format: Pre-recorded with live Q&A

  • Jannis Born, ETH Zurich, Switzerland
  • Gabriel Brändle, Pediatric Emergencies Department, Geneva, Switzerland
  • Manuel Cossio, Biomedical Research Institute August Pi i Sunyer, Barcelona, Spain
  • Marion Disdier, N.A., Switzerland
  • Julie Goulet, Physik Department and Bernstein Center for Computational Neuroscience, Technische Universität München, Germany
  • Jeremie Roulin, N.A., Switzerland
  • Nina Wiedemann, ETH Zurich, Switzerland

Presentation Overview: Show

With the rapid development of COVID-19 into a global pandemic, there is an urgent need for cheap, fast and reliable tools that assist physicians in diagnosing COVID-19. Medical imaging can take a key role in complementing conventional diagnostic tools. Using CT or X-ray scans several deep learning models were demonstrated promising performances.
Here, we present the first framework for COVID-19 detection from ultrasound. Ultrasound is cheap, portable, non-invasive and ubiquitous in medical facilities.
Our contribution is threefold.
First, we gather a lung ultrasound dataset consisting of 1103 images (654 COVID-19, 277 bacterial pneumonia and 172 healthy controls). This dataset is by no means exhaustive, but we processed it to feed deep learning models and make it publicly available, thus delivering a starting point for an open-access initiative of lung ultrasound data. Second, we train a deep convolutional neural network (POCOVID-Net) in a 5-fold cross validation on this data and achieve an accuracy of 89%, and, for COVID-19, a sensitivity of 0.96 (specificity 0.79).
Third, we provide an open-access web service at: https://pocovidscreen.org. The website deploys not only the predictive model but also offers a data-sharing interface, simplifying data contribution for researchers and physicians. Dataset and code are available from: https://github.com/jannisborn/covid19_pocus_ultrasound

4:00 PM-4:20 PM
Drug repurposing to improve health and lifespan in humans
Format: Pre-recorded with live Q&A

  • Handan Melike Donertas, EMBL-EBI, United Kingdom
  • Matias Fuentealba Valenzuela, Institute of Healthy Aging (UCL), United Kingdom
  • Linda Partridge, University College London - Institute of Healthy Ageing; MPI for Biology of Ageing, United Kingdom
  • Janet Thornton, EMBL-EBI, United Kingdom

Presentation Overview: Show

Model organism studies have demonstrated the possibility of lifespan extension up to 10-fold through genetic interventions. Although the effect size is relatively smaller, several drugs have also been shown to modulate lifespan and health during ageing in model organisms. Translation of this information to humans, however, is challenging and requires further investigation. In this study, we perform a comparative and integrative analysis of different drug repurposing studies for human ageing together with the known lifespan modulators in model organisms. We use two different approaches we developed, i) targeting genes which change expression during ageing in humans, and ii) targeting genes associated with an increased risk of multiple late-onset diseases. The first set included a significant number of known lifespan modulators, which also improves health in model organisms. However, drugs targeting multiple diseases did not overlap with the known pro-longevity drugs. This offers new avenues to explore experimentally. Through a systems-level analysis of the targeted pathways and their regulators, we aim to elucidate the mechanisms of lifespan modulation that can also improve health in the elderly.

4:20 PM-4:30 PM
Patient Derived Xenografts Based Pharmacogenomics for Precision Medicine
Format: Pre-recorded with live Q&A

  • Arvind Singh Mer, University of Toronto, Canada
  • Benjamin Haibe-Kains, University Health Network, Canada

Presentation Overview: Show

Patient-derived xenografts (PDXs) are used as reliable preclinical models for testing anti-cancer therapies in precision medicine. Several academic groups, research institutes, and commercial organizations are generating and distributing PDX models. However the distributed nature of PDX model generation and lack of central repository make it challenging to find and analyze PDX pharmacogenomic data. To overcome these challenges we have developed Xenograft Visualization & Analysis (Xeva), an open-source software package in R. Xeva allows PDX growth curve visualization and biomarker discovery. We have also developed XevaDB, a database of PDX drug response and genomic profiles. XevaDB is the first resource to allow concurrent visualization of drug response and associated molecular data such as mutation and CNV. XevaDB contains PDXs from >600 individual patients and >70 drugs. Using XevaDB, we have performed meta-analysis of PDX pharmacogenomic data and have identified 90 pathways significantly associated with response to 53 drugs (FDR < 5%). Our results show that activity of the EGFR signaling pathway is significantly associated with Erlotinib response in lung cancer and Binimetinib is associated with the MAPK pathway. Xeva and XevaDB tool set provides a comprehensive resource to search and explore the PDX pharmacogenomic data for precision medicine.

4:30 PM-4:40 PM
Format: Live-stream

5:00 PM-5:10 PM
ReactomeGSA - Efficient Multi-Omics Comparative Pathway Analysis
Format: Pre-recorded with live Q&A

  • Johannes Griss, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, United Kingdom
  • Guilherme Viteri, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Kostas Sidiropoulos, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Vy Nguyen, Medical University of Vienna, Austria
  • Antonio Fabregat, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Henning Hermjakob, European Bioinformatics Institute (EMBL-EBI), United Kingdom

Presentation Overview: Show

Pathway analyses are key methods to analyse ‘omics experiments. The continuous increase in public data offers the great opportunity to extend own analyses. This is often complicated by the use of different ‘omics technologies and different species. Therefore, researchers still require considerable bioinformatics knowledge to perform such analyses.
Here we present ReactomeGSA for comparative pathway analyses of multi-omics datasets. ReactomeGSA is integrated in Reactome’s existing web interface and accessible through the ReactomeGSA R Bioconductor package with explicit support for scRNA-seq data. Data from different species is mapped to a common pathway space. Public data from ExpressionAtlas and Single Cell ExpressionAtlas can be directly integrated in the analysis. ReactomeGSA thereby greatly reduces the technical barrier for multi-omics, cross-species, comparative pathway analyses.
We used ReactomeGSA to characterise the role of B cells in anti-tumour immunity. We compared B cell rich and poor human cancer samples from five TCGA transcriptomics and two CPTAC proteomics studies. B cell-rich lung adenocarcinoma samples lack the otherwise present activation through NFkappaB. This may be linked to tumour associated IgG+ plasma cells that lack NFkappaB activation in single-cell RNAseq data from human melanoma. This showcases how ReactomeGSA can derive novel biomedical insights by integrating large multi-omics datasets.

5:10 PM-5:50 PM
Keynote: Precisely Practicing Medicine from 700 Trillion Points of Data
Format: Live-stream

  • Atul Butte, Priscilla Chan and Mark Zuckerberg Distinguished Professor  of  Pediatrics, Bioengineering &  Therapeutic Sciences, and Epidemiology & Biostatistics at UCSF , United States

Presentation Overview: Show

There is an urgent need to take what we have learned in our new data-driven era of medicine, and use it to create a new system of precision medicine, delivering the best, safest, cost-effective preventative or therapeutic intervention at the right time, for the right patients. Dr. Butte's lab at the University of California, San Francisco builds and applies tools that convert trillions of points of molecular, clinical, and epidemiological data -- measured by researchers and clinicians over the past decade and now commonly termed “big data” -- into diagnostics, therapeutics, and new insights into disease. Dr. Butte, a computer scientist and pediatrician, will highlight his center’s recent work on integrating electronic health records data across the entire University of California, and how analytics on this “real world data” can lead to new evidence for drug efficacy, new savings from better medication choices, and new methods to teach intelligence – real and artificial – to more precisely practice medicine.

5:50 PM-6:00 PM
Closing remarks
Format: Live-stream

  • Maria Secrier, University College London, United Kingdom