Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

TransMed COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in UTC
Sunday, July 25th
11:00-11:05
Opening
Format: Live-stream

Moderator(s): Venkata Satagopam

  • Venkata Satagopam
11:05-11:40
TransMed Keynote
Format: Live-stream

Moderator(s): Venkata Satagopam

  • Prof. Dr. Jochen Klucken
11:40-12:00
Multimodal analysis of cell-free DNA whole genome sequencing for pediatric cancers with low mutational burden
Format: Pre-recorded with live Q&A

Moderator(s): Venkata Satagopam

  • Peter Peneder, St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria, Austria
  • Adrian Stütz, St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria, Austria
  • Christoph Bock, CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria, Austria
  • Eleni M. Tomazou, St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria, Austria

Presentation Overview: Show

Sequencing of cell-free DNA in the blood of cancer patients (liquid biopsy) provides attractive opportunities for early diagnosis, assessment of treatment response, and minimally invasive disease monitoring. To unlock liquid biopsy analysis for pediatric tumors with few genetic aberrations, we introduce an integrated genetic/epigenetic analysis method and demonstrate its utility on 241 deep whole genome sequencing profiles of 95 patients with Ewing sarcoma and 31 patients with other pediatric sarcomas. Our method achieves sensitive detection and classification of circulating tumor DNA in peripheral blood independent of any genetic alterations. Moreover, we benchmark different metrics for cell-free DNA fragmentation analysis, and we introduce the LIQUORICE algorithm for detecting circulating tumor DNA based on cancer-specific chromatin signatures. Finally, we combine several fragmentation-based metrics into an integrated machine learning classifier for liquid biopsy analysis that is tailored to cancers with low mutation rates while exploiting widespread epigenetic deregulation. Clinical associations highlight the potential value of cfDNA fragmentation patterns as prognostic biomarkers in Ewing sarcoma. In summary, our study provides a comprehensive analysis of circulating tumor DNA beyond recurrent somatic mutations, and it renders the benefits of liquid biopsy more readily accessible for childhood cancers.

12:00-12:20
DriveWays: A Method for Identifying Possibly Overlapping Driver Pathways in Cancer
Format: Pre-recorded with live Q&A

Moderator(s): Venkata Satagopam

  • Ilyes Baali, Antalya Bilim University, Turkey
  • Hilal Kazan, Antalya Bilim University, Turkey
  • Cesim Erten, Antalya Bilim University, Turkey

Presentation Overview: Show

The majority of the previous methods for identifying cancer driver modules output nonoverlapping modules. This assumption is biologically inaccurate as genes can participate in multiple molecular pathways. This is particularly true for cancer-associated genes as many of them are network hubs connecting functionally distinct set of genes. It is important to provide combinatorial optimization problem definitions modeling this biological phenomenon and to suggest efficient algorithms for its solution. We provide a formal definition of the Overlapping Driver Module Identification in Cancer (ODMIC) problem. We show that the problem is NP-hard. We propose a seed-and-extend based heuristic named DriveWays that identifies overlapping cancer driver modules from the graph built from the IntAct PPI network. DriveWays incorporates mutual exclusivity, coverage, and the network connectivity information of the genes. We show that DriveWays outperforms the state-of-the-art methods in recovering well-known cancer driver genes performed on TCGA pan-cancer data. Additionally, DriveWay’s output modules show a stronger enrichment for the reference pathways in almost all cases. Overall, we show that enabling modules to overlap improves the recovery of functional pathways filtered with known cancer drivers, which essentially constitute the reference set of cancer-related pathways.

12:40-13:00
Analyzing spatial heterogeneity of tumor mutation burden and immune infiltrates on whole slide images signals correlation with bladder cancer survival
Format: Pre-recorded with live Q&A

Moderator(s): Wei Gu

  • Tae Hyun Hwang, Cleveland Clinic, United States
  • Hongming Xu, Dalian University of Technolog, Cleveland Clinic, China
  • Sunho Park, Cleveland Clinic, United States
  • Jean Clemenceau, Cleveland Clinic, United States
  • Jinhwan Choi, Cleveland Clinic, United States
  • Sung Hak Lee, Seoul St.Mary’s Hospital, Korea, Republic of

Presentation Overview: Show

Recent work has shown that high tumor mutation burden (TMB-H) could result in an increased number of neoepitopes from somatic mutations expressed by a patient’s tumor cells, which can be recognized and targeted by neighboring tumor-infiltrating lymphocytes (TILs). A deeper understanding of the spatial heterogeneity and organization of tumor cells and their neighboring immune infiltrates could provide new insights into the biology of tumor progression and treatment response, including immunotherapy. We developed and applied computational approaches using digital whole slide images (WSIs) to investigate the spatial heterogeneity and organization of regions harboring TMB-H tumor cells and TILs within tumors, and their impact in prognostic and predictive utility. In experiments using WSIs from The Cancer Genome Atlas bladder cancer (BLCA), our findings show that WSI-based approaches can reliably predict patient-level TMB status, delineate spatial TMB heterogeneity and identify co-organization with TILs. TMB-H patients with low spatial heterogeneity enriched with high TILs showed improved overall survival. Furthermore, we evaluated our models using BLCA patients treated with immunotherapy from the real-world clinical setting. Our results indicate both prognostic and predictive roles for image-based TMB and TILs.

13:00-13:10
Analysis of single nucleus transcriptome profiles from developing human cerebellum reveals potential cellular origins of pediatric brain tumors
Format: Pre-recorded with live Q&A

Moderator(s): Wei Gu

  • Konstantin Okonechnikov, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Mari Sepp, Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, Heidelberg, Germany
  • Piyush Joshi, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Kevin Leiss, Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, Heidelberg, Germany
  • Ioannis Sarropoulos, Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, Heidelberg, Germany
  • Martin Sill, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Natalie Jäger, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • David T.W. Jones, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Marcel Kool, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Lena Kutscher, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
  • Henrik Kaessmann, Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, Heidelberg, Germany
  • Stefan M. Pfister, Hopp Children's Cancer Center Heidelberg (KiTZ), Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany

Presentation Overview: Show

The majority of pediatric brain tumors such as medulloblastoma, ependymoma or pilocytic astrocytoma are arising in the cerebellum. Various molecular biology techniques, such as for example methylation profiling, improved the accuracy of clinical diagnosis. Nevertheless, the lack of knowledge about tumors’ cellular origin has hampered further breakthroughs in treatment strategies. Such knowledge could help better understand tumor specific biology and distinguish potential treatment targets. Single cell sequencing is well suited to solve this task by deconvoluting the cellular composition. We performed global single nucleus sequencing on human cerebellum samples across several developmental time points and generated transcriptome profiles from ~200k single cells. After identification of cell types forming the developing cerebellum we established computational approaches for the detailed comparison of normal cells state to bulk pediatric brain tumors transcriptome profiles. Examination of 65 tumor classes allowed to confirm known relationships and discover novel cerebellar associations along with shared specific genes that could potentially serve as treatment targets in the future. We further integrated single cell tumor data with correlated cell type lineages to better understand tumor development. All obtained results were combined into an interactive web service (BRAIN-MATCH) allowing to perform these analyses at different settings and visualize the results.

13:10-13:20
Ranking Cancer Drivers via Betweenness-based Outlier Detection and Random Walks
Format: Pre-recorded with live Q&A

Moderator(s): Wei Gu

  • Cesim Erten, Antalya Bilim University, Turkey
  • Aissa Houdjedj, Antalya Bilim University, Turkey
  • Hilal Kazan, Antalya Bilim University, Turkey

Presentation Overview: Show

Background: Recent cancer genomic studies have generated detailed molecular data on a large number of cancer patients. A key remaining problem in cancer genomics is the identification of driver genes.
Results: We propose BetweenNet, a computational approach that integrates genomic data with a protein-protein interaction network to identify cancer driver genes. BetweenNet utilizes a measure based on betweenness centrality on patient specific networks to identify the so-called outlier genes that correspond to dysregulated genes for each patient. Setting up the relationship between the mutated genes and the outliers through a bipartite graph, it employs a random-walk process on the graph, which provides the final prioritization of the mutated genes. We compare BetweenNet against state-of-the-art cancer gene prioritization methods on lung, breast, and pan-cancer datasets.
Conclusions: Our evaluations show that BetweenNet is better at recovering known cancer genes based on multiple reference databases. Additionally, we show that the GO terms and the reference pathways enriched in BetweenNet ranked genes and those that are enriched in known cancer genes overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods.

13:20-13:40
Proceedings Presentation: Optimising Blood-Brain Barrier Permeation through Deep Reinforcement Learning for De Novo Drug Design
Format: Pre-recorded with live Q&A

Moderator(s): Wei Gu

  • Tiago Pereira, University of Coimbra, Portugal
  • Maryam Abbasi, Univeristy of Coimbra, Portugal
  • José Oliveira, University of Aveiro, Portugal
  • Bernardete Ribeiro, University of Coimbra, Portugal
  • Joel Arrais, University of Coimbra, Portugal

Presentation Overview: Show

The process of placing new drugs into the market is time-consuming, expensive and complex. The application of computational methods for designing molecules with bespoke properties can contribute to saving resources throughout this process. However, the fundamental properties to be optimised are often not considered or conflicting with each other. In this work, we propose a novel approach to consider both the biological property and the bioavailability of compounds through a deep reinforcement learning framework for the targeted generation of compounds. We aim to obtain a promising set of selective compounds for the adenosine A2A receptor and, simultaneously, that have the necessary properties in terms of solubility and permeability across the blood-brain barrier to reach the site of action. The cornerstone of the framework is based on a Recurrent Neural Network architecture, the Generator. It seeks to learn the building rules of valid molecules to sample new compounds further. Also, two Predictors are trained to estimate the properties of interest of the new molecules. Finally, the fine-tuning of the Generator was performed with reinforcement learning, integrated with multi-objective optimisation and exploratory techniques to ensure that the Generator is adequately biased.
The biased Generator can generate an interesting set of molecules, with approximately 85% having the two fundamental properties biased as desired. Thus, this approach has transformed a general molecule generator into a model focused on optimising specific objectives. Furthermore, the molecules' synthesisability and drug-likeness demonstrate the potential applicability of the de novo drug design in medicinal chemistry.

13:40-14:00
Reconciling Multiple Connectivity Scores for Drug Repurposing
Format: Pre-recorded with live Q&A

Moderator(s): Wei Gu

  • Kewalin Samart, Michigan State University, United States
  • Phoebe Tuyishime, Michigan State University, United States
  • Stephanie Hickey, Michigan State University, United States
  • Arjun Krishnan, Michigan State University, United States
  • Janani Ravi, Michigan State University, United States

Presentation Overview: Show

The key principle of recent drug repurposing methods is an efficacious drug will reverse the disease molecular ‘signature’ with minimal side-effects. This principle was defined and popularized by the influential ‘connectivity map’ study in 2006 regarding reversal relationships between disease- and drug-induced gene expression profiles, quantified by a disease-drug ‘connectivity score.’ Over the past 15 years, several studies have proposed variations in calculating connectivity scores towards improving accuracy and robustness in light of massive growth in reference drug profiles. However, these variations have been formulated inconsistently using various notations and terminologies even though various scores are based on a common set of conceptual and statistical ideas. Here, we present a systematic reconciliation of multiple disease-drug similarity metrics and connectivity scores by defining them using consistent notation and terminology. In addition to providing clarity and deeper insights, this coherent definition of connectivity scores and their relationships provides a unified scheme that newer methods can adopt, enabling the computational drug-development community to compare and investigate different approaches easily. This resource will be available as a live document (https://jravilab.github.io/connectivity_scores) coupled with a GitHub repository (https://github.com/JRaviLab/connectivity_scores) to facilitate the continuous and transparent integration of newer methods.

14:20-14:40
Identification of transcriptional network disruptions in drug-resistant prostate cancer with TraRe
Format: Pre-recorded with live Q&A

Moderator(s): Maria Secrier

  • Charles Blatti, University of Illinois at Urbana-Champaign, Urbana, IL, United States
  • Jesus De la Fuente Cedeño, TECNUN, University of Navarra, Spain., Spain
  • Huanyao Gao, College of Medicine, Mayo Clinic, USA, United States
  • Irene Marin, Computational Biology Program, CIMA University of Navarra, Spain., Spain
  • Zikun Chen, University of Illinois at Urbana-Champaign, Urbana, IL, United States
  • Sihai Dave Zhao, University of Illinois at Urbana-Champaign, Urbana, IL, United States
  • Weinshilboum Richard, College of Medicine, Mayo Clinic, USA., United States
  • Krishna R Kalari, College of Medicine, Mayo Clinic, USA, United States
  • Liewei Wang, College of Medicine, Mayo Clinic, USA., United States
  • Mikel Hernaez, Computational Biology Program, CIMA University of Navarra, Spain., Spain

Presentation Overview: Show

The identification of significant changes in Gene Regulatory Networks (GRNs) under different response groups can help discover novel molecular diagnostics and prognostic signatures.

In this work, we present a computational method, TraRe, which combines unsupervised learning and non-parametric testing to mechanistically understand how transcription networks are differentially regulated.

We applied TraRe on RNAseq data of metastatic Castration-Resistant Prostate Cancer (CRPC) patients from the PROMOTE clinical study (NCT 01953640) treated with abiraterone (ABI). Rewired GRNs between ABI- responders and non-responders were found to be enriched in genes down-regulated in prostate cancer samples, as well as in transcription factors (TFs) involved in the androgen receptor signaling pathway as well as associated with other cancers. Further MDX1, a TF that acts as a transcriptional repressor and is a candidate tumor suppressor, is among the top rewiring-specific TFs.

Key rewired TF-target relationships were validated in vitro via qRT-PCR. After knock-down of the top TFs, expression levels of four key genes were significantly changed between parent cell lines and ABI-resistant cell lines.

TraRe efficiently uncovers GRNs from high-throughput sequencing data, performing differential network analysis that unravel phenotype-specific regulatory disruptions.

14:40-15:00
NeDRex - an integrative and interactive network medicine platform for drug repurposing
Format: Pre-recorded with live Q&A

Moderator(s): Maria Secrier

  • Sepideh Sadegh, Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Germany
  • James Skelton, School of Computing, Newcastle University, United Kingdom
  • David B. Blumenthal, Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Germany
  • Elisa Anastasi, School of Computing, Newcastle University, United Kingdom
  • Gihanna Galindez, Division Data Science in Biomedicine, PLRI, TU Braunschweig and Hannover Medical School, Germany
  • Anil Wipat, School of Computing, Newcastle University, United Kingdom
  • Tim Kacprowski, Division Data Science in Biomedicine, PLRI, TU Braunschweig and Hannover Medical School, Germany
  • Jan Baumbach, Chair of Computational Systems Biology, University of Hamburg, Germany

Presentation Overview: Show

Traditional drug discovery faces a severe efficacy crisis. Repurposing of registered drugs provides an alternative with lower costs, reduced risk, and faster clinical application. The underlying mechanisms of complex diseases are best described by disease modules. These modules represent disease-relevant pathways and contain potential drug targets which can be identified in silico with network-based methods. The data necessary for the identification of disease modules and network-based drug repurposing are scattered across independent databases, moreover, existing studies have been limited to predictions for specific diseases or non-translational algorithmic approaches. Hence, there is an unmet need for adaptable tools allowing biomedical researchers to employ network-based drug repurposing approaches for their specific use cases. We close this gap with NeDRex, an integrative and interactive platform for network-based drug repurposing. NeDRex integrates different data sources covering genes, proteins, drugs, drug targets, disease annotations, and their relationships, resulting in a network with 350,142 nodes and 14,127,004 edges. NeDRex allows for constructing heterogeneous biological networks, mining them for disease modules, and prioritizing drugs targeting disease mechanisms. NeDRex generalizes the approach implemented in our previous work for COVID-19 drug repurposing, CoVex (https://doi.org/10.1038/s41467-020-17189-2), to be applicable for other diseases.

15:00-15:10
A novel feature selection pipeline for identifying predictive targets associated with drug toxicity
Format: Pre-recorded with live Q&A

Moderator(s): Maria Secrier

  • Yun Hao, University of Pennsylvania, United States
  • Jason Moore, University of Pennsylvania, United States

Presentation Overview: Show

In silico assessment of drug toxicity is becoming a critical step in drug development. Existing models are limited by low accuracy and lack of interpretability. Further, they often fail to explain cellular mechanisms underlying structure-toxicity associations. We addressed these limitations by incorporating target profile as an intermediate connecting structure to toxicity. To accommodate for high-dimensional feature space, we developed a pipeline that can identity subset of predictive features. We implemented our pipeline to study 569 targets and 815 adverse events. The features identified by our pipeline comprise less than ten percent of the original feature space, nevertheless, they accurately predicted binding outcomes for 377 targets and toxicity outcomes for 36 adverse events. We demonstrated that predictive targets tend to be differentially expressed in the tissue of toxicity. We rediscovered key cellular functions associated with cardiotoxicity from the predictive targets, as well as markers of skin and liver diseases. We found evidence supporting diagnostic/therapeutic applications of some predictive targets in hepatotoxicity and nephrotoxicity. Our findings highlighted the critical role of predictive targets in cellular mechanisms leading to toxicity. In general, our study improved the interpretability of toxicity prediction without sacrificing accuracy. Our novel pipeline may benefit future studies of high-dimensional datasets.

15:10-15:20
Closing of Day 1
Format: Live-stream

Moderator(s): Maria Secrier

Monday, July 26th
11:00-11:40
TransMed Keynote
Format: Live-stream

Moderator(s): Irina Balaur

  • Dr. Serena Scollen
11:40-12:00
Proceedings Presentation: “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases
Format: Pre-recorded with live Q&A

Moderator(s): Irina Balaur

  • Dillon Aberasturi, The University of Arizona, United States
  • Nima Pouladi, The University of Utah, United States
  • Samir Rachid Zaim, The University of Arizona, United States
  • Colleen Kenost, The University of Utah, United States
  • Joanne Berghout, Pfizer, United States
  • Walter W. Piegorsch, The University of Arizona, United States
  • Yves A. Lussier, The University of Utah, United States

Presentation Overview: Show

Motivation: Identifying altered transcripts between very small human cohorts is particularly challenging and is compounded by the low accrual rate of human subjects in rare diseases or sub-stratified common disorders. Yet, single-subject studies (S3) can compare paired transcriptome samples drawn from the same patient under two conditions (e.g., treated vs pre-treatment) and suggest patient-specific responsive biomechanisms based on the overrepresentation of functionally defined gene sets. These improve statistical power by: (i) reducing the total features tested and (ii) relaxing the requirement of within-cohort uniformity at the transcript level. We propose Inter-N-of-1, a novel method, to identify meaningful differences between very small cohorts by using the effect size of “single-subject-study”-derived responsive biological mechanisms.
Results: In each subject, Inter-N-of-1 requires applying previously published S3-type N-of-1-pathways MixEnrich to two paired samples (e.g., diseased vs unaffected tissues) for determining patient-specific enriched genes sets: Odds Ratios (S3-OR) and S3-variance using Gene Ontology Biological Processes. To evaluate small cohorts, we calculated the precision and recall of Inter-N-of-1 and that of a control method (GLM+EGS) when comparing two cohorts of decreasing sizes (from 20 vs 20 to 2 vs 2) in a comprehensive six-parameter simulation and in a proof-of-concept clinical dataset. In simulations, the Inter-N-of-1 median precision and recall are > 90% and >75% in cohorts of 3 vs 3 distinct subjects (regardless of the parameter values), whereas conventional methods outperform Inter-N-of-1 at sample sizes 9 vs 9 and larger. Similar results were obtained in the clinical proof-of-concept dataset.
Availability: R software is available at Lussierlab.net/BSSD.

12:00-12:20
Assessing the role of Digital Device Technology in Alzheimer’s Disease using Artificial Intelligence
Format: Pre-recorded with live Q&A

Moderator(s): Irina Balaur

  • Holger Fröhlich, Fraunhofer SCAI and University of Bonn, Germany
  • Meemansa Sood, Fraunhofer SCAI and University of Bonn, Germany
  • Mohamed Aborageh, Fraunhofer SCAI and University of Bonn, Germany
  • Robbert Harms, Altoida Inc.* 2100 West Loop South, Suite 1450, Houston, TX, 77027 - USA, United States
  • Maximilian Bügler, Altoida Inc.* 2100 West Loop South, Suite 1450, Houston, TX, 77027 - USA, United States
  • Ioannis Tarnanas, Altoida Inc.* 2100 West Loop South, Suite 1450, Houston, TX, 77027 - USA, United States

Presentation Overview: Show

In Alzheimer’s Disease (AD) the use of digital technologies has gained a lot of attention, because it may help to diagnose the disease in a pre-symptomatic stage. However, before any use in clinical routine, digital measures (DMs) need to be evaluated carefully by assessing their relationship to established clinical scores and understanding their diagnostic benefit. Along these lines, the IMI project RADAR-AD (www.radar-ad.org) evaluates a smartphone based virtual reality game panel that can help to assess cognitive impairment. In our work we applied Variational Autoencoder Modular Bayesian Network (VAMBN) [1] on the virtual reality game data and analysed connections between DMs and cognitive assessments (e.g. Mini Mental State Examination). Based on our model we then predicted DMs within the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort. This resulted into a network that allowed us to disentangle and quantify the relationship between DMs, established clinical scores, brain volumes as well as molecular mechanisms. Therefore, DMs may have the potential to act as a vital measure in the diagnosis of AD in a pre-symptomatic stage.

[1] Gootjes-Dreesbach L, Sood M…..Fröhlich H (2020) Variational Autoencoder Modular Bayesian Networks for Simulation of Heterogeneous Clinical Study Data. Front. Big Data 3:16. doi: 10.3389/fdata.2020.00016

12:40-13:00
Formulating a Gene Signature for Diagnosis of Autoimmune and Infectious Diseases
Format: Pre-recorded with live Q&A

Moderator(s): Irina Balaur

  • Riya Gupta, Center for Biomedical Informatics Research, Department of Medicine, Stanford University, United States
  • Aditya Rao, Immunology Graduate Program, Department of Medicine, Stanford University, United States
  • Lara Murphy Jones, Division of Critical Care Medicine, Department of Pediatrics, School of Medicine, Stanford University, United States
  • Purvesh Khatri, Center for Biomedical Informatics Research, Department of Medicine, Stanford University, United States

Presentation Overview: Show

When patients with an underlying autoimmune condition such as juvenile idiopathic arthritis or lupus report life-threatening symptoms, physicians need to quickly determine whether these symptoms are caused by an acute infection or a complication of their autoimmune condition. As immunosuppressive drugs are harmful to someone undergoing an infection, accurate and timely diagnosis is critical. In recent years, host-response-based diagnostics have shown promise in accurately and non-invasively diagnosing a number of infectious and autoimmune diseases.
Here, we collected and curated blood transcriptome profiles of 14,587 patients from 42 countries across 122 independent datasets and grouped them into infectious, autoimmune, and healthy control categories. Using a novel statistical framework, we created two gene signatures from this data: one to differentiate patients with autoimmune or infectious diseases from healthy individuals and another to differentiate between patients with autoimmune or infectious diseases. Both signatures achieve an area under the receiver operating characteristics curve (AUROC) of >0.87 on completely independent datasets. Because our training and testing data included heterogeneity across many factors, these gene signatures can be utilized in diverse clinical populations. Furthermore, these signatures can aid physicians across a broad range of clinical scenarios, where existing diagnostics are invasive, expensive, or non-specific.

13:00-13:20
Proceedings Presentation: Expected 10-anonymity of HyperLogLog sketches for federated queries of clinical data repositories
Format: Pre-recorded with live Q&A

Moderator(s): Irina Balaur

  • Ziye Tao, University of Toronto, Canada
  • Griffin M. Weber, Harvard Medical School, United States
  • Yun William Yu, University of Toronto, Canada

Presentation Overview: Show

Motivation: The rapid growth in of electronic medical records provide immense potential to researchers, but are often silo-ed at separate hospitals. As a result, federated networks have arisen, which allow simultaneously querying medical databases at a group of connected institutions. The most basic such query is the aggregate count—e.g. How many patients have diabetes? However, depending on the protocol used to estimate that total, there is always a trade-off in the accuracy of the estimate against the risk of leaking confidential data. Prior work has shown that it is possible to empirically
control that trade-off by using the HyperLogLog (HLL) probabilistic sketch.
Results: In this article, we prove complementary theoretical bounds on the k-anonymity privacy risk of using HLL sketches, as well as exhibit code to efficiently compute those bounds.

13:20-13:40
Synthesizing realistic patient-level data using multimodal neural differential equations
Format: Pre-recorded with live Q&A

Moderator(s): Venkata Satagopam

  • Holger Fröhlich, Fraunhofer SCAI, Germany
  • Philipp Wendland, Fraunhofer SCAI, Germany
  • Colin Birkenbihl, Fraunhofer SCAI, Germany
  • Maik Kschischo, University of Applied Sciences Koblenz, Germany

Presentation Overview: Show

Generative artificial intelligence models can utilize patient-level datasets to learn the data underlying distribution and subsequently generate realistic samples from it. These generated virtual patients do ideally maintain the data inherent signals, such as variable interdependencies and progression trends, and overcome crucial limitations of their real counterparts like missing values and irregular assessment intervals.
In this work, we present a novel generative model, the Multimodal Neural Ordinary Differential Equation (MultiNODE). MultiNODE was designed to handle multimodal time-dependent and static features (e.g. tremor scores and biological sex). We applied the MultiNODE to a longitudinal Parkinson’s disease (PD) clinical dataset and generated synthetic PD cohort data. We compare the generative capabilities of MultiNODE against a previously published generative approach. Finally, we demonstrate the interpolation and extrapolation ability of our model.
MultiNODE successfully created synthetic data that captured the real characteristics of complex longitudinal clinical data. Marginal variable distributions, longitudinal trajectories and the correlation structure exhibited in the synthetic data resembled the original real data. Interpolation and extrapolation beyond time points used in model training was successful.
Generative models, such as MultiNODE, can support research endeavors and clinical trials through synthesizing control arms, simulating intervention events and generating data with arbitrary observation intervals.

13:40-14:00
Bringing the Algorithms to the Data - Secure Distributed Medical Analytics using the Personal Health Train
Format: Pre-recorded with live Q&A

Moderator(s): Venkata Satagopam

  • Oliver Kohlbacher, University of Tübingen, Germany
  • Marius Herr, University Hospital Tübingen & University of Tübingen, Germany
  • Lukas Zimmermann, University Hospital Tübingen & University of Tübingen, Germany
  • Michael Graf, University Hospital Tübingen, Germany
  • Peter Placzek, University Hospital Tübingen & University of Tübingen, Germany
  • Florian König, University of Tübingen, Germany
  • Mete Akgun, University Hospital Tübingen, Germany
  • Felix Boette, University Hospital Tübingen & University of Tübingen, Germany
  • Tyra Stickel, University Hospital Tübingen & University of Tübingen, Germany
  • Michael Slupina, University Hospital Tübingen, Germany
  • Stephanie Biergans, University Hospital Tübingen, Germany
  • Nico Pfeifer, University of Tübingen & Max Planck Institute for Informatics, Saarbrücken, Germany

Presentation Overview: Show

Data protection laws force hospitals to create data silos, making it difficult to apply machine learning and artificial intelligence methods across distributed datasets. The Personal Health Train is a paradigm proposed within the GO-FAIR initiative to utilize these methods and improve personalized medicine by enabling the learning of more robust models in a distributed setting. Our deployment-ready and open-source Personal Health Train architecture (Figure 1) enables the execution of arbitrarily complex analysis pipelines with a strong focus on security. Without transferring the data to a central analysis site, container technologies allow the user to run a wide range of algorithms iteratively among participating hospitals. Our framework is dynamically extensible to accommodate the particular needs of researchers and hospitals. After deployment of a station at a hospital, no further installation steps are required. A hospital never gives up control over its data and can independently decide to join in an analysis. We demonstrate our framework's capabilities for raw genomic analysis, including homomorphically encrypted count queries and deep neural networks applied to image data (Figures 2, 3).

14:20-15:00
TransMed Keynote
Format: Live-stream

Moderator(s): Wei Gu

  • Prof. Kenneth D. Mandl
15:00-15:10
MONET: Multi-omic module discovery by omic selection
Format: Pre-recorded with live Q&A

Moderator(s): Wei Gu

  • Nimrod Rappoport, Tel Aviv University, Israel
  • Roy Safra, Tel Aviv University, Israel
  • Ron Shamir, Tel Aviv University, Israel

Presentation Overview: Show

Recent advances in experimental biology allow creation of datasets where several omics are measured per sample. Integrative analysis of multi-omic datasets in general, and clustering of samples in such datasets specifically, can improve our understanding of biological processes and discover different disease subtypes. In this work we present MONET, which presents a unique approach to multi-omic clustering. MONET discovers modules of similar samples, such that each module is allowed to have a clustering structure for only a subset of the omics. This approach differs from most existent multi-omic clustering algorithms, which assume a common structure across all omics. We tested MONET extensively on simulated data, on an image dataset, and on ten multi-omic cancer datasets from TCGA. Our analysis shows that MONET compares favorably with other multi-omic clustering methods. We demonstrate MONET's biological and clinical relevance by analyzing its results for Ovarian Serous Cystadenocarcinoma. We also show that MONET is robust to missing data, can cluster genes in multi-omic dataset, and reveal modules of cell types in single-cell multi-omic data. Our work shows that MONET is a valuable tool that can provide complementary results to those provided by existent algorithms.

15:10-15:20
Closing
Format: Live-stream

Moderator(s): Irina Balaur



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube