Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

MLSB COSI Track Presentations

Attention Conference Presenters - please review the Speaker Information Page available here
Lineage estimation from single-cell RNAseq time-series
Date: Tuesday, July 25
Time: 8:40 AM - 9:30 AM
Room: North Hall
  • Fabian Theis , Helmholtz Center Munich, Germany

Presentation Overview: Show

Single-cell technologies have gained popularity in developmental and stem cell biology because they allow resolving potential heterogeneities due to asynchronicity of differentiating cells. With technologies slowly becoming mature and cost-efficient, single cell profiles across multiple conditions e.g. time points and replicates are being generated. In this talk I will first show that by modeling the high-dimensional single cell state space as a diffusion process, we can visualize cell differentiation and estimate lineage formation using pseudotemporal ordering. By including information across multiple time points and if available replicates, we can then setup a model motivated by population dynamics but with continuous states that explains cell lineage transitions in real time beyond pseudotime. I will finish by briefly discussing algorithmic and computational challenges in upscaling to "big data" scRNAseq.

Transcriptome-wide splicing quantification in single cells
Date: Tuesday, July 25
Time: 10:00 AM - 10:25 AM
Room: North Hall
  • Yuanhua Huang, School of Informatics, University of Edinburgh, United Kingdom
  • Guido Sanguinetti, School of Informatics, University of Edinburgh, United Kingdom

Presentation Overview: Show

Single cell RNA-seq (scRNA-seq) has revolutionised our understanding of transcriptome variability, with profound implications both fundamental and translational. While scRNA-seq provides a comprehensive measurement of stochasticity in transcription, the limitations of the technology have prevented its application to dissect variability in RNA processing events such as splicing. Here we present BRIE (Bayesian Regression for Isoform Estimation), a Bayesian hierarchical model which resolves these problems by learning an informative prior distribution from multiple single cells. BRIE combines the mixture modelling approach for isoform quantification with a regression approach to learn sequence features which are predictive of splicing events. We validate BRIE on several scRNA-seq data sets, showing that BRIE yields reproducible estimates of exon inclusion ratios in single cells and provides an effective tool for differential isoform quantification between scRNA-seq data sets. BRIE therefore expands the scope of scRNA-seq experiments to probe the stochasticity of RNA-processing.

Gaussian processes for identifying branching dynamics in single cell data
Date: Tuesday, July 25
Time: 10:25 AM - 10:50 AM
Room: North Hall
  • Alexis Boukouvalas, University of Manchester, United Kingdom
  • James Hensman, Prowler.io, United Kingdom
  • Magnus Rattray, University of Manchester, United Kingdom

Presentation Overview: Show

Single cell gene quantification allows for the analysis of heterogeneous cell
populations and the analysis of the whole transcriptome without the need for a priori gene target selection. Identifying branching dynamics in cell populations undergoing differentiation is computationally challenging due to lack of time course data and high technical and biological noise.
We develop the branching Gaussian process (BGP), a non-parametric flexible model that is able to robustly identify branching dynamics on an individual gene level whilst also providing an uncertainty estimate of the branching times.

Data Integration in Computational Biology and Medicine: Current Progress and Future Directions
Date: Tuesday, July 25
Time: 10:50 AM - 11:40 AM
Room: North Hall
  • Anna Goldenberg, The Hospital for Sick Children, Toronto, Canada

Presentation Overview: Show

There is a great potential for machine learning to contribute to understanding and curing complex human diseases. Rapidly evolving biotechnologies are making it progressively easier to collect multiple and diverse genome-scale datasets to address clinical and biological questions. How do we take advantage of this extensive and heterogeneous data to help patients? In this talk I will introduce several very different biological and clinical questions that all call for data integration but a diverse set of machine learning approaches. First, I will mention Bayesian and discriminative approaches  we developed for patient subtyping, then I will talk about identifying disease mechanisms using graphical models and finally, if time permits, I will talk about drug response prediction via deep learning. I will conclude this talk with a summary of ongoing work in data integration and outline new research directions in this area.

Modeling Post-treatment Gene Expression Change with a Deep Generative Model
Date: Tuesday, July 25
Time: 11:40 AM - 12:05 PM
Room: North Hall
  • Ladislav Rampášek, University of Toronto, Canada
  • Daniel Hidru, University of Toronto, Canada
  • Peter Smirnov, Princess Margaret Cancer Centre, Canada
  • Benjamin Haibe-Kains, Princess Margaret Cancer Centre, Canada
  • Anna Goldenberg, The Hospital for Sick Children, Toronto, Canada

Presentation Overview: Show

In this paper we present a new Perturbation Variational Autoencoder (PertVAE), that learns latent representation of the underlying gene states before and after a drug application. PertVAE is a deep generative model based on Variational Autoencoder. To fit generative and approximate inference distributions for our model, we use a combination of Stochastic Gradient
Variational Bayes and Inverse Autoregressive Flow. We tested PertVAE on 19 drugs, predicting post treatment gene expression. The highest number of cell lines tested across the drugs was 56, which is a very small sample size for training complex models. Nevertheless, PertVAE can at least partially predict drug perturbations for 5 out of 8 drugs for which there is the most data available. Furthermore we found that the correlation of the reconstruction data is better when the size of the latent space is relatively small. We believe that this is a promising result showing that even with a small sample size, deep models are able to learn some level of reconstruction of post-treatment gene expression.

Generative Learning of Dynamic Structures using Spanning Arborescence Sets
Date: Tuesday, July 25
Time: 2:15 PM - 2:40 PM
Room: North Hall
  • Anthony Coutant, Laboratoire d'Informatique de Paris Nord (LIPN - UMR 7030), France
  • Celine Rouveirol, Laboratoire d'Informatique de Paris Nord (LIPN - UMR 7030), France

Presentation Overview: Show

We are interested in the problem of generative learning of dynamic models from "fat" time series data (high \#variables/\#individuals ratio), leading to a high sensitivity of learned models to the dataset noise. We propose in this purpose a method computing a mixture of many highly biased but optimal spanning arborescences obtained from many perturbed versions of the original dataset, introducing variance to counterbalance the strong arborescence bias. The method is at the boundary between structure oriented Bayesian model averaging and recent work on density estimation using mixtures of poly-trees in a perturb and combine framework, transposed to a dynamic setting. In practice, preliminary results on the recent DREAM 8 challenge are promising.

Understanding and predicting drug efficacy in cancer: from machine learning to biochemical models
Date: Tuesday, July 25
Time: 2:40 PM - 3:30 PM
Room: North Hall
  • Julio Saez-Rodriguez, RWTH Aachen University, Germany

Presentation Overview: Show

Large-scale genomic studies are providing unprecedented insights into the molecular basis of cancer, but it remains challenging to leverage  this information for the development and application of therapies. We have performed an integrated analysis of the molecular profiles of over 11,000 primary tumours and 1,000 cancer cell lines, along  with the response of the cell lines to 265 anti-cancer compounds. This analysis finds alterations in tumours  that can confer drug sensitivity or resistance, and sheds light on which data types are most informative to prioritize treatment.  Integration of  this data with various sources of prior knowledge, in particular signaling pathways and transcription factors, points at molecular processes involved in resistance mechanisms, and offer hypotheses for novel combination therapies.  Our own analysis as well as the results of  a crowdsourcing effort (DREAM challenge) reveals that  prediction of drug efficacy is far from accurate, implying important limitations for personalised medicine. I will argue than an important aspect that needs to be further studied is the dynamics of signaling networks and how they response to drug treatment. I will show how applying  logic models, trained with phosphoproteomic measurements upon perturbations, can further improve our understanding of the molecular basis of drug resistance, thereby providing new treatment opportunities not noticeable by static molecular characterisation.B8

Kernelized Rank Learning for Personalized Drug Recommendation
Date: Tuesday, July 25
Time: 3:30 PM - 3:55 PM
Room: North Hall
  • Xiao He, ETH Zurich, Switzerland
  • Lukas Folkman, ETH Zurich, Switzerland
  • Karsten Borgwardt, ETH Zurich, Switzerland

Presentation Overview: Show

Large-scale screenings of cancer cell lines with detailed genomic profiles against libraries of pharmacological compounds are currently being performed in order to gain a better understanding of the genetic component of drug response and to enhance our ability to predict drug sensitivity from genetic profiles. These screens differ from the clinical setting in which (1) medical records only contain the response of a patient to very few drugs, and in which (2) selecting the most promising out of all therapies is more important than accurately predicting the sensitivity to the given drug. Current regression models for drug sensitivity prediction fail to account for these two properties. We present a machine learning approach, named Kernelized Rank Learning (KRL), that ranks drugs based on their predicted effect per patient, circumventing the difficult problem of precisely predicting the sensitivity to the given drug. Our approach outperforms several state-of-the-art predictors in drug recommendation, particularly in a clinically-relevant case where few training data are available.

Ask the doctor - Improving drug sensitivity predictions through active expert knowledge elicitation
Date: Tuesday, July 25
Time: 3:55 PM - 4:20 PM
Room: North Hall
  • Iiris Sundin, Aalto University, Finland
  • Tomi Peltola, Aalto University, Finland
  • Muntasir Mamun Majumder, Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Finland
  • Pedram Daee, Aalto University, Finland
  • Marta Soare, Aalto University, Finland
  • Homayun Afrabandpey, Aalto University, Finland
  • Caroline Heckman, Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Finland
  • Samuel Kaski, Aalto University, Finland
  • Pekka Marttinen, Aalto University, Finland

Presentation Overview: Show

Predicting the efficacy of a drug for a given individual, using high-dimensional genomic measurements, is at the core of precision medicine. However, identifying features on which to base the predictions remains a challenge, especially when the sample size is small. Incorporating expert knowledge offers a promising alternative to improve a prediction model, but collecting such knowledge is laborious to the expert if the number of candidate features is very large. We introduce a probabilistic model that can incorporate expert feedback about the impact of genomic measurements on the sensitivity of a cancer cell for a given drug. We also present two methods to intelligently collect this feedback from the expert, using experimental design and multi-armed bandit models. In a multiple myeloma blood cancer data set (n=51), expert knowledge decreased the prediction error by 8%. Furthermore, the intelligent approaches can be used to reduce the workload of feedback collection to less than 30% on average, compared to a naive approach.