BioVis COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in CDT
Wednesday, July 13th
10:30-11:30
Keynote Presentation: Once upon a time in Bio-Medical Data Visualization: Reflections on Research Before and During Pandem...
Room: KOPL
Format: Live from venue

Moderator(s): Michael Krone

  • Tatiana Landesberger


Presentation Overview: Show

Research in visualization is often motivated by the endeavor to improve on the illustration of data: in order to better communicate data to others and to gain deeper insights into complex datasets, possibly from a variety of data sources. In the medical domain, the data can include for example patient data, health records as well as biologic data such as genome. Insights to be obtained from data may relate inter alia to the spreading of diseases, evolutionary analysis and virus mutations. The tasks include both retrospective analysis for finding the patient zero and modelling the spreading of a disease, as well predictive modelling of virus mutations and future disease spreading. The COVID-19 pandemic has confronted this general motivation for our research to a need for practical solutions. Infection control experts needed to quickly gain insights into novel datasets and to communicate the insights to colleagues and to a broader public, requiring quick and efficient visualization solutions.
New methods, tools, and methodologies have popped up from basic and from applied research. New data was collected, model results were produced that required rapid analysis. Multi-disciplinary teams worked and applied solutions to the new challenges resulting from the pandemic. The rapid response was only possible by leveraging on the experience and past research. Thus, the talk will take a larger historical perspective and present specific solutions from own experience, including reflections on data, task and user triangle as well as the challenges of multidisciplinary working styles.

11:30-11:50
Polyphony: an Interactive Transfer Learning Framework for Single-Cell Data Analysis
Room: KOPL
Format: Live from venue

Moderator(s): Michael Krone

  • Furui Cheng, The Hong Kong University of Science and Technology, Hong Kong
  • Mark Keller, Harvard Medical School, United States
  • Huamin Qu, The Hong Kong University of Science and Technology, Hong Kong
  • Nils Gehlenborg, Harvard Medical School, United States
  • Qianwen Wang, Harvard Medical School, United States


Presentation Overview: Show

Reference-based cell-type annotation can significantly reduce time and effort in single-cell analysis by transferring labels from a previously-annotated dataset to a new dataset. However, label transfer is challenging. End-to-end computational methods can fail due to mixing technical variants (e.g., different sequencing batches or techniques) that must be removed and biological variants (e.g., different cells) that must be conserved among datasets. To address this issue, we propose Polyphony, an interactive transfer learning (ITL) framework, to complement biologists' knowledge with advanced computational methods. Polyphony is motivated and guided by domain experts' needs for a controllable, interactive, and algorithm-assisted annotation process, identified through our multi-round expert interviews with six biologists. We introduce anchors, i.e., analogous cell populations across datasets, as a paradigm to explain the computational process and collect users' feedback for model improvement. A set of visualizations and interactions is provided to empower users to add, delete, or modify anchors, resulting in refined cell type annotations. We demonstrate the effectiveness of this approach through two usage scenarios and interviews with two biologists. The results show that our anchor-based ITL method takes advantage of both human and machine intelligence in annotating massive single-cell datasets.

11:50-12:00
Data Transformations for Effective Visualization of Single-Cell Embeddings
Room: KOPL
Format: Live from venue

Moderator(s): Michael Krone

  • Evan Greene, Ozette Technologies, United States
  • Greg Finak, Ozette Technologies, United States
  • Fritz Lekschas, Ozette Technologies, United States
  • Malisa Smith, Ozette Technologies, United States
  • Leonard A. D'Amico, Fred Hutchinson Cancer Research Center, United States
  • Nina Bhardwaj, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, United States
  • Candice D. Church, Division of Dermatology, Department of Medicine University of Washington, United States
  • Chihiro Morishima, Division of Dermatology, Department of Medicine University of Washington, United States
  • Nirasha Ramchurren, Fred Hutchinson Cancer Research Center, United States
  • Janis M. Taube, Johns Hopkins University School of Medicine, United States
  • Paul T. Nghiem, Division of Dermatology, Department of Medicine University of Washington, United States
  • Martin A. Cheever, Fred Hutchinson Cancer Research Center, United States
  • Steven P. Fling, Fred Hutchinson Cancer Research Center, United States
  • Raphael Gottardo, University of Lausanne and Lausanne University Hospital, Swiss Institute of Bioinformatics, Switzerland


Presentation Overview: Show

Nonlinear dimensionality reduction (DR) methods are commonly used to create two-dimensional embeddings of high-dimensional data for visualization. Since the effectiveness of learned embeddings can depend markedly on the choice of the DR method’s hyperparameters, prior work has focused on evaluating hyperparameter settings. However, data transformations can be equally important for creating effective embeddings. Yet, they have received less attention.
In this talk, we’re going to present data transformation approaches for the embedding of single-cell data, specifically surface proteomics. Using computationally-derived labels for expression groups (e.g., low, medium, high) we can spread out and normalize the expression range of different cell phenotypes. Visually this allows for the identification of rare and complex cell types that would otherwise be indistinguishable from broad cell phenotypes. Moreover, such an approach effectively eliminates batch effects that are otherwise the cause for great differences in the lower-dimensional embedding and make sample-by-sample comparisons ineffective. Finally, we’re going to show a data transformation approach using simulated data to create a generic embedding with concrete data being mapped into it. Such an approach enables relative comparison of cluster expression profiles while still providing a global map for broad cluster similarities.

12:00-12:10
Visualizing Cluster-specific Genes from Single-cell Transcriptomics Data Using Association Plots
Room: KOPL
Format: Live from venue

Moderator(s): Michael Krone

  • Elzbieta Gralinska, Max Planck Institute for Molecular Genetics, Germany
  • Clemens Kohl, Max Planck Institute for Molecular Genetics, Germany
  • Bita Sokhandan Fadakar, Max Planck Institute for Molecular Genetics, Germany
  • Martin Vingron, Max Planck Institute for Molecular Genetics, Germany


Presentation Overview: Show

Visualizing single-cell transcriptomics data in an informative way is a major challenge in biological data analysis. Clustering of cells is a prominent analysis step and the results are usually visualized in a planar embedding of the cells using methods like PCA, t-SNE, or UMAP. Given a cluster of cells, one frequently searches for the genes highly expressed specifically in that cluster. At this point, visualization is usually replaced by studying a list of differentially expressed genes.

We address this bottleneck by presenting Association Plots (APs) adapted to single-cell data. APs are derived from correspondence analysis, a projection method which embeds both genes and cells in high-dimensional space, where genes associated to a cell cluster lie in a particular direction. By employing this feature, APs constitute a dimension-independent visualization of cluster-specific genes from single-cell datasets. Our method is now available as a free Bioconductor package APL.

We demonstrate the application of APs to single-cell RNA-seq data through several examples. First, we show the identification of marker genes using APs. Second, we present how APs aid in cell cluster annotation using a predefined list of marker genes. Finally, we compare results from APs to results from existing differential expression testing tools.

12:10-12:20
Kana: Interactive Single-Cell Analysis in the Browser
Room: KOPL
Format: Live-stream

Moderator(s): Michael Krone

  • Jayaram Kancherla, Genentech, Inc., United States
  • Hector Corrada Bravo, Genentech, Inc., United States
  • Aaron Lun, Genentech, Inc., United States


Presentation Overview: Show

We present kana, a web application for interactive scRNA-seq data analysis that combines execution of both visualization and computational analysis in the web browser. Kana leverages web technologies such as WebAssembly to efficiently perform the relevant computations on the user’s machine leveraging C++ libraries implementing analysis steps that are re-usable in non-visualization, or client/server approaches. As an added benefit of this client side approach, user data is never transferred or uploaded to a server, avoiding problems with data privacy. Since computations run in the browser, this also removes network latency hence providing a smooth interactive experience. Kana provides a streamlined one-click workflow for all steps in a typical scRNA-seq analysis, starting from a count matrix and finishing with marker detection and cell type annotation. Results are progressively rendered immediately as the underlying analysis step is complete and are presented in an intuitive web interface for further exploration and iterative analysis. Testing on public datasets shows that kana can analyze over 100,000 cells within 5 minutes on a typical laptop.

The application is hosted on GitHub: http://github.com/jkanche/kana. The preprint is available at https://www.biorxiv.org/content/10.1101/2022.03.02.482701v1

12:20-12:30
Supervised capacity preserving mapping: a clustering guided visualization method for scRNA-seq data
Room: KOPL
Format: Live from venue

Moderator(s): Michael Krone

  • Zhiqian Zhai, Department of Statistics, University of California Los Angeles, United States
  • Yu L. Lei, Department of Periodontics and Oral Medicine, University of Michigan; University of Michigan Rogel Cancer Center, United States
  • Rongrong Wang, Department of Computational Mathematics, Science and Engineering and Department of Mathematics, MSU, United States
  • Yuying Xie, Department of Computational Mathematics, Science and Engineering and Department of Statistics and Probability, MSU, United States


Presentation Overview: Show

Recently, various visualization methods have been developed to analyze the scRNA-seq data. However, current visualization methods, including UMAP and t-SNE, are challenged by the limited accuracy of rendering the geometric relationship of populations with distinct functional states. Most visualization methods are unsupervised, leaving out information from the clustering results or given labels. This leads to the inaccurate depiction of the distances between the bona fide functional states. In particular, UMAP and t-SNE are not optimal to preserve the global geometric structure. They may result in a contradiction that clusters with near distance in the embedded dimensions are in fact further away in the original dimensions. Besides, UMAP and t-SNE cannot track cluster variance. The embedded cluster variance is not only associated with the true variance but also proportional to the sample size.
We present supCPM, a robust supervised visualization method utilizing clustering results, which separates different clusters, preserves the global structure and tracks the cluster variance. Compared with other existing methods using synthetic and real datasets, supCPM shows improved performance in preserving the global geometric structure and data variance. Overall, supCPM provides an enhanced visualization pipeline to assist the interpretation of functional transition and accurately depict population segregation.

14:30-14:50
Microbiome Maps: Hilbert Curve Visualizations of Metagenomic Profiles
Room: KOPL
Format: Live from venue

Moderator(s): Qianwen Wang

  • Camilo Valdes, Lawrence Livermore National Laboratory, United States
  • Vitalii Stebliankin, Bioinformatics Research Group (BioRG), Florida International University., United States
  • Daniel Ruiz-Perez, Bioinformatics Research Group (BioRG), Florida International University., United States
  • Ji In Park, Department of Medicine. Kangwon National University School of Medicine., South Korea
  • Hajeong Lee, Department of Internal Medicine, Seoul National University College of Medicine., South Korea
  • Giri Narasimhan, Bioinformatics Research Group (BioRG), Florida International University., United States


Presentation Overview: Show

Abundance profiles from metagenomic sequencing data synthesize information from billions of sequenced reads coming from thousands of microbial genomes. Analyzing and understanding these profiles can be a challenge since the data they represent are complex. Particularly challenging is their visualization, and here we present a technique called a ""Microbiome Map"" which visualizes a microbiome profile using a Hilbert curve.

The maps are created using the Jasper software, which generates colorful 2D images that succinctly visualizes a microbiome sequencing profile. Color and location in a microbiome map play a vital role: locations represent a genome from a reference collection (whole-genome sequencing), or a set of OTUs (16S sequencing); and color can represent their relative abundance. Maps can also be interactively explored using Jasper, which integrates with online resources such as Ensembl, GenBank, and UniProt.

We discuss how microbiome maps can be a powerful asset for classification and prediction models by visualizing the strain-level abundances of 44K genomes in 328 samples from the Human Microbiome Project, as well as 5K species in 200 fecal samples from a collaboration with Kangwon National University and Seoul National University in South Korea.

More information can be found at ""www.microbiomemaps.org"".

14:50-15:10
Coral: a web-based visual analysis tool for creating and characterizing cohorts
Room: KOPL
Format: Live-stream

Moderator(s): Qianwen Wang

  • Patrick Adelberger, Institute of Computer Graphics, Johannes Kepler University Linz, Linz, A-4040, Austria, Austria
  • Klaus Eckelt, Institute of Computer Graphics, Johannes Kepler University Linz, Linz, A-4040, Austria, Austria
  • Markus Johann Bauer, Global Computational Biology and Digital Sciences, Boehringer Ingelheim RCV GmbH & Co KG, Vienna, A-1121, Austria, Austria
  • Marc Streit, Institute of Computer Graphics, Johannes Kepler University Linz, Linz, A-4040, Austria, Austria
  • Christian Haslinger, Global Computational Biology and Digital Sciences, Boehringer Ingelheim RCV GmbH & Co KG, Vienna, A-1121, Austria, Austria
  • Thomas Zichner, Global Computational Biology and Digital Sciences, Boehringer Ingelheim RCV GmbH & Co KG, Vienna, A-1121, Austria, Austria


Presentation Overview: Show

A main task in computational cancer analysis is the identification of patient subgroups (i.e. cohorts) based on a rich collection of metadata attributes (patient stratification) or genomic markers of response (biomarkers). Coral is a web-based cohort analysis tool that is designed to support this task: Users can interactively create and refine multiple cohorts, based on quantitative or categorical attributes, which can then be compared, characterized, and inspected down to the level of single items. The characterization includes the possibility for statistical testing between cohorts and provides intuitive access to prevalence information. Coral visualizes the evolution of cohorts as well as their relationships as a graph. Furthermore, findings can be stored, shared, and reproduced via the integrated session management. Coral is pre-loaded with data from over 128 000 samples from the AACR Project GENIE, The Cancer Genome Atlas, the Cell Line Encyclopedia, and two depletion screen datasets.
To demonstrate the usefulness of Coral, we reproduce findings from a published article about KRASG12C somatic mutations in the AACR Project GENIE patients. We analyze the KRASG12C mutation frequencies for Non-Small Cell Lung Cancer (NSCLC) and colorectal cancer patient cohorts with regard to their differences in race and gender.

15:10-15:20
TCGAnalyzeR: A Web Portal for Visualization of Integrated Analysis Of Subcohorts of Pan-Cancer Patients With Molecular and Clinical Data
Room: KOPL
Format: Live from venue

Moderator(s): Qianwen Wang

  • Başak Abak Masud, Istanbul Medipol University, Turkey
  • Talip Zengin, Muğla Sıtkı Koçman University, Turkey
  • Tuğba Önal-Süzek, Muğla Sıtkı Koçman University, Turkey


Presentation Overview: Show

The Cancer Genome Atlas (TCGA) contains multidimensional molecular data of 11,000 cancer patients of 33 cancer types. In our work, we aimed to present a visual analysis tool integrating our recently published gene-signature based low-risk/high-risk TCGA patient cohorts (Zengin T and Önal-Süzek T., 2020; Zengin T and Önal-Süzek T., 2021) along with curatedTCGA patient cohorts with all the single-nucleotide variations (SNVs), the copy number variations (CNVs), RNA-seq and clinical data of 33 different cancer patients from TCGA http://tcganalyzer.mu.edu.tr/

Our interactive shiny-based web platform TCGAnalyzeR enables statistical analysis of big data in 4 main categories providing the users to interactively select the cancer type, data category(SNV/CNV/DEA/Clinical), mutation type (somatic or all), risk group(low-risk/high-risk) and cohort type(paired/all). Downloadable plots and data tables are provided to interactively visualize data specific to each category. Each plot has its filtration options. The gene and patient (sample) names given in the tables and plots are selectable which enables the user to add a gene or patient to the “My genes” or “My patients” panel respectively for filtering other plots and copying the selections to user's clipboard.

For 3 cancer types, LUAD,LUSC, COAD, we provide pre-clustered low-risk or high-risk cohorts using our gene signature method. For 15 cancer types, patient cohorts from curatedTCGA are integrated and for 5 cancer types, we computed an iCluster+ based multi-omic patient clustering and integrated them into the web interface enabling a comparative visual analysis of user-defined subcohorts.

15:20-15:30
PhyloDiver: A Visual Analytics Tool for Tumor Phylogenies
Room: KOPL
Format: Live from venue

Moderator(s): Qianwen Wang

  • Charles Blatti, University of Illinois at Urbana-Champaign, United States
  • Matthew Berry, University of Illinois at Urbana-Champaign, United States
  • Chad Olson, University of Illinois at Urbana-Champaign, United States
  • Lisa Gatzke, University of Illinois at Urbana-Champaign, United States
  • Chuanyi Zhang, University of Illinois at Urbana-Champaign, United States
  • Peter Groves, University of Illinois at Urbana-Champaign, United States
  • Colleen Bushell, University of Illinois at Urbana-Champaign, United States
  • Nicholas Chia, Mayo Clinic, United States
  • Zeynep Madak-Erdogan, University of Illinois at Urbana-Champaign, United States
  • Mohammed El-Kebir, University of Illinois at Urbana-Champaign, United States


Presentation Overview: Show

Cancer is the result of an evolutionary process, where somatic mutations accumulate over time in a population of cells. As such, a tumor is composed of multiple subpopulations of cells, or clones, with distinct complements of mutations. This intra-tumor heterogeneity is a major driver for resistance to therapy. Researchers use evolutionary trees, or phylogenies, to study intra-tumor heterogeneity and reason about cancer evolution. While many methods have been developed to visualize and interpret tumor phylogenies, these methods often provide either 1) a static image of clonal evolution that does not accommodate user interaction or 2) tree layout interfaces that do not incorporate clonal proportions and mutation details. Here, we introduce PhyloDiver, a novel visual analytics tool that enables end-users to study clonal evolution in an interactive fashion while remaining connected to the underlying annotated mutations.

16:00-16:20
Visual Exploration of Relationships and Structure in Low-Dimensional Embeddings
Room: KOPL
Format: Live-stream

Moderator(s): Helena Jambor

  • Klaus Eckelt, Johannes Kepler University Linz, Austria
  • Andreas Hinterreiter, Johannes Kepler University Linz, Austria
  • Patrick Adelberger, Johannes Kepler University Linz, Austria
  • Conny Walchshofer, Johannes Kepler University Linz, Austria
  • Vaishali Dhanoa, Johannes Kepler University Linz and Pro2Future GmbH, Austria
  • Christina Humer, Johannes Kepler University Linz, Austria
  • Moritz Heckmann, Johannes Kepler University Linz, Austria
  • Christian Steinparz, Johannes Kepler University Linz, Austria
  • Marc Streit, Johannes Kepler University Linz, Austria


Presentation Overview: Show

We present an interactive visual approach for the exploration and formation of structural relationships in embeddings of high-dimensional data.
These structural relationships, such as item sequences, associations of items with groups, and hierarchies between groups of items, define properties of many real-world datasets. Nevertheless, most existing methods for the visual exploration of embeddings treat these structures as second-class citizens or do not take them into account at all.

In our proposed analysis workflow, users explore enriched scatterplots of the embedding, in which relationships between items and/or groups are visually highlighted. During their exploratory analysis, users can externalize their insights by setting up additional groups and relationships between items and/or groups---for example, by dividing a heterogeneous group of patients into several subgroups.

The original high-dimensional data for single items, groups of items, or differences between items and groups are accessible through additional summary visualizations and difference visualizations that complement the embedding with a detailed look at the high-dimensional attributes.
We carefully tailored these summary and difference visualizations to various data types and semantic contexts.

We implemented the approach as a web application, which is open-source and publicly available at https://jku-vds-lab.at/apps/embedding-structure-explorer.

16:20-16:24
Plaice plots - an allele-aware visualization of clonal evolution
Room: KOPL
Format: Live-stream

Moderator(s): Helena Jambor

  • Sarah Sandmann, Institute of Medical Informatics, University of Münster, Germany
  • Yvonne Lisa Behrens, Department of Human Genetics, Hannover Medical School, Germany
  • Gudrun Göhring, Department of Human Genetics, Hannover Medical School, Germany
  • Julian Varghese, Institute of Medical Informatics, University of Münster, Germany


Presentation Overview: Show

Reconstruction of clonal evolution involves complex integrated analyses. The results are, in addition to classical representation by phylogenetic or clonal evolution trees, commonly visualized using fish plots. In these plots, the development of every individual clone is displayed, considering time on the x-axis, and cancer cell fraction on the y-axis. Thereby, fish-shaped objects are generated.

Despite providing a comprehensive visualization of clonal evolution, fish plots display information only on clone-, not on allele-level. Biallelic mutations cannot be identified at first sight. However, with respect to disease progression, these mutations play an essential role. To fill this gap, we introduce plaice plots as a derivative of fish plots. The actual 'fish' become flatfish, i.e. plaice, and are mirrored - above and below the y-axis. The upper plot visualizes common clonal development, while the lower plot shows the fraction of remaining healthy alleles. For example, in case of mutated TP53 and additional del17p affecting the remaining healthy allele, the fraction of cells with deficient TP53 is marked in the lower plot. Similarly, X-chromosomal mutations in male samples, leading to a loss of the only available healthy allele, are visualized. Thereby, plaice plots allow for immediate identification of double-hit events.

16:24-16:27
An R Shiny app for systematically integrating genetic and pharmacologic cancer dependency maps
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Tapsya Nayak, Greehey Children's Cancer Research Institute, University of Texas Health San Antonio, United States
  • Li-Ju Wang, Greehey Children's Cancer Research Institute, University of Texas Health San Antonio, United States
  • Michael Ning, Department of Computer Science, University of Texas at Austin, United States
  • Yu-Chiao Chiu, UPMC Hillman Cancer Center, University of Pittsburgh, United States
  • Yidong Chen, Greehey Children's Cancer Research Institute, University of Texas Health San Antonio, United States


Presentation Overview: Show

The rapidly growing cancer dependency maps pave the way to precision oncology by identifying and targeting the “Achilles’ heel” of cancer. There is a pressing need for software that systematically links such genetic (gene knockouts) and pharmacologic dependencies (small compounds). Here we present a web-based R Shiny app that incorporates heterogenous data from large-scale high-throughput CRISPR screens, pharmacologic screens, and molecular signatures library, jointly covering 17k genes, 20k drugs, and 1k cell lines. The major goal is to match gene knockouts and drug treatments that induce similar effects in cell viability and/or gene expression perturbation in order to address two fundamental questions: 1) which drugs can be potential surrogates to the knockout of a gene, and 2) which genes are potential targets or mechanisms of action of a drug. The app has four complementary and interconnected modules that address various query scenarios to identify potential druggable genetic vulnerabilities and understand the mechanisms of action of a known or new drug. The results are represented by interactive figures and networks, as well as annotated data tables. In summary, our Shiny app enables easy and systematic navigation, visualization, and integration of the rapidly evolving genetic and pharmacologic dependency maps of cancer.

16:27-16:31
INVESTIGATION AND IDENTIFICATION OF ESSENTIAL FACTORS FOR VISUALISATION TOOLS FOR COMPLEX BIOLOGICAL NETWORKS
Room: KOPL
Moderator(s): Helena Jambor

  • Hanin Alzahrani, Newcastle university, United Kingdom
  • Dr.Sara Fernstad, Newcastle university, United Kingdom


Presentation Overview: Show

Networks have become a ubiquitous research focus of the biological and biomedical research fields. Complex phenotypes, such as disease vulnerability, result from single-gene mutations that act in isolation and result from the perturbation of a gene’s network context. Interactive visualisation can support interpretation and understanding of the complexities inherent in such biological network data. The research presented here aims to investigate and review existing network visualisation tools in terms of their ability to support human cognition and data exploration, and through this provide guidance for next steps to improve such visualisation.
The effectiveness of the visualisation tools was measured using 25 factors, which were identified from literature using the systematic review method. Additionally, primary data was gathered using interviews and surveys to capture the data analysts’ experiences, expectations and opinions about the visualisation tools. Such mixed methods approach enables the researcher to juxtapose results from different angles for an accurate conclusion.
The results show that out of all the visualisation factors considered in the research, only “Advanced search” has emerged as a non-essential factor to be included in visualisation applications for complex networks. However, features would reveal more profound insight into the essential factors for visualisation application for complex networks.

16:31-16:34
ECellDive: Exploring Biological Systems in Virtual Reality
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Eliott Jacopin, RIKEN, Center for Biosystems Dynamics Research, Japan
  • Kozo Nishida, Genome Analytics Japan Inc., Tokyo, Japan, Japan
  • Kazunari Kaizu, RIKEN, Center for Biosystems Dynamics Research, Japan
  • Koichi Takahashi, RIKEN, Center for Biosystems Dynamics Research, Japan


Presentation Overview: Show

ECellDive is a virtual environment where users can model, simulate and visualize biological systems in collaboration with their colleagues. In ECellDive, everything is a module representing either data (e.g. a metabolic pathway) or any transform on this data (e.g. a Flux Balance Analysis).
For demonstration purposes we import the Escher-FBA model in our virtual scene (Zachary A. King et al. 2017, doi:10.1371/journal.pcbi.1004321) and dive into it. Diving transfers us to a new scene containing the metabolic pathway encoded in Escher-FBA. From there on, we explore the pathway by strolling around. This is a major improvement compared to the original web app where we have to zoom in/out or pan to explore the model. Then, we highlight the structure of the network by grouping modules together automatically or manually. It is particularly efficient to help contextualize the model by, for example, visualizing cellular compartment and metabolic subsystems. Finally, we perform a Flux Balance Analysis (FBA) of the pathway and update the simulation results by knocking-out/activating reactions of interest. Finally, ECellDive is about collaboration: any changes can be exported and shared. But we can also join a session hosted by someone else in real-time to modify the same file.

16:34-16:38
VenOmics and Cell Signaling Environment for Studies and BioDiscoveries
Room: KOPL
Format: Live-stream

Moderator(s): Helena Jambor

  • Marcela Ishihara, Programa de Pós Graduação em Toxinologia, Laboratório de Toxinologia Aplicada, Instituto Butantan, Brazil, Brazil
  • Bruno Ferreira de Souza, Laboratório de Toxinologia Aplicada, Instituto Butantan, Brazil, Brazil
  • Henrique Cursino Vieira, Laboratório de Toxinologia Aplicada, Instituto Butantan, Brazil, Brazil
  • Hugo Aguirre Armelin, Laboratório de Ciclo Celular, Instituto Butantan, Brazil, Brazil
  • Marcelo Silva Reis, Departamento de Ciência da Computação, UNICAMP, Brazil, Brazil
  • Milton Yutaka Nishiyama-Jr, Laboratório de Toxinologia Aplicada, Instituto Butantan, Brazil, Brazil


Presentation Overview: Show

Animal venoms have fascinated humanity for a long time mainly due their complex actions and effects. Nowadays, these substances still intricate humans and represent one of the main drivers for the discovery of novel natural drugs with potential therapeutic, medicinal and agricultural properties. Venom's vary widely and its biotechnological relevance is mostly attributed to its complex composition, being composed of a plethora of peptides, enzymes and other molecular compounds. Due to the importance that venoms represent, a new field, Venomics, that combines high throughput data from different biological levels with molecular and computational techniques, has emerged. A higher understanding of these substances can aid the generation of more effective antivenoms and discovery of new biomolecules. Here, we present the VenOmics and Cell Signaling Environment for BioDiscoveries (VEnOmiCS4BD), a novel web-based public database, in development, for -omics storage and integration of multi-level venomous data, such as transcriptomics and proteomics, derived from venomous and envenomated organisms as well as platform for integrative analysis that allows data exploration of gene expression profiles, crossing experiments, signaling pathways and knowledge discovery. With VEnOmiCS4BD, we hope to facilitate Venomics research, serving as a commonplace for deposition and downstream analysis of heterogeneous biological data.

16:38-16:41
SciViewer- An interactive browser for visualizing large single cell datasets
Room: KOPL
Format: Live-stream

Moderator(s): Helena Jambor

  • Dhawal Jain, Pulmonary Drug Discovery Laboratory, Bayer US LLC. Pharmaceuticals, Research & Development, Boston, MA, United States
  • Sikander Hayat, Institute of Experimental Medicine and Systems Biology, RWTH Aachen, Germany, Germany
  • Michael Cho, Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA, United States
  • Edwin Silverman, Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA, United States
  • Rafael Kramann, Institute of Experimental Medicine and Systems Biology, RWTH Aachen, Germany, Germany
  • Alexis Laux-Biehlmann, Pulmonary Drug Discovery Laboratory, Bayer US LLC. Pharmaceuticals, Research & Development, Boston, MA, United States
  • Joydeep Chakraborty, Product Platform Research, Enterprise Platforms and Infrastructure, Bayer US LLC., Morristown, NJ, United States
  • Xinkai Li, Data Integration and Historians Services, Enterprise Platforms and Infrastructure, Bayer US LLC., Morristown, NJ, United States
  • Hobart Moore, Infrastructure Engineering Services, Enterprise Platforms and Infrastructure, Bayer US LLC., Morristown, NJ, United States
  • Pooja Srinivasa, Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA, United States


Presentation Overview: Show

Single-cell sequencing improves our ability to understand biological systems at single-cell resolution and can be used to identify novel drug targets and optimal cell-types for target validation. However, tools that can interactively visualize and provide target-centric views of these large datasets are limited. We present SciViewer (Single-cell Interactive Viewer), a novel tool to interactively visualize, annotate and share single-cell datasets. SciViewer allows visualization of cluster, gene and pathway level information such as clustering annotation, differential expression, pathway enrichment, cell-type specificity, cellular composition, normalized gene expression and comparison across datasets. Further, we provide APIs for SciViewer to interact with publicly available pharmacogenomics databases for systematic evaluation of potential novel drug targets. We provide a module for non-programmatic upload of single-cell datasets. SciViewer will be a useful tool for data exploration and target discovery from single-cell datasets. It is available on GitHub (https://github.com/Dhawal-Jain/SciViewer).

16:41-16:45
Gos: a declarative (epi)genomics visualization library for Python
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Trevor Manz, Harvard Medical School, United States
  • Sehi L'Yi, Harvard Medical School, United States
  • Nils Gehlenborg, Harvard Medical School, United States


Presentation Overview: Show

Existing genomic visualization tools are tailored towards specific tasks and as such are limited in expressiveness. The Gosling visualization grammar defines a set of primitives that specify how genomic datasets can be transformed and mapped to visual properties, providing building-blocks to compose unique scalable and interactive genomic data visualizations. Gosling visualizations are defined via JSON, however, which can be tedious and error-prone to edit manually – especially for complex specifications containing many layered and repeated elements. Additionally, genomic datasets defined by the Gosling grammar are expected to be accessible via HTTP, which poses challenges for users since a simple web-server and/or HiGlass server must be configured separately to view local data. Here we present Gos – a Python library which includes an API designed for computational biologists to quickly compose Gosling visualizations. Gos allows the use of familiar language features (variables, functions, for-loops, etc.) to author validated Gosling specifications (JSON) and additionally implements data-loading utilities to transparently load local data into visualizations, abstracting away the complexity of configuring custom web-servers. Gos is designed for interactive analysis within a computational notebook environment and integrates into Jupyter Notebook, JupyterLab, and Google Colab.

16:45-16:48
CoSIA: An R Package that Measures and Visualizes Transcriptome Diversity across Model Organisms and Their Tissues
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Anisha Haldar, Heersink School of Medicine, The University of Alabama at Birmingham, United States
  • Elizabeth J. Ramsey, Heersink School of Medicine, The University of Alabama at Birmingham, United States
  • Vishal H. Oza, Heersink School of Medicine, The University of Alabama at Birmingham, United States
  • Brittany N. Lasseigne, Heersink School of Medicine, The University of Alabama at Birmingham, United States


Presentation Overview: Show

Studying patient variants in model organisms is an active area of research. The key challenge is determining an ideal model organism for modeling and studying the patient variant phenotype. This task requires collaboration between a diverse group of experts and involves complex evaluations across multiple metrics like sequence alignment, human protein, and gene expression. Though there are many challenges in comparing the expression variation of a gene-associated variant, the advent of new databases with preprocessed expression data across species and tissues has prompted the exploration of transcriptome diversity aiding scientists in selecting a suitable model organism for phenotypic studies. We are developing CoSIA (Cross-Species Investigation and Analysis), an R package that provides researchers with multiple metrics for choosing the most suitable model organism for study by measuring and visualizing a diverse group of gene expression-based metrics. CoSIA uses curated non-diseased wild-type RNA-sequencing expression data from Bgee to visualize a gene’s expression across tissues and model organisms. Additionally, CoSIA provides functions to measure and visualize transcriptome diversity for a gene using median-based coefficients of variation and Shannon Entropy calculations. Thus, CoSIA provides researchers with tools to visualize the variation in a gene’s expression profile to determine a suitable model organism.

16:48-16:52
Interactive Exploration of Tissues and Cells Guided by Visual Pattern Mining
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Qianwen Wang, Harvard Univeristy, United States
  • Nils Gehlenborg, Harvard University, United States


Presentation Overview: Show

Visual patterns of tissues and cells in microscopy images can unravel valuable insights to understand human bodies and treat diseases (e.g., histopathology). Recent advances in spatial omics enable the analysis of tissues at the cellular level and lead to an explosion of research interest. However, current studies rarely discuss visual patterns, which is partly due to the difficulty for humans to interpret the generated multiplexed images, which can have more than 40 channels.

To tackle this research gap, this study proposes a visual analytics approach to facilitate the visual exploration of tissues and cells through visual pattern mining. Specifically, the proposed method consists of a backend data module and a frontend visualization module. The backend module employs a beta-VAE module and extracts visual patterns by simultaneously considering all channels of the multiplexed images. The frontend module supports users in arranging and grouping items (e.g., cell thumbnails, tissue patches) based on the identified visual patterns. Users can examine the distribution of certain visual patterns and associate the item visual patterns with their spatial contexts and other types of biological information. A preliminary case study on breast cancer demonstrates the effectiveness of our proposed approach.

16:52-16:55
Effective visualisation of the tumour microenvironment using glyph-based approaches
Room: KOPL
Format: Live-stream

Moderator(s): Helena Jambor

  • Heba Sailem, University of Oxford, United Kingdom


Presentation Overview: Show

Visualisation of cancer tissues is important for diagnosis, identifying driving pathological processes and potential biomarkers. Existing visualisation methods do not represent different tissue components and the tumour microenvironment intuitively and therefore are difficult to interpret by pathologists. Previously, we developed ShapoGraphy (www.shapography.com), a user-friendly web app for interactive creation of new glyph-based representations. Here we use ShapoGraphy to develop semantically relevant representation of multiplexed tissue image data that facilitate the pathological assessment and pattern discovery of tumour microenvironment phenotypes.

We will present the development of our representation and demonstrate its utility using several datasets measuring protein activities in stromal, immune and cancer cells. We will also present the exploration of various glyph design choices that uses different shapes and marks to represent different tissue compartments and tumour heterogeneity. To determine the effectiveness of our approach, we reviewed our designs with pathologists and biologists. We found that a representation that utilises compactly arranged hexagons that encode variables using the colour and symbols is more favourable. Finally, we will discuss general guidelines for producing effective glyph-based representation. In summary, our approach addresses the limitations of other visualisation approaches and provides a flexible way for summarising tissue image data.

16:55-16:58
Using Mapper to Reveal Morphological Relationships in Passiflora Leaves
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Sarah Percival, Michigan State University, United States


Presentation Overview: Show

As collections of data grow in size, it is increasingly important to have efficient means of analyzing large data sets. Topological data analysis (TDA) uses concepts from the mathematical field of topology to not only efficiently examine large data sets, but to make inferences related to the "shape" of data. In this project, we use Mapper, a tool from TDA that summarizes data into a graph, to discover an underlying structure relating the shapes of more than 3,300 Passiflora leaves from 40 different species. As the Mapper graph has a structure, or "shape" of its own, we think of it as a "shape of shapes" that provides information on the interplay between the developmental processes determining leaf shape within a single plant and the evolutionary processes between species. In particular, we examine the interactions between leaf species and both leaf age and leaf area by constructing a Mapper graph for each measure. For each node in the resulting graphs, we then compute the average leaf shape to obtain a graph structure that reveals how morphometric differences between species relate to the developmental changes that must occur for those shapes to be realized.

17:00-18:00
Keynote Presentation: Machine learning provides a new perspective on protein modification
Room: KOPL
Format: Live from venue

Moderator(s): Helena Jambor

  • Lennart Martens, VIB and Ghent University, Belgium


Presentation Overview: Show

Over the last two decades, mass spectrometry based proteomics has evolved quite dramatically, levera...