Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

BioVis COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in UTC
Monday, July 26th
11:00-11:10
BioVis Opening
Format: Live-stream

11:10-12:00
BioVis Keynote: Inter-disciplinary practices in BioVis
Format: Live-stream

  • Seán O’Donoghue, Garvan Institute of Medical Research, Australia

Presentation Overview: Show

BioVis involves a diverse community; while many of us are computer scientists, bioinformaticians, or bench biologists, our field also engages designers and illustrators, and our outcomes are used by students, educators, and science communicators. In this talk, I will highlight work practices used in inter-disciplinary collaborations to address key BioVis challenges. In addition to drawing on historical exemplars, I will showcase how I have used these practices in recent collaborative work on (a) human genetic variations, (b) multi-omics time-series studies, (c) the SARS-CoV-2 proteome, and (d) exploring protein structures with extended reality. I will also discuss how these practises influence the emergence of BioVis as a recognized discipline.

12:00-12:20
BioVis 10th Anniversary - Test of Time Award Ceremony
Format: Live-stream

12:40-13:00
Proceedings Presentation: OncoThreads: Visualization of Large Scale Longitudinal Cancer Molecular Data
Format: Pre-recorded with live Q&A

  • Theresa Anisja Harbig, Harvard Medical School, United States
  • Sabrina Nusrat, Harvard Medical School, United States
  • Tali Mazor, Dana-Faber Cancer Institute, United States
  • Qianwen Wang, Harvard Medical School, United States
  • Alexander Thomson, Novartis Institutes for BioMedical Research, United States
  • Hans Bitter, Novartis Institutes for BioMedical Research, United States
  • Ethan Cerami, Dana-Faber Cancer Institute, United States
  • Nils Gehlenborg, Harvard Medical School, United States

Presentation Overview: Show

Motivation: Molecular profiling of patient tumors and liquid biopsies over time with next-generation sequencing technologies and new immuno-profile assays are becoming part of standard research and clinical practice. With the wealth of new longitudinal data, there is a critical need for visualizations for cancer researchers to explore and interpret temporal patterns not just in a single patient but across cohorts.
Results: To address this need we developed OncoThreads, a tool for the visualization of longitudinal clinical and cancer genomics and other molecular data in patient cohorts. The tool visualizes patient cohorts as temporal heatmaps and Sankey diagrams that support the interactive exploration and ranking of a wide range of clinical and molecular features. This allows analysts to discover temporal patterns in longitudinal data, such as the impact of mutations on response to a treatment, e.g. emergence of resistant clones. We demonstrate the functionality of OncoThreads using a cohort of 23 glioma patients sampled at 2-4 timepoints.
Availability and Implementation: Freely available at http://oncothreads.gehlenborglab.org. Implemented in Javascript using the cBioPortal web API as a backend.
Contact: nils@hms.harvard.edu
Supplementary Material: Supplementary figures and video.

13:00-13:10
Visualisation of Identical-By-State regions across multiple assembled genomes.
Format: Pre-recorded with live Q&A

  • Ricardo Humberto Ramirez Gonzalez, John Innes Centre, United Kingdom
  • Jemima Brinton, Royal Botanical Gardens, United Kingdom
  • Cristobal Uauy, John Innes Centre, United Kingdom

Presentation Overview: Show

Multiple genome assemblies from the same species are a powerful tool to explore the diversity and highly selected haplotypes in the process of domestication and improvement. This is of particular importance on crop species, such as wheat (Triticum aestivum), as there are regions of agricultural value that had been purposedly selected; foreign sequence from hybridizations with related species; and there has been a reduction of genetic diversity product of intensive breeding. We developed a haplotype-based approach to identify genetic diversity for crop improvement using genome assemblies from 15 bread wheat cultivars
We developed a relational database and a dynamic visualisation to explore the identical-by-state (IBS) regions in their genomic context. We cover both: pseudomolecules and scaffolds-level assemblies. We use the gene projected genes from the main reference (cultivar Chinese Spring) to anchor the position of scaffolds. We use the same projections to efficiently produce an approximation of the relative coordinate systems between assemblies. The users can identify which IBS regions are shared among cultivars in a specific position or from the point of view of a cultivar.
Our visualisation provides intuitive tools to explore pangenomes in the context of crop breeding, but the principle can be adapted for other organisms.

13:10-13:20
Visualization of SARS-CoV-2 Genome Atlas
Format: Pre-recorded with live Q&A

  • Aditya Rao, TCS Research, India
  • Thomas Joseph, TCS Research, India
  • Vangala Govindakrishnan Saipradeep, TCS Research, India
  • Naveen Sivadasan, TCS Research, India
  • Rajgopal Srinivasan, TCS Research, India
  • Kavya Vaddadi, TCS Research, India
  • Naina Tiwari, TCS Research, India

Presentation Overview: Show

Massive growth in the number of publicly available SARS-CoV-2 genome sequences and the large number of mutations in these sequences makes the role of large-scale data analytics, including intuitive visual analytics tools, crucial in COVID-19 research. Analysis of viral genomes often involves MSA-based and phylogeny-based analysis of the sequences and their relatedness. The visual analytics tools often make use of the outcome of the MSA and the phylogeny studies for visual exploration of the data. However, difficulty in scaling to large data and expensive computations are limitations of these methods. In this work, we present a fast and inexpensive visualization approach that efficiently extracts variant level features of the strains and computes a bag-of-variants embedding for visualization which considers similar variant patterns in the strains. Our approach can easily scale to a large collection of strains, incorporate additional strains efficiently, and support visualizations based on selective genomic regions of interest. It provides intuitive 2D representation of the data with meaningful clusters, while also capturing the temporal and clade level evolution of the data in an informative fashion. Our visualization approach can complement the standard approaches for studying large-scale and diverse genome data.

13:20-13:30
A look at trails through the pangenome visualization jungle
Format: Pre-recorded with live Q&A

  • Éloi Durant, Institut de Recherche pour le Développement, France
  • Francois Sabot, IRD, France
  • Matthieu Conte, Syngenta Seeds, France
  • Mathieu Rouard, Bioversity, France

Presentation Overview: Show

Pangenomes are complex and malleable entities, listing common and unique genomic content within a group of genomes. As repertoires of present and absent genes they apply well to bacteria which have almost no ‘wasted’ genomic material. As inventories of all available sequences instead they might be better for more complex genomes (human, plants…) whose intergenic spaces and structural variations have multiple effects on said genes and their expression. Added to this is a diversity of inner properties (most represented parts, ‘openness’…) and usages (as references, genome storage …).
Visualizing pangenomes inherits this complexity, with additional challenges of data-to-available-space ratio and understandability, among others. Earlier representations showed pangenomes as genes shared between sets in Venn diagrams, or presence absence matrix of genes, with scalability issues or no support for a sequence centric definition of pangenomes. Recent efforts describe pangenomes as graphs of sequences, with genomes as paths within them. While faithful to the underlying sequences they are not easily readable for humans, especially when involving a large amount of data.
In exploring possible visualizations, we worked on linearized representations to enhance readability and explorability. We created Panache, our web-based viewer for browsing through linearized pangenomes.

13:30-13:50
Grammar-Based Interactive Visualization of Genomics Data
Format: Pre-recorded with live Q&A

  • Sehi L'Yi, Harvard Medical School, United States
  • Qianwen Wang, Harvard Medical School, United States
  • Fritz Lekschas, Harvard University, United States
  • Nils Gehlenborg, Harvard Medical School, United States

Presentation Overview: Show

The combination of diverse data types and analysis tasks in genomics has resulted in the development of a wide range of visualization techniques and tools. However, most existing tools are tailored to a specific problem or data type and offer limited customization, making it challenging to optimize visualizations for new analysis tasks or datasets.

To address this challenge, we designed Gosling—a grammar for interactive and scalable genomics data visualization. Gosling balances expressiveness for comprehensive multi-scale genomics data visualizations with accessibility for domain scientists. For example, Gosling allows creating complex glyph representations (e.g., gene annotations, lollipop plots, and ideograms), use both linear and circular layouts (i.e., using Cartesian and polar coordinates), and link views and interactive brushes for synchronous visual explorations. Our accompanying JavaScript toolkit called Gosling.js provides scalable and interactive rendering. Gosling.js is built on top of an existing platform for web-based genomics data visualization to further simplify the visualization of common genomics data formats.

We re-implemented a variety of real-world examples to demonstrate the expressiveness of the grammar. Furthermore, we show how Gosling supports the design of novel genomics visualizations. An online editor and examples of Gosling.js and its source code are available at https://gosling.js.org.

13:50-14:00
OmicsTIDE: Interactive Exploration of Trends in Multi-Omics Data
Format: Pre-recorded with live Q&A

  • Theresa Anisja Harbig, University of Tuebingen, Institute for Bioinformatics and Medical Informatics, Germany
  • Julian Fratte, University of Tuebingen, Institute for Bioinformatics and Medical Informatics, Germany
  • Michael Krone, University of Tuebingen, Institute for Bioinformatics and Medical Informatics, Germany
  • Kay Nieselt, University of Tuebingen, Institute for Bioinformatics and Medical Informatics, Germany

Presentation Overview: Show

The increasing amount of data produced by omics technologies has significantly improved the understanding of how biological information is transferred across different omics layers. Besides data-driven analysis strategies, interactive visualization tools have been developed to make the analysis in the multi-omics field more transparent. However, most state-of-the-art tools do not reconstruct the impact of a given omics layer on the final integration result.

To identify the requirements for a tool addressing this issue we classified omics data focusing on different aspects of multi-omics data sets, such as data type and experimental design. Based on this classification we developed the Omics Trend-comparing Interactive Data Explorer (OmicsTIDE), an interactive visualization tool. The tool consists of an automated part that clusters transcriptomics and proteomics data to determine trends and an interactive visualization. The trends are visualized as profile plots and are connected by a Sankey diagram that allows an interactive pairwise trend comparison to discover concordant and discordant trends. Moreover, large-scale omics data sets are broken down into small subsets within few analysis steps. In future work we plan to extend OmicsTIDE to more omics levels, such as metabolomics.

14:20-14:40
State-based Visual Analysis of Disease Progression with ThreadStates
Format: Pre-recorded with live Q&A

  • Qianwen Wang, Harvard Medical School, United States
  • Tali Mazor, Dana-Farber Cancer Institute, Boston, MA, USA, United States
  • Theresa Anisja Harbig, University of Tuebingen, Germany
  • Ethan Cerami, Dana-Farber Cancer Institute, Boston, MA, USA, United States
  • Nils Gehlenborg, Harvard University, United States

Presentation Overview: Show

Longitudinal cohort studies collect extensive patient observations across multiple timepoints and provide valuable information for researchers to understand the change of patient status and the progression of diseases. Given the large number of patients, the high dimensionality and heterogeneity of features, and the varying number of timepoints, the analysis of these patient observations is challenging.

To scale the analysis of disease progression, we depict patient status using a number of states that are characterized by value distributions over a set of observed measures (e.g., genomic mutations, blood glucose). We propose a visual analytics approach to assist users in the analysis through two phases: a state identification phase and a transition summarization phase. In state identification, an unsupervised clustering method learns states from patient timepoint features. A novel matrix+glyph design is proposed to effectively communicate the characteristics of different states. In the transition summary view, state transitions are represented using Sankey-based visualizations to capture and highlight the patterns of disease progression. Patients are further grouped based on transition patterns to reduce visual clutter and to reveal potential associations between progression patterns and other variables, including patient-level variables (e.g., age, gender) and event variables (e.g., medications, treatments).

14:40-15:00
Vesalius: Image-free extraction and analysis of tissue anatomy by using image processing applied to sequencing based Spatial Transcriptomics
Format: Pre-recorded with live Q&A

  • Patrick Martin, BRIC, Denmark

Presentation Overview: Show

Sequencing based Spatial transcriptomic methods have provided an avenue to study the cellular heterogeneity of tissues. Conventional clustering approaches do not explicitly account for spatial location of spots and often miss the finer details of spatially resolved transcriptional profiles. Here, we present Vesalius a new framework for the analysis of spatial transcriptomics based on image processing and computer vision. Vesalius converts transcriptional profiles into RGB colour images. These images are processed and anatomical structures are easily extracted for further analysis. Available as an R package, Vesalius provides a convenient way to study the finer details of tissue anatomy, find novel markers genes and micro-environment specific gene expression.

15:00-15:20
BioVis Highlights Talk: Insights From Experiments With Rigor in an EvoBio Design Study
Format: Live-stream

  • Jen Rogers

Presentation Overview: Show

Design study is an established approach of conducting problem-driven visualization research. The academic visualizationcommunity has produced a large body of work for reporting on design studies, informed by a handful of theoretical frameworks, andapplied to a broad range of application areas. The result is an abundance of reported insights into visualization design, with anemphasis on novel visualization techniques and systems as the primary contribution of these studies. In recent work we proposeda new, interpretivist perspective on design study and six companion criteria for rigor that highlight the opportunities for researchersto contribute knowledge that extends beyond visualization idioms and software. In this work we conducted a year-long collaborationwith evolutionary biologists to develop an interactive tool for visual exploration of multivariate datasets and phylogenetic trees. Duringthis design study we experimented with methods to support three of the rigor criteria:ABUNDANT,REFLEXIVE, andTRANSPARENT. As aresult we contribute two novel visualization techniques for the analysis of multivariate phylogenetic datasets, three methodologicalrecommendations for conducting design studies drawn from reflections over our process of experimentation, and two writing devices forreporting interpretivist design study. We offer this work as an example for implementing the rigor criteria to produce a diverse range ofknowledge contributions.

Tuesday, July 27th
11:00-11:10
Benchmarking framework for optimal visualization and interpretability of high-dimensional separable data
Format: Pre-recorded with live Q&A

  • Komlan Atitey, NIH - National Institute of Environmental Health Sciences (NIEHS), United States
  • Benedict Anchang, NIH - National Institute of Environmental Health Sciences (NIEHS), United States

Presentation Overview: Show

Understanding complex biological mechanisms of carcinogenesis using genomic and clinical data is vital, to develop new treatment for patients, and improve survival prognosis. High dimensional single-cell data poses challenges in terms of visualization and interpretability. In studying the performance of the most used linear, nonlinear, and neural network methods, we propose a robust analytical pipeline suitable for benchmarking dimensionality reduction methods for targeted biological questions. We define a multivariate metric for good visualization and interpretability by optimizing five features, characterizing the quality of projection in terms of fidelity of good coverage, uniform spread of the projected data, preserving structure of the original dataset, time dependency of the projected data, and robustness to outliers of dense clusters. To account for dependency of these features respecting the accuracy of a method, we build a Bayesian regression model of independent variables of metrics’ features and dependent variable of accuracy. The model predicts the conditional effect of metrics which is summarized as performance measures of good projection. By comparing the performance of six models applied to single-cell-based dynamic processes, we confirm that optimizing variational autoencoders preserves the most meaningful properties of a given biological process after data reduction and provide better visualization and interpretability.

11:10-11:30
Explaining Deep Learning Approaches in Drug Repurposing through Interactive Data Visualization
Format: Pre-recorded with live Q&A

  • Qianwen Wang, Harvard Medical School, United States
  • Nils Gehlenborg, Harvard University, United States
  • Kexin Huang, Harvard University, United States
  • Payal Chandak, Columbia University, United States
  • Marinka Zitnik, Harvard Medical School, United States

Presentation Overview: Show

Deep learning has demonstrated remarkable potential for identifying novel therapeutic uses of existing drugs (i.e., drug repurposing). However, the black-box nature of deep models can severely hinder their use in drug development.
To enfuse interpretability into deep drug-repurposing models, we combine interactive visualization with explainable machine learning. Targeted at graph neural networks, we propose a visualization method to extract and present explanations that 1) can be easily interpreted in the biomedical context and 2) can scale well with different granularities of analysis.
We first identify and tackle the mismatch between model-generated explanations and human-level explanations. The former can be characterized as a subset of a knowledge graph while the latter can be formalized as a path in the knowledge graph reflecting a biological mechanism. Interactive visualizations are then developed to enable easy switching between model explanations and human explanations.
We further summarize explanations through meta-paths, which are sequences of relation types summarized from individual explanation paths. Explanations using meta-paths enable analysis and comparison of predictions at different granularities (e.g., individual predictions, groups of predictions for similar diseases).
We demonstrate the proposed approach on a repurposing dataset across the entire range of approved drugs and human diseases.

11:30-11:40
Association Plots reveal cluster-specific genes from high-dimensional transcriptome data
Format: Pre-recorded with live Q&A

  • Elzbieta Gralinska, Max Planck Institute for Molecular Genetics, Germany
  • Martin Vingron, Max Planck Institute for Molecular Genetics, Germany

Presentation Overview: Show

Visualizing high-dimensional transcriptome data sets in an informative way is a major challenge in biological data analysis. Many questions, however, can be reduced to the search for associations between clusters of conditions and the highly expressed genes the conditions share. Existing embedding methods like PCA, t-SNE, or UMAP can visualize clusters of conditions or cells, but do not readily provide an overview of associated genes. Separate algorithms for delineation of differentially expressed genes need to be invoked, for which only individual results can be overlaid with an embedding.

To address this bottleneck we have developed Association Plots for visualizing condition-specific genes in high-dimensional data. Association Plots are derived from correspondence analysis, a projection method similar to PCA which, however, embeds both genes and conditions simultaneously in such a way that genes associated to a cluster of conditions lie in a particular direction in high-dimensional space. Measuring distances between genes and conditions leads to Association Plots which are independent of the data dimensionality and can aid in delineating marker genes.

We will demonstrate our method on GTEX bulk- and 3k PBMCs single-cell RNA-seq data sets, with Association Plots highlighting genes that characterize tissues or cell clusters from the data.

11:40-11:50
Color encoding of high-dimensional data using the CIELAB color space and state-of-the-art dimensionality reduction techniques
Format: Pre-recorded with live Q&A

  • Mikaela Koutrouli, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
  • John H. Morris, Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, California, United States
  • Lars J. Jensen, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark

Presentation Overview: Show

Data visualization is essential to discover patterns and anomalies in large high-dimensional datasets. New dimensionality reduction techniques have thus been developed for visualizing omics data, in particular from single-cell studies. However, combining several types of data, e.g. gene expression and networks, remains a challenge. Here, we present a method that addresses this by using UMAP to embed high-dimensional data within the CIELAB color space. As UMAP partially preserves distances and Euclidean distances approximate relative perceptual differences in CIELAB color space, the result is a color encoding that aims to capture much of the structure of the original high-dimensional data. We show how this method can be applied to a single-cell RNA-seq dataset. To validate the results, we mapped the resulting color encoding onto both a UMAP and a co-expression network of the same data, which shows a smooth color transition with color differences corresponding to the relative position of the genes. Finally, we use the method to jointly visualize single-cell RNA-seq data with a completely different type of data, namely a physical protein interaction network.

11:50-12:00
Hierarchical interactive exploration and analysis of single cell RNA-seq datasets
Format: Pre-recorded with live Q&A

  • Jayaram Kancherla, Data Science and Statistical Computing, Genentech, Inc., United States
  • Kazi Tasnim Zinat, Dept. of Computer Science, University of Maryland, College Park, United States
  • Stephanie Hicks, Dept. of Biostatistics, Johns Hopkins Bloomberg School of Public Health, United States
  • Hector Corrada Bravo, Genentech, Inc., United States

Presentation Overview: Show

A fundamental task in the analysis of single cell RNA-seq is unsupervised (or semi-supervised) clustering to help identify cell types. A purely computational approach to determine an appropriate number of clusters, and subsequently cell types, is usually unsatisfactory. Users need to specify parameters (number of clusters or resolution) and perform post-hoc analysis to determine the number of clusters. This is an exploratory, interactive analysis and approaches for interactivity are critical for effective analysis.

We present scTreeViz, a Bioconductor package to interactively visualize multi-resolution clusterings by exploiting their hierarchical organization using the facetZoom navigation technique. Users can explore finer or coarser resolutions of the hierarchy, remove clusters not pertinent to their analysis and dynamically aggregate measurements at different resolutions. These interactions update multiple linked interactive visualizations: low dimensional embeddings, heatmaps of gene expression, and boxplots of expression of specific genes across cell types.

scTreeViz integrates Bioconductor’s `Seurat` and `Single Cell Experiment` objects and can be used as part of an analysis workflow. We present examples where scTreeViz is used after clustering to select clusters at multiple resolutions, extract cluster labels, and proceed with downstream analysis, e.g., determining marker genes, using Bioconductor tools.

12:00-12:20
BioVis Highlights Talk: Image visualisation in publications - current status and workflows for improvements
Format: Pre-recorded with live Q&A

  • Helena Jambor
12:40-13:00
Proceedings Presentation: Metaball skinning of synthetic astroglial morphologies into realistic mesh models for in silico simulations and visual analytics
Format: Pre-recorded with live Q&A

  • Marwan Abdellah, Blue Brain Project (BBP) / EPFL, Switzerland
  • Alessandro Foni, Blue Brain Project (BBP) / EPFL, Switzerland
  • Eleftherios Zisis, Blue Brain Project (BBP) / EPFL, Switzerland
  • Nadir Roman Guerrero, Blue Brain Project (BBP) / EPFL, Switzerland
  • Samuel Lapere, Blue Brain Project (BBP) / EPFL, Switzerland
  • Jay S. Coggan, Blue Brain Project (BBP) / EPFL, Switzerland
  • Daniel Keller, Blue Brain Project (BBP) / EPFL, Switzerland
  • Henry Markram, Blue Brain Project (BBP) / EPFL, Switzerland
  • Felix Schürmann, Blue Brain Project (BBP) / EPFL, Switzerland

Presentation Overview: Show

Motivation: Astrocytes, the most abundant glial cells in the mammalian brain, have an instrumental role in developing neuronal circuits. They contribute to the physical structuring of the brain, modulating synaptic activity, maintaining the blood-brain barrier in addition to other significant aspects that impact brain function. Biophysically detailed astrocytic models are key to unraveling their functional mechanisms via molecular simulations at microscopic scales. Detailed, and complete, biological reconstructions of astrocytic cells are sparse. Nonetheless, data-driven digital reconstruction of astroglial morphologies that are statistically identical to biological counterparts are becoming available. We use those synthetic morphologies to generate astrocytic meshes with realistic geometries, making it possible to perform these simulations.

Results: We present an unconditionally robust method capable of reconstructing high fidelity polygonal meshes of astroglial cells from algorithmically-synthesized morphologies. Our method uses implicit surfaces, or metaballs, to skin the different structural components of astrocytes and then blend them in a seamless fashion. We also provide an end-to-end pipeline to produce optimized two- and three- dimensional meshes for visual analytics and simulations respectively. The performance of our pipeline has been assessed with a group of 5,000 astroglial morphologies and the geometric metrics of the resulting meshes are evaluated. The usability of the meshes is then demonstrated with different use cases.

Implementation and availability: Our metaball skinning algorithm is implemented in Blender 2.82 relying on its Python API (Application Programming Interface). To make it accessible to computational biologists and neuroscientists, the implementation has been integrated into NeuroMorphoVis.

13:00-13:10
Interactive multiscale microscopy visualization on the web with Viv
Format: Pre-recorded with live Q&A

  • Nils Gehlenborg, Harvard University, United States
  • Trevor Manz, Harvard Medical School, United States
  • Mark Keller, Harvard Medical School, United States
  • Ilan Gold, Harvard Medical School, United States
  • Chuck Mccallum, Harvard Medical School, Department of Biomedical Informatics, United States

Presentation Overview: Show

Recent advances in highly multiplexed imaging​ have enabled the comprehensive profiling of complex tissues in healthy and diseased states, facilitating the study of fundamental biology and human disease in spatially-resolved contexts at sub-cellular resolution. However, current computational infrastructure to distribute and visualize these data on the web remains complex to set up and maintain. To address these limitations, we have developed Viv – an open-source image visualization
library for high-resolution multiplexed image data that is implemented in JavaScript and builds on modern web technologies. Viv directly renders both OME-TIFF and modern OME-NGFF (Zarr) data formats in a web-browser, enabling the viewing of large data resources without requiring wholesale data download or any software installation.

Viv is modular by design and intended to be embedded within various interactive exploratory or analytical visualization applications. Viv's client-side rendering approach is unique from existing web-based image viewers that rely on pre-rendering or server-side rendering, allowing greater flexibility and reuse. We integrate Viv within Jupyter Notebooks and pair Viv with a cloud-based data repository to demonstrate the utility of our approach.

13:10-13:30
Loon: Using Exemplars to Visualize Large Scale Microscopy Data
Format: Pre-recorded with live Q&A

  • Devin Lange, University of Utah, United States
  • Eddie Polanco, University of Utah, United States
  • Robert Judson-Torres, University of Utah, United States
  • Thomas Zangle, University of Utah, United States
  • Alexander Lex, University of Utah, United States

Presentation Overview: Show

Which drug is most promising for a cancer patient? This is a question a new microscopy-based approach for measuring the mass of individual cancer cells treated with different drugs promises to answer in only a few hours. However, the analysis pipeline for extracting data from these images is still far from complete automation: human intervention is necessary for quality control for preprocessing steps such as segmentation, to adjust filters, and remove noise, and for the analysis of the result. To address this workflow, we developed Loon, a visualization tool for analyzing drug screening data based on quantitative phase microscopy imaging. Loon visualizes both, derived data such as growth rates, and imaging data. Since the images are collected automatically at a large scale, manual inspection of images and segmentations is infeasible. However, reviewing representative samples of cells is essential for quality control and data analysis. We introduce a new approach of choosing and visualizing representative exemplar cells that retain a close connection to the low-level data. By tightly integrating the derived data visualization capabilities with the novel exemplar visualization and providing selection and filtering capabilities, Loon is well suited for making decisions about which drugs are suitable for a specific patient.

13:30-13:50
ShapoGraphy: a glyph-oriented visualisation approach for creating pictorial representations of bioimaging data
Format: Pre-recorded with live Q&A

  • Heba Sailem, University of Oxford, United Kingdom

Presentation Overview: Show

Intuitive visualisation of quantitative microscopy data is crucial for interpreting and discovering new patterns in complex bioimage data. Existing visualisation approaches, such as bar charts, scatter plots and heat maps, do not accommodate the complexity of visual information present in microscopy data. Here we develop ShapoGraphy, a first of its kind method accompanied by a user-friendly web-based application for creating interactive quantitative pictorial representations of phenotypic data and facilitating the understanding and analysis of image datasets (www.shapography.com). ShapoGraphy enables the user to create a structure of interest as a set of shapes. Each shape can encode different variables that are mapped to the shape dimensions, colours, symbols, and stroke features. We illustrate the utility of ShapoGraphy using various image data, including high dimensional multiplexed data. Our results show that ShapoGraphy allows a better understanding of cellular phenotypes and relationships between variables. In conclusion, ShopoGraphy supports scientific discovery and communication by providing a wide range of users with a rich vocabulary to create engaging and intuitive representations of diverse data types.

13:50-14:00
An Extensive Visualization Suite for Pathway/Genome Databases
Format: Pre-recorded with live Q&A

  • Peter Karp, SRI International, United States
  • Suzanne Paley, SRI International, United States

Presentation Overview: Show

The BioCyc database collection combines genome data with computational inferences, curated data, and data imported from other databases, for 18,000 genomes. To speed cognitive uptake of these vast data we have developed an extensive set of visualization tools within the Pathway Tools (PTools) software. PTools generates individual metabolic pathway diagrams and multi-pathway diagrams. It generates a zoomable diagram depicting the full metabolic network of an organism; these diagrams can be combined to depict the metabolism of an organism community. Omics data analysis is supported by painting omics data onto individual pathway diagrams, multi-pathway diagrams, and metabolic-network diagrams. PTools provides a tool for interactively exploring linear paths through a metabolic network. PTools can depict the full regulatory network of an organism, and provides a novel diagram that summarizes all the regulatory influences on a gene and its product. PTools provides a genome browser that zooms from the sequence level to a visualization of a complete genome in one screen.

14:20-15:10
BioVis Keynote: Theories of inference for visualization interactions
Format: Live-stream

  • Jessica Hullman, Northwestern University, US

Presentation Overview: Show

Research and development in computer science and statistics have produced increasingly sophisticated software interfaces for interactive and exploratory analysis, optimized for easy pattern finding and data exposure. But design philosophies that emphasize exploration over other phases of analysis risk confusing a need for flexibility with a conclusion that exploratory visual analysis is inherently “model free” and cannot be formalized. We describe how without a grounding in theories of human statistical inference, research in exploratory visual analysis can lead to contradictory interface objectives and representations of uncertainty that can discourage users from drawing valid inferences. We discuss how the concept of a model check in a Bayesian statistical framework unites exploratory and confirmatory analysis, and how this understanding relates to other proposed theories of graphical inference. Viewing interactive analysis as driven by model checks suggests new directions for software and empirical research around exploratory and visual analysis.

15:10-15:20
BioVis Award Ceremony & Closing
Format: Live-stream



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube