Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

BioVis: Biological Data Visualization

COSI Track Presentations

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
Monday, July 9th
10:15 AM-10:20 AM
BioVis: Opening Remarks
Room: Columbus IJ
  • Marc Streit
10:20 AM-11:20 AM
Keynote: 0ne successful data exploration—many explanations
Room: Columbus IJ
  • Martin Krzywinski
11:20 AM-11:40 AM
Proceedings Presentation: NeuroMorphoVis: a collaborative framework for visualization and analysis of neuronal morphology skeletons reconstructed from microscopy stacks
Room: Columbus IJ
  • Marwan Abdellah, Blue Brain Project/EPFL, Switzerland
  • Juan Hernando, Blue Brain Project/EPFL, Switzerland
  • Stefan Eilemann, Blue Brain Project/EPFL, Switzerland
  • Samuel Lapere, Blue Brain Project/EPFL, Switzerland
  • Nicolas Antille, Blue Brain Project/EPFL, Switzerland
  • Henry Makram, Blue Brain Project/EPFL, Switzerland
  • Felix Schürmann, Blue Brain Project/EPFL, Switzerland

Presentation Overview: Show

Motivation: From image stacks to computational processing digital representations of neuronal morphologies is essential to neuroscientific research. Workflows involve various techniques and tools, leading in certain cases to convoluted and fragmented pipelines.
The existence of an integrated, extensible and free framework for processing, analysis and visualization of those morphologies is a challenge that is still largely unfulfilled.

Results: We present NeuroMorphoVis, an interactive, extensible and cross-platform framework for building, visualizing and analyzing digital reconstructions of neuronal morphology skeletons extracted from microscopy stacks.
Our framework is capable of detecting and repairing tracing artifacts, allowing the generation of high fidelity surface meshes and high resolution volumetric models for simulation and \emph{in silico} imaging studies.
The applicability of NeuroMorphoVis is demonstrated with two case studies.
The first simulates the construction of three-dimensional profiles of neuronal somata and the other highlights how the framework is leveraged to create volumetric models of neuronal circuits for simulating different types of \emph{in vitro} imaging experiments.

Availability and implementation: The source code and documentation are freely available on https://github.com/BlueBrain/NeuroMorphoVis under the GNU public license.
The morphological analysis, visualization and surface meshing are implemented as an extensible Python API (Application Programming Interface) based on Blender, and the volume reconstruction and analysis code is written in C++ and parallelized using OpenMP. The framework features are accessible from a user-friendly GUI (Graphical User Interface) and a rich CLI (Command Line Interface).

11:40 AM-11:50 AM
starmap: Immersive visualisation of single cell data using smartphone-enabled virtual reality
Room: Columbus IJ
  • Andrian Yang, Victor Chang Cardiac Research Institute, Australia
  • Yu Yao, Victor Chang Cardiac Research Institute, Australia
  • Jianfu Li, Victor Chang Cardiac Research Institute, Australia
  • Joshua W. K. Ho, Victor Chang Cardiac Research Institute, Australia

Presentation Overview: Show

Advances in single-cell RNA-seq technology, flow cytometry and mass cytometry has enabled the expression profiling of a large number of genes and proteins for hundreds of thousands of individual cells. However, current visualisation techniques do not allow for effective display and understanding of the data due to the large number of points and use of non-immersive flat-screen visualisation. With the widespread availability of low-cost virtual reality (VR) devices, such as Google Cardboard, we propose the use these devices as an immersive environment for visualising single-cell data in order to improve the navigation and exploration of the large number of cells. We have developed starmap, a VR program for visualising single-cell data designed to work with low-cost VR headset. starmap offers a number of methods for interactions, such as wireless controller and voice control, and has a built-in star plot visualisation to allow user to explore features of the cells.

11:50 AM-12:00 PM
Interactive Visual Analysis of Mass Cytometry Data by Hierarchical Stochastic Neighbor Embedding Reveals Rare Cell Types
Room: Columbus IJ
  • Thomas Höllt, Delft University of Technology, Leiden University Medical Center, Netherlands
  • Vincent van Unen, Leiden University Medical Center, Netherlands
  • Nicola Pezzotti, Delft University of Technology, Netherlands
  • Na Li, Leiden University Medical Center, Netherlands
  • Marcel J.T Reinders, Delft University of Technology, Netherlands
  • Elmar Eisemann, Delft University of Technology, Netherlands
  • Frits Koning, Leiden University Medical Center, Netherlands
  • Anna Vilanova, Delft University of Technology, Netherlands
  • Boudewijn P.F. Lelieveldt, Leiden University Medical Center / Delft University of Technology, Netherlands

Presentation Overview: Show

Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a step-wise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.

12:00 PM-12:10 PM
Visual Analysis of Enzyme-Substrate Interactions
Room: Columbus IJ
  • Karsten Schatz, University of Stuttgart, Germany
  • Michael Krone, University of Stuttgart, Germany
  • Valerio Ferrario, University of Stuttgart, Germany
  • Jürgen Pleiss, University of Stuttgart, Germany
  • Thomas Ertl, Visualization Research Center University of Stuttgart, Germany

Presentation Overview: Show

Enzymes play a crucial role in all living organisms as well as in industrial applications such as biofuel or cheese production. Most enzymes are catalytic proteins, that is, they accelerate a chemical reaction of one or more contacting substrate molecules. In general, these reactions do not happen at arbitrary positions of the enzyme’s surface but at specific binding sites. We present two visualization methods to assist users during the analysis of Molecular Dynamics simulations of such enzyme-substrate interactions. The first one targets the extraction and visualization of cavities, which are often the location of a binding site. Our second visual analysis method targets the substrate in the proximity of the enzyme. We visualize the temporally aggregated locations of the substrate molecules, thereby showing paths travelled by the substrate as well as preferred binding locations. We demonstrate the applicability of our approach for the visual analysis of simulation trajectories with more than 2M time steps, each one containing over 3M atoms, which results in more than 2 TB of hard drive space. Our proposed visualization of aggregated properties cannot only be visualized on standard consumer desktop hardware, but it also directly summarizes the temporal development during the simulation.

12:10 PM-12:20 PM
BioBlox – A suite of computer games for protein docking for crowd-sourcing scientific solutions and education via mobiles and virtual reality
Room: Columbus IJ
  • Pedro Quijada Leyton Quijada Leyton, Goldsmiths, University of London, United Kingdom
  • Ioannis Filippis Filippis, Imperial College London, United Kingdom
  • Elio de Berardinis, Goldsmiths, University of London, United Kingdom
  • Federico Soncini, Goldsmiths, University of London, United Kingdom
  • Ciaran Wilson, Goldsmiths, University of London, United Kingdom
  • Suhail A Islam, Imperial College London, United Kingdom
  • Pablo Larenas Larenas, Goldsmiths, University of London, United Kingdom
  • Alessia David, Imperial College London, United Kingdom
  • Richard Leinfellner, Goldsmiths, University of London, United Kingdom
  • Andy Thomason, Goldsmiths, University of London, United Kingdom
  • William Latham, Goldsmiths, University of London, United Kingdom
  • Frederic Fol Leymarie, Goldsmiths, University of London, United Kingdom
  • Michael J.E. Sternberg, Imperial College London, United Kingdom

Presentation Overview: Show

We present the BioBlox suite of gaming-inspired programs based around protein docking (www.bioblox.org).

BioBlox3D (www.bioblox3d.org) is a web-based program for desktops tackling the challenge of starting with two unbound protein molecules and predicting the structure of the complex. BioBlox3D provides an-easy-to-use approach for biologists to explore graphically protein docking. A motivation is that humans are particularly skilled at visual reasoning, sometimes outperforming computers. We have a gaming scoreboard and will establish a crowd source approach for protein docking.

BioBloxVR is a visual reality version of BioBloxd3D which provides an inspiring game for exhibitions and outreach meetings. User can pick up, rotate and dock protein molecules using their Vive controllers in highly intuitive way.

Inspired by BioBlox3D, we have developed a simple fun-to-play game named BioBlox2D designed for phones and tablets available from the App Store and Google Play. The game models fragment-based drug discovery. Small shapes with charges need to be correctly assembled so they fit into a complementary protein receptor. As the player progresses through the game there are questions in a biomedical quiz posing problems such as identifying glucose as the molecule used by our cells to produce energy.
YouTube videos are at:
https://www.youtube.com/watch?v=2z8y7rUWOos
https://www.youtube.com/watch?v=4X9vgPzk_pM
https://youtu.be/YInXSEVWPvk

12:20 PM-12:30 PM
Visualization methods for RNA-sequencing data analysis
Room: Columbus IJ
  • Lindsay Rutter, Iowa State University, United States
  • Dianne Cook, Monash University, Australia

Presentation Overview: Show

Motivation: RNA-seq data is biased and accurate detection of differentially expressed genes is not a trivial task. As a result, researchers should analyze RNA-seq data like they would any other biased multivariate data. The most effective approach to modern data analysis is to iterate between models and visuals, and to enhance the appropriateness of models based on feedback from visuals. As it stands, there is a need to make it easier for researchers to use models and visuals in a complimentary fashion during RNA-seq data analysis.

Results: We use real RNA-seq data to show that our visualization tools can detect normalization problems, DEG designation problems, and common errors in RNA-seq analysis. We also show that our tools can identify genes of interest that cannot be obtained by models.

Conclusion: In this project, we do not propose that users radically change their approach to RNA-seq analysis. Instead, we propose that users simply modify their usual approach to RNA-seq analysis by quickly assessing the sensibility of their models with multivariate statistical graphics. We plan to serve a role in this solution by publishing a new R software package that includes the useful plotting techniques we introduce in this project.

12:30 PM-12:40 PM
HebbPlot: A new tool for visualizing histone signatures
Room: Columbus IJ
  • Alfredo Velasco Ii, University of Tulsa, United States
  • Hani Girgis, University of Tulsa, United States

Presentation Overview: Show

Epigenetics involves the study of histones (proteins that DNA wraps around). Side chains — known as marks — attached to these histones may determine the function of the DNA wrapped around them. Identifying patterns or signatures consisting of 100 marks can be a challenging task. We developed a tool called HebbPlot, which can learn and visualize a signature from thousands of genomic locations that have the same function, e.g. active promoters. HebbPlot obtains vectors that represent overlaps between a set of regions and histone marks. These vectors are fed into a Hebbian network, which outputs a gray scale image of the overlaps between the genetic element and the epigenome. Each pixel represents the presence or absence of a mark. We used HebbPlot in six case studies conducted on 57 cell types. HebbPlots of promoters on the positive and the negative strands are mirror images, indicating the directionality of histone marks around active promoters. We confirmed that some marks are only present in high-CpG promoters in contrast to low-CpG promoters. HebbPlots show clear associations between the abundance of histone marks around coding regions and the level of gene expression. We hope HebbPlot will help biologists decipher the histone code.

12:40 PM-2:00 PM
Lunch Break
2:00 PM-3:00 PM
Keynote: The Changing Nature of Collaboration in Visualization
Room: Columbus IJ
  • Sheelagh Carpendale
3:00 PM-3:20 PM
Proceedings Presentation: The Kappa platform for rule-based modeling
Room: Columbus IJ
  • Pierre Boutillier, Harvard University, United States
  • Mutaamba Maasha, Harvard University, United States
  • Xing Li, Edgewise Networks, United States
  • Hector Medina-Abarca, Harvard University, United States
  • Jean Krivine, Universite Paris 7, France
  • Jerome Feret, Ecole Normale Superieure Paris, France
  • Ioana Cristescu, Harvard University, United States
  • Angus Forbes, University of California at Santa Cruz, United States
  • Walter Fontana, Harvard University, United States

Presentation Overview: Show

Motivation: We present an overview of the Kappa platform, an integrated suite of analysis and visualization techniques for building and interactively exploring rule-based models. The main components of the platform are the Kappa Simulator, the Kappa Static Analyzer, and the Kappa Story Extractor. In addition to these components, we describe the Kappa User Interface, which includes a range of interactive visualization tools for rule-based models needed to make sense of the complexity of biological systems. We argue that, in this approach, modeling is akin to programming and can likewise benefit from an integrated development environment. Our platform is a step in this direction.

Results: We discuss details about the computation and rendering of static, dynamic, and causal views of a model, which include the contact map, snaphots at different resolutions, the dynamic influence network, and causal compression. We provide use cases illustrating how these concepts generate insight. Specifically, we show how the contact map and snapshots provide information about systems capable of polymerization, such as Wnt signaling. A well-understood model of the KaiABC oscillator, translated into Kappa from the literature, is deployed to demonstrate the dynamic influence network and its use in understanding systems dynamics. Finally, we discuss how pathways might be discovered or recovered from a rule-based model by means of causal compression, as exemplified for early events in EGF signaling.

Availability: The Kappa platform is available via the project website at kappalanguage.org. All components of the platform are open source and freely available through the authors' code repositories.

3:20 PM-3:30 PM
Feature-Centric Visual Exploration of Genome Interaction Maps
Room: Columbus IJ
  • Fritz Lekschas, Harvard University, United States
  • Benjamin Bach, The University of Edinburgh, United Kingdom
  • Peter Kerpedjiev, Harvard University, United States
  • Michael Behrisch, Harvard University, United States
  • Nils Gehlenborg, Harvard Medical School, United States
  • Hanspeter Pfister, Harvard University, United States

Presentation Overview: Show

We present Scalable Insets, a feature-centric technique to visually explore genome interaction maps with many 2D features. Genome interaction maps present an approximate likelihood of physical interaction of pairs of regions on the genome. These maps contain up to several thousands of 2D features such as compartments, domains, and loops. Visual inspection is critical to generate new hypotheses, evaluate the performance of feature detectors, and stratify features into groups. Exploration of many but sparsely distributed features in traditional pan-and-zoom heatmaps is challenging, as visual representations change across zoom levels, context and navigational cues get lost upon zooming, and navigation is time consuming. Our technique visualizes features too small to be identifiable at certain zoom levels using magnified thumbnail views of the features called insets. Insets support users in searching, comparing, and contextualizing features, while reducing the amount of navigation needed. They are dynamically placed either within the viewport or along the boundary of the viewport to offer a compromise between locality and context preservation. Features are interactively clustered by location and type and are visually represented as a pileup showing the average and variance of features to provide scalable exploration within a single viewport.

3:30 PM-3:40 PM
Scaling Up the Genome Browser
Room: Columbus IJ
  • Danielle Nguyen, Harvard University, United States
  • Peter Kerpedjiev, Harvard University, United States
  • Nils Gehlenborg, Harvard Medical School, United States

Presentation Overview: Show

The increasing abundance of genome-wide data has created a need for visual comparison between different genomic data sets as well as comparison between different regions in the same genome. We have developed HiGlass (http://higlass.io), a genome visualization tool that supports visualization of genome-wide 1D tracks and 2D interaction matrices in flexible view configurations that users can create on the fly.
Recently, we extended HiGlass to provide a solution for comparing hundreds of genomic datasets efficiently by using a multi-resolution matrix data type with scaling and aggregation in one dimension. The ‘multi-vector’ matrix can store multiple data sets so that they are rapidly retrievable, thus eliminating the need to read and compare data across many files. The advantages of this solution are demonstrated in our use of HiGlass to display 256 ChIP-seq profiles using a heatmap as well as a chromatin state model with 15 states using a stacked bar chart. This data format can also be used to display genomic sequences. Returning tile data instead of pre-rendered image tiles allows for flexibility. This is particularly useful in features we have implemented for our visualizations, such as dynamic scaling, plot type selection, and smooth zooming and panning.

3:40 PM-3:50 PM
PopNetD3: An accessible web tool for the analysis and visualization of population structure
Room: Columbus IJ
  • Javi Zhang, University of Toronto, Canada
  • John Parkinson, Hospital for Sick Children, Canada

Presentation Overview: Show

While the structure of populations has garnered increasing interest as a source of insight into their evolution and behavior, an intuitive and informative method for its visualization remain a challenge to be met. We previously developed PopNet, featuring network graphs and chromosome painting as innovations in visualizing the genomes of populations. To facilitate its use to a wider audience, we present PopNetD3, a browser-based interface to PopNet. Users can submit data to a cloud server for analysis, and view their network in-browser through a D3-based network visualizer. The advantages of PopNetD3 include low user requirements and stable run environment along with the introduction of new functionalities.

PopNetD3’s functionalities are illustrated in an analysis of 44 whole genome sequence samples of Neisseria gonorrhoeae, the causative agent of gonorrhea. The network graph generated by PopNetD3 reveals 5 subpopulations within the sample set, and represents their relatedness as links between nodes. Geography is shown as an important factor modulated by long-distance travel. As well, segregation between male and female samples points to host-specific adaptations. The chromosome paintings, embedded within each node of the network, depict the degree of shared ancestry between subpopulations.

3:50 PM-4:00 PM
MecCog: A framework for representing human genetic disease mechanisms
Room: Columbus IJ
  • Kunal Kundu, University of Maryland, United States
  • Lipika R. Pal, Institute for Biosciences and Biotechnology Research, United States
  • Lindley Darden, University of Maryland, United States
  • John Moult, University of Maryland, United States

Presentation Overview: Show

Advances in high-throughput technologies are leading to major new insights into human genetic disease mechanisms. But the many results are scattered throughout the literature and are represented in many different ways, including free text, cartoons, pathway diagrams, and network graphs. Thus there is a need for a framework that is capable of integrating and presenting this knowledge in a coherent, comprehensive, and intuitive way. The MecCog framework utilizes formal concepts of biological mechanism to achieve this goal. In MecCog, a disease mechanism schema consists of a sequence of substate perturbations, starting at the DNA stage of biological organization and progressing through RNA, protein, molecular complex, cell, tissue and organ stages to the disease phenotype. Each substate perturbation produces the next through a mechanism module. Representation of uncertainty, ambiguity and ignorance, as well as inclusion of the evidence supporting each mechanism component, is tightly integrated into the framework. Mechanism schemas are represented by a diagrammatic language and are accessed through an interactive graphical interface (http://www.meccog.org).In addition to providing an integrative framework for disease mechanism, MecCog facilitates prioritization of future experiments, identification of new therapeutic targets, detection of epistatic interactions between loci in complex trait disease, and optimized therapy choice.

4:00 PM-4:40 PM
Coffee Break
4:40 PM-5:20 PM
Invited Talk: Mining Gems from the Data Visualization Literature
Room: Columbus IJ
  • Nils Gehlenborg, Harvard Medical School, United States

Presentation Overview: Show

Vis Infrastructure, Tools and Libraries

5:20 PM-5:30 PM
MaizeDIG: A mechanism for connecting gene models to phenotypes at MaizeGDB
Room: Columbus IJ
  • Kyoung Tak Cho, Iowa State University, United States
  • John Portwood Ii, USDA-ARS, United States
  • Elisabeth Harper, USDA-ARS, United States
  • Jack Gardiner, University of Missouri, United States
  • Carolyn Lawrence-Dill, Iowa State University, United States
  • Iddo Friedberg, Iowa State University, United States
  • Carson Andorf, USDA-ARS, United States

Presentation Overview: Show

An organism can be described by both its observable characteristics (phenotypes) and the underlying genes and genomic data (genotype) that cause the phenotype. There has been tremendous growth in data for both genomic data and phenotype imaging. This growth has been seen in the maize research community with sequencing of thousands of maize accessions and the availability of large-scale phenotype data. A challenge at MaizeGDB, the model organism database for maize, is to be able to build connections between the phenotype and genotype data sets. A GMOD project that consists of a web-based software package that allows annotation of the genotypic-phenotypic relationships is called BioDIG (Biological database of images and genomes). To integrate the phenotype images and the genomics information at MaizeGDB, we implemented and updated a maize-based version of BioDIG called MaizeDIG. MaizeDIG is enhanced to handle multiple genomes and integrated with genome browsers to make tracks showing mutant phenotypes images within their genomic context. MaizeDIG allows for custom tagging of images to highlight regions related to the phenotypes and to curate and search by gene model, gene symbol, gene name, and allele. MaizeDIG is preloaded with 2,721 mutant phenotype images that are available on seven genome browsers.

5:30 PM-5:40 PM
D3Oncoprint: Standalone software to visualize and dynamically explore annotated genomic mutation files
Room: Columbus IJ
  • Alida Palmisano, National Cancer Institute, DCTD/BRP, United States
  • Yingdong Zhao, National Cancer Institute, DCTD/BRP, United States
  • Richard Simon, National Cancer Institute, DCTD/BRP, United States

Presentation Overview: Show

Thanks to the reduction in cost of sequencing technologies, many laboratories are collecting enormous amounts of genomic data. Powerful and easy to use software applications are needed to translate the information hidden in sequencing data into biological discovery to impact patient care. Available software packages typically require advanced programming knowledge, system administration privileges or they are web services that force researchers to work on outside servers.
We developed D3Oncoprint to facilitate the interactive exploration of genomic datasets on local machines with no programming skills required. D3Oncoprint is a standalone application used to visualize and dynamically explore annotated genomic mutation files. D3Oncoprint provides links to curated databases (e.g., CIViC, OncoKB, My Cancer Genome, and FDA approved drugs), as well as curated gene lists from BioCarta pathways, and FoundationOne cancer panels to explore commonly investigated biological processes.
D3Oncoprint is free and available for download from the website of the Biometric Research Program (BRP) of the Division of Cancer Treatment and Diagnosis, NCI (https://brb.nci.nih.gov/d3oncoprint/).
The focus on interactive visualization with biological and medical annotation significantly lowers the barriers between complex genomic data and biomedical investigators. D3Oncoprint can help researchers explore their own data, without the need of an extensive computational background.

5:40 PM-5:50 PM
Visualization of Longitudinal Cancer Genomics Data
Room: Columbus IJ
  • Theresa Anisja Harbig, University of Tübingen, Germany
  • Sabrina Nusrat, Harvard Medical School, United States
  • Alexander Thomson, Novartis Institutes of Biomedical Research, Cambridge, MA, USA, United States
  • Hans Bitter, Novartis Institutes of Biomedical Research, Cambridge, MA, USA, United States
  • Tali Mazor, Dana-Farber Cancer Institute, United States
  • Ethan Cerami, Dana-Farber Cancer Institute, United States
  • Nils Gehlenborg, Harvard Medical School, United States

Presentation Overview: Show

Affordable next-generation sequencing techniques and immuno-profiling assays have made it possible to monitor tumor evolution in patients over time, and to study their response to therapy. To interpret data from such studies, there is a critical need for visualization designed to aid researchers in visualizing and exploring temporal patterns within a single patient and across an entire patient cohort.
In a week-long design sprint, we identified the requirements for a longitudinal cancer genomics visualization tool by interviewing potential end users. This also included the implementation and testing of a first prototype. Inspired by the design of the Domino technique (Gratzl et al., 2014), our approach is based on heterogeneous heatmaps representing patient samples (columns) at different timepoints (blocks). Each sample can be represented by various user-defined variables such as mutation count in genes of interest, or measurements from other assays. Between timepoint blocks, users can add treatment blocks representing information about drug regimens. Within each block, samples can be grouped to show proportions of patients with a particular attribute. With this grouping, the heatmap can be transformed iteratively to a Sankey diagram.
We will report on our visualization approach and insights from our design process.

5:50 PM-6:00 PM
BioVis: Closing Remarks
Room: Columbus IJ
  • Kay Nieselt, University of Tübingen, Germany