Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters

Poster Categories
Poster Schedule
Preparing your Poster - Information and Poster Size
How to mount your poster
Print your poster in Basel

View Posters By Category

Session A: (July 22 and July 23)
Session B: (July 24 and July 25)

Presentation Schedule for July 22, 6:00 pm – 8:00 pm

Presentation Schedule for July 23, 6:00 pm – 8:00 pm

Presentation Schedule for July 24, 6:00 pm – 8:00 pm

Session A Poster Set-up and Dismantle
Session A Posters set up: Monday, July 22 between 7:30 am - 10:00 am
Session A Posters should be removed at 8:00 pm, Tuesday, July 23.

Session B Poster Set-up and Dismantle
Session B Posters set up: Wednesday, July 24 between 7:30 am - 10:00 am
Session B Posters should be removed at 2:00 pm, Thursday, July 25.

B-01: Addressing Scientific Reproducibility with Data Sharing Platforms: INTERVALS, a case study
COSI: BioVis COSI
  • Sylvain Gubian, Philip Morris International, Switzerland
  • Manuel C. Peitsch, Philip Morris International R&D, Switzerland
  • Stéphanie Boué, Philip Morris International R&D, Switzerland
  • Julia Hoeng, Philip Morris International R&D, Switzerland
  • Adrian Stan, Philip Morris International R&D, Switzerland

Short Abstract: In order to address concerns of reproducibility, INTERVALS - an online platform created for scientists by Philip Morris International R&D - was built using the latest standards in data sharing and reproducible research. INTERVALS is driven by the idea of proactively sharing protocols, computational tools, and data from assessment studies. In a single place, it shares results, software, raw data files, and detailed information on the design and the protocols used in studies. This facilitates the review process and allows, at the same time, for the reuse of data in the generation and testing of new hypotheses. We believe that all these traits enhance several-fold the transparency of the scientific process and accelerate scientific research. The scientific community is invited to use the portal to publish their own studies, to share their datasets and results. In this context, the poster approaches the theme of live visualizations on data sharing platforms. Using a data warehouse environment, we implemented a restful API for visualizing, with the plotly framework, data published on the INTERVALS platform. We discuss the main difficulties in implementing visualizations on data sharing platforms, and address specific examples of online exploration of raw data, descriptive and inferential statistics.

B-02: Advanced Visualization of Data Comparisons with BiocompR
COSI: BioVis COSI
  • Yoann Pageaud, Division of Cancer Epigenomics, German Cancer Research Center (DKFZ), Germany

Short Abstract: Increasing amount of omics data generated since the last decade has led to the necessity of finding standardized ways for analysis and representation. R programming language has quickly established itself as the reference for statistical analysis and graphics for omics data, jump-started by the Tidyverse packages. Among those, ggplot2 has brought scientists a new grammar for plotting graphs using combinations of independent components. However, there are limited options available for visualising multi-dimensional data. To address this, we introduce BiocompR, an R Package build upon ggplot2. BiocompR improves commonly used plots dedicated to data comparison, dataset exploration, and ultimately provides users with versatile and customizable graphics. In addition, we describe craviola plots, which are splitted and binned violin plots, fused plot, which is a combined pairwise visualisation of two distinct metrics, and sunset plot, which offers an elegant solution to visualize multiple variables.

B-03: BioCicle: A Tool for Summarizing and Comparing Taxonomic Profiles out of Biological Sequence Alignments
COSI: BioVis COSI
  • Meili Vanegas-Hernandez, Universidad de los Andes, Colombia
  • Fabio Andres Lopez-Corredor, Universidad de los Andes, Colombia
  • Tiberio Hernandez, Universidad de los Andes, Colombia
  • Alejandro Reyes, Universidad de los Andes, Colombia
  • John Alexis Guerra-Gomez, Northeastern University Silicon Valley, Colombia

Short Abstract: Biological sequence comparison is a crucial step towards the process of identifying and cataloging new sequences. To achieve this, computational biologists must compare a new sequence to the permanently-growing biological databases. This comparison produces a myriad of results, from where extracting useful information is highly cost-intensive given the lack of tools providing an overview of the results. Moreover, it is possible to mistakenly catalog new sequences due to poor comparison analysis. This project is the outcome of a close collaboration with domain experts and a thorough study of the state of the art. As a result, six analysis tasks commonly performed by bioinformaticians were identified. Each task consists either in summarizing (for single sequence results) or comparing (for multiple sequence results): regions of interest (AT1), taxonomic reports (AT2), and sequences' descriptions (AT3). A user test was done with a group of computational biologist, in which we could evaluate both the usability and usefulness of the platform. In brief, this project presents a taxonomy of analysis tasks, a visualization design for AT2a and AT2b, a dummy use case for a subset of a real metagenomics result set, and an open source prototype (available at http://54.208.29.57) presented as a proof of concept.

B-04: GeneDMRs: an R package for Gene-based Differentially Methylated Regions analysis
COSI: BioVis COSI
  • Xiao Wang, Technical University of Denmark, Denmark
  • Haja Kadarmideen, Technical University of Denmark, Denmark

Short Abstract: Calculating methylation levels within a single gene could help investigate and identify the Gene-based Differentially Methylated Regions (GeneDMRs). Such GeneDMRs are better than single differentially methylated cytosines (DMCs), as it provides an overall gene methylation profiles. The mean methylation of a gene of one treatment group is defined as: ∑_1^n▒〖(∑_1^m▒〖MR〗_ij )/(∑_1^m▒〖TR〗_ij )*〖W_ij and W〗_ij 〗= (∑_1^m▒〖TR〗_ij )/(∑_1^n▒∑_1^m▒〖TR〗_ij ), where 〖MR〗_ij and 〖TR〗_ij are methylated and total read number of the involved CpG/DMC j at a given gene of individual i, n is the total individual number of one treatment group, m is total number of CpG/DMC involved in this gene and W_ij is the weight of reads. GeneDMRs is defined by the comparisons across different treatment groups following logistic regression model: ln(π_i/(1-π_i ))=u+βT_i+e, where π_i is the mean methylation and T_i is the treatment. The GeneDMRs is a user-friendly package that can easily output the required results and figures, such as mean methylation levels of all genes and CpG islands or a specific gene or promoter/exon/intron regions with the realized boxplot, heat map and correlation matrix, as well GO terms/pathways in hyper-/hypo-methylated categories. As more features for GeneDMRs are being updated, the current offline GeneDMRs package is available from authors.

B-05: A network-based approach for visualization and simplification of single-cell RNA sequencing data
COSI: BioVis COSI
  • Mariia O. Bilous, Université de Lausanne, Switzerland

Short Abstract: Single-cell RNA sequencing (scRNA-seq) is a powerful tool for studying heterogenous cell populations. In particular, it makes it possible to perform unsupervised and unbiased clustering of cells based on their gene expression profile. Therefore scRNA-seq can potentially reveal novel subpopulations of cells, their intermediate differentiation states, new cell type markers, and developmental trajectories. However, together with all of these advantages and promises it also brings computational challenges related to representing and studying such high-dimensional data. Here we propose an alternative way to visualize and denoise scRNA-seq data with the help of networks. This approach is based on connecting cells that share high transcriptional similarity and building a graph. Using random walks for computing connectivity among cells we are able to build a graph that better recapitulate clustering of the scRNA-seq data than the widely used graphs based on k-nearest-neighbors. In addition, we demonstrate that networks can be used for simplifying scRNA-seq data by grouping similar cells into super-cells, while preserving most of the information of the initial data. Such simplification can reduce noise of scRNA-seq data and reveal stable and intermediate states of blood cells undergoing developmental process.

B-06: GenesetPCA for multi-variate model interpretation
COSI: BioVis COSI
  • Rachel Cavill, Maastricht University, Netherlands
  • Nordine Aouni, Maastricht University, Netherlands
  • Luc Linders, Maastricht University, Netherlands
  • David Robinson, Maastricht University, Netherlands
  • Len Vandelaer, Maastricht University, Netherlands
  • Jessica Wiezorek, Maastricht University, Netherlands

Short Abstract: Principal Component Analysis (PCA) and other multi-variate models are often used in the analysis of transcriptomics data. These models contain much information which is not currently easily accessible or interpretable. We have produced a Graphical User Interface (GUI) in Matlab which allows the overlay of geneset information onto the PCA loadings plot and thus improves the interpretability of the PCA model. For each geneset the optimal convex hull, covering a subset of genes from the geneset, is found and displayed. The significance of each geneset is calculated using several measures of significance. Each of these measures, along with other geneset features such as size and percentage of genes from the geneset which were covered by the optimal convex hull, are represented on sliders, which allow the user to adjust the visible genesets overlaid in the loadings plot. In addition to PCA models the software can also accept pre-calculated loadings from other multivariate models, for instance Partial Least Squares (PLS). The software is available for download and is free for academic use.

B-07: Nightingale: A library of re-usable data visualisation components
COSI: BioVis COSI
  • Xavier Watkins, European Bioinformatics Institute, United Kingdom
  • Daniel Rice, European Bioinformatics Institute, United Kingdom
  • Aurelien Luciani, European Bioinformatics Institute, United Kingdom
  • Gustavo A Salazar, InterPro, EMBL-EBI, United Kingdom

Short Abstract: “Nightingale” is a library of reusable data visualisation components which provides tools to display protein features (ProtVista), protein interactions and 3D structures, with many more components to come. Initially developed as an initiative of EMBL-EBI developers and now adopted by UniProt, InterPro, PDBe, and soon Open Targets, it is implemented using established web standards (web components) and designed with flexibility in mind. These components can easily be added to any web resource, allowing users to display data from the Proteins API as well as their own APIs.

B-08: Evidente – A visual analytics tool for data enrichment in SNP-based phylogenetic trees
COSI: BioVis COSI
  • Mathias Witte Paz, University of Tübingen, Center for Bioinformatics, Germany
  • Alexander Seitz, University of Tübingen, Center for Bioinformatics, Germany
  • Kay Nieselt, Center for Bioinformatics Tübingen, University of Tübingen, Germany

Short Abstract: In recent years the developments of the next-generation sequencing technologies have enabled genome resequencing projects of many individuals within one species. The genomes are often analysed with respect to single-nucleotide polymorphisms (SNPs) or small indels. This gives the possibility of reconstructing a phylogenetic tree of all individuals based on the detected mutations. From such a phylogenetic tree, a common question is to identify clade-specific SNPs within the reconstructed phylogeny, i.e. that support the computed topology. Then one also often wishes to analyse these mutations in more detail to retrieve for example functional consequences that the SNP may have on the organism or to compute enrichment of certain features within the phylogenetic tree. Here, we present on-going work in developing the visual analytics tool Evidente for annotation and analysis of metadata in SNP-based phylogenetic trees. Besides the visualization of a phylogenetic tree, Evidente enables the user to get a visual overview of distribution of SNPs across all samples as well as clade-specific SNPs within the tree. Furthermore, Evidente allows the user to run an enrichment analysis, for example for Gene Ontology (GO) annotations.

B-09: Cliques of single-cell RNA-seq profiles reveal insights into cell ecology during development and differentiation
COSI: BioVis COSI
  • Baihan Lin, Columbia University, United States

Short Abstract: The lack of a formal link between cell-cell cohabitation and its emergent dynamics into cliques during development has hampered our understanding of how cell populations proliferate, differentiate, and compete, i.e. the cell ecology. With the advancement of single-cell RNA-sequencing (RNA-seq), we have now come closer to describing such a link by taking cell-specific transcriptional programs into account, constructing graphs of a network that reflect the similarity of gene expression, and analyzing these graphs using algebraic topology. Applying this approach to single-cell gene expression profiles from local networks of cells in different developmental stages with different outcomes revealed a previously unseen topology of cellular ecology. These networks contain an abundance of cliques of single-cell profiles bound into cavities that guide the emergence of more complicated habitation forms. We can visualize these ecological patterns with topological simplicial architectures of these networks, compared with the random graphs. Benchmarked on single-cell RNA-seq of zebrafish embryogenesis over 25 cell types and 12 time steps, our approach highlights the gastrulation as the most critical stage, consistent with consensus in developmental biology. As a nonlinear, model-independent, and unsupervised framework, our approach can also be applied to tracing multi-scale cell lineage, identifying critical stages, or creating pseudo-time series.

B-10: Cellxgene: enabling performant, interactive exploration of single-cell transcriptomics data
COSI: BioVis COSI
  • Sidney Bell, Chan Zuckerberg Initiative, United States
  • Jeremy Freeman, Chan Zuckerberg Initiative, United States
  • Fiona Griffin, Chan Zuckerberg Initiative, United States
  • Genevieve Haliburton, Chan Zuckerberg Initiative, United States
  • Justin Kiggins, Chan Zuckerberg Initiative, United States
  • Bruce Martin, Chan Zuckerberg Initiative, United States
  • Colin Megill, Chan Zuckerberg Initiative, United States
  • Charlotte Weaver, Chan Zuckerberg Initiative, United States

Short Abstract: Single-cell transcriptomics holds great promise for furthering our understanding of cellular characteristics in health and disease. However, these datasets often consist of millions of cells (observations) and thousands of genes (features). Enabling all scientists to explore and utilize these large, high-dimensional datasets requires innovative computational approaches to analysis and visualization. Here, we present cellxgene, a highly performant visualization tool for interactive exploration of single-cell transcriptomics data. Cellxgene features a rich, user-friendly interface that enables scientists to explore, select, and annotate their data based on metadata, low-dimensional embeddings, and analytical outputs such as pseudotime and cluster assignments. Cellxgene is developed using a modern web stack and highly performant architecture to enable visualization of datasets of up to 1 million cells. While cellxgene was developed for single-cell transcriptomics data, its design and infrastructure are largely domain-agnostic, enabling potential application to other high-dimensional data types. With this project, we hope to both enable biologists to collaborate and explore their data, and to demonstrate general, scalable, and reusable patterns for scientific data visualization.

B-11: iBioProVis: Interactive Visualization and Analysis of Compound Bioactivity Space
COSI: BioVis COSI
  • Ahmet Sureyya Rifaioglu, Middle East Technical University, Turkey
  • Maria Jesus Martin, EMBL-EBI, United Kingdom
  • Ataberk Donmez, Middle East Technical University, Turkey
  • Aybar Can Acar, Middle East Technical University, Turkey
  • Rengül Atalay, Middle East Technical University, Turkey
  • Tunca Dogan, European Bioinformatics Institute, Turkey
  • Mehmet Volkan Atalay, Middle East Technical University, Turkey

Short Abstract: Visualization and interpretation of high-dimensional chemical compound and target space is critical for better understanding of the mechanisms of bioactivity space and drug discovery process. Here, we describe iBioProVis, which projects and visualizes compounds on 2D space based on their structural features in the context of their cognate targets. The inputs are pairs of ChEMBL target identifiers and the output is the 2D projection plot of the active compounds of the input targets. By looking at the distribution of compounds(i.e.,points) in a projection, the user can infer that compounds that are close to each other may possess similar binding characteristics. One of the interesting additional feature is that the user can also provide a list of SMILES strings as input. By this way, the user can observe the projection of these compounds along with the projections of previously reported active compounds of the selected targets. iBioProVis provides an interactive environment where users can select different compounds and get several information about them. iBioProVis also provides cross-references to well-known databases so that users can easily relate the entities and navigate to those databases by clickable links. iBioProVis is freely available at http://ibioprovis.kansil.org/.

B-12: The gene Expression and Analysis Resource (gEAR) Portal
COSI: BioVis COSI
  • Joshua Orvis, University of Maryland School of Medicine - Institute for Genome Sciences, United States
  • Brian Gottfried, University of Maryland School of Medicine - Institute for Genome Sciences, United States
  • Dustin Olley, University of Maryland School of Medicine - Institute for Genome Sciences, United States
  • Jayaram Kancherla, University of Maryland, United States
  • Beatrice Milon, University of Maryland School of Medicine, United States
  • Kevin Rose, University of Maryland School of Medicine, United States
  • Yang Song, University of Maryland School of Medicine - Institute for Genome Sciences, United States
  • Anup Mahurkar, University of Maryland School of Medicine - Institute for Genome Sciences, United States
  • Ronna Hertzano, University of Maryland School of Medicine, United States
  • Hector Corrada Bravo, University of Maryland, Collge Park, United States

Short Abstract: The gEAR portal (umgear.org) is an online tool for multi-omic and multi-species data visualization, sharing, and analysis. Originally designed for auditory researchers, the gEAR portal has now been expanded for general use. The gEAR is unique in its ability to allow users to upload, view and analyze their own data in the context of previously published datasets, as well as confidentially share their data with collaborators prior to publication. It is also unique in combining not only multiple species but multiple data types including bulk RNA-seq, sorted cell RNA-seq, single cell RNA-seq (scRNA-seq) and epigenomics in a one page, user-friendly, browseable format. Individual expression datasets can be displayed in a variety of ways alongside each other, including interactive bar, line or violin plots, colorized anatomical SVGs, tSNE and PCA plots. We have integrated a scRNA-seq workbench into the gEAR which provides access to both the raw data of scRNA-seq datasets, as well as saved expert analyses where cell types have already been assigned – giving researchers rapid insight into gene expression of their cell type of interest. This presentation functions as a step-by-step introduction to the gEAR portal, now a mainstream multi-omic data source for the ear research community.

B-13: Interoperable web-based data visualisation components for the future: Working towards BioJS 3.0
COSI: BioVis COSI
  • Dennis Schwartz, Repositive, United Kingdom
  • Yochannah Yehudi, Department of Genetics, University of Cambridge, United Kingdom

Short Abstract: Visualisation is a fundamental part of presenting biological data in a clear and understandable manner. BioJS has been providing a library and directory of reusable and interoperable JavaScript components for data visualisation on the web since 2014 and has grown to include more than 200 modules by hundreds of collaborators. While some of the components registered are widely used - such as Cytoscape.js and the UniProt ProtVista component - many are more niche. Following our goal to become the most comprehensive, reusable and easy to use directory of browser-based visualisation tools for biological data, we report progress and changes to the BioJS core infrastructure and team over recent years and propose an interface for the next generation of BioJS components (BioJS 3.0) based on web component technology. We present a completely rebuild and redesigned website and component registry, a badge system to encourage improved interoperability and coding standards as well as draft details for BioJS 3.0 with the aim to reduce the barrier of entry to non-programmers.

B-14: ImmuneRegulation: A web based tool for identifying human immune regulatory elements
COSI: BioVis COSI
  • Selim Kalayci, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, United States
  • Myvizhi Esai Selvan, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, United States
  • Irene Ramos, Department of Microbiology and Global Health & Emerging Pathogens Institute, Icahn School of Medicine at Mount Sinai, United States
  • Chris Cotsapas, Department of Neurology, Yale University, United States
  • Eva Harris, Division of Infectious Diseases and Vaccinology, School of Public Health, University of California, Berkeley, United States
  • Ruth R. Montgomery, Section of Rheumatology, Department of Internal Medicine, Yale School of Medicine, United States
  • Gregory Poland, Mayo Clinic, United States
  • Bali Pulendran, Emory Vaccine Center/Yerkes National Primate Research Center at Emory University, United States
  • John S. Tsang, Multiscale Biology Section, Laboratory of Immune System Biology, NIAID & NIH Center for Human Immunology, NIH, United States
  • Robert J. Klein, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, United States
  • Zeynep H. Gümüş, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, United States

Short Abstract: Humans vary considerably in their healthy immune phenotypes and in their immune responses to various stimuli. Recent high-throughput studies are contributing to an improved understanding of immune cell function and regulation. Extensive datasets are publicly available through multiple large consortiums, including the Human Immunology Project Consortium, which includes measurements of healthy and activated human immune system, coupled with detailed clinical phenotyping in well-characterized cohorts. However, there is currently no central resource to interactively explore this wealth of data. We developed a user-friendly open-access web portal, ImmuneRegulation, that enables users to interactively explore immune regulatory elements. ImmuneRegulation currently provides the largest centrally integrated resource on human transcriptome regulation across whole blood and blood cell types, including (i) ~43,000 genotyped individuals with associated gene expression data from ~51,000 experiments, yielding associations on ~220 million eQTLs; (ii) 14 million transcription factor binding region hits extracted from 1,945 ChIP-seq studies; and (iii) the latest GWAS catalog with 67,230 published variant-trait associations. In its front-end, a visually intuitive web interface enables query, browsing and interaction with large volumes of data, including user-supplied data. For gene(s) queried, visual, interactive summaries of regulatory elements are returned to help explore and communicate results. ImmuneRegulation is available at https://icahn.mssm.edu/immuneregulation.

B-15: Proactive Visual and Statistical Analysis of Genomic Data in Epiviz
COSI: BioVis COSI
  • Jayaram Kancherla, University of Maryland, United States
  • Zhe Cui, University of Maryland, United States
  • Niklas Elmqvist, University of Maryland, United States
  • Hector Corrada Bravo, University of Maryland, Collge Park, United States

Short Abstract: Integrative analysis of genomic data that includes statistical methods in combination with visual exploration has gained widespread adoption. Many existing methods involve a combination of tools and resources: user interfaces that provide visualization of large genomic datasets, and computational environments that focus on data analyses over various subsets of a given dataset. Over the last few years, we have developed Epiviz as an integrative and interactive genomic data analysis tool that incorporates visualization tightly with state-of-the-art statistical analysis framework. We present a proactive and automatic visual analytics system integrated with Epiviz that alleviates the burden of manually executing data analysis required to test biologically meaningful hypotheses. Results of potential interest that are proactively identified by server-side computations are listed as notifications in a feed. The feed turns genomic data analysis into collaborative work between the analyst and the computational environment, which shortens the analysis time and allows the analyst to explore results efficiently. This effort provides initial work on systems that substantially expand how computational and visualization frameworks can be tightly integrated to facilitate interactive genomic data analysis.

B-16: Flud: a hybrid crowd-algorithm approach for visualizing biological networks
COSI: BioVis COSI
  • T. M. Murali, Virginia Tech, United States
  • Aditya Bharadwaj, Virginia Tech, United States
  • David Gwizdala, Bridgewater Associates, United States
  • Yoonjin Kim, Virginia Tech, United States
  • Kurt Luther, Virginia Tech, United States

Short Abstract: Network biologists use graphs to understand the protein interactions that underlie processes that take place in the cell. In order to present and analyze these graphs, researchers require aesthetic layouts of these graphs that clearly convey the relevant biological information. However, the problem remains challenging due to multiple conflicting aesthetic criteria and complex domain-specific constraints. In this research, we have developed Flud, an online game with a purpose (GWAP) that allows humans with no expertise to design biologically meaningful graph layouts with the help of algorithmically generated suggestions. The goal of the players is to create a layout that optimises a weighted-score based on previously defined aesthetic considerations and a new biologically inspired criteria -- “maximize the number of downward pointing.” Further, we propose a novel hybrid approach for graph layout wherein crowd workers and a simulated annealing algorithm build on each other's progress. To showcase the effectiveness of Flud, we recruited crowd workers on Amazon Mechanical Turk to lay out complex protein networks that represent signaling pathways. Our results show that the proposed hybrid approach outperforms state-of-the-art techniques for graphs with a large number of feedback loops.

B-17: Density-Preserving Visualization of Single Cells
COSI: BioVis COSI
  • Bonnie Berger, Massachusetts Institute of Technology, United States
  • Ashwin Narayan, Massachusetts Institute of Technology, United States
  • Hyunghoon Cho, Massachusetts Institute of Technology, United States

Short Abstract: Exploratory analysis of single-cell omics datasets begins with visualizing the data in low dimensions to reveal structural insights that can be probed in future experiments. For these insights to be biologically meaningful, it is crucial that the visualizations be as faithful to the source dataset as possible. However, as we demonstrate both theoretically and empirically, widely-used methods for single-cell data visualization, including t-SNE and UMAP, entirely ignore local density information, leading to visualizations where the sizes and densities of clusters have no bearing on the true transcriptional variability of underlying cell states. To address this problem, we present density-aware t-SNE (da-SNE), which obtains visualizations with density landscape that better correlates with the source dataset. We achieve this property by incorporating a differentiable metric for the density of a point in a dataset into the t-SNE objective function to obtain a joint optimization problem. On simulated and real datasets, we demonstrate da-SNE not only allows researchers to accurately compare cluster sizes and densities in the visualizations, but also more faithfully represents overlapping clusters and is more robust to parameter selection than t-SNE. Our approach is broadly applicable to other data science domains where the local density of data encodes valuable information.

B-18: Visualising Variant Data on 3D Structure Using Jalview
COSI: BioVis COSI
  • Mungo Carstairs, University of Dundee, United Kingdom
  • James Procter, University of Dundee, United Kingdom
  • Geoff Barton, University of Dundee, United Kingdom

Short Abstract: As the volume of annotated genomic variant data continues to grow, so does the challenge of visualisation and interpretation across multiple levels of biological organisation (DNA, coding sequence, protein sequence and alignments, secondary and 3D structure). The Jalview alignment workbench (www.jalview.org) now supports parsing, display and analysis of variant data in VCF format (text or tab-indexed), using the htsjdk Open Source library. VEP (variant consequence) data is also parsed if present. Jalview employs BioJava's Sequence Ontology API to allow filters and visual encodings to be applied at different levels of the SO hierarchy. Variants can be rendered on sequences with simple colours, or shaded according to numerical attributes such as allele frequency. Filters allow complex queries to be created: for instance “SIFT score is ‘deleterious’ or PolyPhen score is 'damaging'". Visual and filter settings can be exported and re-applied to different datasets, interactively or via Jalview's command line. Genomic variant features may be shown at corresponding positions in encoded proteins, and on available 3D structure. Protein features (such as domain or binding site) are also mapped back to codons on coding sequences. These capabilities support rapid visual exploration of denovo variant data, helping Jalview users gain insight into functional impact.

B-19: ExploSig: Hypothesis-driven Exploration of Mutation Signature Etiology
COSI: BioVis COSI
  • Mark Keller, University of Maryland, United States
  • Welles Robinson, University of Maryland, United States
  • Mark Leiserson, University of Maryland, United States

Short Abstract: Mutational processes operative in tumors leave signatures of single-base-substitutions and small insertions/deletions. Since the development of computational methods for extraction of mutation signatures from cancer genomes, researchers have validated 30 signatures. Half of these signatures have been attributed to known factors, with implications for cancer prevention, detection, and personalized medicine. Uncovering etiologies of the remaining signatures has become an urgent priority. Individual studies have associated elevated signature exposure with tumor-level features (tobacco smoking, BRCA1 mutations) in piecemeal fashion and required computational sophistication, limiting the ability of DNA damage and repair experts to test their hypotheses on signature etiology. We present a web-based visualization tool, ExploSig, (https://explosig.lrgr.io) for analyzing mutation signatures and tumor-level features, generalizing the piecemeal approach for characterizing signature etiology. Informed by HCI principles, ExploSig allows users to perform core interaction tasks. Experts in particular damage/repair pathways can look for associations between signatures and genes via mutation, expression, and CNA visualization. ExploSig creates reproducible workflows, enabling saving and sharing. Findings associating particular signatures with tumor-level features can be reproduced, enabled by our efforts to collect/standardize datasets containing clinical/demographic variables for over 10,000/2,500 exomes/genomes. We anticipate ExploSig being useful for investigating etiologies of those signatures that remain unknown.

B-20: A CUSTOMIZED FORCE-DIRECTED LAYOUT ALGORITHM WITH GENETIC ALGORITHM TECHNIQUES FOR BIOLOGICAL GRAPHS
COSI: BioVis COSI
  • Fırat Aksoydan, Middle East Technical University, Turkey
  • Rengül Atalay, Middle East Technical University, Turkey
  • Mehmet Volkan Atalay, Middle East Technical University, Turkey

Short Abstract: A pathway can be visualized as a graph whose layout is drawn by a force-directed algorithm. In our previous study, we described Eclerize which is a customized and improved Kamada Kawai force-directed algorithm in order to visualize pathways that contain nodes with attributes as EC numbers. EClerize creates clusters of nodes with enzymes that belong to the same EC class. Here, we make use of genetic algorithm to obtain a global optimum solution for EClerize and we integrate undirected graph layout drawing with a genetic algorithm. To provide diversity, 6 techniques in mutation phase and for crossover 2 techniques are employed. In mutation, vertices of a selected graph are moved randomly within a limited area or selected edges/vertices are exchanged according the routines of a technique. In crossover, the operation of exchanging vertices is performed between two selected graphs. In each iteration, fitness values of individuals are calculated by 6 different fitness measurements ranging from edge crossing number to drawing area. Overall relative fitness values are used to choose parent individuals. We have applied this method to 3 pathways and the results are better,than those of the base study with a reasonable longer execution time.

B-21: Enhancing gene set enrichment using networks
COSI: BioVis COSI
  • Michael Prummer, NEXUS Personalized Health Technologies, ETH Zurich, Switzerland

Short Abstract: Differential gene expression (DGE) studies often suffer from poor interpretability of their primary results, i.e., thousands of differentially expressed genes. This has led to the introduction of gene set analysis (GSA) methods that aim at identifying interpretable effects by grouping genes into sets of high-level context, such as, molecular pathways, biological function or cell type composition. In practice, GSA often results in hundreds of differentially regulated gene sets. Similar to the genes they contain, gene sets are regulated in a correlative fashion because they share many of their members or they describe related processes. Using this kind of neighborhood information to construct networks of gene sets allows to differentiate large, highly connected general purpose clusters from smaller, more specialized gene set archipelagos. Singletons, i.e., gene sets with negligible overlap to any other regulated set, are likely to represent either new biology or false positives. We show here how topological information and other network features can be used to filter and prioritize gene sets in routine DGE studies. Community detection in combination with automatic labeling and the network representation of gene set clusters further constitute an appealing and intuitive visualization of GSA results beyond saw-tooth plots of enrichment scores.

B-22: Advances in Neuronal Morphology Analysis, Meshing and Visualization with NeuroMorphoVis
COSI: BioVis COSI
  • Marwan Abdellah, Blue Brain Project / EPFL, Switzerland
  • Samuel Lapere, Ecole Polytechnique Fédérale de Lausanne, Switzerland
  • Felix Schürmann, Ecole Polytechnique Fédérale de Lausanne, Switzerland
  • Henry Markram, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, Switzerland

Short Abstract: NeuroMorphoVis is an interactive framework for reconstruction, visualization and analysis of digital neuronal morphologies segmented from optical microscopy stacks. This poster highlights the recent features integrated in NeuroMorphoVis and demonstrates their applicability with several use cases and applications in in silico neuroscience. A fast sketching method, using the Blender mesh editing API (bmesh), is added to plot the skeleton as a connected list of samples to highlight and repair the morphology from artifacts that are either reported from the analysis of the morphology or noticed manually. The analysis module is redesigned to improve its extensibility, allowing the users to define a kernel function and have it seamlessly applied on each neurite individually and globally on the entire skeleton. Three novel meshing methods are integrated in the framework to create various mesh models of the neurons with different quality and level-of-detail. They produce meshes for several applications including: machine learning, skeletonization, in silico imaging and visualization of electrophysiological simulations at various scales. New materials are designed and added to create high quality renderings of the neurons. All the features are released to https://github.com/BlueBrain/NeuroMorphoVis.

B-23: Developing web tools for mass cytometry data analysis
COSI: BioVis COSI
  • Gabor Beke, Institute of Molecular Biology SAS, Bratislava, Slovakia, Slovakia
  • Lubos Klucar, Institute of Molecular Biology SAS, Bratislava, Slovakia, Slovakia
  • Dana Cholujova, Cancer Research Institute, Biomedical Research Center, SAS, Bratislava, Slovakia, Slovakia
  • Jana Jakubikova, Cancer Research Institute, Biomedical Research Center, SAS, Bratislava, Slovakia, Slovakia

Short Abstract: Mass cytometry is a new technology based on inductively coupled plasma and time of flight mass spectrometry. Bioinformatics analysis of high-throughput cytometry data requires sophisticated algorithms like SPADE, which can process multidimensional data and display those on two-dimensional plots. We analysed 84 protein markers from Multiple myeloma (MM) and Waldenström macroglobulinemia (WM) cells with SPADE algorithm. To visualize the SPADE results and for further downstream analyses we developed a dynamic web portal using R and its additional packages (among others shiny, visNetwork, DT). visNetwork allowed us to visualize the SPADE results in real time (e.g. moving the SPADE tree nodes) and identify different cell populations on the SPADE trees. An automated script merges and normalizes the number of cells and markers’ intensities in the previously identified cell populations. Results could be visualized in different ways: SPADE tree, heat-maps, dot-plots and Whisker-box-plots. We modified the SPADEVizR R package to statistically compare two or more groups (based on various biological and/or clinical data). This dynamic web portal helps to identify differences between healthy and tumour cells and between individual stages of the cancer diseases. This work was supported by grant APVV-16-0484 and VEGA grant 2/0076/17.

B-24: The nd_scatter widget in Jupyter and other web contexts
COSI: BioVis COSI
  • Aaron Watters, Simons Foundation, United States

Short Abstract: Many systems biology studies derive datasets with hundreds or thousands of dimensions. For example micro-biome assays can produce relative abundance measurements for thousands of microorganisms and ATAC-seq (assay for accessible chromatin) or CHIP-seq (chromatin immuno precipitation) assays can produce gene accessibility and expression levels for hundreds of genes. This poster presents the nd_scatter widget -- an interactive visualization extending methods developed in GGobi for exploring and presenting multidimensional data using three dimensional projections. The widget allows the user to select components, examine and adjust the projection vectors for the data, and to apply common projection method such as 3 dimensional principal components analysis or t-distributed Stochastic Neighbor projections. The nd_scatter widget is a stand alone Javascript component built using HTML 5 canvas technology. It is based on the jp_doodle package (https://github.com/AaronWatters/jp_doodle) and it is designed to be easily embedded as an interactive Jupyter widget using jp_proxy_widgets (https://github.com/AaronWatters/jp_proxy_widget). The Jupyter notebook embedding is useful for including the widget visualization in scientific workflows or other computational narratives. The widget published as an open source project and is currently under active development. View the latest version of the widget as a standalone Javascript component here: https://aaronwatters.github.io/jp_doodle/nd_scatter.html.

B-25: GO Cluster Finder: Visualization tool to identify gene clusters that are composed of functionally related genes
COSI: BioVis COSI
  • Mingeun Ji, Dept. of Multimedia Engineering, Dongguk University, Seoul, Korea, South Korea
  • Jaehee Jung, Dept. of General Education, Hongik University, Seoul, Korea, South Korea
  • Jeongkyu Kim, Dept. of Multimedia Engineering, Dongguk University, Seoul, Korea, South Korea
  • Gangman Yi, Dept. of Multimedia Engineering, Dongguk University, Seoul, Korea, South Korea

Short Abstract: As the Next Generation Sequencing (NGS) has been advanced, more research is being extended to analyze genes using microarray data or metabolic pathway databases. Most of outcomes from state-of-the-art related research is composed of complicated results in text format which is difficult to analyze and needs additional process to convert them to understand biological meanings. We present GO Cluster Finder(GOCF), which is a visualization tool to analyze functionally similar genes on different chromosomes. GOCF investigates for clusters that are composed of functionally similar genes, or clusters containing specific genes that are closed relation in Gene Ontology (GO). The method is based on C-Hunter algorithm that identifies gene clusters of the eukaryotic genome that is defined as terms in GOs. We provide a method to visualize gene clusters of genomes in eukaryotes, which are defined in FASTA format as well as the GO vocabulary. In experiments, the proposed tool identifies gene clusters for eight representative genomes that allows us to visually analyze distributed functionally related genes within the genomes. In conclusion, outcomes such as the cluster location, member of genes in clusters can provide another biological inspiration from visualized information.

B-26: Vitessce - Building a Modular Tool for Single-Cell Spatial Omics and Image Visualization
COSI: BioVis COSI
  • Chuck McCallum, Harvard University, United States
  • Peter Kharchenko, Harvard University, United States
  • Nils Gehlenborg, Harvard University, United States

Short Abstract: Technologies to measure single-cell data in human and other tissues are on the rise. Several large-scale projects such as NIH HuBMAP and the Human Cell Atlas combine several of these technologies, which results in complex data sets that integrate sequencing, mass spectrometry, and optical microscopy data. They pose new challenges for data visualization due to the spatial component that requires the integration of two- and three-dimensional imaging data with omics measurements. To address these challenges, we are building Vitessce, a tool for single-cell spatial omics and image visualization. Vitessce is implemented with a modular architecture designed to be reusable in other projects, such as data portals. The focus of our development so far has been the spatial component: The Uber deck.gl library, developed for geospatial applications, is used to render polygons for cells and points for molecules, and these are overlaid on multiple independently selectable layers of images. Cells can be recolored based on user-selected criteria such as expression levels or cell types, and subsets can be selected, with coloring and selection state linked between all components, which also include typical single-cell visualizations such as projections and heatmaps. A demo is available at https://hms-dbmi.github.io/vitessce.

B-27: Vitessce - Building a Modular Tool for Single-Cell Spatial Omics and Image Visualization
COSI: BioVis COSI
  • Chuck McCallum, Harvard University, United States
  • Peter Kharchenko, Harvard University, United States
  • Nils Gehlenborg, Harvard University, United States

Short Abstract: Technologies to measure single-cell data in human and other tissues are on the rise. Several large-scale projects such as NIH HuBMAP and the Human Cell Atlas combine several of these technologies, which results in complex data sets that integrate sequencing, mass spectrometry, and optical microscopy data. They pose new challenges for data visualization due to the spatial component that requires the integration of two- and three-dimensional imaging data with omics measurements. To address these challenges, we are building Vitessce, a tool for single-cell spatial omics and image visualization. Vitessce is implemented with a modular architecture designed to be reusable in other projects, such as data portals. The focus of our development so far has been the spatial component: The Uber deck.gl library, developed for geospatial applications, is used to render polygons for cells and points for molecules, and these are overlaid on multiple independently selectable layers of images. Cells can be recolored based on user-selected criteria such as expression levels or cell types, and subsets can be selected, with coloring and selection state linked between all components, which also include typical single-cell visualizations such as projections and heatmaps. A demo is available at https://hms-dbmi.github.io/vitessce.

M-66: Trend Analysis of Biological Big-data in NABIC
COSI: BioVis COSI
  • Dong-Jun Lee, National Institute of Agricultural Sciences(NAS), South Korea
  • Tae-Ho Lee, National Institute of Agricultural Sciences(NAS), South Korea
  • Hye-Jin Lee, National Institute of Agricultural Sciences(NAS), South Korea
  • Do-Kyung An, National Institute of Agricultural Sciences(NAS), South Korea
  • Yil-Cho Cho, National Institute of Agricultural Sciences(NAS), South Korea
  • Mi-Sun Lee, National Institute of Agricultural Sciences(NAS), South Korea

Short Abstract: NABIC(National Agricultural Biotechnology Information Center) has been a management integrated system of biological big-data for collecting and analyzing the omics data resources in South Korea and providing various data such as genome, proteome, transcriptome and metabolome. NABIC has many functions to submit, search and visualize many kinds of the omics data type and represented biological omics data service. also, developing for improving. Presently, the total amount of data submitted on our system is about 40TB data and providing service of the 36.91TB data.