If you need assistance please contact firstname.lastname@example.org and provide your poster title or submission ID.
Short Abstract: To understand genome evolution, to detect and to visualize colinear blocks between genomes as well as within a genome. Particularly, the visualization of multiple colinear blocks is helpful in interpreting the evolution among organisms. Till now, a few web applications have been developed to visualize multiple homologous blocks. However, the applications are not fully convenient for biologists because some of them just have the function to make the multiple colinear blocks image or determine the colinear block based on similarity information only without delicate algorithms designed to detect colinear blocks. We introduce a web service, MultiSyn, that determines and visualizes multiple colinear blocks at once from two types of files. As an application of the service, we determined colinear blocks for tomato genic region including PSY1 (phytoene synthase 1) which is important in lycopene biosynthesis and show that the colinear blocks are well conserved in many other plants.
Short Abstract: There are many reasons why genomic data is difficult to analyze and interpret and chief among them is its size. The genomes of vertebrates contain hundreds of millions to billions of nucleotides. Functional elements are dispersed across wide swathes of this sequence space and absent in many others. To answer biological questions, researchers need to be able to quickly navigate and interpret data across large regions of the genome. To put their observations into context, they need to be able to compare across not only samples but also locations and scales. In our poster, we describe our implementation of “composable linked views” for creating arrangements of views showing different datasets and linked by location and zoom level. We propose the use the idea of composable, linked views to create customizable displays of large genomic data sets. These displays consist of a collection of views supporting continuous zooming and panning within a coordinate system containing billions of base pairs on each axis. Each view can show multiple datasets which are displayed along common axes. Individual views can be synchronized with other views by linking by attributes such as location, zoom level and value scale. Together, these views constitute a navigable arrangement, or composition, which can be used to navigate multiple “too large to display at once” datasets while enforcing common locations, zoom levels or data scalings across selected views. We show the utility of these compositions by demonstrating their use with chromosome capture data as well as with the ubiquitous genomic profile data found in most genome browsers.
Short Abstract: High-dimensional biological data are being generated from numerous sources and in unprecedented amounts, what engenders the need for their automated processing and analysis. To gain meaningful and deep insights into these data, a number of common tasks have emerged, among the most important of which are (1) imputation and denoising, (2) clustering, and (3) visualization of low-dimensional embeddings underlying the high-dimensional structure of data. We develop a unified framework of methods based on deep autoendcoder neural networks to perform these tasks in a scalable and efficient fashion. The framework allows us to harness the power of deep learning, a field that has shown exceptional performance boosting numerous other fields including pattern recognition and computer vision, in biology. Our Deep Clustering Autoencoder (DCAE) is able to detect clusters in a sparse encoding layer, a visual embedding in the 2-dimensional bottleneck layer and recover imputed denoised data in its final decoder layer. For imputation, we show that our method is able to recover missing values and reconstruct imputed, more complete datasets. We show that DCAE can also find clusters in learnt embeddings using a unique loss function that encourages cluster separation. To demonstrate their effectiveness, we compare our methods to their state-of-the-art counterparts and analyze the biological significance of their results on novel single cell mass cytometry data.
Short Abstract: To visualize how many samples are sharing a subset of up to five features, it is common to present a Venn diagram. Samples sharing the same feature are enclosed by the same circle (up to 3 dimensions) or ellipse (up to 5 dimensions). The advent of big data and statistics on many covariates, across all scientific disciplines increased the demand for higher dimensionsional Venn diagrams. We hereto extended the Venn diagrams of the gplots R package to present 6 and 7 dimensional Polyominoes, i.e. an approximate flat representation of a graph connecting neighbouring areas that omits edges. The underlying graph structure is exported as an igraph object for an unlimited set of features, which may be subject to further external processing or seek external information for a mapping like from geographical annotation.
Short Abstract: Point Mutant Epistatic MiniArray Profile (pE-MAP) (Braberg, et al., 2013) is a new technique that uses builds on the E-MAP (Collins, et al., 2010) approach for measuring genetic interactions. In traditional E-MAP experiments, genes are deleted or knocked-down singly and in pairs and some measure of fitness resulting from that perturbation is measured (usually cell growth). The genetic interaction is measured as the difference between the expected fitness of the double delete mutants assuming the paired genes are independent and the measured fitness. A pE-MAP experiment proceeds in a similar fashion except that instead of looking at pairs of gene deletions, the cross is between a gene deletion and a point mutant on a large structure. This provides a sensitive readout of the interactions between genes and particular residues or regions of large structures such as RNA polymerase II and histones in the nucleosome. Traditional E-MAP data is clustered using hierarchical clustering and the resulting dendrogram and heatmap displayed in JTreeView (Saldanha, 2004) or a similar tool. In a traditional E-MAP, both axes are genes and the resulting heatmap is symmetrical across the diagonal (Figure 1). The colors typically are shown using a cyan (negative) to yellow (positive) gradient, where negative indicates a synthetic lethal (i.e. the cell fitness is worse than expected) and positive indicates an epistatic interaction (i.e. the cell fitness is better than expected). In a pE-MAP, one axis represents point mutants and one axis represents deleted (or knocked-down) genes. Typically, what is of interest is the correspondence between the similarity of the interaction profile and the location on the structure. For example, if a series of point mutants all cluster close to each other (have a similar interaction with deleted genes) it would strongly indicate a similar physiological effect. If all of the point mutants are in a similar location on the structure, it might indicate a possible location for binding, post-translational modifications, or a region of important structural stability. All of these could be important for understanding the underlying biological impact of perturbations to that region. Recently, we began looking at ways to visualize this data in a manner that would leverage the visualizations researchers were accustomed to, but extend it to support the integration of a structural view. To achieve this, we integrate four elements: the original clustered heatmap, a node-link diagram of the residues in the structure (Residue Interaction Network), a node-link diagram that connects the genes to the interacting residues in the RIN, and a 3D-structural view using UCSF Chimera (Pettersen, et al., 2007). The genes in the integrated node-link diagram are positioned to reflect the dendrogram, thus preserving the profile of the genes and the links (edges) between the genes and the interacting residues are colored to reflect the values in the heatmap. All of the views are integrated -- selection in the node-link view of a residue will immediately select that residue in the 3D view. Selection in the heatmap of a gene will select that gene in the node-link view. The power of the system, however, comes from the ability to select one or more genes and "paint" the interaction scores onto the 3D structure. This is done by adding side chains to the interacting residues in the 3D view, then coloring the spheres by the interaction score (blue for synthetic lethal, yellow for epistatic). When there are multiple scores, they are averaged if they are in the same direction, but if they are in different directions, the residues are colored green). This approach allows researchers to quickly explore the impact of individual genes, or entire complexes on the structure, and whether all of the genes in the complex exhibit similar profiles. Furthermore, a heatmap that represents only the interacting genes and residues is provided to give researchers more of a "zoom" view of the interaction. Researchers also requested the ability to limit the view based on the number of interactions, so only genes with more than a certain number of interactions are shown. The implemented system is shown in Figure 2, including all of the various views. In typical use, the full heatmap is rarely, if ever used. Researchers tend to rely on the heatmap summary in the main window. The system is implemented as a Cytoscape (Shannon, et al., 2003; Cline, et al., 2007) app, stEMAPApp, that utilizes several other existing apps, including structureViz (Morris, et al., 2007), RINalyzer (Doncheva, et al., 2011), clusterMaker2 (Morris, et al., 2011), and setsApp (Morris, et al., 2014). The app itself is available on github (https://github.com/RBVI/StEMAPApp) and all dependent components are available on the Cytoscape app store (http://apps.cytoscape.org). References Braberg H, Jin H, Moehle EA, Chan YA, Wang S, Shales M, Benschop JJ, Morris JH, Qiu C, Hu F, Tang LK, Fraser JS, Holstege FC, Hieter P, Guthrie C, Kaplan CD, Krogan NJ. From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II. Cell. 2013 Aug 15;154(4):775-88. doi: 10.1016/j.cell.2013.07.033. Epub 2013 Aug 8. PubMed ID: 23932120 Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B, Hanspers K, Isserlin R, Kelley R, Killcoyne S, Lotia S, Maere S, Morris J, Ono K, Pavlovic V, Pico AR, Vailaya A, Wang PL, Adler A, Conklin BR, Hood L, Kuiper M, Sander C, Schmulevich I, Schwikowski B, Warner GJ, Ideker T, Bader GD. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366-82. PubMed ID: 17947979 Morris JH, Apeltsin L, Newman AM, Baumbach J, Wittkop T, Su G, Bader GD, Ferrin TE. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics. 2011 Nov 9;12:436. doi: 10.1186/1471-2105-12-436. PubMed ID: 22070249 Collins SR, Roguev A, Krogan NJ. Quantitative genetic interaction mapping using the E-MAP approach. Methods Enzymol. 2010;470:205-31. doi: 10.1016/S0076-6879(10)70009-4. Epub 2010 Mar 1. PubMed ID: 20946812 Doncheva NT, Klein K, Domingues FS, Albrecht M. Analyzing and visualizing residue networks of protein structures. Trends Biochem Sci, 36:4 (179-82). 2011 Apr. PubMed ID: 21345680. Morris JH, Huang CC, Babbitt PC, Ferrin TE. structureViz: linking Cytoscape and UCSF Chimera. Bioinformatics. 2007 Sep 1;23(17):2345-7. Epub 2007 Jul 10. PubMed ID: 17623706 Morris JH, Lotia S, Wu A, Doncheva NT, Albrecht M, Pico AR, Ferrin TE. setsApp for Cytoscape: Set operations for Cytoscape Nodes and Edges. F1000Res, 3: (149). 2014. PubMed ID: 25352980. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem. 2004 Oct;25(13):1605-12. PubMed ID: 15264254 Saldanha AJ. Java Treeview—extensible visualization of microarray data. Bioinformatics 2004; 20 (17): 3246-3248. PubMed ID: 15180930 Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, 13:11 (2498-504). 2003 Nov. PubMed ID: 14597658.
Short Abstract: Multiple myeloma (MM) is considered a cancer of plasma cells. MM cells produce paraprotein, which is mostly made of IgG and light and/or heavy chains of antibodies. MM is a relatively uncommon cancer with an incidence rate of about 60 cases per million people per year in the United States. Waldenström macroglobulinemia (WM) is a cancer of both plasma cells and lymphocytes and WM is very similar to MM. WM cells make large amounts of IgM, which is known as a macroglobulin. WM is rare, with an incidence rate of about 3 cases per million people per year in the United States. To better understand WM and MM, it is necessary to measure changes in tumour cell composition. We analysed 857 samples from MM and WM with SPADE algorithm. To visualize the results of SPADE analyses, we developed a dynamic web portal using R and its additional packages, mainly shiny, igraph and Cairo. We used these dynamic visualizations to identify different cell populations on the SPADE trees. After that, we merged and normalized the number of cells and the intensities of markers in cell populations to the total number of cells. To visualize and analyse the normalized results we developed a second dynamic web portal, where we can visualize the data in different ways: SPADE tree, heat-maps, dot-plots or Whisker-box-plots. On this portal we can also perform statistical analyses of SPADE results, using modified SPADEVizR R package. This data will be used to identify differences between healthy and tumour cells. This study was supported by SASPRO 0064/01/02, Transcan-2 TRS-2015-00000170, VEGA 2/0076/17, VEGA 2/0100/17, and the Slovak R&D Agency APVV-16-0484.
Short Abstract: Over the last two decades, a large number of seemingly unrelated and vastly diverse systems have been described in terms of networks. An important factor for their success is that they allow for a very intuitive visualization of very complex datasets. Such visualizations, however, are limited to relatively small networks. For many networks studied in biology, such as the interactome (currently around 15,000 nodes and 300,000 interactions), conventional, two-dimensional visualizations on a simple computer screen are impractical, both for computational reasons, but also due to fundamental limitations in the amount of information that can be perceived visually on a relatively small screen. Here, we present a Virtual Reality (VR) platform that allows us to explore huge networks. Based on VR gaming technology, our platform provides us with the necessary tools to visualize and manipulate massive amounts of data in a highly submersive and interactive virtual environment. Using a VR headset, the user can walk around in spatial representations of the network and inspect every detail of its internal structure in a very intuitive fashion. The program can be connected to a database, so that displayed annotations can be updated and filtered from within the VR application. We also explore possibilities for remote collaboration of multiple users in a shared virtual space.
Short Abstract: Synthetic Biology (SynBio) is the application of engineering principles to biological systems. The idea of Synthetic Biology in the context of gene regulatory networks is to modify or integrate gene components to create systems of living cells synthesizing biological products. Currently, SynBio often lacks detailed information about such synthetic circuits. When placed in different genomic contexts, even of the same species, synthetic circuits often lose their functionality and as of today, comparative studies of synthetic circuits across species are almost rare (1,2). A promising example to fill this gap are Extracytoplasmic Function σ factors (ECFs), the largest group of alternative σ factors, representing a major mechanism of bacterial signal transduction. Based on ECFs, it is possible to implement highly orthogonal regulatory switches and circuits (3). The ECFDesigner is a web-based platform to explore the world of ECFs. The software is developed within an international research project to identify orthologous ECFs, which can be transferred across species without loss of function, and combined to complex networks of switches. Based on expert knowledge, the ECFDesigner identifies proteins as ECFs or predicts ECFs in genomes of interest. Moreover, it provides additional information through protein domain and promoter motif analyses, as well as, function analysis. Via automated homology-based function predictions, information provided by biological data-warehouses are bundled together and delivered via an easy to understand visual information system. Natural biological switches can be combined to form complex circuits, comparable with electronic circuits. For the future, it is planned to provide a toolbox allowing users to design their own regulatory switches and circuits. Guet, Călin C., et al. "Combinatorial synthesis of genetic networks." Science 296.5572 (2002): 1466-1470. Staroń, Anna, et al. "The third pillar of bacterial signal transduction: classification of the extracytoplasmic function (ECF) σ factor protein family." Molecular microbiology 74.3 (2009): 557-581. Rhodius, Virgil A., et al. "Design of orthogonal genetic switches based on a crosstalk map of σs, anti‐σs, and promoters." Molecular systems biology 9.1 (2013): 702.
Short Abstract: The steadily declining cost of sequencing and assembling multiple genomes from a single species has created a growing need for intuitive, visual, comparative exploration of pangenome data. Currently available genome viewers limit representations by fixing one genome as a global reference, against which all others are compared. This can make it difficult to identify and compare subpopulations to which the global reference does not belong. Recent efforts have focused on variant graph representations, in which individual genomes are defined by paths through a graph that encodes genomic subsequences as nodes. Here, we present the Augmented Genome Graph Visualization (AGV), a web-based visualization server for the efficient exploration of variants of hundreds of individuals. Compared to existing genome browsers, its data structure is based on the open-source project variation graph, which efficiently stores hundreds of genomes using a lossless compression. AGV enables a deep and efficient exploration of huge genome population datasets, including the intuitive display of large structural variants spanning thousands of basepairs. This web-based client-server model performs data storage and intensive computation on the server, while visualizations are performed in the client’s web browser. Information can be shared between collaborators without installing any local software. AGV includes lift-over of reference annotations and the calculation and visualization of haplotype blocks, i.e. regions of strong linkage disequilibrium that indicate genome regions of low recombination frequency.
Short Abstract: Next Generation Sequencing has moved the Big Data phenomenon into the Biological Sciences, making the understanding of biological data a computational challenge. In consequence, it is important to create tools that exploit human visual skills in the interpretation of this ever-increasing information. However, transforming genomic data into an image with biological meaning is particularly difficult because the information is not comprised in a single variable but a set of them. The distribution of genomic composition embedded in k-mer frequencies (frequencies of all possible substrings of size k) is a suitable approach, since it will allow us to obtain a specific signature of different organisms in order to classify and visualize them. The main goal of this study was to develop an R function to transform a genomic sequence into a specific 2D image based on k-mer frequencies. The function was developed such that it fragments a genome, reduce the dimensionality of genomic composition measurements and assign a specific color (RGB) to each fragment, transforming it into an image pixel. This function was applied to 59 genomes from two different domains of life and it was observed that related organisms presented similar color pattern across family, class and phylum. Also, a Mantel test was done over two distinct matrices, one from pixel features and another from a traditional 16S-based phylogenetic tree, in order to asses statistical similarity of the obtained 2D images to classical phylogeny. In conclusion, image-based tools can help improve genomic comparisons, exploiting human visual capabilities.
Short Abstract: In metagenomic studies, one is interested in finding out which species are present in a sample. The complexity of metagenomic samples can make it difficult to get an overview of the species. Here we present MicroWineBar, a new interactive visualization tool which aids exploring relative abundances by using bar graphs. The focus is to look at one taxonomical rank at the time for one or several samples. Further, one can do correlation analysis between species or higher taxa. Two groups of samples can be compared in various plots and the species which are driving the difference between the two groups of samples can be identified. MicroWineBar is an interactive visualization tool for relative abundances of taxa such as species in metagenomic samples.
Short Abstract: In recent decades, the development of 16S rRNA sequencing data has emerged into massive publicly available databases, such as Genomes Online Database, SILVA, GreenGenes, and the Ribosomal Database Project. Many of these sequences are tagged with geo-locations for sample sites but have never been used scientifically. Even more, researchers currently lack a user-friendly tool to analyze microbial distribution in a location-specific context. BioAtlas is an interactive web application that closes this gap between sequence databases, taxonomy profiling, and geo/body-location information. It enables users to browse taxonomically annotated sequences across 1) the world map, 2) human body maps, and 3) user-defined maps. It further allows for 4) uploading of own sample data, which can be placed on existing maps to 5) browse the distribution of the associated taxonomies. Finally, BioAtlas enables users to 6) contribute with custom maps (e.g. for plants or animals) and to map taxonomies to pre/user-defined map locations. To summarize, BioAtlas facilitates browsing of maps for public 16S rRNA sequence data and analyses of user-provided sequences without requiring manual mapping to taxonomies and existing databases.
Short Abstract: Nanopore sequencing holds great promise in the field of metagenomics – the study of genomes of microbial organisms sharing the same environment. Its low cost, the length of reads and the size of the sequencing device MinION, which can easily fit into a palm, make nanopore sequencing the ideal fit for metagenomics. However, one of the major questions in metagenomics still stays unresolved – after a shotgun sequencing run, how are we able to determine which sequences belong to the same taxonomical unit? The current techniques trying to solve this issue are based on extraction of selected features from each of the sequences followed by a suitable clustering algorithm, which in an ideal case produces clusters containing sequences that belong to the same operational taxonomic unit (OTU). This process is commonly called binning. Based on the number of extracted features, the visualization of the sequences in the form of points in 2D or 3D space can be included in the final pipeline to make the results human-comprehensible. Here we present a novel method for visualization of metagenomic sequences, which works with the original electric current signals, so-called “squiggles”, produced by nanopore sequencing, instead of working with the final character string sequence obtained after base-calling. Apart from the fact that we do not need the prior base-calling, the algorithm runs with linear time complexity, which allows almost instant results. We tested the algorithm on a simplified metagenome available in the EBI Metagenomics database and obtained reliable results.
Short Abstract: Modifying and visualizing well-defined biological pathways are routine tasks during a biomedical data analysis. We describe a simple tool called PathwayEditor for facilitating rapid pathway editing. PathwayEditor has capabilities that support creating, editing, and visualizing biochemical pathways. A simple and intuitive XML schema is used to represent biological pathways. PathwayEditor efficiently renders XML files for editing and visualization by converting well-defined pathway markup files from the KEGG and WikiPathway databases. In addition, we have used an improved automatic layout algorithm to graphically represent a pathway based on XML files. Therefore, PathwayEditor enables users to visualize metabolic pathways with their transporting systems by integrating transport proteins from public databases. These functions make PathwayEditor a quick and easy tool for routine pathway editing. PathwayEditor is implemented in the Qt GUI development toolkit with a user-friendly interface. We present this easy-to-use graphic editing tool for converting KEGG KGML files and efficiently adding a small number of nodes to a well-defined biological pathway. PathwayEditor is freely available at our website: http://soft.bioinfo-minzhao.org/pathwayeditor/.
Short Abstract: Systems-level visualization is an important first step towards understanding high-dimensional molecular data and generating biological hypothesis for downstream analysis. However, the existing open-source solutions for –omics data visualization lack crucial features such as the scalability of visualization with >10,000 features and the ability to further manipulate and query subsets of data based on user’s own hypotheses. Enabling such an interactive communication with the user requires a proper architecture. Here we present SLIDE, which allows for scalable visualization and interactive exploration of high throughput datasets. SLIDE has been designed to visualize gene-level expression data across various study designs and experimental conditions, including static and time-course experiments. The tool allows users to sift through the gene-level data in many different ways. A scrollable view allows visualizing the data at a fixed resolution where individual gene expression levels are apparent. The tool also performs hierarchical clustering for very high dimensional data with a large number of samples (e.g. 50,000 genes with 200 samples), allowing users to explore data in the branches of the hierarchical tree (dendrogram) at different granularities. The tool provides capabilities for searching and annotating individual genes as well as the functional groups (biological pathways and gene ontologies), with the gene groups being marked on the heatmaps in real time. More importantly, user has the ability to select genes of interest, which can be visualized separately for further in-depth inspection of data. In addition, a separate module allows visualization of data at the level of biological functions, using the pathways and gene ontology as the basis of gene groups. We demonstrate SLIDE using the data from a murine model of influenza infection (Brandes et al., Cell, 2013) with highly complex time-course design, featuring 45,281 genes and 133 gene expression microarray experiments.
Short Abstract: iCLiKVAL is an open-access web-platform for crowdsourcing scientific curation, in which researchers can annotate literature and other media to highlight knowledge present in these media and to share it with the rest of the community. Almost 2 billion annotations of over 24 million media have been collected so far. The iCLiKVAL web site allows users to easily create accounts, add annotations, form communities and groups, and effectively search upon the media or annotations themselves. The iCLiKVAL network view attempts to represent the content of the knowledge database via a user-driven network. The focus of this project is the relations between media. The network is built to connect media that share common annotations. The most annotations are shared between two media, the heaviest the link between them will be. By this means, a scientist can see at a glance which media contents have related information and then use these hints to determine what to investigate next. Why “user-driven”? The huge quantity of data makes it difficult to represent the full network at once. In addition, annotations are from broad domains. They can be from meta-data such as author names; to general topics such as species involved; or specific scientific contents such as molecule names. Because this variety among the annotations will quickly result in a confusing all-to-all network, we allow the users to select the specific annotation they want to represent, and by this process, each user can obtain a custom view of the full network that fits their personal requirements.
Short Abstract: Krona allows the complex, quantitative hierarchies of metagenomic data to be explored intuitively with multi-layered, interactive pie charts. While useful for quality control and hypothesis-free discovery in single samples, Krona has lacked the facility for the direct comparison of two metagenomic samples, a common desire for controlled experiments. Here, we present enhancements to Krona that allow two metagenomes to be explored simultaneously, allowing differences to be observed easily while preserving the depth and detail inherent in Krona’s radial, space-filling displays. Krona is available at github.com/marbl/Krona/wiki.
Short Abstract: Existing tools for genomic data visualization (UCSC Genome Browser, Ensembl Genome Browser, IGV, etc.) share the principle of displaying data in physical coordinates of reference genome providing user with pan and zoom controls for navigation. This approach is convenient for low-level sequencing data exploration (e.g. BAM files) but is ineffective for sparse genomic annotation data analysis and sequencing result interpretation where presence and number of features or feature relations is more important than physical size and exact location. In Concentrate we implement a novel approach to genomic data visualization that is based on fixed visual scale unit. Every displayed object (annotation feature, genetic variant, etc.) is scaled to have minimal possible size sufficient to visualize all interactions (overlapping, inclusion, etc.) with other objects, so any isolated element equals one unit and every interaction site increases element size by one. This ensures the most efficient use of screen space and makes element interaction events straightforward to detect. Base-pair scale is displayed in parallel to provide information on objects physical size. Concentrate also retrieves objects metadata and provides the ability to filter objects on the fly with arbitrary complex queries omitting the necessity to use external tools and reload processed dataset into the browser. Concentrate has client-server architecture and can run both as desktop genome browser and as web-service for genomic and annotation data visualization and sharing both intraorganizational and worldwide. Concentrate requires only web-browser and Java 8 to operate. Source code is licensed under GNU AGPLv3 and available on GitHub.
Short Abstract: The UniProt Knowledgeable is a central knowledge repository with individual protein entries outlining all known aspects of the protein’s biology. The entries include an Interaction section that details the protein’s interactions with other proteins or protein complexes. Protein interactions can help users characterise protein complexes, understand the protein’s function and infer its involvement in biological processes. UniProt initially displayed binary protein interactions in a table view, which made it difficult for users to see the interactivity and complexity. Hence we needed to find an effective and intuitive solution. The UniProt Knowledgeable currently contains 18,413 with varying number of interactors. Initial trials with the traditional nodes/edges graph showed that we quickly ended up with “hairballs”, making it difficult to identify what the relationships between nodes were. After some research, we decided to settle on an adjacency graph, which, although less conventional than node/edges graphs, allows a much clearer display of larger numbers of interactions. Following a user centred design approach we optimized the graph further showing only one side of the symmetrical graph at a time and then adding more nodal information on-hover and on-click. User feedback showed that we could add value by overlaying more annotation types onto the interaction data. Thus we added filters for interactors from certain subcellular locations or involved in certain diseases. As a result, we are now not only interactively visualizing interactions that were in a static table, but also showing additional secondary interactions and a layer of biological annotations to create a multi-pronged solution.
Short Abstract: Dementia is a disease of which early detection and treatment is considered highly important, and the progression of dementia may vary substantially among individual patients. We suggest a visualization facilitating a more detailed analysis through subdividing characteristics within patient groups. Our research goal is to help build a diagnosis system according to the characteristics of individual patients by proposing an analysis tool that subdivides the diagnosis stages. This paper thus identified data in various perspectives with a multi-dimensional visualization tool, and studied the differences between scores on each diagnosis stage by comparing the diagnosis results. Based upon the refined CREDOS (Clinical Research Center for Dementia of South Korea) data, we visualized the distribution of dementia diagnosis assessment results through two techniques: 3D Radvis with Scatter plots on a three dimensional and parallel coordinate which showed each result as a line graph. Next, through a case study, key variables were extracted to subdivide the patients. We also designed an analysis method to subdivide a group with a specific disease stage when the chronic grade within the group varies. As a result, variables in cognitive assessments were identified to be more important indicators than physical examinations. If the chronic grades vary within a group of AD and SVD, it appeared to be effective to consider the total score of current functioning and latent ability in S-IADL, CGA-NPI, K-MMSE and CDR grade. Such findings are expected to be beneficial for medical professionals aiming to diagnose patients according to individual characteristics. (https://youtu.be/Z5NvvdBJx0o)
Short Abstract: Reactome (http://reactome.org) is a free, open-source, curated and peer-reviewed knowledgebase of biomolecular pathways. Reactome models pathway space as a hierarchy of increasingly detailed pathways. While we provide a hierarchical pathway browser as a key element of the Reactome web interface, the relationships and connectivity between high level pathways were previously not represented well. In addition, options for re-use of the manually laid out low level pathway diagrams were limited, as they were only downloadable as PNG images. Following intensive User Experience testing by external users, we implemented a series of major visual enhancements, to make Reactome more interactive and user-friendly: 1: In the detailed pathway diagrams, sub-pathways are now visually highlighted through shaded boxes. 2: Detailed pathway diagrams are now downloadable as PowerPointTM slides, with pathway elements rendered as connected PowerPointTM objects, allowing scientists to edit, modify, and re-use them to present their own pathway-related research results in presentations and publications. 3: The relationships between high level nodes in the Reactome hierarchy, for example between Adaptive Immune System, Innate Immune System, and Cytokine Signalling in Immune System, are now visualised through textbook-style diagrams developed by a professional illustrator. However, these diagrams are not static PNG images, but dynamic SVG graphics, allowing fast zooming and navigation, clicking to link to sub-pathways, as well as overlay of aggregated pathway analysis results. Diagrams as well as their components are open data and have been released as a library re-useable for biomolecular visualisation by the scientific community.
Short Abstract: Reactome (http://reactome.org) is a free, open-source, open-data, curated and peer-reviewed knowledge base of biomolecular pathways. For the higher levels of its pathway hierarchy, Reactome now provides scalable, interactive textbook-style diagrams in SVG format, which are also freely downloadable and editable. These diagrams are developed by professional illustrators in collaboration with Reactome curators. To ensure consistency in the visual representation of pathway diagrams, we have built up a library of icons in SVG format, ranging from simple protein labels to representations of organelles, receptors, and cell types. The Reactome Icon Library is freely accessible under a CC-BY licence for re-use by the community, suitable for a broad range of purposes, from schematic pathway sketches in scientific presentations and publications to grant proposal illustrations. We invite contributions to the library from the community, and provide detailed instructions to achieve technical and artistic consistency. To encourage community engagement, each icon is attributed to the author through a web link and/or ORCID id. As of May 2017, the Reactome Icon Library contains 371 elements and is accessible at http://reactome.org/icon-lib.
Short Abstract: High throughput data is often challenging to interpret given the huge amount of information it generates. Therefore, data integration and visualisation is essential to interpret -omics data in a biologically meaningful way. We have developed MitoXplorer, a web-based integration and visualisation platform to analyse mitochondrial variation in -omics data. MitoXplorer integrates expression and mutation data with a hand-curated, mitochondrial interactome, which is composed of all proteins associated with mitochondrial functions. The user interface allows visualisation and comparative analysis of expressional changes and mutational events in mitochondria. We tested the analytical and predictive power of MitoXplorer on a set of aneuploid cell lines containing a defined, unbalanced chromosome content. RNA-seq and proteomics data from five aneuploid cell lines were integrated using MitoXplorer and analysed for expressional changes in different mitochondrial functions. We observed a significant disruption of mitochondrial translational machinery in a cell line carrying trisomy 21 (RPE1 21/3), caused by the down-regulation of one of the subunits of mitochondrial 28S ribosome, MRPS21. As a consequence, a substantial number of protein subunits from the electron transport chain are down-regulated, leading to a severely altered mitochondrial metabolism, as shown by SeahorseTM extracellular flux analysis. In summary, MitoXplorer proved to be a practical web tool with an intuitive interface for users who wish to gain insight from -omics data in mitochondrial functions. The implementation of the versatile and flexible D3 library also allows the extension of this tool to examine other biological problems or fields of interests.
Short Abstract: The computational study of the dynamic and stochastic natures of gene regulatory networks is a challenging topic in systems biology. Visualizing ensemble time-evolving probability landscapes of stochastic gene networks can further biologists’ understanding of phenotypic behavior associated with specific genes. We present a web-based visual analysis tool for the exploration of peak distributions over state space and simulation time in such stochastic networks, and the comparison of peak distributions between multiple simulations. Our approach combines multiple linked views to capture ensemble time-evolving probability landscapes. A peak trajectory cube provides users an overview of peak spatiotemporal distributions between six simulations. A peak projection map shows the exact peak locations of multiple simulations at the user selected time. At a more detailed level, users can inspect a particular state in the peak projection map to view for each simulation both the probability values over time, and the local probability landscape shapes. This information is displayed in a small multiple using two glyphs: profile glyphs and arrow glyphs. The arrow glyph indicates that a state is a peak when all the glyph eight arrows point towards the glyph center. In the figure, a disagreement between the arrow glyphs and the peak projection map demonstrates that probability distributions over genes in this system are not independent of each other. Our visual analysis tool allows bioinformatics researchers to explore and compare the time evolving changes of probability landscapes from multiple simulations efficiently, without running many small scripts and computing all characteristics separately.
Short Abstract: Visualization of Next Generation Sequencing (NGS) data is vital to allow researchers to explore and understand the data from experiments or large-scale datasets. A common process involves processing NGS files to create suitable input files (e.g. files in BED format) to be used by visualization software like Circos (http://circos.ca/). Depending on the visualization application, the creation of complex configuration files may be required additionaly. This task is time consuming, repetitive and requires constant input from the user to adjust the views to fit the demands. Here we present NGS-InVi an integration and visualization tool that aims at automating the process of producing visual representations of complex data using intuitive point and click support.
Short Abstract: The web application Escher is a versatile tool for building biochemical network maps and visualizing experimental or simulated data on these networks. Three key features make Escher a uniquely effective tool for pathway visualization. First, users can rapidly design new pathway maps. Escher provides pathway suggestions based on user data and genome-scale models, so users can draw pathways in a semi-automated way. Second, users can visualize data related to genes or proteins on the associated reactions and pathways, using rules that define which enzymes catalyze each reaction. Thus, users can identify trends in common genomic data types (e.g., RNA-Seq, proteomics, ChIP) in conjunction with metabolite- and reaction-oriented data types (e.g., metabolomics, fluxomics). Third, Escher harnesses the strengths of web technologies (SVG, D3, developer tools) so that visualizations can be rapidly adapted, extended, shared, and embedded. We present recently developed features, including support for reading and writing systems biology standards (SBML-Layout and SBGN-ML) and fully customizable tooltips.
Short Abstract: Spatial and temporal brain transcriptomics have recently emerged as invaluable data sources for molecular neuroscience. However, the complexity of such data poses challenges for both analysis and visualization. We present BrainScope: a web portal for fast, interactive visual exploration of the Allen Atlases of the adult and developing human brain transcriptome. The portal shows the transcriptional relationships between brain regions, and the spatial co-expression patterns of genes, while linking these patterns to the anatomical context. To achieve this, we make use of a dual t-SNE analysis. We show that clusters in t-SNE maps of the brain samples coincide with anatomical regions, and that clusters in t-SNE maps of the genes represent gene co-expression modules. The topography of the gene t-SNE maps reflect brain region-specific gene functions, enabling hypothesis and data driven research. We demonstrate the discovery potential of BrainScope through three examples: (i) analysis of cell type specific gene sets, (ii) analysis of a set of stable gene co-expression modules across the adult human donors and (iii) analysis of the evolution of co-expression of oligodendrocyte specific genes over developmental stages. BrainScope is publicly accessible at www.brainscope.nl.
Short Abstract: Discovery of chimeric RNAs, which are produced by chromosomal translocations as well as the joining of exons from different genes by trans-splicing, has added a new level of complexity to our study and understanding of the transcriptome. The enhanced ChiTaRS-3.1 database (http://chitars.md.biu.ac.il) has been designed to make widely accessible a wealth of mined data on chimeric RNAs, with easy-to-use analytical tools built-in. The database comprises 34 922 chimeric transcripts along with 11 714 cancer breakpoints. In this latest version, we have included multiple cross-references to GeneCards, iHop, PubMed, NCBI, Ensembl, OMIM, RefSeq and the Mitelman collection for every entry in the ‘Full Collection’. In addition, for every chimera, we have added a visualization system for the predicted chimeric protein–protein interaction (ChiPPI) network, which allows for easy visualization of protein partners of both parental and fusion proteins for all human chimeric proteins. Finally, the database contains a comprehensive annotation for 34 922 chimeric transcripts from eight organisms, and includes the manual annotation of 200 sense-antiSense (SaS) chimeras. Thus, the ChiTaRS-3.1 database aims to state-of-the-art visualization of the novel ChiPPI networks of the fusion proteins in different types of cancers. Our database solves efficiently and presents online more than 11 000 cancer fusion networks using our novel method of ChiPPI networks' visualization by means of the domain-domain co-occurrence scores in the interacting pairs of proteins. This study has been published recently in the Nucleic Acids Research journal.
Short Abstract: Human diseases such as cancer are routinely characterized by high-throughput molecular technologies, and multi-level omics data are accumulated in public databases at increasing rate. Retrieval and visualization of these data in the context of molecular network maps can provide insights into the pattern of regulation of molecular functions reflected by an omics profile. In order to make this task easy, we developed NaviCom, a Python package and web platform for visualization of multi-level omics data on top of biological network maps. NaviCom is bridging the gap between cBioPortal, the most used resource of large-scale cancer omics data and NaviCell, a data visualization web service that contains several molecular network map collections. NaviCom proposes several standardized modes of data display on top of molecular network maps, allowing addressing specific biological questions. We illustrate how users can easily create interactive network-based cancer molecular portraits via NaviCom web interface using the maps of Atlas of Cancer Signalling Network (ACSN) and other maps. Analysis of these molecular portraits can help in formulating a scientific hypothesis on the molecular mechanisms deregulated in the studied disease. NaviCom is available at https://navicom.curie.fr
Short Abstract: "Background: We present a software workflow capable of building large scale, highly detailed and realistic volumetric models of neocortical circuits from the morphological skeletons of their digitally reconstructed neurons. The limitations of the existing approaches for creating those models are explained, and then, a multi-stage pipeline is discussed to overcome those limitations. Starting from the neuronal morphologies, we create smooth piecewise watertight polygonal models that can be efficiently utilized to synthesize continuous and plausible volumetric models of the neurons with solid voxelization. The somata of the neurons are reconstructed on a physically-plausible basis relying on the physics engine in Blender. Results: Our pipeline is applied to create 55 exemplar neurons representing the various morphological types that are reconstructed from the somatsensory cortex of a juvenile rat. The pipeline is then used to reconstruct a volumetric slice of a cortical circuit model that contains ∼210,000 neurons. The applicability of our pipeline to create highly realistic volumetric models of neocortical circuits is demonstrated with an in silico imaging experiment that simulates tissue visualization with brightfield microscopy. The results were evaluated with a group of domain experts to address their demands and also to extend the workflow based on their feedback. Conclusion: A systematic workflow is presented to create large scale synthetic tissue models of the neocortical circuitry. This workflow is fundamental to enlarge the scale of in silico neuroscientific optical experiments from several tens of cubic micrometers to a few cubic millimeters."
Short Abstract: Protein structures have been in the focus of structural biology research for decades. The study of their interactions with other smaller molecules (known as ligands) helps to reveal the fundamentals of biochemical processes that are taking place in living cells. Such reaction between a protein and a ligand often takes place deep inside the protein structure. In this paper, we present a novel visualization method for interactive exploration of protein tunnels, which connect the protein surface with deeply buried reaction sites. The tunnel geometry together with biochemical properties of surrounding amino acids and their changes over time determine whether a ligand can pass through the tunnel and reach the reaction site. The proposed visualization abstracts from the common 3D tunnel representation. The complex geometry of a tunnel is simplified by straightening the tunnel and depicting only its width profile and its changes over time. This frees up the visual space and allows the expert to directly depict the key physico-chemical properties of individual amino acids surrounding the tunnel in a single view. In our representation, each amino acid is depicted as a set of colored lines showing the spatial and temporal impact of the amino acid on the tunnel. The vertical ordering communicates the importance of amino acids with respect to selected criteria. It helps biochemists to select the most interesting candidates for further examination. The representation was developed in tight collaboration with domain experts and their feedback is presented on case studies.
Short Abstract: The roles of fungi in the forest are sophisticated. Some fungi are able to convert dead plants into consumable carbon sources. Some others are symbiotic and exchange carbon, phosphorus, nitrogen with trees and soils. Such fungal interactions give us hints for the efficient carbon recycling systems for innovative green technologies. However, measuring fungal genome-wide omics activities are challenging. Capturing just a single time point of transcriptomic activity involves over ten thousands genes showing various transcription levels. The addition of proteomic/secretomic information gives an extra layer of complexity. To extract biologically meaningful patterns from such high-dimensional omics data, we have developed the multi-omics profiling platform, Self-organizing map Harboring Informative Nodes with Gene Ontology (SHIN+GO). Genome-wide omics models constructed with the platform are designed to pinpoint biological activities of interest that would otherwise be buried in the high-dimensional data. As one of the key components of this platform, Self-organizing map (SOM) is an algorithm constructing a neural network with given input data in an unsupervised manner. It has a unique property of making two-dimensional maps suitable for large-scale data visualization. We used our integrative omics platform (SHIN+GO) to examine the dynamics of fungal omics responses to the growth conditions. It enabled us to compress layers of biological information into simple heatmaps, allowing for visual inspection of the data. Our omics combining methods and related biological findings may contribute to the knowledge of fungal systems biology.
Short Abstract: Background: A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. Results: To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Conclusions: Intervene and its web application companion provide an easy command line, and an interactive web interface to compute intersections of multiple genomic and list sets. They also have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene, with the web application available at https://asntech.shinyapps.io/intervene.
Short Abstract: Advances in chromosome conformation capture technologies reveal, that highly organized three dimensional (3D) arrangement of chromosomes underpin the structural and functional basis of the genome. However, visualisation of 3D spatial organization of chromosomes with epigenome data remains a significant challenge. We developed Rondo - a web-based, interactive tool for fast, intuitive exploration of chromatin interaction data in the context of genomic and epigenetic features. Rondo uses novel ‘spatial connectivity maps’ to allow fast, intuitive exploration of 16 published Hi-C datasets, plus a novel set of ‘Quasi-Universal’ spatial connections common across 5 human cell-lines. Rondo simplifies the process of gaining insight into genomic and epigenomic processes by allowing any molecular biologist to use Hi-C data to gain insights into genomic processes. Currently, the Quasi-Universal dataset comprises 38,482 conserved chromosomal interactions across 5 cell types. These data are a powerful instrument to study high-order genome structure, required for normal cell function and opening a whole new way of looking at these data, similarly to how the concept of housekeeping genes changed gene regulations knowledge. Moreover, by intersecting datasets, the connections present in the Quasi-Universal gain a substantial increase of confidence, making - perhaps currently the only - means available to assess trans-interactions.
View Posters By Category
- Bioinformatics Open Source Conference (BOSC)
- Network Biology
- Regulatory Genomics (RegGenSig)
- Computational Modeling of Biological Systems (SysMod)
Session A: (July 22 and July 23)
- High Throughput Sequencing Algorithms and Applications (HitSeq)
- Machine Learning Systems Biology (MLSB)
- Translational Medicine (TransMed)