Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters - Schedules

Poster presentations at ISMB/ECCB 2021 will be presented virtually. Authors will pre-record their poster talk (5-7 minutes) and will upload it to the virtual conference platform site along with a PDF of their poster beginning July 19 and no later than July 23. All registered conference participants will have access to the poster and presentation through the conference and content until October 31, 2021. There are Q&A opportunities through a chat function and poster presenters can schedule small group discussions with up to 15 delegates during the conference.

Information on preparing your poster and poster talk are available at: https://www.iscb.org/ismbeccb2021-general/presenterinfo#posters

Ideally authors should be available for interactive chat during the times noted below:

View Posters By Category

Session A: Sunday, July 25 between 15:20 - 16:20 UTC
Session B: Monday, July 26 between 15:20 - 16:20 UTC
Session C: Tuesday, July 27 between 15:20 - 16:20 UTC
Session D: Wednesday, July 28 between 15:20 - 16:20 UTC
Session E: Thursday, July 29 between 15:20 - 16:20 UTC
Alignment-free protein distances and phylogenetic estimation
COSI: EvolCompGen
  • Edward Braun, University of Florida, United States

Short Abstract: Multiple sequence alignment is necessary for most traditional methods of phylogenetic analysis, but there is no exact algorithm that can yield biologically realistic multiple sequence alignments. This has led an interest in the development of methods for phylogenetic estimation that bypass the multiple sequence alignment step. However, these alignment-free methods for phylogenetics have not been achieved broad adoption by the community despite two decades of work. One of the most general approaches for phylogenetic estimation without sequence alignment involves two steps: 1) evolutionary distances are calculated using on the conditional Kolmogorov complexity of the genome or proteome of one organism given the genome or proteome of another organism; and 2) those distances are used to generate a phylogenetic tree. Conditional Kolmogorov complexity can be approximated using data compression. Herein, I show that the simplest versions of these compression distances are highly susceptible to long-branch attraction. However, a relatively simple modification motivated by the commonly used gamma distance can yield much more realistic estimates of evolutionary distances. Alignment-free distance estimates for pairs of proteins can be further improved by recoding amino acids based on their physicochemical properties. This approach can greatly improve the performance of alignment-free phylogenetic estimation.

Build a Better Bootstrap and the RAWR Shall Beat a Random Path to Your Door: Phylogenetic Support Estimation Revisited
COSI: EvolCompGen
  • Wei Wang, Michigan State University, United States
  • Ahmad Hejasebazzi, Michigan State University, United States
  • Julia Zheng, Michigan State University, United States
  • Kevin Liu, Michigan State University, United States

Short Abstract: The standard bootstrap method is used throughout science and engineering to perform general-purpose non-parametric resampling and re-estimation. Among the most widely cited and widely used such applications is the phylogenetic bootstrap method, which Felsenstein proposed in 1985 as a means to place statistical confidence intervals on an estimated phylogeny (or estimate “phylogenetic support”). A key simplifying assumption of the bootstrap method is that input data are independent and identically distributed (i.i.d.). However, the i.i.d. assumption is an over-simplification for biomolecular sequence analysis, as Felsenstein noted.

In this study, we introduce RAWR (or “RAndom Walk Resampling”), a new sequence-aware non-parametric resampling technique. RAWR consists of random walks that synthesize and extend the standard bootstrap method and the “mirrored inputs” idea of Landan and Graur. We apply RAWR to the task of phylogenetic support estimation. RAWR’s performance is compared to the state of the art using synthetic and empirical data that span a range of dataset sizes and evolutionary divergence. We show that RAWR support estimates offer comparable or typically superior type I and type II error compared to phylogenetic bootstrap support. We also conduct a re-analysis of large-scale genomic sequence data from a recent study of Darwin’s finches.

Comparative genomic analyses reveal an ancestral feedback loop in gene regulatory networks underlying axis formation in metazoa
COSI: EvolCompGen
  • Rohit Dnyansagar, University of Salzburg, Austria

Short Abstract: Brachyury is a founding member of the T-Box family of transcription factors, members of which play vital roles in the developmental process. Brachyury traces its origin even before the metazoan lineage but its ancestral function(s) in animals without canonical mesoderm is largely unexplored. We present a genome-wide Brachyury targets comparison across strategically selected metazoan lineages, using in-house and publicly available ChIP-seq data.
Most transcription factors do not function independently but rather in concerted fashion with other transcription factors and our motif analysis approach gives a good indication of putative collaborating transcription factors. We also addressed the key challenge of correctly identifying the target of the binding site by using an specific approach to use orthologous ChIP-seq targets from other species to determine target of each binding site.
Our analyses revealed conservation of a feedback loop involving Brachyury, FoxA, and canonical Wnt signaling. We demonstrate that this feedback loop is most likely involved in axial patterning that predates the split of cnidarians and bilaterians about 700 million years ago. Moreover, our analysis suggests that Brachyury regulates neuro-mesodermal progenitors exists even before the bilaterian lineage. Taken together, our systematic analyses thus enabled a cross-species elucidation of feedback loops in gene regulatory networks.

CONET: Copy number event tree model of evolutionary tumor history for single-cell data
COSI: EvolCompGen
  • Magda Markowska, University of Warsaw, Poland
  • Tomasz Cąkała, University of Warsaw, Poland
  • Błażej Miasojedow, University of Warsaw, Poland
  • Dilafruz Juraeva, Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Germany
  • Johanna Mazur, Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Germany
  • Edith Ross, Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Germany
  • Eike Staub, Merck Healthcare KGaA, Translational Medicine, Oncology Bioinformatics, Germany
  • Ewa Szczurek, University of Warsaw, Poland

Short Abstract: Copy number alterations constitute important phenomena in tumor evolution. Whole genome single cell sequencing gives insight into copy number profiles of individual cells, but is highly noisy. Here, we propose a novel approach for Copy Number Event Tree (CONET) inference and copy number calling. CONET fully exploits the signal in scDNA-seq, as it relies directly on both the per-breakpoint and per-bin data. The model jointly infers the structure of an evolutionary tree on copy number events and copy number profiles of the cells, gaining statistical power in both tasks. The nodes of the evolutionary tree are copy number events, which are allowed to overlap. CONET employs an efficient MCMC procedure to search the space of possible model structures and parameters, with a range of model priors and penalties for efficient regularization. Results on simulated data and 260 cells from xenograft breast cancer sample demonstrate the excellent performance of CONET in inferring both the copy number evolutionary history in cancer tissue, as well as integer copy number profiles for each cell. Taken together, the proposed approach is a step towards a better understanding of copy number evolution in cancer.
CONET implementation is available at github.com/tc360950/CONET.

Design Principles in Viral Proteome: Steal, Optimize, Duplicate, Invent and Pileup
COSI: EvolCompGen
  • Dan Ofer, The Hebrew University of Jerusalem, Israel
  • Michal Linial, The Hebrew University of Jerusalem, Israel

Short Abstract: From the initial discovery of the first giant virus in 2003, tens of giant viruses were identified comprise the order Megavirales. While some viruses encode as many as 1000-2000 proteins, only a small fraction is shared with other giant viruses. For assessing the functional novelty of giant viruses, we collected over 350 functional terms (e.g., transport, iron-binding, translation) whose occurrence is markedly different relative to all known viruses. We found significant enrichment in translation, ubiquitination, and metabolic enzymes. We further show that giant virus proteomes are extremely enriched with repeats such as Ankyrin repeats (ANK) which is found in 195 of 909 mimivirus APMV proteome. We propose that in giant viruses, in contrast to purifying selection that shapes the structure and function of proteins, evolution is driven by an adaptation toward the host physiology. For successful entry to their host (e.g., amoeba), viruses must increase their physical size. A strategy for genome expansion includes "stealing" genetic materials and local duplications of short repeats (e.g., BTB/POZ, ANK). Furthermore, we observed an increase in proteome diversity through the innovation of short unstructured proteins. Maintaining poorly conserved proteins reflects the opportunistic nature of giant viruses in expanding their genome content and length.

Developing a machine learning approach to determine gene/protein features to classify bacterial groups
COSI: EvolCompGen
  • Vignesh Sridhar, Pathobiology and Diagnostic Investigation, Michigan State University, United States
  • Joseph T Burke, Michigan State University, United States
  • Karn Jongnarangsin, Computer Science Engineering, Michigan State University, United States
  • Arjun Krishnan, Computational Mathematics, Science and Engineering, Michigan State University, United States
  • Janani Ravi, Pathobiology and Diagnostic Investigation, Michigan State University, United States

Short Abstract: Comparative genomics and pangenomics enable researchers to identify genomic features across different bacterial groups, such as pathogens/nonpathogens, or host-specific species. Such analyses are particularly helpful in identifying genes that are conserved and unique to different subsets of organisms. While characterizing these genes further will provide valuable molecular insights for the original questions of interest, currently, there is a dearth of tools that provide automatic identification of such genome-/protein-level features. We are, therefore, developing a computational approach that takes the results of typical comparative genomic and pangenomic workflows and generates accurate hypotheses about gene/protein functions of relevant genes. Our approach works by integrating multiple scales of data (i.e., from protein motifs/domains to genome organization) across the tree of life. These data become input features for a supervised machine learning model designed to classify bacterial groups or phenotypic attributes of interest. Examining the importance of all the features in the trained model will help us identify and characterize genes/proteins of interest in an unbiased and comprehensive manner. We are applying this approach to study pathogen-/host-specific and antibiotic-resistant vs susceptible features. This computational framework will be especially valuable to tackle emerging and understudied zoonotic pathogens and infectious diseases, e.g., Staphylococci, Mycobacteria.

EASEL (Efficient, Accurate, Scalable Eukaryotic modeLs), a tool for improvement of eukaryotic genome annotation
COSI: EvolCompGen
  • Peter Richter, University of Connecticut, United States
  • Jill Wegrzyn, University of Connecticut, United States
  • Bikash Shrestha, University of Connecticut, United States
  • Sumaira Zaman, University of Connecticut, United States
  • Vidya Vuruputoor, University of Connecticut, United States
  • Jeremy Bennett, University of Connecticut, United States
  • Daniel Monyak, University of Connecticut, United States

Short Abstract: Improvements in high throughput sequencing platforms, accompanied by its massive cost reductions, have proliferated draft genomes across diverse eukaryotic lineages. The rapid generation of these resources has highlighted the challenges associated with structural gene prediction as the existing models struggle to provide efficient and accurate solutions with transcriptomic data. A major challenge in eukaryotic gene annotation includes accurate prediction of complex structures including identifying untranslated regions and coding regions. Prediction of coding regions are further complicated by difficulty in precise identification of translation initiation sites (TIS), exons, and splice sites, including alternative TIS, if present. Current methods generally employ ab initio gene prediction algorithms, such as, hidden Markov models together with external RNA-seq data or assembled transcriptome for accurate prediction of the gene structure. However, these methods remain inadequate and susceptible to errors in genome annotation. This, in addition to requiring software that is challenging and time consuming to run, hinders high quality annotations, and limits opportunities to improve annotations in future genome releases. Here, we present the framework for EASEL (Efficient, Accurate, Scalable Eukaryotic modeLs), and that integrates both RNA folding, and functional annotation into the model to enhance gene prediction accuracy.

Evolution of Staphylococcal Antibiotic Resistance Systems Across Gram-Positive Bacteria
COSI: EvolCompGen
  • Elliot Majlessi, Lyman Briggs College, Michigan State University, USA, United States
  • Vignesh Sridhar, Pathobiology and Diagnostic Investigation, Michigan State University, United States
  • Neal Hammer, Microbiology & Molecular Genetics, Michigan State University, USA, United States
  • Janani Ravi, Pathobiology and Diagnostic Investigation, Michigan State University, USA, United States

Short Abstract: The bacterial cell envelope serves as the primary defense mechanism against external environmental threats, including antibiotics. Bacteria continuously evolve and adapt to their environment — when pathogens encounter antibiotics administered to the host, they evolve further, giving rise to antibiotic resistance. Envelope stress-response systems (ESRs), which protect and maintain cellular integrity, are critical for antibiotic resistance. The evolution of ESRs in the Staphylococci has led to disparate lineages and new antibiotic-resistant strains. Here, we are applying a computational approach to uncover the evolution of antibiotic resistance mechanisms in Staphylococci. This approach is based on a novel computational framework for characterizing bacterial genomes and operons using comparative genomics, pangenomics, molecular evolution, and phylogeny. Specifically, we are studying the conservation and modularity of Staphylococcal ESRs involved in antibiotic resistance by mapping the constituent domains in terms of their i) ancestry, ii) lineage- and environment-specific variations, and iii) diverse specialized protein/operon functions. Our findings will establish the nature and course of evolution of Staphylococcal antibiotic resistance systems within Staphylococci and key Gram-positive lineages.

Evolutionary and Comparative genomics of Bacterial Non-homologous End Joining Repair
COSI: EvolCompGen
  • Mohak Sharda, National Centre for Biological Sciences, Bangalore, India, India
  • Anjana Badrinarayanan, National Centre for Biological Sciences, Bangalore, India, India
  • Aswin Sai Narain Seshasayee, National Centre for Biological Sciences, Bangalore, India, India

Short Abstract: DNA double-strand breaks (DSBs) are a threat to genome stability. DSBs are either faithfully fixed via homologous recombination or erroneously via Non-homologous end joining (NHEJ). Unlike recombination-based repair, NHEJ is only sporadically present in prokaryotes. Towards understanding why many prokaryotes lack it, we used comparative genomics and phylogenetic approaches to show that multiple independent gain and loss events along with extensive horizontal gene transfers have shaped the evolutionary history of NHEJ. We also highlight the association of NHEJ with three genome characteristics- GC content, genome size and growth rate. Given the central role these traits play in determining the ability to carry out recombination, it is possible that the evolutionary history of bacterial NHEJ may have been shaped by requirement for efficient DSB repair. Approaches used in our study could be extended to other repair pathways in order to understand how they might have contributed in bacterial evolution or vice versa.

Identification and analysis of the genes under positive selection in E. coli: Towards a better understanding
COSI: EvolCompGen
  • Negin Malekian, TU Dresden, Germany
  • Amay Ajaykumar Agrawal, TU Dresden, Germany
  • Michael Schroeder, TU Dresden, Germany

Short Abstract: Bacterial phenotypes are usually shaped by strong positive selection. Antibiotic resistance, a global health threat, is one of the best examples of such phenotypes. Therefore, we would like to have a deep understanding of targets of positive selection in E. coli and see if positive selection in E. coli targets the genes involved in the resistance mechanism. Here, we look into a highly diverse wastewater E. coli, collected and sequenced by our group. Interestingly, we found many genes that showed highly significant evidence of positive selection are linked to the resistance mechanism or the basic functioning of bacteria. For example, OmpC, and OmpA are outer membrane porin diffusion channels that allow drug molecules to diffuse inside the cell, and downregulation or mutation in them leads to altered permeation and thus resistance. As another example, YfaL play a role in biofilm formation and biofilm formation is a known mechanism for bacteria to develop resistance. To summarize, we provided a deep understanding of the targets of positive selection in E. coli. Our results show that the targets of positive selection in E. coli are diverse. Among them, we found important genes involved in the antibiotic resistance mechanism of bacteria.

Identification of inter-phylogroup horizontal gene transfer of type III secretion system cluster genes between distant Pseudomonas syringae strains
COSI: EvolCompGen
  • Janet Lorv, University of Waterloo, Canada
  • Brendan McConkey, University of Waterloo, Canada

Short Abstract: The broad host range of the phytopathogen P. syringae (Psy) is attributed in part to its type III effectome diversity. This variability arises from the high levels of gene exchange in the effector loci of the type III secretion system (T3SS) gene cluster, allowing for precise host specificity. However, the core T3SS genes in the cluster are considered highly static. In this work, we clustered our two sequenced Psy genomes with 423 other Psy genomes using PIRATE. Both strains were identified as sub-phylogroup 2c strains, a distinct monophyletic group containing an atypical variant (AT-PAI) of the hrp/hrc-1 T3SS subtype. An analysis of T3SS gene families revealed that only the conserved core genes clustered together across all T3SS variants, albeit poorly (<60% id). When non-conserved genes of the AT-PAI variant were analyzed, we found highly similar (60-80% id) AT-PAI core genes in phylogroup 13 strains. To confirm the homology between distant AT-PAI strains, our MLSA of all conserved core T3 genes grouped these strains as a monophyletic group distant from the other strains. This incongruence between the core genome phylogeny and core T3SS MLSA suggest that inter-phylogroup gene transfer of the core T3SS genes can occur between distantly related Psy strains.

Identifying novel biological pathways through phylogenetic profiling based network analysis
COSI: EvolCompGen
  • Dana Sherill-Rofe, Hebrew University, Israel
  • Idit Bloch, Hebrew University, Israel
  • Yuval Tabach, Hebrew University, Israel

Short Abstract: Despite the sequencing revolution, the majority of the genes are poorly annotated. As almost none of the unannotated genes uniquely evolved in human, their evolution across hundreds of organisms can be the anchor for their functional characterization.
Although identifying uncharacterized pathways represents an extremely difficult challenge, novel pathways were recently identified after extensive analysis of gene groups showing similar phenotypes, interactions, or expression. Given the extensive growth in genomic data, a powerful approach to predict gene function and its interactions is phylogenetic profiling.
Phylogenetic profiling (PP) is an unbiased approach to predict gene function and its interactions. The main assumption is that genes sharing a similar PP are also functionally coupled. Recently, we established that integrating information from different clades can optimize co-evolution signals, and improve gene function discovery. We generated a network of all genes divided into paralogous groups for 12 clades containing almost 2000 species and identified clusters. We found several known pathways, but the annotated clusters corresponded to only 22% of the overall clusters, thus the remaining clusters may represent undiscovered biology. Using data integration and biological validation we intend to identify novel biological pathways. Characterization of even a single novel pathway is of paramount importance.

Inferring single-cell trees alongside cell-state transition dynamics from lineage tracing and RNAseq data - revised
COSI: EvolCompGen
  • Sophie Seidel, ETH Zurich, Switzerland
  • Ashley Maynard, ETH, Switzerland
  • Zhisong He, ETH, Switzerland
  • Barbara Treutlein, ETH, Switzerland
  • Tanja Stadler, ETH, Switzerland

Short Abstract: A central goal of developmental biology is to understand the building of a complex tissue from an initial cell. Single-cell lineage tracing and expression data have the potential to elucidate the cell-state transitions during that process. However, lineage tracing data is typically analysed using methods based on parsimony. A computational framework quantifying the transition dynamics by integrating both data sources is lacking.

We derived a novel substitution model that approximates the editing process of frequently used lineage tracing systems (LINNEAUS, ScarTrace). Using this substitution model, we perform Bayesian inference of single-cell lineage trees. Compared to state-of-the-art maximum parsimony methods, this allows us to take recurring editing outcomes and phylogenetic uncertainty into account. Alongside the trees, we estimate a cell-state’s growth and state transition rate. We implemented our model as a package within the BEAST2 platform and validated it on simulated data. We apply it to cerebral organoid data from multiple time points to investigate the cell-state transitions from neural progenitor to neuron cells.

We provide a framework to estimate cell-state transition rates based on lineage tracing and RNAseq data. This framework will enable a more quantitative understanding of development in health and disease.

matUtils: Resources and Tools for MAT Phylogenetic Analysis
COSI: EvolCompGen
  • Jakob Mcbroome, University of California, Santa Cruz, United States
  • Bryan Thornlow, University of California, Santa Cruz, United States
  • Russell Corbett-Detig, University of California, Santa Cruz, United States
  • Angie Hinrichs, University of California, Santa Cruz, United States
  • Yatish Turakhia, University of California, San Diego, United States

Short Abstract: Phylogenetic analyses are at the core of tracing viral evolution and interpreting local and global transmission networks, but the vast scale of available SARS-CoV-2 genome sequences has revealed deep inadequacies in existing phylogenetic workflows, data sharing platforms and data formats. A comprehensive and publicly available global phylogenetic dataset is needed to bring consistency to different research groups and enable rapid contact tracing and analysis. New data formats, such as mutation-annotated tree (MAT) (Turakhia, Thornlow, et al., 2020), can fulfill this need. We present a database of public MATs representing SARS-CoV-2 phylogenies, along with a new toolkit for the efficient querying, manipulation, and analysis of MAT structures, matUtils. Our database is updated daily with new publicly-available sequences and is annotated with Pangolin and Nextstrain clades. To support the use of this database, our toolset includes basic functionality like parsimony statistics, subtree selection, and conversion from MAT to standard formats. matUtils can additionally analyze placement uncertainty and calculate heuristics for phylogeographic analysis. Overall, our package provides support for public, accessible, and rapid phylogenetic analysis, including contact tracing and the tracking of variants of concern of SARS-CoV-2.

Phylogenetic profiling of PAF1 complex-interacting proteins during germ cell development
COSI: EvolCompGen
  • Hisashi Takatsuka, Ritsumeikan University, Japan
  • Yukihiko Kubota, Ritsumeikan University, Japan
  • Masahiro Ito, Ritsumeikan University, Japan

Short Abstract: PAF1 (RNA polymerase-II associated factor-1) complex (PAF1C) is a transcriptional regulator that exists as a hetero-pentamer comprising PAF1, CTR9, LEO1, CDC73, and RTF1. PAF1C is widely conserved from yeast to human and is suggested to regulate various biological processes.
RNAi knockdown of genes encoding PAF1C components in hermaphroditic Caenorhabditis elegans (C. elegans) resulted in abnormal oogenesis defect, suggesting the involvement of PAF1C during germ cell development. In this study, we performed evolutionary analysis of PAF1C-interacting proteins and assessed their gene expression in C. elegans and humans.
We constructed phylogenetic profiles based on the presence or absence of orthologs of 458 proteins that directly interact with PAF1C components in humans and 542 species with available whole genome sequences. The resulting phylogenetic profiles were categorized into five classes based on the presence of orthologs. Fourteen of the 34 proteins that exhibited ovary-specific expression in humans grouped together in the clustering analysis of C. elegans homologs by analyzing the similarity of gene expression during development. Moreover, ten proteins were conserved throughout eukaryotes. These findings suggest that molecular basis of PAF1C-related oogenesis is conserved in protozoa and provide a direction for mechanistic elucidation.

Quantifying negative selection on synonymous variants
COSI: EvolCompGen
  • Mikhail Gudkov, Victor Chang Cardiac Research Institute and St Vincent's Clinical School, Australia
  • Loïc Thibaut, Victor Chang Cardiac Research Institute and School of Mathematics and Statistics UNSW Sydney, Australia
  • Eleni Giannoulatou, Victor Chang Cardiac Research Institute and St Vincent's Clinical School, Australia

Short Abstract: Most disease sequencing studies tend to focus primarily on potential loss-of-function variants at the expense of other classes of mutations. In particular, synonymous genetic variants, that is, those single-nucleotide variants (SNVs) that do not alter the produced amino acid sequence, are routinely considered to be non-deleterious. However, the role of these so-called ‘silent mutations’ is potentially more important than was previously thought. For instance, synonymous SNVs (sSNVs) may create nonoptimal codons, thus affecting the stability of the produced mRNA and the overall translational efficiency.

It has also been shown that optimality-reducing sSNVs undergo purifying selection, the extent of which, nonetheless, remains unknown. The latter presents a significant limitation for variant prioritization and, consequently, for finding the true causes of genetic disorders.

Here we quantify the intensity of the negative selection acting on all possible sSNVs. Using MAPS, a recently developed metric of deleteriousness, we found that optimality-reducing sSNVs are subject to stronger selection than optimality-increasing ones, which was confirmed using conservation and site-frequency spectrum analyses. Furthermore, we found that purifying selection affects sSNVs in a gene- and amino acid-dependent manner, with glutamine being particularly intolerant to such mutations. We also propose an improved version of MAPS for sSNVs.

Restriction-modification systems in Acinetobacter baumannii
COSI: EvolCompGen
  • Anna Ershova, Department of Microbiology, School of Genetics and Microbiology, Trinity College Dublin, Ireland
  • Carsten Kröger, Department of Microbiology, School of Genetics and Microbiology, Trinity College Dublin, Ireland

Short Abstract: Acinetobacter baumannii is a nosocomial pathogen with a high level of genomic plasticity and the ability to colonise different environments. An analysis of the R-M system and solitary DNA methyltransferase (MTase) distribution in A. baumannii will shed light on horizontal gene flow and gene regulation in these bacteria.
By comparing R-M system proteins from 265 genomes, we discovered that only one MTase homolog (>95% of identity) was found in almost all (264 of 265) genomes. Deleting the gene of this MTase resulted in impaired growth and decreased motility of the mutant strain in comparison to the wild type. Additionally, we identified protein clusters that are well distributed (119 and 21 genomes) but are never found in the same genome, and 125 genomes do not encode both of these proteins. A comparison of all R-M protein clusters in these three groups of genomes showed that the first two groups have fewer common clusters than each of them with the third group. Taken together, R-M proteins can be associated with phenotypic traits of A. baumannii and affect the genomic structure. This project has funded by the European Union’s Horizon 2020 programme under the Marie Sklodowska-Curie grant agreement No 896441.

SequenceServer: A modern graphical user interface for custom BLAST databases
COSI: EvolCompGen
  • Anurag Priyam, Queen Mary University of London, United Kingdom
  • Yannick Wurm, Queen Mary University of London, United Kingdom

Short Abstract: Comparing newly obtained and previously known nucleotide and amino-acid sequences underpins modern biological research. BLAST is a well-established tool for such comparisons, but it can be challenging to use on new data sets. We thus created SequenceServer, a tool for running BLAST searches and visually inspecting BLAST results for biological interpretation.

SequenceServer has flexible feature-set to support varying researcher requirements. These include a graphical and tabular overview of matching sequences for each query, a length-histogram of matching sequences for each query, Kablammo visualisation for each query-hit pair to reveal large insertion and deletion events, a circos-style plot of queries and their top hits to reveal conserved synteny and duplications, ability to download matching sequences and regions in FASTA format, ability to download raw BLAST output in XML and tabular formats, and the ability to use command-line BLAST parameters to customise search results. Results are stored for a user-configurable amount of time and can be shared easily with colleagues. SequenceServer is used by hundreds of researchers world-wide. We expect our software will continue to empower countless more.

SIEVE: Joint Inference of Tumor Phylogeny and Variant Calling from Single-cell DNA Sequencing Data
COSI: EvolCompGen
  • Senbai Kang, University of Warsaw, Poland
  • Nico Borgsmüller, ETH Zurich, Switzerland
  • Monica Valecha, University of Vigo, Spain
  • Jack Kuipers, ETH Zurich, Switzerland
  • Niko Beerenwinkel, ETH Zurich, Switzerland
  • David Posada, University of Vigo, Spain
  • Ewa Szczurek, University of Warsaw, Poland

Short Abstract: Understanding intra-tumor heterogeneity is the cornerstone of developing effective cancer treatment and precision medicine. The development of single-cell DNA sequencing technology remarkably increases the resolution of DNA profiles to single-cell level. This facilitates the inference of phylogenetic trees with individual tumor cells as leaves, providing an evolutionary model of the mechanism behind intra-tumor heterogeneity. However, most of the methods proposed for tree reconstruction from single-cell data are based on infinite-sites assumption, which is often violated in reality due to evolutionary events like loss of heterozygosity.

Here, we develop a novel computational model, called Sieve, to jointly infer tumor phylogeny and call variants under finite-sites assumption from single-cell data. We propose a novel rate matrix, with states representing genotypes corresponding to heterozygous and homozygous mutations. To properly integrate the noisy single-cell sequencing data, we develop a Dirichlet-Multinomial based probabilistic model of the sequencing coverage and nucleotide read counts. The model accounts for allelic dropouts. To acquire accurate branch lengths, acquisition bias correction is applied. We prove that Sieve outperforms existing approaches on simulated data, especially regarding branch lengths and calling homozygous mutations. Sieve is then applied to publicly available real datasets. Sieve is implemented as a package of Beast 2.

Super-Reconciliation with Horizontal Gene Transfers
COSI: EvolCompGen
  • Mattéo Delabre, University of Montreal, Canada
  • Nadia El-Mabrouk, University of Montreal, Canada

Short Abstract: During evolution, genes are mutated, duplicated, lost, and passed on to other organisms through speciations and horizontal gene transfers (HGTs) and, over time, form families of homologous genes. Reconciliation is a longstanding model used for inferring the evolution history of such families, explaining incongruence between the tree of a given family with the corresponding species tree by evolutionary events. A major drawback of reconciliations is that gene families are considered to evolve independently from one another. This assumption is not suited for explaining the evolution of chromosomal segments of genes that evolved together. The super-reconciliation model was the first attempt to generalize the reconciliation approach for several gene families organized into syntenies. This model was however limited to duplications and losses and did not allow for HGTs. In this presentation, I will present new algorithmic results for the super-reconciliation problem, extending it to include HGTs events, seeking for the DTL-distance. I will show that duplications, HGTs, and full losses cannot be inferred independently from segmental losses if an optimal solution is sought. I will also present an algorithm for the DTL-distance. This extended model can enable future studies on the evolution of HGT-shaped-syntenies, such as operons in bacteria.

Trees trump traits for ancestral genome reconstruction
COSI: EvolCompGen
  • Yuting Xiao, Carnegie Mellon University, United States
  • Maureen Stolzer, Carnegie Mellon University, United States
  • Dannie Durand, Carnegie Mellon University, United States

Short Abstract: Reconstruction of genome evolution is the foundation of many studies in computational biology. Wagner parsimony and tree reconciliation are parsimony-based approaches for reconstruction of ancestral genome content. Wagner parsimony infers ancestral content by minimizing gains and losses along the branches of the species tree. Tree reconciliation, on the other hand, additionally takes transfer events and phylogenetic information into consideration.

Here, we demonstrate these two methods can lead to very different evolutionary inferences. We used Notung, a tree reconciliation software developed by the Durand Lab, to reconcile 10,628 gene families in selected Cyanobacteria species. These results were compared with those of Wagner parsimony in terms of inferred family gain/loss and genome size changes. We found that Wagner parsimony tends to overestimate the number of gain events and underestimate the number of losses. Moreover, it suggests that the family sizes of present-day species resulted from a gradual expansion of family gains. In contrast, results obtained from Notung show reconciliation is unbiased with respect to event type. Further, instead of gradual expansion, Notung infers a combination of genome expansion, streamlining and turnover. Our results verify that model choice can dramatically affect downstream analyses when inferring the history of genome evolution.

Utilization of TFmiR2 for the identification of conserved hotspot nodes
COSI: EvolCompGen
  • Maryam Nazarieh, NA, Germany

Short Abstract: Identification of evolutionarily conserved transcription responses is an intriguing challenge across species. With respect that biological processes are conserved across species, it is valid to detect the evolutionarily conserved set of genes that work together to govern particular cellular identity such as the cell cycle process. In terms of diseases, it is challenging to investigate if the underlying mutations in the DNA sequence are preserved for the set of conserved differentially expressed genes (DEGs).
Although, TFmiR2 was introduced initially as a web server for constructing and analyzing disease-, tissue- and process-specific transcription factor, and microRNA co-regulatory networks, it can be utilized for the purpose of comparative genomic analysis. It takes a set of DEGs for a particular organism selected by a user as input and constructs a co-regulatory network and disease-, tissue- and process-specific co-regulatory networks. Then, it applies a set of tools such as MDS and MCDS to find hotspot nodes. Therefore, it can be utilized to identify evolutionarily conserved hotspot genes for different organisms by considering identical tissue, process, and disease for the case of comparison.



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube