Poster presentations at ISMB 2020 will be presented virtually. Authors will pre-record their poster talk (5-7 minutes) and will upload it to the virtual conference platform site along with a PDF of their poster.
All registered conference participants will have access to the poster and presentation through the conference and content until October 31, 2020. There are Q&A opportunities through a chat function to allow interaction between presenters and participants.
Preliminary information on preparing your poster and poster talk are available at: https://www.iscb.org/ismb2020-general/presenterinfo#posters
Ideally authors should be available for interactive chat during the times noted below:
View Posters By Category
Poster Session A: July 13 & July 14 7:45 am - 9:15 am Eastern Daylight Time
Session B: July 15 and July 16 between 7:45 am - 9:15 am Eastern Daylight Time
Short Abstract: An Immune Exposure is the process by which components of the immune system first encounter a potential trigger. The ability to describe consistently the details of the Immune Exposure process was needed for data resources responsible for housing scientific data related to the immune response. This need was met through the development of a structured model for Immune Exposures. This model was created during curation of the immunology literature, resulting in a robust model capable of meeting the requirements of such data. We present this model with the hope that overlapping projects will adopt and or contribute to this work.
Short Abstract: Genome Wide Associations Study (GWAS) have been widely used to identify potentially causative variants of genetic disease or trait given the patient phenotypes. However, generally it cannot present the complete picture, particularly on how the studied trait related to other similar traits; because, often not all of the available phenotype information is exploited in the analyses.
Here, we propose to use Ontology-Wide Genome Associations Study (OWAS) to complete the phenotype profiles of diseases and perform GWAS on UK Biobank.
More specifically, with OWAS, we utilize the phenotype information that exists in the literature as well as in semantic resources to expand the GWAS to the cases that are not explicitly associated with the phenotypes. Our initial results show that our approach has the potential to increase the statistical power of GWAS as well as identify associations for the phenotypes which have not been explicitly observed.
Short Abstract: The Experimental Factor Ontology (EFO) is an application ontology that represents experimental variables and offers custom extensions for EMBL-EBI data. One focus area is the provision of a unified vocabulary of disease/phenotype terms used to annotate data collected by Open Targets for drug/target validation. After improving the EFO architecture in 2018, a large increase in potentially redundant associations was seen in Open Targets data, resulting in the requirements to rearrange the top level of the EFO disease hierarchy. In doing so, EFO was aligned with clinical disease hierarchies, rather than the previous anatomical system derived hierarchy, and the disease hierarchy as a whole improved.
Short Abstract: Defects in activity of lysosomal hydrolases involved in glycosaminoglycan degradation lead to a group of lysosomal storage diseases called Mucopolysaccharidosis (MPS). In MPS, secondary cell disturbance affects pathways common to cancer. This work aims to investigate the differential expression of cancer pathways in MPS datasets available in public functional genomics data repository, such as GEO (Gene Expression Omnibus) and ArrayExpress. We performed Gene Ontology analysis with QuickGo and KEGG annotations, and the network was constructed with ShinyGo (v0.61) based on hypergeometric distribution followed by FDR correction. We selected 12 datasets, being 2 of MPS I; 1 from MPS II; 1 from MPS IIIA; 3 MPS IIIB; 1 MPS VI; and 4 from MPS VII. The most abundant ontologies found were PI3K−Akt signaling pathway; MAPK signaling pathway; Ras signaling pathway; Rap1 signaling pathway and cAMP signaling pathway. We hypothesize that these signaling pathways are altered because glycosaminoglycans play an essential role in the composition of the extracellular matrix, helping to regulate processes such as metabolic signaling, apoptosis, cell migration, adhesion, and antigen presentation. Understanding these biological processes may help us look for new biomarkers for these rare metabolic diseases.
Short Abstract: Diagnosis of COVID-19 is critical to the control of COVID-19
pandemic. Common diagnostic methods include symptoms
identification, chest imaging, serological test, and RT-PCR.
However, the sensitivity and specificity of different diagnosis
methods differ. In this study, we ontologically represent
different aspects of COVID-19 diagnosis using the community-
based Coronavirus Infectious Disease Ontology (CIDO), an
OBO Foundry library ontology. CIDO includes many new terms
and also imports many relevant terms from existing ontologies.
The high level hierarchy and design pattern of CIDO are
introduced to support COVID-19 diagnosis. The knowledge
reported in the literature reports and reliable resources such as
the FDA website is ontologically represented. We modeled and
compared over 20 SARS-CoV-2 RT-PCR assays, which target
different gene markers in SARS-CoV-2. The sensitivity and
specificity of different methods are discussed.
Short Abstract: The Open Biological and Biomedical Ontology (OBO) is a collective of ontology developers committed to collaboration and shared principles. The OBO Foundry mission is to develop a family of logically well-formed and scientifically accurate interoperable ontologies. Participants voluntarily adhere and contribute to the development of an evolving set of principles including open use, collaborative development, non-overlapping and strictly-scoped content, common syntax and relations. OBO provides services to the community such as hosting persistent URLs and ontology files, recording metadata, and supporting discussion forums and regular calls.
We developed a set of key top-level ontology terms that unify the many OBO Foundry ontologies, termed Core Ontology for Biology and Biomedicine (COB). COB simplifies the identification of ontology terms simplifying navigation across OBO projects. It includes logic that links ontologies together, allowing interoperability problems to be detected and corrected. Related ontology terms from multiple ontologies can be viewed at the same time, illustrating how OBO ontologies and their terms are related and ensuring interoperability.
COB is still in active development; we are eager to obtain community feedback. We want to collect actionable suggestions on what users most want in COB and what this community would find most useful to their daily practices.
Short Abstract: Many organizations face challenges in managing and analyzing data, especially when such data is obtained from multiple sources, created using diverse methods or protocols. Analyzing heterogeneous, structured datasets requires rigorous tracking of their interrelationships and provenance. This task has long been a Grand Challenge of data science, and has more recently been formalized in the FAIR principles: that all data be Findable, Accessible, Interoperable and Reusable, both for machines and for people. Adherence to these principles is necessary for proper stewardship of information, for testing regulatory compliance, for measuring efficiency, and for effectively being able to reuse data analytical frameworks. Interoperability and reusability are especially challenging to implement in practice, to the extent that scientists acknowledge a “reproducibility crisis” across many fields of study. We developed CORAL, a framework for organizing the large diversity of datasets that are generated and used by moderately complex organizations. CORAL features a web interface for bench scientists to upload and explore data, as well as a Jupyter notebook interface for data analysts, both backed by a common API. We describe the CORAL data model and associated tools, and discuss how they greatly facilitate adherence to all four of the FAIR principles.
Short Abstract: Nowadays ontologies and knowledge graphs play an important role in the context of Open Data and Big Data applications. Life sciences do not make exception.
Developed since 1989, the international ImMunoGeneTics information system® (IMGT®) regroups today several rich relational databases such as sequence, genome, structure and monoclonal antibody databases, software tools and multiple unstructured resources, such as HTML pages or pdf documents. IMGT’s strength is the provision of a standard vocabulary: the IMGT-ONTOLOGY refined over the years, for data and tools in the system. In 1999, we laid the foundation of IMGT-ONTOLOGY and a first implementation in RDF+OWL language became available in 2010 through the BioPortal (bioportal.bioontology.org/). However, this formalization takes into account only a small part of the data in the system. Therefore, we cannot currently describe a sequence with the IMGT-ONTOLOGY, due to the missing of description concepts in IMGT-ONTOLOGY.
Our work aims to propose a generalized description model which will cover all the system’s data by integrating the IMGT DESCRIPTION’s axiom. This will allow us to structure all the data in the form of a knowledge graph . Subsequently, the goal is to apply machine learning methods in order to discover new knowledge from the structured data.
Short Abstract: Many protein function databases are built on automated or semi-automated curations and can contain various annotation errors. The correction of such misannotations is critical to improving the accuracy and reliability of the databases. We proposed a new approach to detect potentially incorrect Gene Ontology (GO) annotations by comparing the ratio of annotation rates (RAR) for the same GO term across different taxonomic groups, where those with a relatively low RAR usually correspond to incorrect annotations. As an illustration, we applied the approach to 20 commonly-studied species in two recent UniProt-GOA releases and identified 250 potential misannotations in the 2018-11-6 release, where only 25% of them were corrected in the 2019-6-3 release. Importantly, 56% of the misannotations are “Inferred from Biological aspect of Ancestor (IBA)”, i.e. reviewed computational annotations based on phylogenetic analysis. This is in contrast to previous observations that attributed misannotations mainly to “Inferred from Sequence or structural Similarity (ISS)”, probably reflecting an error source shift due to the new developments of function annotation databases. The results demonstrated a simple but efficient misannotation detection approach that is useful for large-scale comparative protein function studies. The code and list of identified misannotations are available at zhanglab.ccmb.med.umich.edu/RAR.
Short Abstract: We present the following components of an integrated biomedical entity recognition system:
- The Bio Term Hub, an aggregator of terminologies sourced from reference ontologies, such as Cellosaurus, Cell Ontology, ChEBI, CTD, EntrezGene, GO, NCBI taxonomy, Protein Ontology, Sequence Ontology, SwissProt, Uberon.
- OGER, a fast and accurate named entity recognition system based on the terminologies derived by the Bio Term Hub. OGER has been shown to be extremely
fast and of great practical utility in a number of shared task evaluations in
the biomedical text mining field.
- A state-of-the-art deep learning model which greatly improves the precision
of the OGER system.
Since all the components of the system have been described in previous
publications, we will focus the presentation on practical aspects, with the
aim of promoting adoption of such a powerful tool by potential end users.
Short Abstract: Medical practitioners record the condition status of a patient through qualitative and quantitative
observations. The measurement of vital signs and molecular parameters in the clinics gives a
complementary description of abnormal phenotypes associated with the progression of a disease. The
Clinical Measurement Ontology (CMO) is used to standardize annotations of these measurable traits.
However, researchers have no way to describe how these quantitative traits relate to phenotype
concepts in a machine-readable manner. Using the WHO clinical case report form standard for the
COVID-19 pandemic, we modeled quantitative traits and developed OWL axioms to formally relate
clinical measurement terms with anatomical, biomolecular entities and phenotypes annotated with the
Uber-anatomy ontology (Uberon), Chemical Entities of Biological Interest (ChEBI) and the Phenotype and
Trait Ontology (PATO) biomedical ontologies. The formal description of these relations allows
interoperability between clinical and biological descriptions, and facilitates automated reasoning for
analysis of patterns over quantitative and qualitative biomedical observations.
Short Abstract: SARS-CoV-2 is the pathogen of the COVID-19 disease. It
is commonly agreed that SARS-CoV-2 originated from
some animal host. However, the exact origin of
SARS-CoV-2 remains unclear. The origins of other human
coronaviruses including SARS-CoV and MERS-CoV are
also unclear. This study focuses on collection, ontological
modeling and representation, and analysis of the hosts of
various human coronaviruses with a focus on SARS-CoV-2.
Over 20 natural and laboratory animal hosts were found able
to host human coronaviruses. All the viruses and hosts were
classified using the NCBITaxon ontology. The related terms
were also imported to the Coronavirus Infectious Disease
Ontology (CIDO), and the relations between human
coronaviruses and their hosts were linked using an axiom in
CIDO. Our ontological classification of all the hosts also
allowed us to hypothesize that human coronaviruses only
use mammals as their hosts.
Short Abstract: In the poorly studied field of physician suicide, various fac-tors can contribute to misinformation or information distor-tion, which in turn can influence evidence-based policies and prevention of suicide in this unique population. Here, we report on the use of nanopublications as a scientific publishing approach to establish a citation network of claims drawn from a variety of media concerning the rate of suicide of US physicians. Our work integrates these vari-ous claims and enables the verification of non-authoritative assertions, thereby better equipping researchers and to advance evidence-based knowledge and make informed statements in the advocacy of physician suicide prevention.
Short Abstract: RGD (rgd.mcw.edu) is a multi-species knowledgebase which provides a substantial corpus of genomic, genetic, phenotypic and disease-related data and an innovative suite of tools for analyzing these data. Researchers can leverage cross-species manual annotations from RGD and annotations imported from external sources to search for an appropriate model. As an example, a researcher studying Wilson disease can find a list of associated genes using RGD's OLGA tool. An integrated toolbox facilitates submission of gene lists to other analysis tools to explore annotations across ontologies and across species. In analyses related to Wilson disease, ATP7B is one gene which commonly appears. The association between ATP7B and Wilson disease is well-documented at RGD via disease and phenotype annotations and associated pathogenic variants. Links on the gene page provide access to data for other species, such as an extensive list of mouse phenotypes. For rat, RGD's PhenoMiner tool provides related quantitative measurement data. RGD's strain record details a large Atp7b deletion in the LEC/Hok strain, a Wilson disease model. RGD's Variant Visualizer provides functionality to explore pathogenic or damaging variants for human, rat and dog.
Short Abstract: Background: Variation graphs are a novel way to describe genomic variation across a population. Variation graph tools present a significant improvement in mitigating reference bias compared to the linear reference ecosystem. Existing toolkits focus on algorithms processing pangenome graphs. Yet, they have limited capabilities in integrating various annotations of the biology and providing an interface for large scale visualizations.
Description: To interpret biological meaning in variation graphs by integrating various kinds of annotations for further analysis, FAIR data interchange formats are demanded. Borderless technology such as the Semantic Web allows variation graph toolkits and pangenome tools to focus on their core competence while allowing bioinformaticians to integrate, analyze, and visualize the data.
Result: We demonstrate how we can represent a graphical pangenome with pangenome ontologies using a standard declarative graph query language. Then we show how the vg RDF and Pantograph RDF can represent data ready for the Semantic Web and how we can combine existing data from INDSC and UniProt without conversions or loss of information into a single Variation and Knowledge Graph.
Short Abstract: With the advances in Next Generation Sequencing (NGS) technologies, a huge volume of clinical genomic data has become available. Efficient exploitation of such data requires linkage to a patient's complete phenotype profile. Current resources providing disease-phenotype associations are not comprehensive, and they often do not cover all of the diseases from OMIM and particularly from ICD10, which are the primary terminologies used in clinical settings. Here, we propose a text-mining system which utilizes semantic relations in the phenotype ontologies and statistical methods to extract disease-phenotype associations from the literature. We compare our findings against established disease-phenotype associations and also demonstrate its utility in covering mouse gene-disease associations from Mouse Genome Informatics (MGI). Such associations serve as necessary information blocks for understanding underlying disease mechanisms and developing or repurposing drugs.
Short Abstract: The Gene Ontology is a standardized vocabulary used to annotate functions to genes and their products. Our lab has developed a suite of tools called GOMAP (Gene Ontology Meta Annotator for Plants; doi.org/10.1101/809988, doi.org/10.1002/pld3.52). GOMAP generates high-confidence, high coverage, and reproducible functional annotations for plants by combining multiple functional prediction approaches. 18 crop plant genomes have been annotated by the GOMAP pipeline (dill-picl.org/projects/gomap/gomap-datasets/), making it possible to use these datasets in a comparative manner across species. We used the union of all functions represented by each annotated genome to generate parsimony and distance matrices for these 18 species, then created dendrograms to visualize the relationships. These dendrograms were compared to well-established species-level phylogenies to determine whether trees derived through analysis of functions contained in each genome agree with known evolutionary histories, which they largely do. Where the dendrograms do not agree with the known evolutionary histories of these species, we have begun to investigate the source of those discrepancies. We find that there is a potential application of this methodology for identifying novel candidate genes for traits of interest and outline future plans for its application within and across species.