Bio-Ontologies

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in CEST
Monday, July 24th
13:50-14:00
COSI Opening Remarks
Room: Salle Rhone 3a
Format: Live from venue

  • Núria Queralt Rosinach
14:00-15:00
Invited Presentation: Ontology Alignment for Life Sciences
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Ernesto Jimenez-Ruiz, City, University of London, United Kingdom


Presentation Overview: Show

The semantic web and life sciences research communities have extensively investigated the problem of defining correspondences between independently developed ontologies, which is usually referred to as the ontology alignment problem. Resulting from this effort are the growing number of ontology matching systems in development and the large mapping repositories that have been created. In particular, the ontology matching community has also been running an annual evaluation campaign (the OAEI) to benchmark ontology alignment systems over different matching tasks. Despite some joint efforts, the OAEI could be better aligned to real-world challenges from life sciences to fulfil not only the objective of improving automated ontology alignment systems but also providing useful outcomes that could be (re)used in practice.

15:00-15:30
Proceedings Presentation: KR4SL: knowledge graph reasoning for explainable prediction of synthetic lethality
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Ke Zhang, ShanghaiTech University, China
  • Min Wu, I2R, A*STAR, Singapore
  • Yong Liu, Nanyang Technological University, Singapore
  • Yimiao Feng, ShanghaiTech University, China
  • Jie Zheng, ShanghaiTech University, China


Presentation Overview: Show

Motivation: Synthetic lethality (SL) is a promising strategy for anti-cancer therapy, as inhibiting SL partners of genes with cancer-specific mutations can selectively kill the cancer cells without harming the normal cells. Wet-lab techniques for SL screening have issues like high cost and off-target effect. Computational methods can help address these issues. Previous machine learning methods leverage known SL pairs, and the use of knowledge graph (KG) can significantly enhance the prediction performance. However, the subgraph structures of KG have not been fully explored. Besides, most machine learning methods lack interpretability, which is an obstacle for wide applications of machine learning to SL identification.
Results: We present a model named KR4SL to predict SL partners for a given primary gene. It captures the structural semantics of a KG by efficiently constructing and learning from relational digraphs in the KG. To encode the semantic information of the relational digraphs, we fuse textual semantics of entities into propagated messages and enhance the sequential semantics of paths using recurrent neural network. Moreover, we design an attentive aggregator to identify critical subgraph structures that contributed the most to the SL prediction as explanations. Extensive experiments under different settings show that KR4SL significantly outperforms all baselines. The explanatory subgraphs for the predicted gene pairs can unveil prediction process and mechanisms underlying synthetic lethality. The improved predictive power and interpretability indicate that deep learning is practically useful for SL-based cancer drug target discovery.
Availability: The source code is freely available at https://github.com/JieZheng-ShanghaiTech/KR4SL

16:00-16:40
A 20-year journey developing the disease open science ecosystem
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Lynn Schriml, University of Maryland School of Medicine, United States
  • J. Allen Baron, Institute for Genome Sciences, United States
  • Claudia Marie Sanchez-Beato Johnson, Institute for Genome Sciences, United States
  • Dustin Olley, Institute for Genome Sciences, United States
  • Lance Nickel, Institute for Genome Sciences, United States
  • Mike Schor, Institute for Genome Sciences, United States


Presentation Overview: Show

The Human Disease Ontology (DO) has established rigorous quality control and release procedures to enhance data rigor and discovery across the human disease open science data ecosystem. As a CC0 resource, the DO Knowledgebase (DO-KB) models, develops and shares models of complex diseases, software for capturing resource usage, ML-ready datasets and novel mechanisms for querying and retrieving disease datasets. Modeling best practices for ontology development, over the past 20 years, the DO has led the field in how to engage data contributors, to collaborate with other data repositories and to support software development for projects utilizing the DO to conduct analysis of disease-gene networks, disease repurposing, representing animal models of human diseases and for developing application ontologies that mine the DO content and structure to formulate novel data structures for modeling data for new ontological purposes.

16:40-17:00
ChemoOnto, an ontology to qualify the course of chemotherapies
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Alice Rogier, PhD, France
  • Bastien Rance, Inserm, Inria, APHP, France
  • Adrien Coulet, Inserm, Inria, France


Presentation Overview: Show

Chemotherapies follow well defined standard regimen (or protocols) recommended by scientific societies. Those are organized in cycles during which cytotoxic molecules, doses and days of administration are precisely specified. But in real life, treatment may not go as planned. Toxicity events, holidays and other factors lead to changes in doses and delay of administration, what may impact the effect of the treatment. Modeling both protocols and their real-word implementation in a unique framework would facilitate further comparisons.
To this aim, we propose an ontology named ChemoOnto to represent both protocols and treatment courses. ChemoOnto, provides 10 classes, 16 object properties and 24 data properties to model the complexity of chemotherapy and cover both standards and administered courses. ChemoOnto reuses several domain ontologies, particularly the Time Ontology and a drug knowledge graph named Romedi. We instantiated ChemoOnto with 1973 chemotherapy protocols and treatment data of 3,923 patients. We added toxicity events detected in a previous work to our knowledge graph and applied temporal reasoning using SWRL rules to detect toxicity events occurring during patient’s chemotherapies.
ChemoOnto is an original model that may support various applications to understand and analyze chemotherapy courses and response, by considering the complexity of their description.

17:00-17:20
First Layperson Translation of the Sickle Cell Disease Ontology – Making SCD-Centred eHealth Platforms more Accessible
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Jade Hotchkiss, Division of Human Genetics, Department of Pathology, University of Cape Town, South Africa
  • Victoria Nembaware, Division of Human Genetics, Department of Pathology, University of Cape Town, South Africa
  • Wilson Mupfururirwa, Division of Human Genetics, Department of Pathology, University of Cape Town, South Africa
  • Nicole Vasilevsky, Critical Path Institute, Tucson, Arizona, United States
  • Melissa Haendel, University of Colorado Anschutz Medical Campus, United States
  • Ambroise Wonkam, McKusick-Nathans Institute and Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, United States
  • Nicola Mulder, Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, South Africa


Presentation Overview: Show

Sickle Cell Disease (SCD) is one of the world’s most common monogenic pathologies, with the majority of sufferers living in African countries where healthcare services are typically inadequate, leaving SCD management as mainly the responsibility of patients and their communities. The SCD Ontology (SCDO) is being used to standardise data collection across multiple research sites in Africa, however, SCDO terms are generally too technical and inaccessible to laypeople.
We adapted a workflow previously developed by us (for creating the French SCDO), to create a novel workflow which we used to produce the first English layperson SCDO. A subset of the SCDO layperson terms has already been used in a mobile health application prototype developed by the SickleInAfrica Consortium for SCD patients.
We aim to produce a French layperson SCDO and layperson versions of other future translations of the SCDO, to be used in making SCD-centred eHealth platforms more accessible to a broader audience. Furthermore, SCDO layperson terms can potentially be used to facilitate the retrieval of information from layperson sources, potentially leading to the discovery of effective novel alternative therapies employed by SCD patients. Notably, our novel workflow can be reused by ontologists to produce their own layperson ontology versions.

17:20-17:40
Cell Taxonomy: a curated repository of cell types with multifaceted characterization
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Zhang Zhang, Beijing Institute of Genomics Chinese Academy of Sciences (China National Center for Bioinformation), China
  • Shuai Jiang, Beijing Institute of Genomics Chinese Academy of Sciences (China National Center for Bioinformation), China


Presentation Overview: Show

Single-cell studies have delineated cellular diversity and uncovered increasing numbers of previously uncharacterized cell types in complex tissues. Thus, synthesizing growing knowledge of cellular characteristics is critical for dissecting cellular heterogeneity, developmental processes and tumorigenesis at single-cell resolution. Here, we present Cell Taxonomy (https://ngdc.cncb.ac.cn/celltaxonomy), a comprehensive and curated repository of cell types and associated cell markers encompassing a wide range of species, tissues and conditions. Combined with literature curation and data integration, the current version of Cell Taxonomy establishes a well-structured ontology for 3,143 cell types and houses a comprehensive collection of 26,613 associated cell markers in 257 conditions and 387 tissues across 34 species. Based on 4,299 publications and single-cell transcriptomic profiles of ∼3.5 million cells, Cell Taxonomy features multifaceted characterization for cell types and cell markers, involving quality assessment of cell markers and cell clusters, cross-species comparison, cell composition of tissues and cellular similarity based on markers. Taken together, Cell Taxonomy represents a fundamentally useful reference to systematically and accurately characterize cell types and thus lays an important foundation for deeply understanding and exploring cellular biology in diverse species.

17:40-18:00
Presenter Q&A
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Núria Queralt-Rosinach
Tuesday, July 25th
10:30-11:30
Invited Presentation: Ontology-based Interpretability for Large Predictive Models
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Janna Hastings, University of Zurich, Switzerland


Presentation Overview: Show

Bio-ontologies include consensus hierarchical classifications of entities in their domain and are a key resource for biomedical data-driven discovery research. While the potential uses of large-scale machine learning models for biomedical discoveries are growing, there are well-known challenges when such models operate as 'black boxes' that cannot be inspected: the model may learn spurious associations that lack generalisability, and the opacity may hinder the utility of the prediction in discovery research where novel findings need to be understood. Interpretability is provided for such models either extrinsically by determining feature importance for predictions, or intrinsically through inspecting model internal parameters such as Transformer attention weights. Ontologies are able to support the provision of natural explanations at different hierarchical levels, but are not accessed by most current approaches that provide interpretability for machine learning models. Through case studies in metabolism and in physical activity, I will show how ontologies can be used to supplement and extend existing approaches that enable interpretability of large machine learning models.

11:50-12:10
Navigating the rare diseases landscape: a comprehensive approach to identify gene therapy targets based on cell type-phenotype associations
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Brian Schilder, Imperial College London, United Kingdom
  • Kitty Murphy, Imperial College London, United Kingdom
  • Bobby Gordon-Smith, Imperial College London, United Kingdom
  • Jai Chapman, Imperial College London, United Kingdom
  • Momoko Otani, Imperial College London, United Kingdom
  • Nathan Skene, Imperial College London, United Kingdom


Presentation Overview: Show

Rare diseases (RDs) are individually uncommon, but collectively they contribute to an enormous global disease burden. Yet we still do not understand the biological mechanisms by which most of these diseases act. Therefore, we utilised the gene Human Phenotype Ontology gene annotations and single-cell transcriptomic atlases to identify the cell types underlying > 6,000 phenotypes associated with >8,000 RDs. Our results both confirm well-known cell type-phenotype relationships and reveal previously unknown connections. We also demonstrate that the particular cell types underlying phenotypes (e.g. neonatal hypotonia, brachydactyly) predict differential clinical outcomes (age of death, severity) across diseases, opening avenues for mechanism-driven differential diagnosis in the clinic. Next, we identified candidate gene therapy targets based on phenotype severity, onset, and viral vector compatibility. Top candidates included respiratory failure (alveolar cells via CCNO), mental deterioration (neurons via APOE/CSTB), and coma (islet endocrine cells via INS/KCNJ11). Finally, we provide a user-friendly web app to enable clinicians, researchers, and patients to trace disease mechanisms down to the level of symptoms, cell types and genes. In summary, our findings have important implications for understanding disease biology at multi-scale resolution, and for the development of gene therapies to treat patients in a more targeted, mechanism-driven manner.

12:10-12:30
COSI Closing Remarks and Awards
Room: Salle Rhone 3a
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Núria Queralt-Rosinach
13:50-14:10
Session: Joint Session with BOSC
The Research Software Ecosystem: an open software metadata commons
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Nomi Harris

  • Hans Ienasescu, Technical University of Denmark, Denmark
  • Salvador Capella-Gutiérrez, Barcelona Supercomputing Center (BSC), Spain
  • Frederik Coppens, Ghent University, Belgium
  • José Mª Fernández, Barcelona Supercomputing Center (BSC), Spain
  • Alban Gaignard, Institut du Thorax, University of Nantes, France
  • Carole Goble, The University of Manchester, United Kingdom
  • Bjoern Gruening, Uni-Freiburg, Germany
  • Johan Gustafsson, Australian Biocommons, Australia
  • Josep Ll Gelpi, Dept. Bioquimica i Biologia Molecular. Univ. Barcelona, Spain
  • Jennifer Harrow, ELIXIR, United Kingdom
  • Steven Manos, Australian BioCommons, Australia
  • Kota Miura, Bioimage Analysis & Research, Japan
  • Steffen Möller, Rostock University Medical Center, Germany
  • Stuart Owen, The University of Manchester, United Kingdom
  • Perrine Paul-Gilloteaux, Institut du Thorax, University of Nantes, France
  • Hedi Peterson, University of Tartu, Estonia
  • Manthos Pithoulias, ELIXIR Europe, United Kingdom
  • Jonathan Tedds, ELIXIR Europe, United Kingdom
  • Dmitri Repchevsky, Barcelona Supercomputing Center (BSC), Spain
  • Federico Zambelli, Department of Biosciences, University of Milan, Milano, Italy, Italy
  • Oleg Zharkov, University of Freiburg, Germany
  • Matúš Kalaš, Computational Biology Unit, Department of Informatics, University of Bergen, Norway
  • Herve Menager, Institut Pasteur, Université Paris Cité, France


Presentation Overview: Show

Research software is a critical component of computational research. Being able to discover, understand and adequately utilize software is essential. Many existing services facilitate these tasks, all of them relying heavily on software metadata. The continued upkeep of such complex and large sets of metadata comes at the cost of multiple efforts of curation, and the resulting metadata is often sparse and inconsistent.
The Research Software Ecosystem (RSEc) aims to act as a proxy to maintain and preserve high-quality metadata for describing research software. These metadata are retrieved and synchronized with many major software-related services, within and beyond the ELIXIR Tools Platform. The EDAM ontology enables the semantic description of the scientific function of the described software.
The RSEc central repository is a GitHub repository that aggregates software metadata mostly related to life sciences, and spanning the multiple aspects of software discovery, evaluation, deployment and execution. The aggregation of metadata in a centralized, open, and version-controlled repository enables the cross-linking of services, the validation and enrichment of software metadata, the development of new services and the analysis of these metadata.

14:10-14:30
Session: Joint Session with BOSC
The Linked data Modeling Language (LinkML): a general-purpose data modeling framework
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Nomi Harris

  • Sierra Moxon, LBNL, United States
  • Harold Solbrig, solbrig informatics, United States
  • Deepak Unni, Swiss Institute of Bioinformatics, Switzerland
  • Mark Miller, LBNL, United States
  • Patrick Kalita, LBNL, United States
  • Sujay Patil, LBNL, United States
  • Kevin Schaper, Anschutz Medical Campus, University of Colorado, United States
  • Tim Putman, Anschutz Medical Campus, University of Colorado, United States
  • Corey Cox, Anschutz Medical Campus, University of Colorado, United States
  • Harshad Hegde, LBNL, United States
  • J. Harry Caufield, LBNL, United States
  • Justin Reese, LBNL, United States
  • Melissa Haendel, Anschutz Medical Campus, University of Colorado, United States
  • Christopher J. Mungall, LBNL, United States


Presentation Overview: Show

The Linked data Modeling Language (https://linkml.io) is a data modeling framework that provides a flexible yet expressive standard for describing many kinds of data models from value sets and flat, checklist-style standards to complex normalized data structures that use polymorphism and inheritance. It is purposefully designed so that software engineers and subject matter experts can communicate effectively in the same language while also providing the semantic underpinnings to make data conforming to LinkML schemas easier to understand and reuse computationally. The LinkML framework includes tools to serialize data models in many formats including but not limited to: JSONSchema, OWL, SQL-DDL, and Python Pydantic classes. It also includes tools to help convert instance data between different model serializations, (LinkML runtime), convert schemas from one framework to another (LinkML convert), validate data against a LinkML schema (LinkML validate), retrieve model metadata (LinkML schemaview), bootstrap a LinkML schema from another framework (LinkML schema automator), and tools that auto-generate documentation and schema diagrams. LinkML is an open, extensible modeling framework that allows computers and people to work cooperatively and LinkML makes it easy to model, validate, and distribute data that is reusable and interoperable.

14:30-14:50
Session: Joint Session with BOSC
KG-Hub: a framework to facilitate discovery using biological and biomedical knowledge graphs
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Nomi Harris

  • J. Harry Caufield, Lawrence Berkeley National Laboratory, United States
  • Harshad Hegde, Lawrence Berkeley National Laboratory, United States
  • Sierra Moxon, Lawrence Berkeley National Laboratory, United States
  • Marcin Joachimiak, Lawrence Berkeley National Laboratory, United States
  • Chris Mungall, Lawrence Berkeley National Laboratory, United States
  • Justin Reese, Lawrence Berkeley National Laboratory, United States


Presentation Overview: Show

Knowledge graphs (KGs) are a powerful approach for integrating and extracting new knowledge from heterogeneous data using techniques such as graph machine learning, and have been successful in biomedicine and many other domains. However, a framework for FAIR construction and exchange of KGs is absent, resulting in redundant efforts, a lack of KG reuse, and insufficient reproducibility. KG-Hub is an open-source framework that was created to address these challenges by standardizing and facilitating KG assembly. KG-Hub enables consistent extract-transform-load (ETL), ensuring compliance with Biolink Model (a data model for standardizing biological concepts and relationships), and producing versioned and automatically updated builds with stable URLs for graph data and other artifacts. The resulting graphs are easily integrated with any OBO (Open Biological and Biomedical Ontologies) ontology. KG-Hub also includes web-browsable storage of KG artifacts on cloud infrastructure, easy reuse of transformed subgraphs across projects, automated graph machine learning on KGs using a YAML-based framework, and a visualization dashboard and manifest file to quickly assess KG contents.

14:50-14:55
Session: Joint Session with BOSC
The SPHN Semantic Interoperability Framework: From clinical routine data to FAIR research data
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Vasundra Touré, Swiss Institute of Bioinformatics SIB, Switzerland
  • Deepak Unni, Swiss Institute of Bioinformatics SIB, Switzerland
  • Sabine Österle, Swiss Institute of Bioinformatics SIB, Switzerland
  • Katrin Crameri, Swiss Institute of Bioinformatics SIB, Switzerland


Presentation Overview: Show

The Swiss Personalized Health Network (SPHN) is an initiative funded by the Swiss government for building a nationwide infrastructure for sharing clinical and health-related data in a secure and FAIR (Findable, Accessible, Interoperable, Reusable) manner. One goal is to ensure that data coming from different sources is interoperable between stakeholders. The priority was to develop a purpose-independent description of existing knowledge rather than relying on existing data models which are focused on specific use cases.

Together with partners at University Hospitals we have developed the SPHN Semantic Interoperability Framework which encompasses:
- semantics definition for data standardization
- data format specifications for data exchange
- software tools to support data providers and users
- training for facilitating knowledge sharing with stakeholders

Well-defined concepts connected to machine-readable semantic standards (e.g., SNOMED CT and LOINC) function as reusable universal building blocks that can be connected with each other to represent information. By adopting semantic web technologies, a specific schema has been built to encode the semantic concepts with given rules and conventions.

This framework is implemented in all Swiss university hospitals and forms the basis for future data-driven research projects with clinical and other health-related data.

14:55-15:00
Session: Joint Session with BOSC
OMEinfo: global geographic metadata for -omics experiments
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Matthew Crown, Northumbria University, United Kingdom
  • Matthew Bashton, Northumbria University, United Kingdom


Presentation Overview: Show

Microbiome classification studies increasingly associate geographical features like rurality and climate with microbiomes. However, microbiologists/bioinformaticians often struggle to access and integrate rich geographical metadata from sources such as GeoTIFFs. Inconsistent definitions of rurality, for example, can hinder cross-study comparisons. To address this, we present OMEinfo, a Python-based tool for automated retrieval of consistent geographical metadata from user-provided location data. OMEinfo leverages open data sources like the Global Human Settlement Layer, Koppen-Geiger climate classification models and Open-Data Inventory for Anthropogenic Carbon dioxide, to ensure metadata accuracy and provenance.

OMEinfo's Dash application enables users to visualise their sample metadata on an interactive map, to investigate the spatial distribution of metadata features, which is complemented by a numerical data visualisation to analyse patterns and trends in the geographical data before further analysis. The tool is available as a Docker container, providing a portable, lightweight solution for researchers. Through its standardised metadata retrieval approach and incorporation of FAIR and Open data principles, OMEinfo promotes reproducibility and consistency in microbiome metadata. As the field continues to explore the relationship between microbiomes and geographical features, tools like OMEinfo will prove vital in developing a robust, accurate, and interconnected understanding of these interactions.

15:00-15:05
Session: Joint Session with BOSC
FAIR-BioRS: Actionable guidelines for making biomedical research software FAIR
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Bhavesh Patel, FAIR Data Innovations Hub, California Medical Innovations Institute, United States
  • Hervé Ménager, Institut Pasteur, Université Paris Cité, France
  • Sanjay Soundarajan, FAIR Data Innovations Hub, California Medical Innovations Institute, United States


Presentation Overview: Show

We present the first actionable guidelines for making biomedical research software Findable, Accessible, Interoperable, and Reusable (FAIR) in line with the FAIR principles for Research Software (FAIR4RS principles). The FAIR4RS principles are the outcome of a large-scale global initiative to adapt the FAIR data principles to research software. They provide a framework for optimizing the reusability of research software and encourage open science. The FAIR4RS principles are, however, aspirational. Practical guidelines that biomedical researchers can easily follow for making their research software FAIR are lacking. To fill this gap, we established the first minimal and actionable guidelines that researchers can follow to easily make their biomedical research software FAIR. We designate these guidelines as the FAIR Biomedical Research Software (FAIR-BioRS) guidelines. The guidelines provide actionable step-by-step instructions that clearly specify relevant standards, best practices, metadata, and sharing platforms to use. We believe that the FAIR-BioRS guidelines will empower and encourage biomedical researchers into adopting FAIR and open practices for their research software. We present here our approach to establishing these guidelines, summarize their major evolution through community feedback since the first version was presented at BOSC 2022, and explain how the community can benefit from and contribute to them.

15:05-15:25
Session: Joint Session with BOSC
BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Nomi Harris

  • Jackson Callaghan, Scripps Research, United States
  • Colleen Xu, Scripps Research, United States
  • Jiwen Xin, Scripps Research, United States
  • Marco Cano, Scripps Research, United States
  • Eric Zhou, Scripps Research, United States
  • Rohan Juneja, Scripps Research, United States
  • Yao Yao, Scripps Research, United States
  • Madhumita Narayan, Scripps Research, United States
  • Kristina Hanspers, Gladstone Institutes, United States
  • Ayushi Agrawal, Gladstone Institutes, United States
  • Alexander Pico, Gladstone Institutes, United States
  • Chunlei Wu, Scripps Research, United States
  • Andrew Su, Scripps Research, United States


Presentation Overview: Show

Knowledge graphs are an increasingly common data structure for representing biomedical information. They can easily represent heterogeneous types of information, and many algorithms and tools exist for operating on them. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, such graphs are constructed as a single structural entity by centralizing and integrating data from multiple disparate sources. We present BioThings Explorer, an application that can query a virtual, federated knowledge graph representing the aggregated information of many disparate biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThing Explorer is distributed as a lightweight application that dynamically retrieves information at query time. More information can be found at https://explorer.biothings.io, and code is available at https://github.com/biothings/biothings_explorer.

15:25-15:30
Session: Joint Session with BOSC
Open Time for Questions
Room: Salle Rhone 3b
Format: Live from venue

Moderator(s): Núria Queralt Rosinach

  • Nomi Harris