RO-Crate: Capturing FAIR research outputs in bioinformatics and beyond
Confirmed Presenter: Phil Reed, The University of Manchester, United Kingdom
Room: 03A
Format: In person
Moderator(s): Karsten Hokamp
Authors List: Show
- Eli Chadwick, The University of Manchester, United Kingdom
- Stian Soiland-Reyes, The University of Manchester, United Kingdom
- Phil Reed, The University of Manchester, United Kingdom
- Claus Weiland, Leibniz Institute for Biodiversity and Earth System Research, Germany
- Dag Endresen, University of Oslo, Norway
- Felix Shaw, Earlam Institute, United Kingdom
- Timo Mühlhaus, RPTU Kaiserslautern-Landau, Germany
- Carole Goble, The University of Manchester, United Kingdom
Presentation Overview: Show
RO-Crate is a mechanism for packaging research outputs with structured metadata, providing machine-readability and reproducibility following the FAIR principles. It enables interlinking methods, data, and outputs with the outcomes of a project or a piece of work, even where distributed across repositories.
Researchers can distribute their work as an RO-Crate to ensure their data travels with its metadata, so that key components are correctly tracked, archived, and attributed. Data stewards and infrastructure providers can integrate RO-Crate into the projects and platforms they support, to make it easier for researchers to create and consume RO-Crates without requiring technical expertise.
Community-developed extensions called “profiles” allow the creation of tailored RO-Crates that serve the needs of a particular domain or data format.
Current uses of RO-Crate in bioinformatics include:
∙ Describing and sharing computational workflows registered with WorkflowHub
∙ Creating FAIR exports of workflow executions from workflow engines and biodiversity digital twin simulations
∙ Enabling an appropriate level of credit and attribution, particularly in currently under-recognised roles (eg. sample gathering, processing, sample distribution)
∙ Capturing plant science experiments as Annotated Research Contexts (ARC), complex objects which include workflows, workflow executions, inputs, and results
∙ Defining metadata conventions for biodiversity genomics
This presentation will outline the RO-Crate project and highlight its most prominent applications within bioinformatics, with the aim of increasing awareness and sparking new conversations and collaborations within the BOSC community.
PheBee: A Graph-Based System for Scalable, Traceable, and Semantically Aware Phenotyping
Confirmed Presenter: David Gordon, Office of Data Sciences at Nationwide Children's Hospital, United States
Room: 03A
Format: In person
Moderator(s): Karsten Hokamp
Authors List: Show
- David Gordon, Office of Data Sciences at Nationwide Children's Hospital, United States
- Max Homilius, Office of Data Sciences at Nationwide Children's Hospital, United States
- Austin Antoniou, Office of Data Sciences at Nationwide Children's Hospital, United States
- Connor Grannis, Office of Data Sciences at Nationwide Children's Hospital, United States
- Grant Lammi, Office of Data Sciences at Nationwide Children's Hospital, United States
- Adam Herman, Office of Data Sciences at Nationwide Children's Hospital, United States
- Ashley Kubatko, Office of Data Sciences at Nationwide Children's Hospital, United States
- Peter White, Office of Data Sciences at Nationwide Children's Hospital, United States
Presentation Overview: Show
The association of phenotypes and disease diagnoses is a cornerstone of clinical care and biomedical research. Significant work has gone into standardizing these concepts in ontologies like the Human Phenotype Ontology and Mondo, and in developing interoperability standards such as Phenopackets. Managing subject-term associations in a traceable and scalable way that enables semantic queries and bridges clinical and research efforts remains a significant challenge.
PheBee is an open-source tool designed to address this challenge by using a graph-based approach to organize and explore data. It allows users to perform powerful, meaning-based searches and supports standardized data exchange through Phenopackets. The system is easy to deploy and share thanks to reproducible setup templates.
The graph model underlying PheBee captures subject-term associations along with their provenance and modifiers. Queries leverage ontology structure to traverse semantic term relationships. Terms can be linked at the patient, encounter, or note level, supporting temporal and contextual pattern analysis. PheBee accommodates both manually assigned and computationally derived phenotypes, enabling use across diverse pipelines. When integrated downstream of natural language processing pipelines, PheBee maintains traceability from extracted terms to the original clinical text, enabling high-throughput, auditable term capture.
PheBee is currently being piloted in internal translational research projects supporting phenotype-driven pediatric care. Its graph foundation also empowers future feature development, such as natural language querying using retrieval augmented generation or genomic data integration to identify subjects with variants in phenotypically relevant genes.
PheBee advances open science in biomedical research and clinical support by promoting structured, traceable phenotype data.
The role of the Ontology Development Kit in supporting ontology compliance in adverse legal landscapes
Confirmed Presenter: Damien Goutte-Gattat, University of Cambridge, United Kingdom
Room: 03A
Format: In person
Moderator(s): Karsten Hokamp
Authors List: Show
- Damien Goutte-Gattat, University of Cambridge, United Kingdom
Presentation Overview: Show
Ontologies, like code, are a form of speech. As such, they can be
subject to laws and other regulations that attempt to control how
freedom of speech is exercised, and ontology editors may find themselves
in the position of being legally compelled to introduce some changes in
their ontologies for the sole purpose of complying with the laws that
applies to them.
Therefore, developers of tools used for ontology editing and maintenance
need to ponder whether their tools should provide features to facilitate
the introduction of such legally mandated changes, and how.
As developers of the Ontology Development Kit (ODK), one of the main
tools used to maintain ontologies of the OBO Foundry, we will consider
both the moral and technical aspects of allowing ODK users to comply
with arbitrary legal restrictions. The overall approach we are
envisioning, in order to contain the impacts of such restrictions to the
jurisdiction that mandate them, is a “split world” system, where the ODK
would facilitate the production of slightly different editions of the
same ontology.