Biomedicine and informatics publications, together with their content contributors and
publishers, play a critical role in furthering the FAIR Principles of Findability, Accessibility, Interoperability, and
Reusability. In this session, our speakers will explore how partnerships in this setting, e.g., publishers and content
contributors, can best support FAIR; what are the current (and future) models for publishing domains to enable
FAIR data and other digital objects; what are the implementation challenges and specific barriers, both technical
and societal in nature; as well as opportunities and major efforts to further FAIR science through infrastructure and
policy initiatives. Key players in the publishing realm and the NIH will share their expertise and experiences, with a
lively panel discussion highlighting the current and future state of FAIR science. The panel discussion will offer
opportunities for a community conversation on topics related to data, tools, and model sharing initiatives by journals;
the response of scientific communities to these efforts; standards, documentation, licensing, and ensuring
appropriate credit to data creators; as well as threats to FAIR becoming commonplace.
10:15 AM-10:20 AM
Introduction to the BD2K Special Sessions
Room: Osaka / Samarkand (3rd Floor)
Susan Gregurick, NIH/NIGMS, United States
10:20 AM-10:40 AM
NIH’s Strategic Vision for Data Science: Enabling a FAIR-Data Ecosystem
Room: Osaka / Samarkand (3rd Floor)
Susan Gregurick, NIH/NIGMS, United States
10:40 AM-11:00 AM
The University – Publisher Partnership as it Relates to FAIR
Room: Osaka / Samarkand (3rd Floor)
Phil Bourne, University of Virginia, PLoS, United States
11:00 AM-11:20 AM
Encouraging Rigor and Reproducibility through FAIRness in JAMIA Open
Room: Osaka / Samarkand (3rd Floor)
Neil Sarkar, Brown University, JAMIA Open, United States
11:20 AM-11:40 AM
Nature Research Publications and Springer Nature, Supporting FAIR
Room: Osaka / Samarkand (3rd Floor)
Grace Baynes, Nature Publishing VP of Research Data & New Product Development, United Kingdom
• What resources do you know of that presently are good models of implementing and enabling FAIR?
o How is the scientific community using these tools? Is uptake a problem?
• What is the role of funding agencies and major institutions in advancing FAIR?
o What should be relationship between funding agencies, major institutions, and publishers in
advancing FAIR?
• Are there ways to evaluate/quantify FAIR principles in datasets or other shared digital objects (e.g.,
FAIRShake)?
• What are the current barriers to making different components of disseminated scientific research FAIR?
Which of these issues are technical and require infrastructure vs. social/policy changes?
o While everyone believes in FAIR, do you see threats to FAIR becoming commonplace?
o Is there a need for standards across the growing number of platforms that we see for
data/code/tool sharing?
§ Minimum documentation and meta-information captured?
o How can different communities foster the required ethos to make sure FAIR happens?
• Do you know of any major efforts from other groups to provide infrastructure solutions for FAIR? What
about any overarching policy moves to foster authors/developers/scientists to follow FAIR?
Session 2: "Demystifying FAIR Science: Examples, Tools and Use Cases"
Room: Osaka / Samarkand (3rd Floor)
Ishwar Chandramouliswaran, NIH/NIAID, United States
FAIR Principles entail Findability, Accessibility, Interoperability, and Reusability of data,
metadata, tools/software, and other digital objects. In this session and panel discussion, we aim to illuminate the
landscape of FAIR science by highlighting specific resources and infrastructure required for implementation,
including systems for storing and sharing scientific resources, evaluation and quantification of FAIRness in datasets
and other digital objects, as well as tools/software and infrastructure from major institutions and funding agencies
that enable FAIR science. The panel discussion will offer opportunities for a community conversation on topics
related to existing resources for implementing, enabling, and evaluating FAIR; scientific communities’ use and
uptake of these tools; the role of funding agencies and major institutions in advancing FAIR; current barriers to
making different components of disseminated scientific research FAIR; as well as the ways by which different
communities can foster the required ethos to ensure the expansion and adoption of FAIR Principles.
2:00 PM-2:20 PM
Enabling FAIR at NIH: Figshare and Beyond
Room: Osaka / Samarkand (3rd Floor)
Jennie Larkin, NIH/NIDDK, United States
2:20 PM-2:40 PM
FAIR Data with Impact
Room: Osaka / Samarkand (3rd Floor)
Henning Hermjakob, EMBL-EBI, United Kingdom
2:40 PM-3:00 PM
Establishing Commons to Foster FAIR Research Data
Room: Osaka / Samarkand (3rd Floor)
Ian Fore, NIH/NCI, United States
3:00 PM-3:20 PM
Building Metadata to Render Clinical Case Reports FAIR
Room: Osaka / Samarkand (3rd Floor)
Harry Caufield, University of California, Los Angeles, United States
• What resources do you know of that presently are good models of implementing and enabling FAIR?
o What are the data/tool/model sharing initiatives by journals?
o What is the scientific community doing, and what has been their response to the
journals/institutions?
• What are the current barriers to making different components of disseminated scientific research FAIR?
Which of these issues are technical and require infrastructure vs. social/policy changes?
o While everyone believes in FAIR, do you see threats to FAIR becoming commonplace?
o Is there a need for standards across the growing number of platforms that we see for
data/code/tool sharing?
§ Minimum documentation and meta-information captured?
o How can different communities foster the required ethos to make sure FAIR happens?
• Do you know of any major efforts from other groups to provide infrastructure solutions for FAIR? What
about any overarching policy moves to foster authors/developers/scientists to follow FAIR?
• What do you think of the role of preprints and public/pre-submission comments in biomedical research and its relation to FAIR?
• How can we ensure that FAIR standards include appropriate credit to data creators?
o What efforts exist to retain credit and recognition of data creditors, and how should those be reflected when data/tools/digital objects are updated/modified?
o Is there anything we can learn from licensing systems, such as the Creative Commons?
o What should be in the public domain?
Session 3: "New Frontiers in Information Extraction and Knowledge Discovery"
A rapidly growing set of diverse biomedical textual data files embodies our expanding
knowledge of biological phenomena, including human health and disease. Pioneering text mining tools to accurately
identify concepts, characterize biological processes and clinical events, as well as define novel relationships among
them, afford great promises in achieving biomedical discoveries and precision health. Accordingly, interpreting and
understanding biomedical text datasets require specific biomedical domain expertise. We have assembled a
session to highlight the current progress in both technology development as well as its applications in this line of
biomedical investigations. There exist both opportunities and challenges on this endeavor. Extracting valuable
information demands considerable technical effort as well as domain knowledge to customize methods, train
models, and validate results. Furthermore, outputs from different tools without a common representation limit their
compatibility and interoperability. Our panel discussion is designed to brainstorm these technical issues and identify
potential solutions in biomedical text mining. We aim to foster a broad community conversation about the optimal
practices in creating text mining tools, the effective approaches to learn from massive text data, and the necessary
community engagement efforts required to move the field forward.
4:40 PM-5:00 PM
Natural Language Processing: Bridging the Gap Between Human Intelligence and Machine Intelligence
Room: Osaka / Samarkand (3rd Floor)
Hongfang Liu, Mayo Clinic, United States
5:00 PM-5:20 PM
How User Intelligence is Improving PubMed
Room: Osaka / Samarkand (3rd Floor)
Zhiyong Lu, NIH/NLM/NCBI, United States
5:20 PM-5:40 PM
Design, Implementation, and Operation of a Rapid, Robust Named Entity Recognition Web Service
Room: Osaka / Samarkand (3rd Floor)
Lars Juhl Jensen, The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
• How can we ensure new methods are (sufficiently) generalizable?
• How do we create more synergy between the information extraction (IE) community and people who want
to apply IE? Computational challenges help, but what else can be done?
• How do we avoid bias in application of IE methods (and more broadly, machine learning/artificial
intelligence approaches) to biomedicine? How do we avoid inadvertently under- or over-representing a
certain feature in our data? For example, methods may do well with analyses of common diseases but
may do poorly with rare diseases and even ignore them entirely. Or, if patient data is used as training
data, and patients are homogeneous (e.g., in ethnicity or in age) the resulting methods may perform
poorly with other patient populations.
• In clinical environments, there is the concept of alarm fatigue, when numerous and potentially false-
positive results become overwhelming. We may see a similar problem when extracting information from
biomedical data (i.e., users may have difficulty determining how useful or trustworthy extracted
information is). How may we address the problem? In other fields, statistical measures such as false
discovery rate are commonly used, but this is not as common in IE and text mining – potentially due to
the lack of sufficiently large, labeled data. More broadly, the problem of determining “relevance” in
information retrieval may be important, but in biomedical text – given the importance of difference
precision/recall metrics (based on the task), relevance may not be appropriate – are there other metrics to
consider?