Databases encoding associative relationships between biomedical entities function as background knowledge which are leveraged for a range of purposes. For example, disease-phenotype associations are used for differential diagnosis and variant prediction, while gene-function associations are used in gene set enrichment analyses.
In the ontology world, these associative knowledgebases lie somewhere between the conceptualisation and instance spaces, defining foundational knowledge that is often probabilistic, associative, or uncertain, rather than axiomatic. They are formed through some combination of manual curation from expert knowledge, experimental data, and analysis of co-occurrence in literature text. Due to this aetiology of associations, existing databases represent a particular perspective on biomedical knowledge, and it is one that differs from those that might be cultivated from analysis of other sources, such as clinical data, public discussion, or alternative modularisations of literature text.
We will explore the similarities and differences between associative knowledgebases derived from these contexts, including methodological concerns, hypothesis generation, characterisation, and implications for downstream applications.