SEMA, A semantic literature annotator

Alex Garcia1, Cleary John2, Mark A. Ragan, Yi-Ping Pheobe Chen
1a.Garcia@imb.uq.edu.au, Institute for Molecular Bioscience; 2jcleary@reeltwo.com, Reel Two

SEMA is a systematic literature annotator for molecular biology databases. We are using a machine-learning algorithm implemented in GO-KDS (Gene Ontology Knowledge discovery System, ReelTwo) to complement SwissProt literature citation fields for each database entry. GO-KDS connects MEDLINE abstracts to relevant GO terms, finding abstracts relevant to a given protein. SEMA organizes this new relevant information, then builds a conceptual navigable map that is presented to the user as a flat or hyperbolic tree. This map allows the user to redefine queries over the same database or over other information sources. The abstracts give context to the set of possible new queries and, along with the GO classification, provide a guidance framework.