16th Annual
International Conference
Intelligent Systems
for Molecular Biology

Metro Toronto Convention Centre (South Building)
Toronto, Canada




































Accepted Posters
Category 'S'- text mining'
Poster S01
PubMeth: a cancer methylation database, based on text-mining
Maté Ongenaert- UGent
Leander Van Neste (Ugent, Molecular Biotechnology); Tim De Meyer (Ugent, Molecular Biotechnology); Gerben Menschaert (Ugent, Molecular Biotechnology); Sofie Bekaert (Ugent, Molecular Biotechnology); Wim Van Criekinge (Ugent, Molecular Biotechnology);
Short Abstract: None On File
Long Abstract: Click Here

Poster S02
Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics
Andrew Smith- Yale University
Kei Cheung (Yale University, Center for Medical Informatics); Michael Krauthammer (Yale University, Department of Pathology); Martin Schultz (Yale University, Department of Computer Science); Mark Gerstein (Yale University, Department of Molecular Biophysics and Biochemistry);
Short Abstract: None On File
Long Abstract: Click Here

Poster S03
EB-eye : The EBI search engine
Franck Valentin- European Bioinformatics Institute
Mickael Goujon (European Bioinformatics Institute, External Services); Rodrigo Lopez (European Bioinformatics Institute, External Services);
Short Abstract: The EB-eye is an Apache Lucene-based search engine aimed at providing unified access to the EBI databases. The system generates indices using a condensed but meaningful subset of the original data and returns summary information and links to the original data as well as all EBI specific database cross-references.
Long Abstract: Click Here

Poster S04
Extraction of Facts and Relationships Relevant to Molecular Mechanisms of Bacterial Pathogenesis through Natural Language Processing
David Pot- SRA International, Inc.
Sam Zaremba (SRA International, Inc., Global Health); Mila Ramos-Santacruz (SRA International, Inc., Global Health); Thomas Hampton (SRA International, Inc., Global Health); Panna Shetty (SRA International, Inc., Global Health); Joel Fedorko (SRA International, Inc., Global Health); Jon Whitmore (SRA International, Inc., Global Health); Nicole Perna (University of Wisconsin, Genome Center); Jeremy Glasner (University of Wisconsin, Genome Center); Guy Plunkett III (University of Wisconsin, Laboratory of Genetics); Matthew Shaker (SRA International, Inc., Global Health); John Greene (SRA International, Inc., Global Health);
Short Abstract: To help understand bacterial pathogenesis, the Enteropathogen Resource Integration Center (ERIC) Bioinformatics Resource Center ( offers a proven text mining application to its user community, which extracts Gene - Roles; Mutation - Phenotypes; and Organism - Pathogenesis relationships from PubMed abstracts. The application and search tools are available (
Long Abstract: Click Here

Poster S05
Extracting protein complexes from biomedical literature
Wagied Davids- University of Toronto
No additional authors
Short Abstract: We have initiated the task of extracting protein complexes from the biomedical literature to provide an updated resource on protein complexes, not only limited to yeast but also other species. Our online web resource is available for users to query protein complexes but also provide annotation tools for users to extract protein complexes from PubMed.
Long Abstract: Click Here

Poster S06
Using textual context for improving OCR performance in biomedical literature retrieval
Songhua Xu- Yale University
Martin Schultz (Yale University, Computer Science); Michael Krauthammer (Yale University, Pathology & Yale Center for Medical Informatics);
Short Abstract: Today’s information retrieval (IR) techniques are mostly text-based, which fail in situations when textual information is not easily accessible, such as in biomedical images and figures. We propose to augment IR with optical character recognition (OCR) capabilities, and describe a context-based method for boosting OCR performance.
Long Abstract: Click Here

Poster S07
Surveying the Biomedical Literature Using Automatically Mined Gene-related Key Terms
Catalina Tudor- University of Delaware
K. Vijay-Shanker (University of Delaware, Computer Science); Carl Schmidt (University of Delaware, Animal and Food Sciences);
Short Abstract: We developed eGIFT, a system which aids annotators to quickly find articles describing gene functions and scientists surveying the results of high-throughput experiments to quickly extract information important to their hits. eGIFT users can learn about a gene by consulting a list of relevant key terms automatically mined from text.
Long Abstract: Click Here

Poster S08
A literature-based dissimilarity measure to explore genome-wide gene relatedness and pathways
Zuoshuang Xiang- University of Michigan
Zhaohui Qin (University of Michigan, Biostatistics); Yongqun He (University of Michigan, Unit for Laboratory Animal Medicine, and Microbiology and Immunology);
Short Abstract: A MeSH-based dissimilarity score is developed to assess the relatedness between two genes based on the frequency of MeSH terms in the literature that refer to each gene. Studies based on Brucella and E. coli genes demonstrate that MeSHdisc can reveal gene relatedness and pathways among bacterial genomes.
Long Abstract: Click Here

Poster S09
PubCurator - a text analysis platform.
Kai Schlamp- Johannes Gutenberg University Mainz
Markus Krupp (Johannes Gutenberg University Mainz, Department of Medicine I); Peter R. Galle (Johannes Gutenberg University Mainz, Department of Medicine I); Andreas Teufel (Johannes Gutenberg University Mainz, Department of Medicine I);
Short Abstract: PubCurator is a biomedical text analysis platform providing extensive support for text mining especially of the NCBI databases. Text analysis may be performed in manual or automatic mode with a full featured graphical frontend built upon the Eclipse RCP.
The application is freely available from our website
Long Abstract: Click Here

Poster S10
Mining Protein Interactions from Text using Convolution Kernels
Ramanathan Narayanan- Northwestern University
Alok Choudhary (Northwestern University, EECS); Simon Lin (Northwestern University, Northwestern Medical School); Sanchit Misra (Northwestern University, EECS);
Short Abstract: We examine the problem of identifying protein-protein interactions in biomedical literature databases by combining NLP and text mining techniques. We propose the use of a hierarchical framework to reduce the search space and introduce Convolution kernels in Support Vector Machines to accurately identify protein-protein interactions in biomedical literature databases.
Long Abstract: Click Here

Poster S11
Entropy and enrichment-based approaches for annotating protein clusters using literature
Shirley Wu- Stanford University
Russ Altman (Stanford University, Bioengineering);
Short Abstract: Clustering algorithms produce groups of proteins that are similar in a way that may not immediately be apparent. We show that computational, literature-based approaches focused on term entropy and enrichment are able to derive comprehensive and informative terms describing clusters of proteins.
Long Abstract: Click Here

Poster S12
Visualizing evolution and impact of biomedical fields
Murat Cokol- Harvard Medical School
Raul Rodriguez-Esteban (Columbia University, );
Short Abstract: We describe a tool ( for visualization of more than 200 thousand biomedical scientific trends. The method captures variations in scientific impact over time to allow for a comparison of relative significance and evolution of fields similar to a financial market scorecard.
Long Abstract: Click Here

Poster S13
Pubmeth: reviewed methylation database in cancer based on text-mining
Maté Ongenaert- Ghent University
Leander Van Neste (Ghent University, Molecular Biotechnology); Tim De Meyer (Ghent University, Molecular Biotechnology); Gerben Menschaert (Ghent University, Molecular Biotechnology); Sofie Bekaert (Ghent University, Molecular Biotechnology); Wim Van Criekinge (Ghent University, Molecular Biotechnology);
Short Abstract: is a cancer methylation database that includes genes that are reported to be methylated in various cancer types. A query can be based either on genes or on cancer types. PubMeth is based on text-mining of PubMed abstracts, combined with manual reading and expert annotation of preselected abstracts.
Long Abstract: Click Here

Poster S14
Towards Mining Images from Full-Text Articles: Associating Images with Reference Text
hong yu- University of Wisconsin-Milwaukee
Shashank Agarwal (UWM, Medical Informatics); Mary Shimoyama (UWM, Medical Informatics);
Short Abstract: Images are important part of experimental results reported in bioscience full-text articles. However, image-mining poses an important research challenge. Here we report our investigation and annotation of associating reference text with images. Our work is an important first step towards developing automated approaches for mining images in full-text biomedical articles.
Long Abstract: Click Here

Poster S15
Searching PubMed articles queried by multiple articles
Katsuhiko Murakami- Japan Biological Informatics Consortium
Yoshiharu Sato (National Institute of Advanced Industrial Science and Technology, Biomedicinal Information Research Center); Tadashi Imanishi (National Institute of Advanced Industrial Science and Technology, Biomedicinal Information Research Center); Takashi Gojobori (National Institute of Advanced Industrial Science and Technology, Biomedicinal Information Research Center);
Short Abstract: To obtain appropriate articles from PubMed, we developed a PubMed article search system that takes multiple articles as input. The system can suggest some directions of query-optimization by splitting the query, or deleting outliers. The system helps user find more appropriate articles.
Long Abstract: Click Here

Poster S16
BioLexicon: Towards a reference terminological resource in the biomedical domain
Dietrich Rebholz-Schuhmann- European Bioinformatics Institute
Piotr Pezik (EMBL-EBI, Rebholz group); Vivian Lee (EMBL-EBI, Rebholz group); Jung-Jae Kim (EMBL-EBI, Rebholz group); Riccardo del Gratta (CNR, ILC); Yutaka Sasaki (University of Manchester, NaCTeM); Jock McNaught (University of Manchester, NaCTeM); Simonetta Montemagni (CNR, ILC); Monica Monachini (CNR, ILC); Nicoletta Calzolari (CNR, ILC); Sophia Ananiadou (University of Manchester, NaCTeM);
Short Abstract: The BioLexicon is a publicly available large-scale terminological resource which brings together potential terms from several resources representing selected semantic types (genes, proteins, chemicals, species, enzymes, selected ontological terms). The schema of the BioLexicon enables improved resolution of term ambiguity and follows lexical standards for terminological resources.
Long Abstract: Click Here

Poster S17
Web-based literature mining tool for target identification and functional enrichment analysis
Junguk Hur- University of Michigan, Ann Arbor
Tim Wiggin (National Center for Integrative Biomedical Informatics, Bioinformatics); Alex Ade (National Center for Integrative Biomedical Informatics, Bioinformatics); Eva Feldman (University of Michigan, Ann Arbor, Neurology); David States (University of Michigan, Ann Arbor, Bioinformatics Program);
Short Abstract: Web-based JUMiner is a dictionary- and rule-based literature mining tool working on full text literature. Name-conflict issue is resolved by a scoring scheme based on co-occurrence of symbols and descriptions. It also features functional enrichments tests to find enriched targets, GO terms, MeSH terms, pathways, and protein-protein interactions.
Long Abstract: Click Here

Accepted Posters
View Posters By Category
Search Posters:
Poster Number Matches
Last Name
Co-Authors Contains
Abstract Contains