Conference on Semantics in Healthcare and Life Sciences (CSHALS)

Call for Presentations (updated June 24, 2010)

The ISCB Conference on Semantics in Healthcare and Life Sciences (CSHALS) is a forum for the presentation and discussion of semantics-based approaches and challenges currently faced in the Healthcare and Life Science Industry. The conference features keynote lectures, invited talks, selected oral presentations and posters.

Conference organizers are accepting abstract submissions for poster and oral presentations in the following topic areas:

  • Biologics, Compounds & Chemistry – use of semantic technologies for any aspect of work with chemical compounds and biologics.
  • Biomolecular Semantics – use of semantic technologies for any aspect of work with biological pathways and molecular interaction networks.
  • Clinical Harmonization – use of semantic technologies for clinical trials and other highly-regulated environments relative to healthcare.
  • Emerging & Established Standards – the development and practical application of content and knowledge representation standards for use with semantic technologies.
  • Genomics & Genetics – semantic applications to biomolecular systems, from the genome-centric, transcriptomic, proteomics, epigenomic, or any other biomolecular profiling.
  • Linked Data – Projects, results, and best practices in using Linked Data principles to apply Semantic Web technologies in health care and life sciences.
  • Ontologies & Knowledge Bases – practical application of knowledge engineering principles and techniques.
  • Safety, Efficacy & Outcomes – Semantic efforts in support of Drug Safety, Efficacy and Outcome prediction within clinical trials, as well as post-market surveillance.
  • Translational Medicine – practical application of semantic technologies across multiple stages of the typical pharmaceutical R&D pipeline.
  • Other – New and Innovative.

Conference on Semantics in Healthcare and Life Sciences (CSHALS)

Poster Presentations

Updated April 19, 2011


Wednesday – February 23, 2011

4:00 pm - 5:00 pm - Poster (Author) Set-up
5:00 pm - 7:00 pm - Poster Reception


Poster 1: Contextual Understanding of Experimental Data Via Formal Semantic Integration of NLP-extracted Content with other Semantically Integrated Resources

Robert Stanley
Jason Eshleman
Erich Gombocz
IO Informatics, Inc.
Berkeley, CA   US

David Milward
Linguamatics ltd. UK

Biological systems are inherently complex. Experimental results, especially if they cover multiple experimental modalities or diverse biological responses, are difficult to interpret out of context. This is a key area for the application of semantic technologies. The first step is the integration of analytical results under a well-formed application ontology. Extensible semantic integration standards such as RDF, N3 and OWL are used to create triples-based coherent dynamically extensible and remappable data models. This first step supports the rapid creation of coherent experimental correlation networks and provides a statistically relevant view of system perturbations. However, this does not necessarily provide a better understanding of biological functions involved. In order to achieve contextual understanding, these networks need to be further enriched by adding mechanistic knowledge. This contextual understanding requires the ability to bring in resources (either through direct connections or via queries to SPARQL endpoints) relevant to biological functions. The addition of information about interactions, pathways or other information from previous observations is relevant to describe biological processes, which may otherwise be missed. Natural Language Processing (NLP) can be used to extract relationships between concepts from resources such as scientific journal articles, collaborations, comparative studies and clinical trials. When the NLP extracted relationships are semantically integrated with experimental findings, the consequential view of the biological system is enhanced. Using thesauri to harmonize classes and relationships from those extracts and merging them into a dynamically extended application ontology results in functionally connected experimental results. This approach makes it possible to apply biomarker patterns or molecular signatures derived from the network to answer complex biological questions, and also to apply them actively for screening and decision support. This poster describes a use-case in which multiple experimental datasets (micro-RNA, sequencing, gene expression, drug target assays) have been semantically integrated, enriched with public knowledge resources (tissue-specific gene expression and regulation [TIGER], human RNA drug targets [TargetScan], miRBase, Microcosm, Diseasome) and supplemented with NLP extracted relationships concerning specific diseases (in this case, severe renal failure) from a variety of articles and other sources. Tools used in this scenario were IO Informatics’ Sentient Knowledge Explorer for the semantic integration of experimental data, ontology import, network visualization and graphical SPARQL queries in conjunction with relationships extracted from MEDLINE abstracts by Linguamatics’ I2E enterprise text mining platform. The resulting semantic network provides a reliable qualification of drug targets with broader applications. The kidney-disease related profiles generated in this example are based on contextual understanding of the biological functions involved in the disease and their manifestation in grounded experimental observations as well as through verification with mined content from trusted resources. Such methodology significantly impacts the way life sciences’ and drug discovery research is leading towards more effective drugs, and for widespread use in personalized medicine to improve the quality of life.



Poster 2:
DrugTree: A Phylogenetic Platform to Study Protein-ligand Binding Relationships in the Drug Discovery Process

Katherine Herbert
Nina Goodey
D. Jason Seraydarian
Roberto Suarez
Shreya Achar
Department of Computer Science
Montclair State University   
Montclair    NJ    US

In the News:  CSHALS Poster Contributor listed on the F1000 Posters Bank:

The discovery of drugs that have the desired pharmacological profiles is critical for human health and survival yet time consuming and expensive. Consequently we must aim for obtaining maximum benefits from those medicinal compounds that have already been identified and found to have favorable properties. The DrugTree Project creates toolkit for scientists interested in understanding the broader implications of the relationship between phylogenetics and the binding between a homologous set of enzymes and their corresponding ligands and inhibitors. Phylogeny is a useful context in which to view these relationships: As a protein evolves, one feature that changes is the binding pocket and hence binding specificity. Consequently, evolutionary relationships can provide predictive power to establish the binding between a given ligand and a homolog based on known binding relationships within a protein family. Insight into which phylogenetically prevalent amino acid changes within the binding site are responsible for different ligand specificities amongst the homologs in a family may also be gained. The DrugTree Project has completed a prototype World Wide Web-based computing system that integrates both phylogenetic data and analyses about enzymes with known information about their ligands and inhibitors. Currently, no one data repository integrates the drug-target, protein-ligand curated datasets with a large, popular protein database like UniProt and then gives tools to allow users to view these datasets in a phylogenetic-meaningful context. The DrugTree tool integrates data from UniProt (www.uniprot.org/), the BindingDB (www.bindingdb.org/) and BRENDA (www.brenda-enzymes.org/) databases to allow the user to create trees with data from both UniProt, with its massive non-redundant database and the data from the known inhibitor repositories. The system initially integrates these three dataset, creating a local repository. Via Web interface, it allows a user to create a phylogenetic tree for a homologous set of enzymes. It then enables the user to perform phylogenetic reconstruction analyses via parsimony techniques with a select few phylogenetics reconstruction algorithms. Finally, the tool then maps allows the user to view the compounds that inhibit or bind each homolog next to the enzyme name. This poster introduces the DrugTree tool. It will demonstrate its effectiveness through analysis of a subset of dihydrofolate reducatase proteins and some of the set’s known inhibitors. Dihydrofolate reductase is both an important target and a good model system: this enzyme has recently been of interest as a drug target in global health issues including treatment of various parasitic diseases such as malaria, African sleeping sickness, Changa’s disease, and tuberculosis. Many sequences and crystal structures are available for dihydrofolate reductase and purification is easy due to the commercial availability of affinity chromatography resin specific for this enzyme. Therefore, it is ideal in verifying our results. It will also discuss our future development plans for the DrugTree platform.



Poster 3: Data Driven Derivation of Canonical Eligibility Criteria for Clinical Trials

Saranya  Krishnamoorthy
Dinakarpandian Deendayal
Saranya Krishnamoorthy
Yugyung Lee
University of Missouri   
Kansas City,  Missouri   US

Recruitment of subjects for clinical trial research is currently an inefficient and time-consuming process in the development of a new drug. Recruitment challenges are particularly difficult for studies involving vulnerable populations, especially those with psychiatric disorders. The other major hurdle to automate the process is that eligibility criteria are written in free text that cannot be reliably parsed or processed computationally. To overcome these obstacles, we created an intelligent online system which targets the following two goals: Helping recruiters to develop/specify a standardized representation of eligibility criteria. Automate selection of candidates for mental health research studies. As proof of concept, the methodology has been developed and validated on a corpus of 701 clinical trials on Generalized-Anxiety-Disorder containing 2765 and 4411 redundant inclusion and exclusion criteria set. A combination of Ontological terms pairwise matching and clustering is used to present semantically non-redundant eligibility criteria set. Text mining techniques like removal of punctuations, breaking criteria into individual sentences ,excision of stop words, stemming, conversion of a phrase to a single term were used to remove the noise from the free text .Finally only the ontological terms (terms of SNOMEDCT,MESH & LOINC) in each criteria are extracted for a symmetric pairwise scoring. The MCL (Markov-Chain-Clustering) was done for the above obtained output. The clusters obtained are transformed to ontological concepts using Tf-Idf terms of each cluster and mapping concepts to terminological hierarchy of SNOMEDCT and NCI-Thesaurus. Each cluster concepts are in turn linked to database queries dynamically. Protégé was used for ontology creation. Jena API to interact with ontology and SPARQL to construct queries. Finally these queries are used to retrieve patient’s records. Results for a particular study are ranked based on percentage of the criteria list satisfied. The recruiter receives suggestions for creating criteria by associative rule mining of eligibility concepts. The total numbers of non-redundant inclusion and exclusion concepts obtained were 126 and 175. The clustering accuracy is 0.93 for inclusion and 0.95 for exclusion determined using F-measure. The first 15 inclusion and 23 exclusion concepts (Taken based on size of cluster) set could cover 85% of the redundant criteria set. From the ontology developed any new eligibility criteria related to GAD can be mapped to a cluster ontology and in turn to a database query which is used to search the patient database table. Thus the recruitment of patient’s process can be largely automated. This paper presents a complementary data driven approach to help find a minimal non-redundant representation of an arbitrary collection of clinical trial eligibility criteria and automates the recruitment of patients for clinical trials. Thus our system allows the recruiters to the have the flexibility of using free-text while the semantics of the criteria are captured for computer readability. We could like to acknowledge National Institute of Mental Health for funding the project (1R43MH085372-01A1).



Poster 4: Integrating Multi-Dimensional Genomic, Proteomic and Clinical Data of Inflammation and Injury
Wenzhong Xiao
Massachusetts General Hospital
Stanford Genome Technology Center   
Stanford,  CA   US

Recent developments in high throughput technologies have enabled direct studies of patients’ genomic response to diseases and treatments, and new computational methods need to be developed to translate the large amount of genomic, proteomic and clinical data to new knowledge in medicine. Over the past nine years, the Inflammation and the Host Response to Injury Glue Grant Consortium has utilized multiple experimental tools to study the temporal immune-inflammatory response in blood leukocytes and sub-populations from over 500 severely injured patients, together with their comprehensive clinical information. We are developing semantics-based approaches in integrating these genomic and proteomic data with clinical information of patients to elucidate disease mechanism and predict patient outcomes.



Poster 5: Translational Medicine in Action: Linking and Visualizing a Network of Biomedical Research Scientists using Nexus

Janos  Hajagos
Erich Bremer
Janos Hajagos
Tammy Diprima
Stony Brook University School of Medicine   
Stony Brook,  NY   US

Poster - .pdf: click here

The goal of translational medicine is to translate basic science research into advances in clinical medicine. One way to meet this goal is to pair up basic scientists with clinical researchers who share common research interests. The challenge is that the terms used by each group do not perfectly align. To demonstrate the utility of using semantic web technology in translational medicine we apply it to interconnect clinical and basic scientists research interests. The research interests of SUNY Reach faculty were obtained from MeSH terms of publication data and are expressed in the VIVO ontology normalized to the UMLS. The VIVO ontology is part of the NIH funded VIVO project to interlink research scientists across different institutions. To explore novel interconnections in the network of research scientists the Nexus visualization environment was utilized. Nexus, a locally developed project, is a semantic web visualization tool built on the OpenSimulator platform. Nexus allows collaborative real time viewing and annotating of RDF data in a 3D environment.

[top]

 

Conference on Semantics in Healthcare and Life Sciences (CSHALS)

Contacts

Please direct inquiries about CSHALS to:

Steven Leard
ISCB Conferences Director
This email address is being protected from spambots. You need JavaScript enabled to view it.
Telephone: 1+780-414-1663



Conference on Semantics in Healthcare and Life Sciences (CSHALS)

Keynote Speaker - Dr. Charles Mead

Dr. Charles Mead
National Cancer Institute
Center for Biomedical Informatics and Information Technology (CBIIT)
Rockville, USA

Presentation Title: Next-generation Architecture for caBIG

Presentation slides (.pdf)

Abstract: The caBIG project now has 6+ years of experience with the challenges and benefits of defining, designing, developing, deploying, and evolving a distributed infrastructure to support collaborative data sharing across the translational medicine continuum.  As a direct result of both the successes of the first-generation of caBIG and the increasingly complex requirements of the caBIG stakeholder community around not only data sharing, but also more complex analytical and cross-process behavior coordination, the NCI Center for Biomedical Informatics and Information Technology (NCI CBIIT) has begun working on its next-generation architecture for caBIG.  This presentation will focus on a detailed enumeration of the distributed processing requirements for the next-generation of caBIG tools and technologies.  It will then discuss the core architecture strategies that have been adopted to satisfy these requirements.  Of particular importance is the adoption and adaption of a number of standards to support interoperability and automated decision making including such topics as management of cross-platform service-level security, ad hoc and distributed queries, and computationally assembled workflows.  In general, the overarching development strategy for the next-generation of caBIG is the combination of leveraging the experience gained and lessons learned both within the caBIG community over the last 6+ years, as well as in the larger internet community as it moves forward in its development of the Web 2.0 strategies, technologies, and tools.

Biography: Dr. Mead has over 35 years of experience in digital signal processing and algorithm development, complex software systems and architectures, and healthcare and life sciences informatics.  Dr. Mead has experience in clinical trials methodologies and data management systems, application of the Unified Process, and fundamental healthcare and life sciences informatics issues including terminology management, application of the Health Level Seven (HL7) Reference Information Model (RIM), use of Clinical Data Interchange Standards Consortium (CDISC) standards such as SDTM and ODM, the JANUS data model, and Oracle’s HTB development framework.  Dr. Mead currently is Chair of the HL7 Architecture Board, past-Chair of the Open Health Tools Architecture Project Team, and a current member of the CDISC Board of Directors.


[Full Agenda] [Keynote Speakers List]

 

Conference on Semantics in Healthcare and Life Sciences (CSHALS)

Keynote Speaker - Toby Segaran


Toby Segaran
Data Magnate
Metaweb Technologies
San Francisco, CA, USA

Presentation Title: How to Argue for Semantics

Presentation slides (.pdf)

Abstract: Many people who could benefit from the techniques used by the semantic web community remain unaware of the advantages conveyed by graphs, URIs and ontologies.  In this talk I'll explore perceptions of the semantic web, the kinds of problems people frequently encounter that can be solved with these techniques, and how to explain semantic technology to the uninitiated.

Biography:
Toby Segaran is a software developer and the author of the acclaimed O'Reilly title, Programming Collective Intelligence, and two new books Programming the Semantic Web and Beautiful Data. Formerly the Director of Software Development at Genstruct, Toby is now a Data Magnate at Metaweb technologies where he develops techniques to retrieve, parse and reconcile large public datasets. He loves applying data-mining algorithms to everything ranging from pharmaceutical trials to social networks and online dating.


[Full Agenda] [Keynote Speakers List]