Conference on Semantics in Healthcare and Life Sciences (CSHALS)

Technical Talks

(updated March 1, 2010)

Tech Talks showcase products and services of relevance to the CSHALS audience. Each Tech Talk is 10 minutes in length and designed to allow organizations to create awareness of new technologies, services, etc., in an informational presentation format.

For organizations interested in presenting a Tech Talk, please go to our Sponsor Opportunities page (click here) for further information.


Thursday, February 25, 2010 - Technical Talk 1
Noon - 12:10 p.m.

Jan Aasman, Franz Inc.,
Exploring Schema Spaces in Linked Open Data


Presentation slides - .pdf: click here

Abtsract:
There is an explosion of linked RDF datasets in the life sciences domain. A typical RDF dataset published on the web is about one particular domain and contains both an ontology of the data that it contains, a set of instances, and possibly some explicit owl:sameas relations to other instances in other datasets. Most interesting problems require one to combine a large number of these datasets and then create queries and analysis programs that touch multiple sources.

In practice, the exploration of these data sources is far from trivial. First the domain expert has to study each dataset to find out what classes it contains and the properties that each class has. Unfortunately not all datasets come with full ontologies that make this easy. One has to reverse engineer which property belongs to which class and what datatype a typical property has. Secondly, the user has to figure out the linkage between particular datasets where the linkage might be through instance names or through owl:sameas relationships. Again the user has to reverse engineer these links mostly manual.

We will discuss a number of tools that help the user to explore linked datasets. [1] For datasets that lack even simple class descriptions we examine tools that will use simple datamining techniques to make clusters of things one might call classes, and provide the metadata about each class. [2] For datasets that do contain at least rdf:types we will again add some more metadata to help to user to figure out what properties a class contains. [3] For a combination of datasets we show how to use the existing ontologies and the output of [1] and [2] to create a schema space. For this schema space we will then datamine the entire dataset to enrich the schema space with links between the classes. This schema space can then be navigated by visual tools to quickly understand what is in the various datasets and how they are linked, and one might even use them to automatically create SPARQL queries that will give the user an idea how the data might be queried.


Thursday, February 25, 2010 - Technical Talk 2
12:10 p.m. - 12:20 p.m.

Erich A. Gombocz, IO Informatics, Inc.
"Predictive Models for Biology in Personalized Medicine: Are We There Yet?"


Presentation slides - .pdf:  click here

Abstract:
Semantic data integration of OMICs (Genomics, proteomics, and metabolomics) experimental results with tissue analytics and clinical endpoints has opened new doors to understanding of complex biological functions. This has led to the creation of mechanistically enhanced Applied Semantic Knowledgebases (ASK ™) which combine correlations from quantitative experimental observations and public domain knowledge on pathway dependencies and mechanistic insights into a rich, systems-oriented decision support tool for biological functions.

This talk will demonstrate how such knowledge can be used in real-life scenarios using arrays of SPARQL network queries to make confident decisions in toxicity assessment, pre-symptomatic prediction of organ failures, tumor classification, therapy selection and patient stratification for clinical trials. Weighing and scoring of combinatorial biomarkers identified via such methods has dramatically improved confidence and applicability of such approaches to make them for the first time available to patient screening. Applications of this technology span the entire circle from efficacy-driven drug discovery to selection of optimal responders for the right drug at the right dose in treatment of well-characterized disease stages.

About the speaker:
Dr. Erich Gombocz has over 30 years of experience in Life Science research, laboratory automation and data management in scientific and distributed systems environments, plus more than 25 years programming experience in instrumentation control, user interface, database design, scientific analysis, and on-line laboratory automation as well as being developer of innovative software algorithms and architecture. Focusing on semantic data integration and knowledge management in life sciences, he founded IO Informatics in 2003 together with Bob Stanley to apply systems biology approaches to challenges in the area of pharmaceutical and clinical decision-making.

Dr. Gombocz has published over 60 scientific publications and holds currently more than 40 biotechnology- and software-related US and international patents. He is an international expert in separation science and bioinformatics, a member of several professional organizations, and serves on the editorial board of a number of scientific journals.


Friday, February 26, 2010 - Technical Talk 3
12:00 p.m. - 12:10 p.m.

Rusty Bobrow, BBN Technologies
Eric Neumann, Clinical Semantics Group

"Visualizing the Web of Linked Life Science Data -- The S*QL Plugin for Cytoscape"


Presentation slides - .pdf:  click here

Abstract:
Visual analytics is a powerful paradigm that can be applied to RDF structured data. Many applications have been created that interact with RDF: some query RDF via SPARQL endpoints, while others help visualize RDF data. Coupling both functions is essential for interactive analytics of RDF content. S*QL integrates SPARQL and SQL network data access mechanisms and analytic scripts with the open-source Cytoscape application, a powerful visualization engine with a strong following in the life sciences. S*QL supports visual analytics for complex knowledge graphs ranging from gene/protein/disease relations to drug/clinical trials data. We'll provide a live demonstration.


[Return to Full Agenda]