Foundation Medicine



IO Informatics

Conference on Semantics in Healthcare and Life Sciences

Tech Talks

Updated February 19, 2013

Tech Talks showcase products and services of relevance to the CSHALS audience. Each Tech Talk is 20 minutes (15 minutes presentation and up to 5 minutes for Q&A) in length and designed to allow organizations to create awareness of new technologies, services, etc., in an informational presentation format.

For organizations interested in presenting a Tech Talk, please go to our Sponsor Opportunities page (click here) for further information.

Thursday – February 28, 2013

11:20 am - 11:40 am

Linking Data with Agile Text Mining

Presenter: David Milward, Chief Technology Officer, Linguamatics

Much of the knowledge we have resides as unstructured text. How can we exploit this to make connections and create new knowledge?

For some years, ontology-based text mining has been used to connect information from different documents to generate new hypotheses, for example by finding indirect relationships e.g. from a compound to a disease via an interaction with a gene. Different terminologies or ontologies can be exploited to bridge different communities such as clinical and scientific research. We will discuss similarities with semantic web approaches and some differences.

Finally we will show how we can export unstructured data in a structured format, whether RDF or BEL, to integrate unstructured and structured data. We will also discuss consequences of extraction of relationships followed by curation vs. direct extraction of hypotheses.

Biography: David Milward is CTO of Linguamatics and has over 20 years experience of product development, consultancy and research in natural language processing. After receiving a PhD from the University of Cambridge, he was a researcher and lecturer at the University of Edinburgh. He has published in the areas of information extraction, spoken dialogue, parsing, syntax and semantics. He is a pioneer of interactive text mining, and a founder of Linguamatics.

Thursday – February 28, 2013

11:45 am - 12:05 pm

Semantic Indexing of Unstructured Documents Using Taxonomies and Ontologies

Presenter: Jans Aasman, Franz Inc., Oakland, CA, US

Life science companies and healthcare organizations use RDF/SKOS/OWL based vocabularies, thesauri, taxonomies and ontologies to organize enterprise knowledge. There are many ways to use these technologies but one that is gaining momentum is to semantically index unstructured documents through ontologies and taxonomies.

In this talk we will demonstrate two projects where we use a combination of SKOS/OWL based taxonomies and ontologies, entity extraction, fast text search and a RDF triplestore to create a semantic retrieval engine for unstructured documents.

Biography: Jans Aasman started his career as an experimental and cognitive psychologist, earning his PhD in cognitive science with a detailed model of car driver behavior using Lisp and Soar. He has spent most of his professional life in telecommunications research, specializing in intelligent user interfaces and applied artificial intelligence projects. From 1995 to 2004, he was also a part-time professor in the Industrial Design department of the Technical University of Delft. Jans is currently the CEO of Franz Inc., the leading supplier of commercial, persistent, and scalable RDF database products that provide the storage layer for powerful reasoning and ontology modeling capabilities for Semantic Web applications.


Friday – March 01, 2013
2:20 pm - 2:40 pm

Enabling Drug Discovery Applications Through a Linked Data Platform

Presenter: Alasdair J G Gray, University of Manchester, UK

We present the Open PHACTS linked data platform that is being developed to support a wide range of novel drug discovery applications. The functionality offered by the platform has been drawn from a collection of prioritised drug discovery business questions created as part of the Open PHACTS project, a collaboration of research institutions and major pharmaceutical companies.

The discovery of new medicines requires pharmacologists to interact with a number of data sources; ranging from data on chemical compounds to their interactions with targets. The linked data platform provides an integrated view over data retrieved from several complementary, but overlapping, data sources

Key features of the Open PHACTS linked data platform are:
1) Domain specific API making drug discovery linked data available for a diverse range of applications without requiring the application developers to become knowledgeable of semantic web standards such as SPARQL;
2) Just-in-time identity resolution and alignment across datasets enabling a variety of entry points to the data and ultimately to support different integrated views of the data;
3) Centrally cached copies of public datasets to support interactive response times for user-facing applications.

The Open PHACTS platform is hosted by OpenLink using the Virtuoso triplestore. This is enabling us to provide the security and privacy guarantees required by pharmaceutical companies. We have recently begun beta testing of the platform with our associated partners and anticipate a full public roll-out later in 2013.

The utility of the linked data platform is demonstrated by the variety of drug discovery applications being built to access the integrated data.

Biography: Alasdair is a researcher in the MyGrid team at the University of Manchester. He is currently working on the Open PHACTS project which is building an Open Pharmacological Space to integrate drug discovery data. Alasdair gained his PhD from Heriot-Watt University, Edinburgh. He has spent the last 10 years working on novel knowledge management projects investigating issues of relating data sets.


Friday – March 01, 2013
2:45 pm – 3:05 pm

Practical Usage of Linked Data and Semantic Annotations by the Enterprise

Presenter: Vassil Momtchev, Group leader, Ontotext, Bulgaria

Linked data and ontology-driven text processing (aka semantic annotation) is nowadays becoming mainstream technology. Although the technology benefits are well understood, it is difficult to point to a significant number of established semantic systems used in production in the life sciences and health care domain.

In this talk, we will present Ontotext's solution implemented on top of native RDF infrastructure capable of efficiently combining semantic annotations with a large repository of bio-medical linked data. We will demonstrate how to implement semantic document searches that disambiguate concepts according their context and browse documents using semantic annotations. Both background and extracted information is modelled as RDF and further exposed as linked data that can be indexed via a powerful search interface included as part of the system. The search interface allows indexing of locally processed data and data exposed via remote SPARQL endpoints.

The solution is built on a public linked data service called Linked Life Data which integrates more than 25 popular life sciences and biomedical data sources in an RDF warehouse of more than 10 billion statements, all accessible via single SPARQL endpoint and updated regularly.

Biography: Vassil Momtchev is board member of Ontotext and passionate software engineer with over 12 years experience in the development of large scale knowledge management solutions for the life sciences, pharmaceutical and biotechnology industries. He joined Ontotext in 2005 and coordinated several European funded research projects in the areas of knowledge representation, reasoning and life sciences. He has practical experience of the product development, software architectures and research in linked data, RDF, natural language processing and semantic databases.