Links within this page:
|DAVID ASTLING, PhD
Beyond Genomics: Deriving Actionable Health Insights from the Human Proteome
The circulating human proteome offers a unique and dynamic perspective into a person's physiological and health status and presents a great opportunity for rapid and accurate health diagnostics. Genomics by contrast fails in applications where diagnostic fingerprints of environmental impact, disease progression, or infection are needed. SomaLogic’s proteomic assay utilizes a library of over 5,000 SOMAmers for the simultaneous measurement of thousands of protein-analytes in a single blood sample. SomaLogic has shown that analysis of the proteome can provide indicators of patient risk for occurrence of a secondary cardiovascular event. To further this work, SomaLogic has embarked on a collaboration with major academic institutions to discover indicators of primary cardiovascular events, type 2 diabetes, kidney function, and lifestyle characteristics of pre-diabetic patients, as targets for incorporation into actionable insights that are of medical significance. Machine learning and statistical modeling techniques are used to develop insights that can provide rapid feedback to patients to inform strategies of managing aspects of cardio-metabolic syndrome. Additional collaborations are underway to discover insights for other disease states, physiological indicators of health and wellness, and non-blood related sample types. This presentation will examine co-regulatory networks to further our understanding of the existing models, to explore and understand the biomarkers underlying each disease model.
- top -
|BENJAMIN M. GOOD, PhD
Lawrence Berkeley National Labs
Integrating Pathway Databases with Gene Ontology Causal Activity Models
The Gene Ontology (GO) Consortium (GOC) is developing a new knowledge representation approach called ‘causal activity models’ (GO-CAM). A GO-CAM describes how one or several gene products contribute to the execution of a biological process. In these models (implemented as OWL instance graphs anchored in Open Biological Ontology (OBO) classes and relations), gene products are linked to molecular activities via semantic relationships like ‘enables’, molecular activities are linked to each other via causal relationships such as ‘positively regulates’, and sets of molecular activities are defined as ‘parts’ of larger biological processes. This approach provides the GOC with a more complete and extensible structure for capturing knowledge of gene function. It also allows for the representation of knowledge typically seen in pathway databases.
Here, we present details and results of a rule-based transformation of pathways represented using the BioPAX exchange format into GO-CAMs. We have automatically converted all Reactome pathways into GO-CAMs and are currently working on the conversion of additional resources available through Pathway Commons. By converting pathways into GO-CAMs, we can leverage OWL description logic reasoning over OBO ontologies to infer new biological relationships and detect logical inconsistencies. Further, the conversion helps to increase standardization for the representation of biological entities and processes. The products of this work can be used to improve source databases, for example by inferring new GO annotations for pathways and reactions and can help with the formation of meta-knowledge bases that integrate content from multiple sources.
- top -
|AARON VON HOOSER, PhD
Principal Scientist, Computational Biology
Building a Learning System that Helps Individuals to Thrive by Connecting Their Experiences and Goals with Molecular Measures of Health
Through the PatientsLikeMe (PLM) network, patients connect with others who have the same disease or condition and track and share their experiences. In the process, they generate data about the real-world nature of disease that help researchers, pharmaceutical companies, regulators, providers, and non-profits develop more effective products, services and care. Studies have shown that members of PLM report tangible benefits from the connectedness and sharing that is part of the PLM community experience. With more than 500,000 members, PLM is a trusted source for real-world disease information and a clinically robust resource that has published more than 60 peer-reviewed research studies.
The Biocomputing team at PLM is leveraging the digitization of person-generated experiential data with deep molecular analyses and machine learning to help patients understand and evaluate their own molecular biology and how they may be able to change their daily lives to optimally thrive. To this end, participants in PLM’s DigitalMeTM program have donated 1000s of biospecimens, building a massive health data set that spans dozens of disease conditions, including SLE, Fibromyalgia, MS, ALS, PD, and RA; captured on an ever-increasing list of big data platforms, including DNAseq, RNAseq, metabolomics, proteomics, and antibody immunosignatures. Here, we report results from several pilot “n of 1” studies, providing deep molecular biological characterization of longitudinal timepoints from the same individuals, tracking normal physiological systems perturbed by health interventions, as well as indications that a spectrum of processes tightly associated with specific disease activities may be perturbed in “healthy” individuals under various circumstances.
- top -
|KIRK E. JORDAN, Ph.D.
IBM Distinguished Engineer
Data Centric Solutions
IBM T.J. Watson Research
Chief Science Officer
IBM Research UK
Algorithm Exploitation & Evolving AI/Cognitive Examples on IBM’s Data Centric Systems
The volume, variety, velocity and veracity of data is pushing how we think about computer systems. IBM Research’s Data Centric Solutions organization has been developing systems that handle large data sets shortening time to solution. This group has created a data centric architecture initially delivered to the DoE labs at the end of 2017 and being completed in 2018. As various features to improve data handling now exist in these systems, we need to begin to rethink the algorithms and their implementations to exploit these features. This data centric view is also relevant for Artificial Intelligence (AI) and Machine Learning (ML). In this talk, I will briefly describe the architecture and point out some of hardware and software features ready for exploitation. I will show how we are using these data centric AI/cognitive computing systems to address some challenges in the life sciences in new ways as case studies.
- top -
|JOSLYNN S. LEE, Ph.D.
Science Education Fellow
Howard Hughes Medical Institute
Training and Engaging URM Undergraduate Students in Genomics Research Through a Place-based Microbiome Research Project
The participation of American Indian/Alaskan Native (AIAN) people and other underrepresented minority (URM) populations in STEM fields remains shockingly low. In the computational field, it is even lower. AIAN face various barriers that impede them from pursuing or continuing careers in genomics. Alongside, there is a demand for Integrating bioinformatics and data science into the life sciences curriculum. I am presenting a workshop training that allows students to gain hands-on laboratory and computational experience to understand the diversity of local environmental microbiomes in Colorado and New Mexico. This workshop targets early-career undergraduate students from Southwest regional PUIs, two-year and tribal colleges. Core competencies incorporated in the workshop are computational concepts (algorithms and file formats), statistics, accessing genomic data and running bioinformatics tools to analyze data. I will discuss some of the successes and pitfalls that I have encountered and the adaption for a one-semester course.
- top -
|ZHIYONG LU, Ph.D.
Deputy Director for Literature Search
National Center for Biotechnology Information (NCBI)
Senior Investigator, NCBI/NLM/NIH
Machine Learning in Biomedicine: from PubMed Search to Autonomous Disease Diagnosis
The explosion of biomedical big data and information in the past decade or so has created new opportunities for discoveries to improve the treatment and prevention of human diseases. But the large body of knowledge—mostly exists as free text in journal articles for humans to read—presents a grand new challenge: individual scientists around the world are increasingly finding themselves overwhelmed by the sheer volume of research literature and are struggling to keep up to date and to make sense of this wealth of textual information. Our research aims to break down this barrier and to empower scientists towards accelerated knowledge discovery. In this talk, I will present our work on developing open-source NLP and image analysis tools based on machine learning. Moreover, I will demonstrate their uses in some real-world applications such as improving PubMed searches, scaling up human curation for precision medicine, and enabling image-based autonomous disease diagnosis.
- top -
|DEBORAH L. MCGUINNESS, PhD
Tetherless World Senior Constellation Chair
Professor of Computer Science and Cognitive Science
Rensselaer Polytechnic Institute
New York, USA
Semantic Data Resources Enabling Science: Building, Using, and Maintaining Ontology-Enabled Biology Data Resources
Ontologies are seeing a resurgence of interest and usage as big data proliferates, machine learning advances, and integration of data becomes more paramount. The previous models of sometimes labor-intensive, centralized ontology construction and maintenance do not mesh well in today’s interdisciplinary world that is in the midst of a big data, information extraction, and machine learning explosion. Today many high quality ontologies exist that can and should be utilized. We will describe our approach to building maintainable and reusable semantics-enabled health and life science data ecosystems. We will introduce our method in the context of our National Institutes of Environmental Health Science-funded Child Health Exposure Analysis Resource and we will describe how how our community built and maintains a broad interdisciplinary ontology that spans exposure science and health and integrates with numerous long standing, well used ontologies. We will also describe how this ontology powers an integrated data resource. We will also give examples of how the same methodology is being used in an IBM-funded Health Empowerment using Analysis, Learning and Semantics project as well as a semantics-aware drug repurposing effort. We will conclude by discussing today’s requirements for choosing, reusing, and interlinking existing, evolving resources and the resulting requirements for new methodologies and their resulting systems that can be used and maintained by large diverse communities to accelerate science discovery.
- top -
|NICOLE A. VASILEVSKY, Ph.D.
Research Assistant Professor
Department of Medical Informatics and Clinical Epidemiology (DMICE)
Oregon Health & Science University
LOINC2HPO: Improving Translational Informatics by Standardizing EHR Phenotypic Data Using the Human Phenotype Ontology
Electronic Health Record (EHR) data are often encoded using Logical Observation Identifier Names and Codes (LOINC), which is a universal standard for coding clinical laboratory tests. LOINC codes encode clinical tests and not the phenotypic outcomes, and multiple codes can be used to describe laboratory findings that may correspond to one phenotype. However, LOINC encoded data is an untapped resource in the context of deep phenotyping with the Human Phenotype Ontology (HPO). The HPO describes phenotypic abnormalities encountered in human diseases, and is primarily used for research and diagnostic purposes. As part of the Center for Data to Health (CD2H)’s effort to make EHR data more translationally interoperable, our group developed a curation tool that is used to convert EHR observations into HPO terms for use in clinical research. To date, over 1,000 LOINC codes have been mapped to HPO terms. To demonstrate the utility of these mapped codes, we performed a pilot study with de-identified data from asthma patients. We were able to convert 70% of real-world laboratory tests into HPO-encoded phenotypes. Analysis of the LOINC2HPO-encoded data showed that the HPO term eosinophilia was enriched in patients with severe asthma and prednisone use. This preliminary evidence suggests that LOINC data converted to HPO can be used for machine learning approaches to support genomic phenotype-driven diagnostics for rare disease patients, and to perform EHR based mechanistic research.
|- top -|