Late Breaking Research Presentation Schedule

Attention Conference Presenters - please review the Speaker Information Page available here.

Presenting Authors are shown in bold:

LBR01 - Transcriptome analysis reveals thousands of targets of nonsense-mediated mRNA decay that offer clues to the mechanism in different species
Date: Sunday, July 13, 10:30 a.m. - 10:55 a.m.Room: 302

Author(s):
Steven Brenner, University of California, Berkeley, United States
Courtney E. French, University of California, Berkeley, United States
Gang Wei, Fudan University, China
Angela Brooks, Broad Institute of MIT AND Harvard, United States
Thomas Gallagher, Ohio State University, United States
Li Yang, Partner Institute FOR Computational Biology, China
Brenton Graveley, University of Connecticut Health Center, United States
Sharon Amacher, Ohio State University, United States

Session Chair: Bonnie Berger
Abstract

Nonsense-mediated mRNA decay (NMD) is an RNA surveillance system that degrades isoforms containing a premature termination codon (PTC). NMD coupled with alternative splicing is a mechanism of post-transcriptional gene regulation. The canonical model of defining a PTC in mammals is the 50nt rule: a termination codon more than 50 nucleotides upstream of an exon-exon junction is premature and triggers degradation. There is evidence that this rule holds in Arabidopsis but not in other eukaryotes such as Drosophila. There is also evidence that a longer 3’ UTR triggers NMD in plants, flies, and mammals.

To survey the targets of NMD genome-wide in human, zebrafish, and fly, we performed RNA-Seq analysis on cells where NMD has been inhibited via knockdown of UPF1, a critical NMD protein. We found that thousands of genes produce alternative isoforms degraded by NMD in the three species. We found that the 50nt rule is a strong predictor of NMD degradation in human, and has an effect in zebrafish and in fly. In contrast, we found little correlation between the likelihood of degradation by NMD and 3' UTR length in any of the three species.

TOP

LBR02 - Leveraging network structure to discover genetic interactions in genome-wide association studies
Date: Sunday, July 13, 10:30 a.m. - 10:55 a.m.Room: 306

Author(s):
Wen Wang, University of Minnesota, United States
Gang Fang, Icahn School of Medicine at Mount Sinai, United States
Vanja Paunic, University of Minnesota, United States
Xiaoye Liu, University of Minnesota, United States
Benjamin Oatley, University of Minnesota, United States
Majda Haznadar, University of Minnesota, United States
Michael Steinbach, University of Minnesota, United States
Brian Van Ness, University of Minnesota, United States
Nathan Pankratz, University of Minnesota, United States
Vipin Kumar, University of Minnesota, United States
Chad Myers, University of Minnesota, United States

Session Chair: Jason Ernst
Abstract

Genetic interactions (epistasis) are important factors in complex diseases that may contribute to unexplained heritability in genome-wide association studies (GWAS). However, existing methods for identifying genetic interactions, which mainly focus on testing individual locus pairs, lack statistical power. We proposed a novel computational approach for discovering disease-specific, pathway-pathway genetic interactions from GWAS data. The key motivation, derived from the extensive analysis of genetic interaction networks in yeast, is that genetic interactions tend to occur between functionally compensatory modules rather than between isolated pairs of genes. We developed a method that explicitly searches for such large structures, guided by established sets of genes belonging to characterized pathways. We applied this approach to a Parkinson's disease (PD) GWAS study and found 50 statistically significant (FDR ?0.25) pathway level interactions, suggesting large genetic interaction structures indeed exist and can be discovered by leveraging structural properties with prior information on pathways. Interestingly, many of the discovered interactions are associated with reduced disease risk while a substantially smaller number are associated with increased disease risk. A significant fraction of them are validated in two independent cohorts. Our study highlights specific insights derived from analysis of the PD interactions and, more broadly, provides a general framework for systematic detection of genetic interactions from GWAS studies.

TOP

LBR03 - Heterogeneous Network Link Prediction Prioritizes Disease-Associated Genes.
Date: Sunday, July 13, 11:00 a.m. - 11:25 a.m.Room: 306

Author(s):
Daniel Himmelstein, University of California, San Francisco, United States
Sergio Baranzini, University of California, United States

Session Chair: Jason Ernst
Abstract

The first decade of Genome Wide Association Studies (GWAS) has uncovered a wealth of disease-associated variants. Two important tasks will be translating this information into a multiscale understanding of pathogenic variants, and increasing the power of existing and future studies through prioritization. We show that heterogeneous network link prediction accomplishes both these tasks. First we constructed a network with 22 node types and 24 edge types from high-throughput publicly-available resources. From this network we extracted features describing the topology between specific genes and diseases. Using a machine learning approach that relies on GWAS-discovered associations for positives, we predicted the probability of association between each protein-coding gene and each of 23 diseases. These predictions achieved a testing AUROC of 0.845 and a 200-fold enrichment in precision at 10% recall. We compared the informativeness of each included network component. The full model outperformed any individual domain, highlighting the benefit of integrative approaches. For multiple sclerosis (MS), we predicted 5 novel susceptibility genes, 4 of which (JAK2, TNFAIP3, REL, RUNX3) achieved Bonferroni validation on a 9,772-case GWAS masked from our analysis. Regions containing two of these genes were uncovered in a recent MS ImmunoChip-based study highlighting our ability to identify the causal gene within a locus.

TOP

LBR04 - Genome annotation of multiple cell types and chromatin architecture using graph-based regularization
Date: Sunday, July 13, 3:05 p.m. - 3:30 p.m.Room: 306

Author(s):
Maxwell Libbrecht, University of Washington, United States
Michael Hoffman, Princess Margaret Cancer Centre, United States
Ferhat Ay, University of Washington, United States
David Gilbert, Florida State University, United States
Jeffrey Bilmes, University of Washington, United States
William Noble, University of Washington, United States

Session Chair: Dana Pe'er
Abstract

Semi-automated genome annotation algorithms are widely used to summarize functional genomics data (such as ChIP-seq) into human-interpretable form. We present a single solution to two seemingly quite different problems that existing algorithms fail to address: (1) performing genome annotation in multiple cell types and (2) integrating 3D genome architecture data into the annotation. Our solution uses an analytic framework based on the idea of a pairwise prior, which states that we have a prior belief that certain pairs of genomic positions should be more likely to receive the same label in our annotation. We developed a novel convex optimization method, called graph-based regularization (GBR) which admits efficient inference in the presence of a pairwise prior. We applied GBR in both settings mentioned above and, by comparing our annotations to functional genomics experiments not used in training, we demonstrated that GBR improves the quality of the resulting annotations in both cases

TOP

LBR05 - Deconvolution of massively-parallel reporter assays tiling 15,000 human regulatory regions reveal activating and repressive regulatory sites at nucleotide-level resolution.
Date: Sunday, July 13, 3:35 p.m. - 4:00 p.m.Room: 306

Author(s):
Jason Ernst, UCLA, United States
Tarjei Mikkelsen, Broad Institute, United States
Manolis Kellis, Massachusetts Institute of Technology, United States

Session Chair: Dana Pe'er
Abstract

Massively parallel reporter assay designs have been demonstrated that test a large number of regulatory elements or discover specific activating and repressive bases for a small number of regulatory elements, but effectively doing both simultaneously has been a limitation. Here, we overcome this limitation, and present a new Bayesian tiling deconvolution approach, which combines experimental tiling of regulatory regions using 31 sequences of length 145bp at 5bp intervals with computational deconvolution of the resulting signal to infer a nucleotide-level view of regulatory activity across thousands of regulatory regions. By exploiting the multiple overlapping sequences in a probabilistic framework, our method is also robust to noisy or missing measurements, and enables high resolution inferences with a very small number of tested sequences per target region. This enables the de novo discovery of individual binding sites, and inference of their activating or repressive action in a single experiment across thousands of candidate regions. We apply this method in two cell types to more than 15,000 regions in the human genome selected based on chromatin data to provide the first nucleotide-level view of activating and repressive sites across a sizeable fraction of the regulatory human genome.

TOP

LBR06 - Linking tumor mutations to drug responses via a quantitative chemical-genetic interaction map
Date: Sunday, July 13, 4:05 p.m. - 4:30 p.m.Room: 306

Author(s):
Sourav Bandyopadhyay, University of California, San Francisco, United States

Session Chair: Dana Pe'er
Abstract

There is an urgent need in oncology to link molecular aberrations in tumors with therapeutics that can be administered in a personalized fashion. One approach identifies synthetic-lethal genetic interactions or emergent dependencies that cancer cells acquire in the presence of specific mutations. Using engineered isogenic cells, we generated an unbiased, quantitative chemical-genetic interaction map that measures the influence of 51 aberrant cancer genes on 90 drug responses. The dataset strongly predicts drug responses found from profiling cancer cell lines, indicating that it accurately models more complex cellular contexts. Applied to triple-negative breast cancer, we interrogate several clinically actionable synthetic lethal interactions with the MYC oncogene, providing new drug and biomarker pairs for clinical investigation. This scalable approach enables the prediction of drug responses from patient data and can be used to accelerate the development of new genotype-directed therapies.

TOP

LBR07 - Linking Signaling Pathways to Transcriptional Programs in Breast Cancer
Date: Monday, July 14, 2:10 p.m. - 2:35 p.m.Room: 306

Author(s):
Hatice Osmanbeyoglu, MSKCC, United States
Raphael Pelossof, MSKCC, United States
Jacqueline F. Bromberg, MSKCC, United States
Christina Leslie, MSKCC, United States

Session Chair: Sourav Bandyopadhyay
Abstract

Cancer cells acquire genetic and epigenetic alterations that often lead to dysregulation of oncogenic signal transduction pathways, which in turn alters downstream transcriptional programs.  Numerous methods attempt to deduce aberrant signaling pathways in tumors from mRNA data alone, but these pathway analysis approaches remain qualitative and imprecise.   Here, we present a statistical method to link upstream signaling to downstream transcriptional response by exploiting reverse phase protein arrays and mRNA expression arrays in The Cancer Genome Atlas breast cancer project. Formally, we use an algorithm called affinity regression to learn an interaction matrix between upstream signal transduction proteins and downstream transcription factors (TFs) that explains target gene expression. The trained model can then predict the TF activity given a tumor sample’s protein expression profile or infer the signaling protein activity given a tumor sample’s gene expression profile. Breast cancers are comprised of molecularly distinct subtypes that respond differently to pathway-targeted therapies. We trained our model on the breast cancer data set and identified subtype-specific and common TF regulators of gene expression. Finally, inferred protein activity predicted clinical outcome within the METABRIC Luminal A cohort, identifying high- and low-risk patient groups within this heterogeneous subtype.

TOP

LBR08 - Beyond Argonaute: understanding microRNA dysregulation in cancer and its effect on protein interaction and transcriptional regulatory networks
Date: Monday, July 14, 2:40 p.m. - 3:05 p.m.Room: 306

Author(s):
Sara Gosline, MIT, United States
Coyin Oh, Massachusetts Institute of Technology, United States
Ernest Fraenkel, Massachusetts Institute of Technology, United States

Session Chair: Sourav Bandyopadhyay
Abstract

microRNAs (miRNAs) cause changes in gene expression through repression of target mRNA and are highly dysregulated in cancer. However, many effects of mIRNA changes cannot be attributed to direct miRNA-mRNA interactions. As such, we propose an integrative approach that characterizes the effect miRNAs can have on protein-protein interaction networks with the hopes of identifying proteins and pathways that correlate with patient prognosis.

TOP

LBR09 - Extensive trans and cis-QTLs revealed by large scale cancer genome analysis
Date: Monday, July 14, 3:10 p.m. - 3:35 p.m.Room: 306

Author(s):
Kjong-Van Lehmann, Memorial Sloan-Kettering Cancer Hospital, United States
Andre Kahles, Memorial Sloan Kettering Cancer Center, United States
Cyriac Kandoth, Memorial Sloan Kettering Cancer Center, United States
William Lee, Memorial Sloan Kettering Cancer Center, United States
Nikolaus Schultz, Memorial Sloan Kettering Cancer Center, United States
Robert Klein, Memorial Sloan Kettering Cancer Center, United States
Oliver Stegle, EBI, United Kingdom
Gunnar Rätsch, Memorial Sloan-Kettering Cancer Center, United States

Session Chair: Sourav Bandyopadhyay
Abstract

While population structure can be one of the most severe confounding factors in QTL analysis, tumor samples open up many new additional challenges. Tumor specific somatic mutations and recurrence patterns are known to explain large amounts of the observed transcriptome variation and sample heterogeneity can lead to spurious associations. We have developed a new strategy to perform a common variant association study (CVAS) using mixed models on tumor samples, which enables us to account for tumor specific genotypic and phenotypic heterogeneity as well as population structure. We apply this strategy to investigate the relationship between germline and somatic variants as well as splicing patterns and expression changes in order to discover determinants of transcriptome variation. Due to sample size constraints, many QTL studies have been limited to the analysis of cis-associated variants. We use whole genome, exome and RNA-seq data from the TCGA project to overcome this limitation and discover trans-associated variants as well. A rare variant association study (RVAS) using variants from whole genome and exome sequencing data is being utilized to investigate the basis of rare mutations.

TOP

LBR10 - Utilizing a Phylogeographic Generalized Linear Model for Identifying Predictors Driving H5N1 Diffusion within Egypt
Date: Monday, July 14, 3:40 p.m. - 4:05 p.m.Room: 306

Author(s):
Matthew Scotch, Arizona State University, United States
Daniel Magee, Arizona State University, United States
Rachel Beard, Arizona State University, United States

Session Chair: Sourav Bandyopadhyay
Abstract

Egypt has become an epicenter of highly pathogenic avian influenza H5N1 influenza transmission. Like many viruses, the diffusion of H5N1 is a highly complicated process that depends on a large number of factors, most of which are poorly understood. We adopted a Bayesian phylogeographic GLM as developed by Lemey et al. in which viral diffusion patterns are reconstructed while predictors are simultaneously assessed.

TOP

LBR11 - Clonality Inference in Multiple Tumor Samples Using Phylogeny
Date: Tuesday, July 15, 10:30 a.m. - 10:55 a.m.Room: 306

Author(s):
Nilgun Donmez, Simon Fraser University, Canada
Salem Malikic, Simon Fraser University, Canada
Andrew McPherson, British Columbia Cancer Agency, Canada
Cenk Sahinalp, Indiana University, United States

Session Chair: Teresa Przytycka
Abstract

Most human tumors exhibit a large degree of heterogeneity that is not only apparent in histology but also presents itself in various features such as genomic copy number alterations and structural rearrangements as well as other aberrations. While the origins of the intra-tumor heterogeneity are still debated, research suggests that this diversity is likely to have clinical implications and may be linked to metastatic potential and drug response.

Although the multi-clonal nature is virtually common to most tumor samples, determining the clonal subpopulations is a challenging process. Currently, single-cell sequencing has a prohibitive cost in the scales that would be necessary to representatively sample a tumor tissue. Furthermore, methods such as Fluorescence in Situ Hybridization (FISH) or Silver in Situ Hybridization (SISH) can only assess a small number of probes in individual cells of a tumor sample.

In silico separation of the clonal subpopulations may provide a viable alternative to these aforementioned methods. Despite the importance of clonal diversity and its clinical implications, relatively few computational methods have been developed to date.

To address the problem of accurately determining subclonal frequencies in tumors as well as their evolutionary history, we have developed a novel combinatorial algorithm, named CITUP (Clonality Inference in Tumors Using Phylogeny), that determines subclonal frequencies in tumors as well as their evolutionary history. CITUP has the ability to exploit multiple samples from the same patient to achieve more accurate estimates and works on a variety of point mutations such as small indels and single nucleotide variants, as well as structural alterations. Through an efficient and robust multi-dimensional clustering approach, our method can handle a large number of mutations per patient. In addition to its exact Quadratic Integer Programming (QIP) formulation, CITUP also employs an approximate iterative module which achieves comparable accuracy to the QIP module for faster solutions.

Using extensive simulations where we experiment with a variety of phylogenetic trees with differing number of subclones and model parameters, we evaluated the performance of CITUP and compared it to the performance of other state-of-the-art tools. In these simulations, we used a comprehensive set of evaluation measures ranging from the ability to infer the correct evolutionary trajectory of the tumor to identifying mutational profile and relative abundance of the subclones. These measures show that CITUP consistently outperforms the other tools in estimating the subclonal frequencies and inferring phylogenetic relationships.

TOP

LBR12 - Expansion of biological pathways based on evolutionary inference
Date: Tuesday, July 15, 11:00 a.m. - 11:25 a.m.Room: 306

Author(s):
Sarah Calvo, Broad Institute, United States
Yang Li, Harvard University, United States
Roee Gutman, Brown University, United States
Jun Liu, Harvard University, United States
Vamsi Mootha, HHMI and Massachusetts General Hospital, United States

Session Chair: Teresa Przytycka
Abstract

One approach to predict gene function is to identify modules of genes that have been lost together multiple times across evolution.  We developed CLIME, a principled “phylogenetic profiling” algorithm that clusters an input gene-set into modules based on shared evolutionary history, and then expands each module with additional genes that likely arose under the inferred model of evolution.  CLIME models evolution of the input gene set using a Bayesian mixture of tree-based hidden Markov models (simultaneously learning module number and membership via Markov Chain Monte Carlo sampling for Dirichlet process mixture models).  Using data from 138 diverse eukaryotic species, we applied CLIME to 1000 human pathways/complexes as well as to the entire genomes of three model organisms (yeast, malaria parasite, and red alga).  These analyses revealed unexpected evolutionary modularity even in well-studied pathways and many novel, co-evolving components.  

TOP

LBR13 - Accurate prediction of mitochondrial presequences and their cleavage sites with MitoFates identifies hundreds of novel human mitochondrial protein candidates
Date: Tuesday, July 15, 11:30 a.m. - 11:55 p.m.Room: 306

Author(s):
Yoshinori Fukasawa, University of Tokyo, Japan
Kenichiro Imai, The National Institute of Advanced Industrial Science and Technology, Japan
Junko Tsuji, University of Tokyo, Japan
Szu-Chin Fu, University of Tokyo, Japan
Kentaro Tomii, The National Institute of Advanced Industrial Science and Technology, Japan
Paul Horton, The National Institute of Advanced Industrial Science and Technology, Japan

Session Chair: Teresa Przytycka
Abstract
Mitochondria provide numerous essential functions for cells, and their dysfunction causes diseases such as neurodegenerative diseases. Thus obtaining a complete mitochondrial proteome should be a crucial step towards understand the roles of mitochondria. Many mitochondrial proteins have been identified but a complete list is not available. Unfortunately, the accuracy of existing predictors is far from perfect and has not improved significantly for a decade! Here, we report MitoFates, a predictor to accelerate the discovery of mitochondrial proteins. In developing MitoFates we introduced novel presequence features: a modified hydrophobic moment, novel motifs and refined PWM for the cleavage site. We combined those with classical features and presented them to an SVM. According to our benchmarks on a non-redundant test set of proteins, MitoFates achieves significantly higher performance than the well known predictors TargetP, Predotar and MitoProtII. To investigate the utility of MitoFates, we looked for undiscovered mitochondrial proteins from the human proteome. MitoFates predicts 1231 genes, and 633 of these were annotated as “mitochondria” in neither UniProt nor GO. Interestingly, these include candidate regulators of Parkin translocation to damaged mitochondria, a trigger of degradation of dysfunctional mitochondria. This suggests that careful investigation of other predictions will be helpful in elucidating the functions of mitochondria in health and disease.
TOP

LBR14 - Stable identifiability of the human microbiome based on metagenomic hitting sets
Date: Tuesday, July 15, 12:00 p.m. - 12:25 p.m.Room: 306

Author(s):
Eric Franzosa, Harvard School of Public Health, United States
Katherine Huang, The Broad Institute, United States
James Meadow, University of Oregon, United States
Dirk Gevers, The Broad Institute, United States
Katherine Lemon, The Forsyth Insitute, United States
Brendan Bohannan, University of Oregon, United States
Curtis Huttenhower, Harvard School of Public Health, United States

Session Chair: Morris Quaid
Abstract

Recent large-scale investigations of the human microbiome have revealed great variability in body site-specific microbial community structure across healthy individuals. However, it remains unknown if this variability is sufficient to uniquely identify individuals within a large population, or if it is sufficiently stable to continue uniquely identifying individuals at later times. We investigated these questions by developing a hitting set-based coding algorithm and applying it to individuals from the Human Microbiome Project cohort. Specifically, our approach defined metagenomic fingerprints: sets of microbial taxa or genes that distinguished individuals from a background population, with features prioritized based on predicted stability. Fingerprints based on clade-specific marker genes were able to distinguish almost all individuals. However, at most body sites, these fingerprints uniquely identified their owners in only ~30% of cases when re-assessed after a period of 30-300 days (due to microbial strain loss). The gut microbiome was an exception, as over 80% of its marker gene-based fingerprints remained stable and unique at later times. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work places an upper bound on the identifiability of human-associated microbial communities over mid-to-long time scales, a result with important ethical implications for future microbiome study design.

TOP

LBR15 - Novel Computational Approach for Integration of Omics-platforms with Application to Hypertension in Recombinant Rat Strains
Date: Tuesday, July 15, 2:00 p.m. - 2:25 p.m.Room: 306

Author(s):
Stefka Tyanova, Max Planck Institute of Biochemistry, Germany
Kathrine Sylvestersen, The Novo Nordisk Foundation Center for Protein Research, Proteomics, Denmark
Matthias Mann, Max Planck Institute of Biochemistry, Germany
Michael Lund Nielsen, The Novo Nordisk Foundation Center for Protein Research, Proteomics, Denmark
Juergen Cox, Max Planck Institute of Biochemistry, Germany

Session Chair: Morris Quaid
Abstract

We propose a novel computational approach for efficient integration of genomic, transcriptomic and proteomic data. We investigate the disease phenotypes characterizing the set of recombinant rat strains HXB/BXH, which is of large relevance to metabolic and cardiovascular diseases. We employ proteomic and transcriptomic quantitative measurements of the founding and the recombinant strains in combination with a genetic markers map of the recombinant strains. First, the molecular feature spaces at the proteome and transcriptome levels are orthogonally transformed and components accounting for the variability explaining the phenotype of interest are extracted. This defines a quantitative measure of the disease phenotype along the recombinant strains. To incorporate genetic information, the map of alleles from the recombinant strains is transformed to a numeric matrix assigning 1 if the recombinant region comes from the diseased founding strain and 0 otherwise. Support Vector Machine Regression (SVR) is then used to build a model that can correctly assign the phenotypic association of the strains based on their genetic characteristics. To identify disease-related genetic loci, we tested different feature selection strategies based on mutual information and SVR and measured their performance for various combinations of features. We identified a small number of genetic markers that are strongly associated with the disease phenotypes.

TOP

LBR16 - Systematic Evaluation of the Prognostic Impact and Intratumour Heterogeneity of clear cell Renal Cell Carcinoma biomarkers
Date: Tuesday, July 15, 2:30 p.m. - 2:55 p.m.Room: 306

Author(s):
SAKSHI GULATI, Cancer Research UK - London Research Institute, United Kingdom
Marco Gerlinger, Cancer Research UK - London Research Institute, United Kingdom
Charles Swanton, Cancer Research UK - London Research Institute, United Kingdom
Paul A Bates, Cancer Research UK - London Research Institute, United Kingdom

Session Chair: Morris Quaid
Abstract

Prediction of prognosis in clear cell Renal Cell Carcinoma (ccRCC) patients currently relies on clinical parameters such as tumour stage and grade. Multiple molecular predictors have been published, but the majority of them have not been validated.

TOP

LBR17 - Utilizing Docking Score Distributions to Identify Novel Protein-Drug Interactions
Date: Tuesday, July 15, 3:00 p.m. - 3:25 p.m.Room: 306

Author(s):
Ariel Feiglin, Bar-Ilan University, Israel
Olga Leiderman, Bar-Ilan University, Israel
Ron Unger, Bar-Ilan University, Israel
Yanay Ofran, Bar-Ilan University, Israel

Session Chair: Morris Quaid
Abstract

Predicting whether a given protein and drug interact, is an important yet greatly unresolved goal. We introduce a fast and computationally inexpensive approach for determining whether proteins and drugs bind each other. This is accomplished by training a machine learning algorithm to differentiate between docking results of real protein-drug pairs and docking results of pairs that do not interact. The features used for training include structural and biophysical features of specific poses. However, the “secret ingredient”, is the use of features derived from the distribution of the docking scores across all proposed binding modes for a given protein-drug pair. We used this approach to identify real protein-drug interactions from a pool of 488 real complexes and 194,770 presumably false ones with precision of 0.6 (i.e. 60% of the predicted interactions were true) at a recall of 0.2. This is >500 fold better than random and >30 fold better than the precision that would be obtained by using only the docking score of the best pose. Applying this method to a large dataset of proteins and FDA approved drugs, we identified novel protein-drug interactions and validated them experimentally. We also show that our predicted interactions are significantly enriched in a large dataset of known protein-drug interactions.

TOP

LBR18 - Interactomics: Computational Analysis of Novel Drug Opportunities
Date: Tuesday, July 15, 3:30 p.m. - 3:55 p.m.Room: 306

Author(s):
GAURAV CHOPRA, University of California, San Francisco, United States
Ram Samudrala, University of Washington, United States

Session Chair: Morris Quaid
Abstract
We have developed a Computational Analysis of Novel Drug Opportunities (CANDO) platform funded by a 2010 NIH Director’s Pioneer Award that analyses compound-proteome interaction signatures to determine drug behaviour, in contrast to traditional single (or few) target approaches. The platform uses similarity of interaction signatures across all proteins as indicative of similar functional behaviour and nonsimilar signatures for off- and anti-target (side) effects, in effect inferring homology of compound/drug behaviour at a proteomic level. This results in an interaction matrix between 3,733 human FDA approved drugs and supplements × 48,278 proteins using our hierarchical chem- and bio-informatic fragment-based docking with dynamics protocol (>500 million predicted interactions total) with a benchmarking success for over 650 indications/diseases. Using signatures we ranked compounds for existing indications and prospectively validate “high value” predictions in vitro, in vivo, and by clinical studies for more than twenty indications , including dental caries, dengue, tuberculosis, ovarian cancer, cholangiocarinomas, among many others, and 49/82 validations done thusfar show better activity to an existing drug or micromolar inhibition at the cellular level that may serve as novel repurposeable therapies . Our approach is applicable to any compound, includes models to enable personalization foreshadowing a new era of faster, safer, better and cheaper drug discovery.
TOP