Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

UPCOMING DEADLINES & NOTICES

  • Last day for presenting and poster authors to complete registration *no extensions*
    GLBIO 2024
    April 22, 2024
  • Late poster submissions open (posters only)
    ISMB 2024
    April 22, 2024
  • Talks and posters submissions deadline
    ECCB 2024
    April 23, 2024
  • Registration deadline for organisers and speakers
    ECCB 2024
    April 30, 2024
  • Last day to upload ANY/ALL files to the virtual Platform
    GLBIO 2024
    May 06, 2024
  • Acceptance notification for talks and posters
    ECCB 2024
    May 08, 2024
  • Tech track proposal deadline (closes earlier if capacity is reached)
    ISMB 2024
    May 10, 2024
  • Early bird registration opens
    APBJC 2024
    May 10, 2024
  • Talk and/or poster acceptance notifications
    ISMB 2024
    May 13, 2024
  • Conference fellowship invitations sent for early abstract accepted talks and posters
    ISMB 2024
    May 13, 2024
  • (Conditional) Acceptance notification for proceedings
    ECCB 2024
    May 15, 2024
  • Registration deadline for talk presenting authors
    ECCB 2024
    May 15, 2024
  • CAMDA extended abstracts deadline
    ISMB 2024
    May 20, 2024
  • Late poster submissions deadline
    ISMB 2024
    May 20, 2024
  • Conference fellowship application deadline
    ISMB 2024
    May 20, 2024
  • Revised paper deadline
    ECCB 2024
    May 25, 2024
  • Tech track acceptance notification
    ISMB 2024
    May 31, 2024
  • Last day for discounted student hotel booking
    ISMB 2024
    May 27, 2024
  • Late poster acceptance notifications
    ISMB 2024
    May 28, 2024
  • CAMDA acceptance notification
    ISMB 2024
    May 30, 2024
  • Complete workshop/tutorial programme with speakers and schedule online
    ECCB 2024
    May 30, 2024
  • Conference fellowship acceptance notification
    ISMB 2024
    May 31, 2024
  • Tech track presentation schedule posted
    ISMB 2024
    May 31, 2024
  • Final acceptance notification for proceedings
    ECCB 2024
    May 31, 2024

Upcoming Conferences

A Global Community

  • ISCB Student Council

    dedicated to facilitating development for students and young researchers

  • Affiliated Groups

    The ISCB Affiliates program is designed to forge links between ISCB and regional non-profit membership groups, centers, institutes and networks that involve researchers from various institutions and/or organizations within a defined geographic region involved in the advancement of bioinformatics. Such groups have regular meetings either in person or online, and an organizing body in the form of a board of directors or steering committee. If you are interested in affiliating your regional membership group, center, institute or network with ISCB, please review these guidelines (.pdf) and send your exploratory questions to Diane E. Kovats, ISCB Chief Executive Officer (This email address is being protected from spambots. You need JavaScript enabled to view it.).  For information about the Affilliates Committee click here.

  • Communities of Special Interest

    Topically-focused collaborative communities

  • ISCB Member Directory

    Connect with ISCB worldwide

  • Green ISCB

    Environmental Sustainability Effort

  • Equity, Diversity, and Inclusion

    ISCB is committed to creating a safe, inclusive, and equal environment for everyone

Professional Development, Training, and Education

ISCBintel and Achievements

TOP 10 PAPERS READING LIST 2013 - 2014

AS SELECTED AT RSG 2014

Top 10 papers reading list for 2013-14 - Nominate now


  1. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position Buenrostro et al. (LA: Greenleaf), Nat Methods. 2013 Dec;10(12):1213-8.

  2. Single-molecule dynamics of enhanceosome assembly in embryonic stem cells Chen et al. (LA: Liu) Cell. 2014 Mar 13;156(6):1274-85.

  3. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals Battle et al. (LA: Koller) Genome Res. 2014 Jan;24(1):14-24.

  4. Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data Li et al. (LA: Brenner) Genome Res. 2014 Jul;24(7):1086-101.

  5. Determination and inference of eukaryotic transcription factor sequence specificity Weirauch et al (LA: Hughes) Cell. 2014 Sep 11;158(6):1431-43.

  6. A community effort to assess and improve drug sensitivity prediction algorithms Costello et al. (LA: Stolovitzky) Nat Biotechnol. 2014 Jun 1. doi: 10.1038/nbt.2877.

  7. Coregulation of transcription factor binding and nucleosome occupancy through DNA features of mammalian enhancers Barozzi et al. (LA: Natoli) Mol Cell 2014 Jun 5;54(5):844-57

  8. Enhancer loops appear stable during development and are associated with paused polymerase Ghavi-Helm et al. (LA: Furlong) Nature. 2014 Aug 7;512(7512):96-100.

  9. TFBSshape: a motif database for DNA shape features of transcription factor binding sites Yang et al. (LA: Rohs) Nucleic Acids Res. 2014 Jan;42(Database issue):D148-55.

  10. Genotype-environment interactions reveal causal pathways that mediate genetic effects on phenotype Gagneur et al. (LA: Steinmetz) PLoS Genet. 2013;9(9):e1003803.

top

 

KEYNOTE SPEAKERS' ABSTRACTS & BIOGRAPHIES

Updated Nov 6, 2014


DREAM Challenges


Andrea Califano
Andrea Califano
Clyde and Helen Wu Professor of Chemical Systems Biology
Columbia University
New York, United States
https://systemsbiology.columbia.edu/faculty/andrea-califano



Title: From Functional Regulatory Modules to the Genetic Determinants of Human Disease

Abstract:
Identification of driver mutations in human diseases is often limited by cohort size and availability of appropriate statistical models. We propose a novel framework for the systematic discovery of genetic alterations that are causal determinants of disease, by prioritizing genes upstream of functional disease drivers, within regulatory networks inferred de novo from experimental data. We tested this framework by identifying the genetic determinants of the mesenchymal subtype of glioblastoma. Our analysis uncovered KLHL9 deletions as upstream activators of two previously established master regulators of the subtype, C/EBPβ and C/EBPδ. Rescue of KLHL9 expression induced proteasomal degradation of C/EBP proteins, abrogated the mesenchymal signature, and reduced tumor viability in vitro and in vivo. Deletions of KLHL9 were confirmed in >50% of mesenchymal cases in an independent cohort, thus representing the most frequent genetic determinant of the subtype. The method generalized to study other human diseases, including breast cancer and Alzheimer’s disease.

Biography: Andrea Califano is director of the Columbia University Department of Systems Biology, director of the JP Sulzberger Columbia Genome Center, director of the Center for Multiscale Analysis of Genetic Networks (MAGNet), and associate director of bioinformatics at Columbia University’s Irving Cancer Research Center. He also currently serves as a member of the Board of Scientific Advisors of the National Cancer Institute.

In collaboration with colleagues in the Columbia scientific community, the Califano Lab was the first to publish fully context-specific molecular interaction networks (interactomes) for normal and tumor cells in humans, including neoplastic malignancies of lymphoma and glioma subtypes. Ongoing projects in the lab aim to define the regulatory networks for neoplasmic states of the breast, ovary, prostate, germ cell, colon, and lung; for the study of pluripotency and lineage differentiation in stem cells; and for the mechanisms associated with the onset and progression of neurodegenerative diseases.

Dr. Califano is a principal investigator on several major center of excellence grants, including MAGNet, the Library of Integrated Network-Based Cellular Signatures (LINCS), and the Cancer Target Discovery and Development Center (CTD2).

...............................................................................................................................

William C. HahnWilliam C. Hahn
Chief, Division of Molecular and Cellular Oncology
Deputy Chief Scientific Officer
Department of Medical Oncology
Dana-Farber Cancer Institute
Boston, United States
http://research4.dfci.harvard.edu/hahnlab/


Title: Systematic functional genomics and cancer

Abstract:
Large scale characterization of cancer genomes has led to information regarding the identify, number, and types of alterations found in human tumors. However, the large number of mutations identified to date require complementary approaches to understand the function of these mutated genes and to elucidate pathways involved in cancer initiation and progression. Over the past several years, we have developed genome scale approach to perform loss of function and gain of function somatic cell genetics to perform a systematic evaluation of genes involved in cancer initiation and maintenance. We have used these tools to dissect known and novel pathways involved in malignant transformation. In particular, we have dissected the signaling pathways related to beta catenin and KRAS, two well-known cancer drivers that have proven difficult to target therapeutically. These approaches have identified new components of these pathways, which may provide a foundation for therapeutic strategies.

Biography: Dr. William C. Hahn is a medical oncologist in the Department of Medical Oncology at the Dana-Farber Cancer Institute and a Senior Associate Member of the Broad Institute of MIT and Harvard. He co-directs the Center for Cancer Genome Discovery, is the Chief of the Division of Molecular and Cellular Oncology and is the Deputy Chief Scientific Officer at the Dana-Farber Cancer Institute.

Dr. Hahn has made numerous seminal discoveries that have informed our current molecular understanding of cancer and which have defined new conceptual paradigms and formed the foundation of new translational studies. His laboratory has helped develop widely adopted experimental models and genome-scale tools, all of which he has made openly available to the research community. Dr. Hahn and his colleagues helped demonstrate that activation of the reverse transcriptase telomerase plays an essential role in malignant transformation. This observation provided the means to create novel experimental model systems to identify and characterize the cooperative genetic interactions that lead to malignant transformation. Together with his colleagues at the Broad Institute, he helped develop genome scale tools and technology (RNAi and open reading frame collections) to perform somatic cell genetics in human cells. His laboratory has pioneered the use of integrated functional genomic approaches to identify and validate cancer targets. Using these approaches, his laboratory has discovered several new oncogenes including IKBKE, CRKL, CDK8 and SOX2 as well as targets (TBK1 and CYCLOPS genes) that are essential in specific genetic contexts, which will pave the way for new therapeutic approaches. The tools, models and approaches that his laboratory has developed have already become widely used worldwide to discover and validate molecularly targeted cancer therapies. Dr. Hahn and his collaborators are now engaged in clinical trials testing whether inhibition both of these new oncogenes or synthetic lethal partners will lead to clinical responses.

...............................................................................................................................

Lincoln D. SteinLincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
Toronto, Canada
http://oicr.on.ca/person/oicr-investigator/lincoln-stein



Title: The Future of Genomic Databases

Abstract: One of the most enlightened aspects of the modern era of biological research is the idea that the experimental data sets that underlie published data must be available for use by the research community for the purposes of replication and extension. Nowhere has this been truer than in genomics, where raw nucleotide sequencing data is routinely deposited into open community databases. However, the revolution in "next generation" sequencing technologies is straining both the capacity of the nucleotide databases as well as the network bandwidth required to download and compute on it. In this talk, I will discuss the promise of cloud computing for archiving, distributing and computing over large genomics datasets. I will describe the technological, ethical and policy hurdles that affect genomic cloud computing, and discuss lessons learned from the cloud-based genomics projects that I have participated in.

Biography: Dr. Lincoln Stein leads the OICR's Informatics and Bio-computing Program, which undertakes the management and analysis of large integrative cancer research projects including the International Cancer Genome Consortium (ICGC) and its Data Coordination Centre. His research focuses on using network and pathway-based analysis to identify common mechanisms in multiple cancer types and to devise prognostic and predictive signatures to aid in patient management. In addition, his group works on problems relating to the management of, and access to, cancer genomes and other large biomedical data sets.

Regulatory Genomics


Nancy Cox

Nancy Cox
Department of Human Genetics
The University of Chicago, United States
http://genes.uchicago.edu/contents/faculty/cox-nancy.html




Title:
Systems Approaches to Integration of Regulatory Information for Characterizing Genome Variation Affecting Complex Human Traits

Abstract: Although it has been effective to layer information on regulatory variation, such as expression quantitative trait loci (eQTLs), on data from association studies to inform our understanding of genome variation affecting complex human traits, I argue here that more comprehensive efforts at data integration and systems based approaches can provide more useful information. I will describe several approaches we are using to integrate regulatory variation directly into analysis of complex traits and illustrate the application of these systems approaches to disease and drug response phenotypes.

Biography: Nancy Cox is a quantitative human geneticist with a research program based on identifying and characterizing genome variation affecting complex human traits and common diseases. Recent research has focused on integration of regulatory variation into studies of genome variation affecting risk of common disease and complex human traits. In addition to working in the GTEx Consortium since its inception, Dr. Cox is active in research in pharmacogenomics, type 2 diabetes and diabetic complications, breast cancer, and neuropsychiatric phenotypes including Tourette Syndrome, obsessive compulsive disorder, autism, schizophrenia, and bipolar disorder.

...............................................................................................................................

Ellen RothenbergEllen Rothenberg

Albert Billings Ruddock Professor of Biology
California Institute of Technology
Pasadena, United States
www.bbe.caltech.edu/content/ellen-rothenberg



Title: Genome-wide Transcriptional Machinery of a Cell-fate Choice: The Early T-cell Pathway

Abstract: The stages through which a multipotent precursor becomes committed to a T-cell fate have been well characterized for a decade, but the underlying transcriptional machinery is only now becoming clear. Distinct groups of transcriptional factors dominate successive stages of early T cell development, maintaining metastable network links but letting new regulators accumulate that eventually drive the cells to the next stages. The presentation will describe the roles of several key regulators that control these state switches.

Biography: Ellen Rothenberg earned a bachelor's degree in Biochemical Sciences from Harvard University in 1972 (summa cum laude). She received her Ph.D. in 1977 from Massachusetts Institute of Technology, carrying out her thesis research in the laboratory of David Baltimore. In 1977 she began her postdoctoral research in immunogenetics at Memorial Sloan-Kettering Cancer Center with Edward Boyse. In 1979, she became Assistant Research Professor at The Salk Institute for Biological Studies, Department of Cancer Biology. In 1982, she moved to the California Institute of Technology, Division of Biology, where she was promoted to Associate Professor of Biology in 1988, Professor of Biology in 1994, and the Albert Billings Ruddock Professor of Biology in 2007. She has won several teaching awards at Caltech and has also taught internationally in advanced courses on immunology, developmental biology, and gene regulatory networks. She is a member of several institutional Scientific Advisory Boards for US and international institutes and is a current or past member of the Editorial Boards of several prominent immunology journals. She has also served on Program, Award, and Nominating committees for the American Association of Immunologists and on grant review panels for the US Government agencies (NIH and NASA) and three private foundations, and she has founded or served on the organizing committees for multiple international conferences in immunology and systems biology. Her group’s research is at the interface of immunology, stem cell developmental biology, systems biology, and genomics.

...............................................................................................................................

Jay ShendureJay Shendure

Associate Professor
Genome Sciences
University of Washington
Seattle, United States
www.gs.washington.edu/faculty/shendure.htm


Title: Next Generation Functional Analysis of Genetic Variants

Abstract: Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation, particularly in non-coding regions. To address this challenge, we are developing computational methods for objective, data-rich, and quantitative integration of genomic annotations, as well as new experimental frameworks for multiplex empirical measurement of the functional effects of both coding and non-coding variation, e.g. massively parallel reporter assays. In this talk, I will describe recent progress from our lab in these areas as well as the more general progress of the field towards measuring, modeling and predicting the functional consequences of genetic variants.

Biography: Jay Shendure is an Associate Professor of Genome Sciences at the University of Washington. Dr. Shendure's 2005 PhD included one of the first successful demonstrations of massively parallel or next generation DNA sequencing. His research group in Seattle has made significant contributions to technologies including exome sequencing and its application to identify the basis of Mendelian disorders and autism spectrum disorders; genome-wide experimental haplotyping and its application to non-invasive whole genome sequencing of a human fetus; massively parallel functional analysis of cis-regulatory elements; and contact probability maps for de novo genome assembly. He is the recipient of the 2012 Curt Stern Award from the American Society of Human Genetics, the 2013 FEDERAprijs, a 2013 NIH Director's Pioneer Award, and the 2014 HudsonAlpha Life Sciences Prize.

...............................................................................................................................

Brendan J. FreyBrendan J. Frey
The Centre for Cellular and Biomolecular Research
University of Toronto, Canada
genes.toronto.edu



Title: Precision Medicine Using Computational Regulatory Models

Abstract: How should we assign a phenotype score to a genetic variant? Association studies and databases of labeled variants can be misleading, due to confounding factors and ascertainment biases. A very different approach is to derive computational regulatory models that can predict the causal effects of genetic variation on gene expression. We trained a computational model of splicing regulation that takes as input a DNA sequence and a tissue label, and outputs the distribution over spliced transcripts. By scoring variants by how much they are predicted to alter the transcript distribution, we identified previously unknown genetic determinants of autism, spinal muscular atrophy and cancers. An evaluation of over 650,000 variants, including deep intronic mutations and synonymous exonic mutations, reveals widespread splicing misregulation, especially among variations previously linked with disease. These results point toward a new, regulatory code-based era of precision medicine.

Biography: Brendan Frey is a Professor at the University of Toronto, with appointments in Engineering and Medicine. He conducts research in the fields of genome biology and machine learning, and is best known for his work on the splicing code, affinity propagation and factor graphs. Brendan holds the Canada Research Chair in Biological Computation, and is a Fellow of the Canadian Institute for Advanced Research, the American Institute for the Advancement of Science, and the Institute of Electrical and Electronic Engineers. He has received several distinctions, including the John C Polanyi Award, the EWR Steacie Felloship, and Canada’s Top 40 Leaders Under 40 Award. He has consulted for several industrial research and development laboratories in Canada, the United States and England, and he is currently on the Technical Advisory Board of Microsoft Research. His former students and postdoctoral fellows include professors, industrial researchers and developers at universities and industrial laboratories from across Canada, the United States and Europe.

...............................................................................................................................

Amos TanayAmos Tanay
Department of Computer Science and Applied Mathematics
The Weizmann Institute
Rehovot, Israel
http://compgenomics.weizmann.ac.il/tanay/



Title: Inferring Gene Regulation in the Single Cell Era

Abstract: Single cell methods for characterizing the transcriptional, epigenomic and chromosomal states of complex cell populations within tissues pose exciting challenges and opportunities for computational biologists. First, the new experiments take snapshots of individual regulatory states instead of averaging states over millions of cells and are therefore compatible with computational approaches to infer regulatory networks. Second, data on the epigenetic markup and three-dimensional conformations of functional elements within chromosomes generate strong mechanistic priors for model inference. Methodologies for characterizing cell states using single-cell datasets are therefore important, but their development is still challenging, as illustrated by recent experiment performed in our group.

Biography: Amos Tanay is an Associate Professor and Kimmel investigator in the department of Computer Science and the department of Biological Regulation at the Weizmann Institute. Amos’s background is in Mathematics, and he spent several years in the Israeli start-up industry before coming back to Tel-Aviv University and completing his PhD in Computational Biology. He did postdoctoral training at Rockefeller University and later established his own research group at Weizmann. The Tanay group is combining computational and experimental work to study genomic and epigenomic regulation at multiple scales, from the nucleotide level and up to the physical conformations of entire chromosomes. By developing quantitative, high-resolution experiments, the group explores how heterogeneous populations of single cells within tissues acquire, memorize, and later modify their functional states.

...............................................................................................................................

Gene YeoGene Yeo
Cellular and Molecular Medicine
University of California, San Diego
United States
http://yeolab.ucsd.edu/yeolab/Gene_Yeo.html



Title: Insights from transcriptome-wide RNA binding protein-RNA networks

Abstract: RNA binding proteins mediate post-transcriptional RNA regulation that when altered result in severe neurological diseases. I will discuss my lab's ongoing efforts in mapping these protein-RNA interactions and our process of studying how misregulation results in neurodegenerative disorders.

Biography: Dr. Gene Yeo is an expert in the area of RNA, genomics and neurological diseases. Dr. Yeo obtained a bachelor of science in chemical engineering and a bachelor of arts in economics from the University of Illinois, Urbana-Champaign (1998) and a masters degree in business administration from the Rady School of Management at the University of California, San Diego (2008). Funded by the prestigious Lee Kuan Yew Graduate Fellowship from Singapore, Dr. Yeo earned a Ph.D. in Computational Neuroscience (2005) from the Massachusetts Institute of Technology under the joint guidance of Dr. Tomaso Poggio and Dr. Christopher Burge. Using comparative genomics and statistical learning theory Dr. Yeo pioneered new computational approaches to attack the problem of splicing and splicing-mediated gene regulation. In 2005 Dr. Yeo was appointed the first Junior Fellow at the Crick-Jacobs Center for Theoretical and Computational Biology at the Salk Institute under the mentorship of Dr. Fred Gage and Dr. Sean Eddy. Dr. Yeo’s collaborative nature has generated successful projects and grants with experts in neuroscience and neurodegeneration (Dr. Fred Gage and Dr. Don Cleveland), RNA processing (Nobel Laureate Dr. Phillip Sharp, Dr. Manuel Ares, Jr, Dr. Brenton Graveley, Dr. Xiangdong Fu and Dr. Amy Pasquinelli) and virology (Dr. Deborah Spector). In late 2008, Dr. Yeo was appointed an Assistant Professor in the Department of Cellular and Molecular Medicine at UCSD. In 2011, Dr. Yeo was awarded the Alfred P Sloan Fellowship in recognition of his work in computational molecular biology. In 2014, Dr. Yeo was promoted with tenure to Associate Professor at UCSD. Since 2003, Dr. Yeo has authored over 60 peer-reviewed publications, invited book chapters and review articles in the areas of neurodegeneration, RNA processing, computational biology and stem cell models. Dr. Yeo has successfully authored 4 and co-authored 2 grants from the California Institute of Regenerative Medicine totaling $8.5 million. The National Institute of Health, ALS Association, Genentech and Roche Pharmaceuticals fund Dr. Yeo’s work. Dr. Yeo actively serves as a bioinformatics and business consultant to biotech and pharmaceutical companies, and has been involved with start-ups in the biotechnology space. Dr. Yeo is on the Editorial Board of the journals Cell Reports and Cell Research. Dr. Yeo is a Visiting Associate Professor at the National University of Singapore, an Adjunct Senior Research Scientist at the Genome Institute of Singapore and a visiting researcher at the Molecular Engineering Laboratory under Nobel Laureate Sydney Brenner’s auspices. Dr. Yeo was a Sword of Honor recipient (the highest honor) in Officer Cadet School in 1999 and has served in the Singapore Navy as a Naval officer.

Systems Biology



James R. HeathJames R. Heath
Elizabeth W. Gilloon Professor of Chemistry
California Institute of Technology
Pasadena, United States
www.its.caltech.edu/~heathgrp/Members.html


Title: Single Cell Functional Proteomics: Experimental Methods and Analysis Algorithms that Draw from the Physico-Chemical Laws

Abstract: Certain single cell biology tools, especially those that measure cellular function (e.g. functional proteins or metabolites), and are truly quantitative (molecules are measured in copy numbers per cell) can provide a conduit between the complexity of biology and the simplicity and predictive nature of the physicochemical laws. Using cancer cells and tumor models as an example, I will describe experimental approaches and associated computational algorithms for investigating cellular transitions, such as the chemically induced carcinogenesis transition in epithelial cells. The algorithms utilize statistical physics methods that are derived from thermodynamics considerations. I will extend this approach towards tools designed to anticipate the development of therapy resistance in tumor models, and to identify combination therapies that can avoid such resistance, leading to more favorable outcomes.

Biography: Jim Heath is the Elizabeth Gilloon Professor and Professor of Chemistry at Caltech, and Professor of Molecular and Medical Pharmacology at UCLA. He directs the National Cancer Institute funded NSB Cancer Center. He received his Ph.D. in 1988 from Rice University where he was the principle graduate student involved in the discovery of C60 and the fullerenes. He was a Miller Fellow at UC Berkeley before joining the research staff at IBM Watson Labs in 1991. He took a faculty position at UCLA in 1994, and moved to Caltech in 2003. He has received the Irving Weinstein Award from the AACR and the Sackler Prize in the Physical Sciences. He was named by Forbes in 2009 as one of the top 7 innovators in the world.

...............................................................................................................................
Garry NolanGarry Nolan
Baxter Laboratory in Genetic Pharmacology

Department of Microbiology and Immunology
Stanford University, United States
www.stanford.edu/group/nolan/



Title: High Parameter Single Cell Analysis define a Structured Immune System & Cancer Hierarchies


Abstract: High parameter single cell analysis has driven deep understanding of immune processes. Using a next-generation single-cell “mass cytometry” platform we quantify surface and cytokine or drug responsive indices of kinase target with 45 or more parameter analysis (e.g. 45 antibodies, viability, nucleic acid content, and relative cell size). We have recently extended this parameterization to mRNA with the capability to measure down to 5 molecules per cell in combination with any other set of previously created markers. I will present evidence of deep internal order in immune functionality demonstrating that differentiation and immune activities have evolved with a definable “shape”. This shape is altered during immune surveillance and “imprinted” during, and after, pathogen attack, traumatic injury, or auto-immune disease. Hierarchies of functionally defined trans-cellular modules are observed that can be used for mechanistic and clinical insights. Similarly, such order can be discerned in cancer, and the boundary conditions for cancers can be readily defined. Approaches such as those presented for single cell proteomics will eventually be applicable to single cell genomics, and especially so when the latter technologies reach can also accomplish 50,000 to 10^6 cells analyzed per experiment.

Biography: Dr. Nolan is the Rachford and Carlota A. Harris Professor in the Department of Microbiology and Immunology at Stanford University School of Medicine. He trained with Leonard Herzenberg (for his Ph.D.) and Nobelist Dr. David Baltimore (for postdoctoral work for the first cloning/characterization of NF-B p65/RelA and the development of rapid retroviral production systems). He has published over 180 research articles and is the holder of 17 US patents, and has been honored as one of the top 25 inventors at Stanford University.

Dr. Nolan is the first recipient of the Teal Innovator Award (2012) from the Department of Defense (a $3.3 million grant for advanced studies in ovarian cancer), the first recipient of an FDA BAAA, for “Bio-agent protection” grant, $3million, from the FDA for a “Cross-Species Immune System Reference”, and received the award for “Outstanding Research Achievement in 2011” from the Nature Publishing Group for his development of CyTOF applications in the immune system.

Dr. Nolan is an outspoken proponent of translating public investment in basic research to serve public welfare. Dr. Nolan was the founder of Rigel Inc. (NASDAQ: RIGL), and Nodality, Inc., a diagnostics development company and serves on the Boards of Directors of several companies, as well as consults for other biotechnology companies. DVS Sciences, on which he was Chair of the Scientific Advisory Board, recently sold to Fluidigm for $207 million dollars (1/2014) on an investment of $14 million.

His areas of research include hematopoiesis, cancer and leukemia, autoimmunity and inflammation, and computational approaches for network and systems immunology. Dr. Nolan’s most recent efforts are focused on a single cell analysis advance using a mass spectrometry-flow cytometry hybrid device, the so-call “CyTOF”. The approach uses an advanced ion plasma source to determine the levels of tagged reagents bound to cells—enabling a vast increase in the number of parameters that can be measured per cell. His laboratory has already begun a large scale mapping of the hematopoietic hierarchy in healthy human bone marrow at an unprecedented level of detail. Dr. Nolan’s efforts are to enable a deeper understanding not only of normal immune function, trauma, and other inflammatory events but also detailed substructures of leukemias and solid cancers—which will enable wholly new understandings that will enable better management of disease and clinical outcomes.

------------------------------------------------------------------------------------------------------------

Our areas of research include hematopoiesis, cancer and leukemia, autoimmunity and inflammation, and computational approaches for network and systems immunology. Our most recent efforts are focused on a single cell analysis advance using a mass spectrometry-flow cytometry hybrid device, the so-call “CyTOF”. The approach uses an advanced ion plasma source to determine the levels of tagged reagents bound to cells—enabling a vast increase in the number of parameters that can be measured per cell. His laboratory has already begun a large scale mapping of the hematopoietic hierarchy in healthy human bone marrow at an unprecedented level of detail. Dr. Nolan’s efforts are to enable a deeper understanding not only of normal immune function, but also detailed substructures of leukemias and solid cancers as well as autoimmunity and pathogen effects upon the immune system.

...............................................................................................................................

Dana Pe'erDana Pe'er
Department of Biological Sciences and Systems Biology

Columbia University

New York, United States

www.c2b2.columbia.edu/danapeerlab/html/


Title: Computational Dissection of Phenotypic and Functional Heterogeneity in Cancer

Abstract: Cells within a single tumor are known to display extensive phenotypic and functional heterogeneity. Many life-threatening features of cancer, including drug resistance, metastasis and relapse, are facets of intratumor heterogeneity. With emerging single-cell measurement technologies, the field is poised to make important strides in understanding and controlling this heterogeneity. However, these technologies require advances in analytical methods to interpret the complex data they produce.

Using mass cytometry, which measures single cells in ~31 simultaneous proteomic features, we developed novel methods for analyzing phenotypic heterogeneity in cancer. We use AML as an example to demonstrate the power of our approach. The heart of our approach is Phenograph, a graph-based representation of single-cells which represents the phenotypic structure of the sample and can be partitioned into subsets of densely interconnected nodes, called communities. Using Phenograph, we deconstructed several AML samples into discrete phenotypes. Analyzing the resulting subpopulations provided insights into functional heterogeneity of AML . Phenograph can be applied to characterize heterogeneity and primitive subpopulations in additional cancers.

Biography: Dana Pe’er is an associate professor in the Departments of Biological Sciences and Systems Biology. Our team develops computational methods that integrate diverse high-throughput data to provide a holistic, systems-level view of molecular networks. Currently we have two key focuses: developing computational methods to interpret single cell data and understand cellular heterogeneity; Modeling how genetic and epigenetic variation alters regulatory network function and subsequently phenotype in health and disease.

This path has led us to explore how systems biology approaches can be used to personalize cancer care. Dana is recipient of the Burroughs Wellcome Fund Career Award, NIH Directors New Innovator Award, NSF CAREER award, Stand Up To Cancer Innovative Research Grant and a Packard Fellow in Science and Engineering. Dana received the ISCB 2014 Overton Award at its annual ISMB Conference.

...............................................................................................................................

Ilya ShmulevichIlya Shmulevich
Institute for Systems Biology
Seattle, United States
http://shmulevich.systemsbiology.net/




Title: Integrative Analysis of Data from The Cancer Genome Atlas

Abstract: The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to improve our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. I will describe our efforts, as a Genome Data Analysis Center within TCGA, to integrate the highly heterogeneous molecular and clinical data collected from thousands of cancer patients spanning over 30 tumor types. This work involves the identification of statistical associations in the data and the development of web-based tools to interactively explore these associations. We further integrate the interdependencies in the data with other information from public biomedical resources by constructing and analyzing large heterogeneous graphs. These analyses are helping to accelerate the scientific progress of the disease working groups in TCGA and are providing unprecedented opportunities to use these comprehensive data sets for clinical and therapeutic applications, with the ultimate goal of improving our ability to diagnose, treat and prevent cancer.

Biography: Ilya Shmulevich received his Ph.D. in Electrical and Computer Engineering from Purdue University, West Lafayette, IN, in 1997. From 1997-1998, he was a postdoctoral researcher at the Nijmegen Institute for Cognition and Information at the University of Nijmegen and National Research Institute for Mathematics and Computer Science at the University of Amsterdam in The Netherlands, where he studied computational models of music perception and recognition. In 1998-2000, he worked as a senior researcher at the Tampere International Center for Signal Processing at the Signal Processing Laboratory in Tampere University of Technology, Tampere, Finland. From 2001-2005, he was an Assistant Professor at the Cancer Genomics Laboratory in the Department of Pathology at The University of Texas M. D. Anderson Cancer Center and an Adjunct Professor in the Department of Statistics in Rice University. Presently, he is a Professor at The Institute for Systems Biology, where he directs a Genome Data Analysis Center that is part of The Cancer Genome Atlas (TCGA) project. He is an Affiliate Professor in the Departments of Bioengineering and Electrical Engineering at the University of Washington, Department of Signal Processing in Tampere University of Technology, Finland, and Department of Electronic and Electrical Engineering in Strathclyde University, Glasgow, UK. He is an Associate Editor of EURASIP Journal on Bioinformatics and Systems Biology and a Senior Member of the IEEE. His research interests include systems biology, nonlinear signal and image processing, and computational learning theory.


[top]

ACCEPTED PAPERS

Updated Oct 28, 2014


The following papers will be presented as talks during the conference.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A validated gene regulatory network and GWAS to identify early transcription factors in T-cell associated diseases

Mika Gustafsson1, Danuta Gawel1, Sandra Hellberg1, Aelita Konstantinell1, Daniel Eklund1, Jan Ernerudh1, Antonio Lentini1, Robert Liljenström1, Johan Mellergård1, Hui Wang2, Colm E. Nestor1, Huan Zhang1 and Mikael Benson1

1Linköpings Univeristet, 2MD Anderson Cancer Center

The identification of early regulators of disease is important for understanding disease mechanisms, as well as finding candidates for early diagnosis and treatment. Such regulators are difficult to identify because patients generally present when they are symptomatic, after early disease processes. Here, we present an analytical strategy to systematically identify early regulators by combining gene regulatory networks (GRNs) with GWAS. We hypothesized that early regulators of T-cell associated diseases could be found by defining upstream transcription factors (TFs) in T-cell differentiation. Time-series expression profiling identified upstream TFs of T-cell differentiation into Th1/Th2 subsets enriched for disease associated SNPs identified by GWAS. We constructed a Th1/Th2 GRN based on integration of expression, DNA methylation profiling and sequence-based predictions data using LASSO algorithm. The GRN was validated by ChIP-seq and siRNA knockdowns. GATA3, MAF and MYB were prioritized based on GWAS and the number of GRN predicted targets. The disease relevance was supported by differential expression of the TFs and their targets in profiling data from six T-cell associated diseases. We tested if the three TFs or their splice variants changed early in disease by exon profiling of two relapsing diseases, namely multiple sclerosis and seasonal allergic rhinitis. This showed differential expression of splice variants of the TFs during relapse-free asymptomatic stages. Potential targets of the splice variants were validated based on expression profiling and siRNA knockdowns. Those targets changed during symptomatic stages. Our results show that combining construction of GRNs with GWAS can be used to infer early regulators of disease.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Are all genetic variants in DNase I sensitivity regions functional?

Gregory A. Moyerbrailean1, Chris T. Harvey1, Cynthia A. Kalita1, Xiaoquan Wen2, Francesca Luca1, Roger Pique-Regi1

1Wayne State University, 2University of Michigan

A detailed mechanistic understanding of the direct functional consequences of DNA variation on gene regulatory mechanism is critical for a complete understanding of complex trait genetics and evolution. Here, we present a novel approach that integrates sequence information and DNase I footprinting data to predict the impact of a sequence change on transcription factor binding. Applying this approach to 653 DNase-seq samples, we identified 3,831,862 regulatory variants predicted to affect active regulatory elements for a panel of 1,372 transcription factor motifs. Using QuASAR, we validated the non-coding variants predicted to be functional by examining allele-specific binding (ASB). Combining the predictive model and the ASB signal, we identified 3,217 binding variants within footprints that are significantly imbalanced (20% FDR). Even though most variants in DNase I hypersensitive regions may not be functional, we estimate that 56% of our annotated functional variants show actual evidence of ASB. To assess the effect these variants may have on complex phenotypes, we examined their association with complex traits using GWAS and observed that ASB-SNPs are enriched 1.22-fold for complex traits variants. Furthermore, we show that integrating footprint annotations into GWAS meta-study results improves identification of likely causal SNPs and provides a putative mechanism by which the phenotype is affected.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A scalable method for molecular network reconstruction identifies properties of targets and mutations in acute myeloid leukemia

Edison Ong1, Anthony Szedlak2, Yunyi Kang, Peyton Smith1, Nicholas Smith1, Madison McBride3, Darren Finlay3, Kristiina Vuori3, James Mason4, Edward D. Ball5, Carlo Piermarocchi2, Giovanni Paternostro3

1Salgomed, 2Michigan State University, 3Sanford-Burnham Medical Research Institute, 4Scripps Health, San Diego, 5University of California, San Diego

A key aim of systems biology is the reconstruction of molecular networks. However, we do not yet have networks that integrate information from all datasets available for a particular clinical condition. This is in part due to the limited scalability, in terms of required computational time and power, of existing algorithms. Network reconstruction methods should also be scalable in the sense of allowing scientists from different backgrounds to efficiently integrate additional data.

We present a network model of acute myeloid leukemia (AML). In the current version (AML 2.1) we have used gene expression data (both microarray and RNA-seq) from five different studies comprising a total of 771 AML samples and a protein-protein interactions dataset. Our scalable network reconstruction method is in part based on the well-known property of gene expression correlation among interacting molecules. The difficulty of distinguishing between direct and indirect interactions is addressed by optimizing the coefficient of variation of gene expression, using a validated gold standard dataset of direct interactions. Computational time is much reduced compared to other network reconstruction methods. A key feature is the study of the reproducibility of interactions found in independent clinical datasets.

An analysis of the most significant clusters, and of the network properties (intraset efficiency, degree, betweenness centrality, and PageRank) of common AML mutations demonstrated the biological significance of the network. A statistical analysis of the response of blast cells from eleven AML patients to a library of kinase inhibitors provided an experimental validation of the network. A combination of network and experimental data identified CDK1, CDK2, CDK4, CDK6, and other kinases as potential therapeutic targets in AML.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A cell lineage-specific regulatory network inferred using limited expression data of erythropoiesis

Fan Zhu1, Lihong Shi1, James Engel1, Yuanfang Guan1

1University of Michigan

Modeling regulatory networks using expression data observed in a differentiation process may help identify context-specific interactions. Despite intensive research efforts on this topic, the outcome of the current algorithms highly depends on the quality and quantity of a single time-course data, and the performance may be compromised for data with a limited number of samples. In this work, we report a novel multi-layer graphical model that is capable of leveraging heterogeneous, generic, publicly available time-course datasets, as well as limited cell lineage-specific data to model regulatory networks specific to a differentiation process. First, a collection of network inference methods are used to predict the regulatory relationships in individual datasets. Then, the inferred relationships are weighted and integrated together by evaluating against the cell lineage-specific data. To test the accuracy of this algorithm, we collected a time-course RNA-Seq dataset during human erythropoiesis to infer regulatory relationships specific to this differentiation process. The resulting erythroid-specific regulatory network reveals novel regulatory relationships activated in erythropoiesis, which were further validated by genome-wide TR4 binding studies using ChIP-seq. These erythropoiesis-specific regulatory relationships were not identifiable by single dataset-based methods or context-independent integrations. Analysis of the predicted targets reveals that they are all closely associated with hematopoietic lineage differentiation. In summary, this paper develops an integrative strategy that is capable of leveraging a limited, cell type-specific expression dataset and large-scale, generic time-course datasets to infer regulatory networks specific to a differentiation process, which is applicable to other cell lineages.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


cDREM: inferring dynamic combinatorial gene regulation

Aaron Wise1, Ziv Bar-Joseph1

1Carnegie Mellon University

Motivation: Genes are often combinatorially regulated by multiple transcription factors (TFs). Such combinatorial regulation plays an important role in development and facilitates the ability of cells to respond to different stresses. While a number of approaches have utilized sequence and ChIP based datasets to study combinational regulation, these have often ignored the combinational logic and the dynamics associated with such regulation.

Results: Here we present cDREM, a new method for reconstructing dynamic models of combinatorial regulation. cDREM integrates time series gene expression data with (static) protein interaction data. The method is based on a hidden Markov model and utilizes the sparse group Lasso to identify small subsets of combinatorially active TFs, their time of activation and the logical function they implement. We tested cDREM on yeast and human data sets. Using yeast we show that the predicted combinatorial sets agree with other high throughput genomic datasets and improve upon prior methods developed to infer combinatorial regulation. Applying cDREM to study human response to flu we were able to identify several combinatorial TF sets, some of which were known to regulate immune response while others represent novel combinations of important TFs.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Multi-species network inference improves gene regulatory network reconstruction for early embryonic development in Drosophila

Anagha Joshi1, Yvonne Beck1, Tom Michoel1

1The Roslin Institute, University of Edinburgh

Gene regulatory network inference uses genome-wide transcriptome measurements in response to genetic, environmental or dynamic perturbations to predict causal regulatory influences between genes. We hypothesized that evolution also acts as a suitable network perturbation and that integration of data from multiple closely related species can lead to improved reconstruction of gene regulatory networks. To test this hypothesis, we predicted networks from temporal gene expression data for 3,610 genes measured during early embryonic development in six Drosophila species, and compared predicted networks to gold standard networks of ChIP-chip and ChIP-seq interactions for developmental transcription factors in five species. We found that (i) the performance of single-species networks was independent of the species where the gold standard was measured; (ii) differences between predicted networks reflected the known phylogeny and differences in biology between the species; (iii) an integrative consensus network which minimized the total number of edge gains and losses with respect to all single-species networks performed better than any individual network. Our results show that in an evolutionarily conserved system, integration of data from comparable experiments in multiple species improves the inference of gene regulatory networks. They provide a basis for future studies on the numerous multi-species gene expression datasets for other biological processes available in the literature.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Reconstruction of gene regulatory networks based on repairing sparse low-rank matrices

Young Hwan Chang1, Roel Dobbe1, Palak Bhushan1, Joe W. Gray2, Claire J. Tomlin1

1University of California, Berkeley, 2Oregon Health and Science University

With the growth of high-throughput proteomic data, in particular time series gene expression data from various perturbations, a general question that has arisen is how to organize inherently heterogenous data into meaningful structures. Since biological systems such as breast cancer tumors respond differently to various treatments, little is known about exactly how these gene regulatory networks (GRNs) operate under different stimuli. For example, when we apply a drug-induced perturbation to a target protein, we often only know that the dynamic response of the specific protein may be affected. We do not know by how much, how long and even whether this perturbation affects other proteins or not. Challenges due to the lack of such knowledge not only occur in modeling the dynamics of a GRN but also cause bias or uncertainties in identifying parameters or inferring the GRN structure. This paper describes a new algorithm which enables us to estimate bias error due to the effect of perturbations and correctly identify the common graph structure among biased inferred graph structures. To do this, we retrieve common dynamics of GRN subject to various perturbations. We refer to the task as “repairing” inspired by “image repairing” in computer vision. The method can automatically correctly repair the common graph structure across perturbed GRNs, even without precise information about the effect of the perturbations. We evaluate the method on synthetic data sets and demonstrate advantages over l1-regularized graph inference by advancing our understanding of how these networks respond across different targeted therapies.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Pathways on demand: automated reconstruction of human signaling networks

Anna Ritz1, Christopher Poirel1, Allison Tegge1, Nicholas Sharp1, Allison Powell1, Kelsey Simmons1, Shiv Kale1, T.M. Murali1

1Virginia Polytechnic Institute and State University

Signaling pathways are a cornerstone of systems biology. Several databases store representations of these pathways that are amenable for automated analyses. Despite painstaking manual curation, significant variations exist between databases. To overcome these limitations, we present PathLinker, a new computational method that can reconstruct a signaling pathway from a background protein interaction network given only the identities of the receptors and transcription factors and regulators in that pathway. We demonstrate that PathLinker can reconstruct the Wnt pathway in the NetPath database with much higher precision and recall than several state-of-the-art algorithms, recovering non-canonical branches that appear only in this pathway's representation in other databases. PathLinker suggests a surprising role for CFTR, a chloride ion channel transporter of the ABC class, in Wnt/beta-catenin signaling, which we validate using siRNA experiments. We extend our computational results to accurately reconstruct a comprehensive set of signaling pathways in the NetPath database. We demonstrate that PathLinker can bridge differing representations of the same pathway between databases.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Inferring the genome-wide functional modulatory network: a case study on the NF-κB/RelA transcription factor

Xueling Li1, Min Zhu2, Allan Brasier1, Andrzej Kudlicki1

1University of Texas Medical Branch at Galveston, 2Hefei Institutes of Physical Science, Chinese Academy of Sciences

How different pathways lead to the activation of a specific transcription factor with specific effects is not fully understood. A modulatory network is composed of triplets of a specific transcription factor, target genes and modulators. Modulators usually affect the activity of the specific transcription factor at the post-transcription level in a target gene-specific manner (action mode), which may be classified as enhancement, attenuation and inversion of the activation or inhibition. Reconstructing such modulatory network will help to interpret how transcription factors produce distinct gene responses to different stimuli. As a case study, here we inferred, from a large collection of expression profiles, all potential modulations of NF-κB/RelA. The predicted modulators include many proteins previously not reported as physically binding to RelA. The functions of the predicted modulators are consistent with biological activities of NF-κB/RelA include RNA processing, alternative splicing, cell cycle, mitochondrion, ubiquitin-dependent proteolysis and ribosome biogenesis, and are consistent with binding modulators in our previous study. The predicted genome-wide RelA modulators from different enriched pathways or processes exert specific prevalent action modes on distinct pathways through RelA. Also, the modulators from non coding RNA (ncRNA), RNA binding proteins, transcription factors, cytoskeleton, and kinases modulate the NF-κB/RelA activity with specific action modes consistent with their molecular functions and modulation level. Finally, we analyzed the modulatory network of NF-κB/RelA in the context of TGFB1 induced epithelial-mesenchymal transition (EMT). Here modulators of NF-κB/RelA included those involved in extracellular matrix (FBN1), cytoskeletal regulation (ACTN1) and tumor suppression (FOXP1).

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Systematic study of synthetic transcript features in S. cerevisiae exposes gene-expression determinants

Tuval Ben-Yehezkel1, Shimshi Atar2, Tzipy Marx1, Rafael Cohen1, Alon Diament2, Alexandra Dana2, Anna Feldman2, Ehud Shapiro1, Tamir Tuller2

1Weizmann Institute of Science, 2Tel Aviv University

A major challenge in functional genomics is understanding how different parts of the transcript affect aspects of its expression. Heterologous gene expression can potentially contribute to this research topic, but has rarely been studied systematically, specifically in eukaryotes. Here, we use a synthetic biology approach to study the distinct and causal effect of different parts of the transcript in the eukaryote S. cerevisiae. We generated three distinct reporter libraries of the viral HRSVgp04 gene for studying the effect of three distinct regions in the transcript; (1) the 5'UTR, (2) the first 40 codons, and (3) codons 42-81 of the ORF. Each of the three libraries contained variants with multiple, rationally designed synonymous mutations, totaling 383 distinct variants tested individually for gene expression. Our results show that while synonymous mutations in each of the three regions can have a dramatic effect on protein abundance, those closer to the 5’end of the ORF are the most effective modulators of protein abundance. Additionally, while weaker local mRNA folding at the beginning of the ORF (codons 1-8) increases protein abundance, it decreases protein abundance when present in downstream codons, reinforcing previous evolutionary studies demonstrating the selection of folding strength in different parts of the ORF. Finally, we show that the mean relative codon decoding time, based on ribosomal densities in endogenous genes, significantly correlates with our measured protein abundance (correlation up to r = 0.6175; p=0.0013). While this report provides an improved understanding of transcript evolution and gene expression regulation, it also suggests relatively simple rules for engineering synthetic gene expression in a eukaryote.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A canonical correlation analysis based dynamic Bayesian network prior to infer gene regulatory networks from multiple types of biological data

Brittany Baur1, Serdar Bozdag1

1Marquette University

One of the challenging and important computational problems in systems biology is to infer gene regulatory networks of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer gene regulatory networks. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in gene regulatory network inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Disease gene prioritization using network and feature

Bingqing Xie1, Gady Agam1, Sandhya Balasubramanian2, Jinbo Xu3, Natalia Maltsev2, Conrad Gilliam2, Daniela Boernigen2

1Illinois Institute of Technology, 2University of Chicago, 3Toyota Technological Institute of Chicago

Identification of the most promising candidate genes contributing to the disease phenotypes among large lists of variations produced by high-throughput genomics using traditional experimental methods is time- and cost- consuming. Therefore, using computational approaches utilizing existing biological knowledge for the prioritization of such candidate genes will allow enhancing the efficiency and accuracy of the analysis of biomedical data. It will also allow reducing the cost of the studies by avoiding experimental validations of irrelevant candidates. To prioritize candidate genes contributing to a disease or phenotype of user’s interest for further testing, in this study, we present a novel algorithm that utilizes both types of information sources, gene annotations and gene interactions simultaneously, while preserving their original representation using Conditional Random Field (CRF) model. We further improve the accuracy and efficiency of our proposed approach by assigning enrichment scores to the annotation feature factors within the model. To estimate the performance of our approach, we evaluated it on two independent benchmark studies, ranking the candidate genes by both network and feature knowledge. Our results overall had high Area Under Curve (AUC) values and high partial AUC (pAUC) values on various diseases benchmarks and revealed a higher accuracy and precision at the top predictions (10%) as compared with other prioritization tools. Additionally, we applied our method on a case study for the prediction of molecular mechanisms contributing to intellectual disability and autism. Our method was able to recover additional genes related to both disorders and provide suggestions for possible candidates based on their rankings and functional categories.


top

 

CYTOSCAPE WORKSHOPS
Sold Out (for both workshops)

Updated Nov 4, 2014


November 09, 2014

1:00 pm – 5:00 pm

Subject to space availability as these workshops will sell out.

To register follow instructions available at: http://www.iscb.org/recomb-regsysgen2014-register

Choose between either Cytoscape User or Cytoscape App Developer workshops:

  • Workshop 01: The Cytoscape User Workshop (Max attendance: 45)

    Room: East Coast

    The network perspective on biology aims to bring meaningful context to high-throughput data for exploratory analysis, interpretation and hypothesis generation. As a free and open source tool, Cytoscape has become the most popular network visualization and analysis tools in the biological sciences. It is now cited in over 500 publications per year and downloaded close to 8,000 times per month. This workshop will provide a general introduction to network biology studies and Cytoscape concepts, including a hands-on session for universal data import and demonstration of a few of the over 200 freely available apps contributed by the Cytoscape developer community.

    By the end of this workshop, you should be able to:
    •     Import any tabular data in Cytoscape
    •     Integrate your data with public sources of networks, pathways and other datasets
    •     Master layout and data visualization

    Instructor:  Alexander R. Pico, Gladstone Institutes, San Francisco, United States


  • Workshop 02: The App Developer Workshop (Max attendance: 20)

    Room:  Embarcadero

    Cytoscape's real power lies in the ecosystem of community-developed apps. The most common types of apps provide access to third-party biological databases, customize data import for domain-specific data sets, and perform custom analyses and workflows. Browse the full collection at http://apps.cytoscape.org. During this workshop, we will demonstrate how to develop apps for Cytoscape, targeting individuals who want to take advantage of the network visualization and analysis capabilities of Cytoscape and extend it for custom use cases.

    By the end of this workshop, you should be able to:
    •     Navigate the complete Cytoscape API
    •     Setup an app development environment and cycle
    •     Start your own app development project from scratch
    •     Edit and contribute to other open source Cytoscape app projects
    Instructor:  John "Scooter" Morris,  University of California San Francisco, United States

top

TRAVEL AWARDS


FASEB MARC Funding (USA Citizen/Permanent Resident of the USA required)

The FASEB MARC Program provides funding for travel awards to support the participation of graduate students and postdoctoral fellows selected to give poster or platform (oral) presentations at the RECOMB / ISCB  RSG DREAM 2014.  The travel awards help to defray meeting registration and travel-related expenses (lodging, transportation, per diem) for eligible graduate students and postdoctoral fellows from underrepresented groups in the biomedical and behavioral sciences.  Full-time graduate students and postdoctoral fellows meeting the citizenship/residency requirements at accredited postsecondary minority institutions are also eligible to apply for the travel awards.

The FASEB MARC Travel Award Application details are available at: www.faseb.org/MARC-and-Professional-Development/Travel-Awards.aspx

Applications to the FASEB MARC Program is encouraged for all who meet the FASEB MARC eligibility criteria.


 

 

CONFERENCE COMMITTEE


CONFERENCE CHAIRS

  • Andrea Califano
    Columbia University, New York, United States
  • Manolis Kellis
    Massachusetts Institute of Technology
    Cambridge, United States
  • Gustavo Stolovitzky
    IBM Computational Biology Center
    Yorktown Heights, United States

PROGRAM COMMITTEE CO-CHAIRS

  • Christina Leslie
    Memorial Sloan-Kettering Cancer Center
    New York, United States
  • Lonnie Welch
    Ohio University
    Athens, United States

PROGRAM COMMITTEE

Updated August 05, 2014


PROGRAM COMMITTEE CO-CHAIRS

  • Christina Leslie
    Memorial Sloan-Kettering Cancer Center
    New York, United States
  • Lonnie Welch
    Ohio University
    Athens, United States

PROGRAM COMMITTEE

  • Stein Aerts, University of Leuven
  • M. Madan Babu, MRC Laboratory of Molecular Biology
  • Ziv Bar-Joseph, Carnegie Mellon University
  • Panayiotis (Takis) Benos, University of Pittsburgh
  • Mark Biggin, Lawrence Berkeley National Laboratory
  • Michael R. Brent, Washington University in St. Louis
  • Harmen Bussemaker, Columbia Univeristy
  • Jim Collins, Boston University
  • Diego Di Bernardo, Telethon Institute of Genetics and Medicine
  • Barry Demchak, University of California San Diego
  • Finn Drablos, Norwegian University of Science and Technology
  • Dean Felsher, Standford University
  • Ana Freitas, INESC-ID
  • Sridhar Hannenhalli, University of Pennsylvania
  • Alex Hartemink, Duke University
  • Uri Keich, University of Sydney
  • Seungchan Kim, Translational Genomics Research Institute & Arizona State University
  • Yuval Kluger, Yale University
  • Reinhard Laubenbacher, University of Connecticut Health Center
  • Avi Ma'Ayan, Mount Sinai School of Medicine
  • Satoru Miyano, University of Tokyo
  • Quaid Morris, University of Toronto
  • T. M. Murali, Virginia Polytechnic Institute and State University
  • Alexander Pico, Gladstone Institutes
  • Dana Pe'er, Columbia University
  • Theodore Perkins, Ottawa Health Research Institute
  • Jan Prins, University of North Carolina
  • Miguel Angel Pujana, Institut d'Investigació Biomédica de Bellvitge (IDIBELL)
  • Raul Rabadan, Columbia University
  • Isidore Rigoutsos, Thomas Jefferson University
  • Sushmita Roy, The University of Wisconsin, Madison
  • Alexander Schliep, Rutgers University
  • Ron Shamir, Tel Aviv University
  • Ilya Shmulevich, Institute for Systems Biology
  • Saurabh Sinha, University of Illinois
  • Pavel Sumazin, Baylor College of Medicine
  • Martin Vingron, Max Planck Institute for Molecular Genetics
  • Weixiong Zhang, Washington University in St. Louis
  • Sheng Zhong, University of California, San Diego

[top]

SYSTEM BIOLOGY PRESENTATIONS & ABSTRACTS

Presented Tuesday, November 11 and Wednesday, November 12


--> Go directly to Wednesday, Nov 12

TUESDAY, NOVEMBER 11



1:55 pm – 2:15 pm


SB T01
A cell lineage-specific regulatory network inferred using limited expression data of erythropoiesis


Fan Zhu1, Lihong Shi1, James Engel1, Yuanfang Guan1

1University of Michigan

Modeling regulatory networks using expression data observed in a differentiation process may help identify context-specific interactions. Despite intensive research efforts on this topic, the outcome of the current algorithms highly depends on the quality and quantity of a single time-course data, and the performance may be compromised for data with a limited number of samples. In this work, we report a novel multi-layer graphical model that is capable of leveraging heterogeneous, generic, publicly available time-course datasets, as well as limited cell lineage-specific data to model regulatory networks specific to a differentiation process. First, a collection of network inference methods are used to predict the regulatory relationships in individual datasets. Then, the inferred relationships are weighted and integrated together by evaluating against the cell lineage-specific data. To test the accuracy of this algorithm, we collected a time-course RNA-Seq dataset during human erythropoiesis to infer regulatory relationships specific to this differentiation process. The resulting erythroid-specific regulatory network reveals novel regulatory relationships activated in erythropoiesis, which were further validated by genome-wide TR4 binding studies using ChIP-seq. These erythropoiesis-specific regulatory relationships were not identifiable by single dataset-based methods or context-independent integrations. Analysis of the predicted targets reveals that they are all closely associated with hematopoietic lineage differentiation. In summary, this paper develops an integrative strategy that is capable of leveraging a limited, cell type-specific expression dataset and large-scale, generic time-course datasets to infer regulatory networks specific to a differentiation process, which is applicable to other cell lineages.

...............................................................................................................................
Tuesday, November 11
2:15 pm – 2:35 pm

SB T02

FAST-SL: An efficient algorithm to identify synthetic lethal reaction/gene sets in metabolic networks

Aditya Pratapa1, Shankar Balachandran1, Karthik Raman1

1Indian Institute of Technology Madras

Synthetic lethal reaction/gene sets are sets of reactions/genes where only the simultaneous removal of all reactions/genes in the set abolishes growth of an organism. In silico, synthetic lethal sets can be identified by simulating the effect of removal of reaction/gene sets from the reconstructed genome-scale metabolic network of an organism. Previous approaches to identifying synthetic lethal reactions in genome-scale metabolic networks have built on the framework of Flux Balance Analysis (FBA), extending it either to exhaustively analyze all possible combinations of reactions, or formulate the problem as a bi-level Mixed Integer Linear Programming (MILP) problem.

FAST-SL circumvents the complexity of both exhaustive enumeration and the bi-level MILP by iteratively reducing the search space and the computational time involved in identification of synthetic lethal reaction sets. FAST-SL, while considering all possible phenotypes and all parts of metabolism, efficiently identifies the targeted phenotypes. Our algorithm shows more than a 4000-fold reduction in search space over exhaustive enumeration of triple lethal sets for Escherichia coli iAF1260 model. Unlike the previous methods used for identification of lethal reaction sets, FAST-SL uses the sparsest solution obtained by solving the flux balance constraints of a metabolic network, which is a linear programming problem, to eliminate reaction combinations that do not lead to a lethal phenotype, thereby reducing the search space for identifying lethal reaction sets.

As our algorithm finds application in the identification of combinatorial drug targets, in this study, we performed synthetic reaction and gene lethality analysis for genome-scale reconstructions of Salmonella enterica typhimurium and Mycobacterium tuberculosis. We validated the reaction lethals obtained using FAST-SL with exhaustive enumeration of reaction deletions up to the order of two for these organisms. The triple lethal reactions obtained for Escherichia coli using FAST-SL have a precise match with the results obtained with exhaustive enumeration, by performing it on a high-performance computer cluster. Our results also completely agree with those of the SL finder algorithm (Suthers, P.F. et al (2009). Mol Syst Biol, 5:301); notably, our algorithm is substantially faster. Further, we also present a mathematical proof for the correctness of our algorithm.

Overall, FAST-SL is a powerful tool to identify the lethal reaction/gene sets, through a massive reduction in the search space over an exhaustive enumeration approach and the SL Finder algorithm. We believe that our algorithm presents an important advance and can enable the rapid enumeration of synthetic lethal reaction/gene sets in genome-scale metabolic networks.

Availability: The MATLAB implementation of our algorithm (compatible with the COBRA toolbox v2.0, a popular toolbox for constraint-based analysis of metabolic networks) is freely available from: https://home.iitm.ac.in/kraman/lab/research/fast-sl.

...............................................................................................................................
Tuesday, November 11
2:35 pm – 2:55 pm

SB T03
Trafficking and signaling interplay modeling after serotonin receptor activation

Aurélien Rizk1, Mauno Schelb1, Milica Bugarski1, Maysam Mansouri1, Gebhard Schertler1, Philipp Berger1

1Paul Scherrer Institute

Despite the physiological and pharmacological importance of G protein-coupled receptors (GPCRs), receptor activation and its translation into cytoplasmic trafficking and cellular response remain elusive. In this project, we study the interplay between signaling and trafficking of serotonin receptors 5-HT2c after stimulation. We use RAB GTPases as markers of intracellular compartments to monitor the dynamic distribution of receptors after stimulation and ERK phosphorylation to monitor signaling output. In order to obtain statistically significant trafficking data and high temporal resolution we developed the "Squassh" image analysis software for automatic vesicles segmentation, counting, and colocalization computation [Rizk et al., Nature Protocols 2014]. Based on the receptor localization data, signaling data and previous work on the modeling of GPCR activated signaling pathways [Heitzler et al., MSB 2012] we developed an ordinary differential equation model combining signaling with receptor internalization and transport to early, recycling, and late endosomes. This is to our knowledge the first attempt to develop a dynamic trafficking model for a GPCR. We evaluate trafficking influence on signaling by conducting global sensitivity analysis and use the model to test hypotheses on receptor constitutive internalization, trafficking regulation, and signaling from endosomes.

...............................................................................................................................
Tuesday, November 11
2:55 pm – 3:15 pm


SB T04
Joint learning over drugs improves prediction of cancer drug response


Ivan Paskov1, Han Yuan2, Hristo Paskov1, Alvaro Gonzalez2, Christina Leslie2

1Stanford University, 2Memorial Sloan Kettering Cancer Center

The ultimate goal of precision medicine is to predict the best personalized therapeutic option from patient-specific genomic data. In cancer, precision medicine seeks to leverage new targeted therapies that work only in a subset of tumors where the targeted pathway is suitably altered; in general we cannot predict drug response from the mutation or copy number status of the target alone. Here we use publicly available drug response data sets in cancer cell lines, including Cancer Cell Line Encyclopedia (CCLE) and National Cancer Institute NCI-60, and develop a multi-task strategy to predict drug sensitivity by jointly learning across many drugs at once. We use a nuclear norm regularization approach with a highly efficient ADMM (alternating direction method of multipliers) optimization algorithm that readily scales to large data sets. For the CCLE data set, we used cross-validation to train on 445 cell lines with 50,000 genomic features (gene expression, copy number, and mutation status) and jointly learn prediction models for 24 drugs. For all drugs, our multi-task learning approach outperformed elastic net single-task learning in a transductive cross-validation setting, where the features of all cell lines are seen across tasks, but the drug response values for each task’s test set are held-out. The mean square error (MSE) of multi-task learning is on average 33% smaller than the MSE of single-task learning. For NCI60 dataset, we trained on 60 cell lines, around 60,000 genomic features, and 309 FDA approved drugs. Here, multi-task learning outperformed elastic net single-task learning for 226 out of 309 drugs in a transductive cross-validation setting, with a mean improvement in MSE of 14.1%. Moreover, our joint training approach led to more interpretable drug response models, where drugs with similar mechanisms of action had similar regression models, and where enrichment analysis of regression coefficients revealed the mechanism of action.

...............................................................................................................................
Tuesday, November 11
3:40 pm – 4:00 pm

SB T05
A first truly systems level mechanistic model – unravelling the gene regulation of Th2 differentiation

Mattias Köpsén1, William Lövfors1, Sören Bruhn2, Gunnar Cedersund1, Mikael Benson1, Mika Gustafsson1

1Linköping University, 2Karolinska Institute

Recent and ongoing revolutions in measurement technologies imply completely new possibilities for genome research; today, time-resolved, quantitative, and systems-level data are available. Nevertheless, without a corresponding revolution in methods for data analysis, these new data tend to drown researchers and doctors, rather than provide clear and useful insights. Such new methods are developed within the field of systems biology. Systems biology has two main approaches: mechanistically detailed and well-determined simulation models for small subsystems, and more approximative statistical models for the entire genome. However, there are few, if any, methods that combine the strengths of these two approaches. Herein, we present LASSIM, a new simulation-based approach, which can be applied to systems of the size of the entire genome. The superior performance of LASSIM is demonstrated in three examples: i) an example with simulated data shows that unlike traditional large-scale methods, LASSIM correctly identifies the true behavior between measured data-points, ii) LASSIM outperforms the winner of a previous DREAM challenge, the most competitive benchmarking approach available, iii) based on new data from TH2 differentiation, LASSIM identifies a first mechanistic model for the entire genome. The key predictions of this model are typically enriched for DNA bindings, which suggests that most predicted interactions are direct. Moreover, in silico knockdowns were experimentally validated. In summary, LASSIM opens the door to a new type of model-based data analysis: models that combine the strengths of reliable mechanistic models with truly systems-level data.

...............................................................................................................................
Tuesday, November 11
4:00 pm – 4:20 pm

SB T06
Simulation predicts IGFBP2-HIF1α interaction drives glioblastoma growth

Ka Wai Lin1, Angela Liao1, Amina Qutub1

1Rice University

Introduction: Recent clinical studies show that both obese and type II diabetic patients have faster tumor progression and decreased survival rates from brain cancers compared to cancer patients without obesity or type II diabetes [1]. Though studies suggest interactions between insulin-like growth factor I (IGFI), insulin-like growth factor binding proteins (IGFBP2), and hypoxic inducible factor 1 alpha (HIF1α) correlate to tumor growth and invasiveness, the detailed mechanisms of these interactions are unknown. Computational modeling can address the complexity of these interactions by identifying sets of key signaling regulators and characterizing the architecture of their signaling pathways. Here we present a computational model relating IGFBP2, IGFI, and HIF1α to the growth of glioblastoma cells. Many drugs that have targeted the insulin-like growth factor receptor (IGFIR) have shown promising results under in vitro settings but have failed in late stage clinical trials. Results from our model found a potential target in the insulin signaling pathway which will guide the design of new drugs for glioblastoma.

Materials and Methods: The interactions between IGFBP2, IGFI, HIF1α, and oxygen levels to glioblastoma growth were summarized by an extensive literature search. The chemical-kinetic model was created containing 5 ordinary differential equations and simulated using Matlab. The parameters were found by fitting the rate constants to in vitro data from existing literature. IGFI and IGFBP2 was fitted using data that observed IGFBP2 concentration as a function of IGFI stimulation over time [2]. HIF1α was fitted using data that observed changes in HIF1α concentration as a function of oxygen levels [3]. The HIF1α signaling is linked to the growth of the glioblastoma, which was fitted in vitro assays which used U87 glioblastoma cells cultured into spheroids by the hanging drop approach in our lab where the diameter at different time points was measured. An extensive series of sensitivity analyses were conducted on all parameters in the model.

Results and Discussion: The results from the model showed that the downstream signal from IGFI to HIF1α was less sensitive to change as compared to the feedback of IGFBP2 to HIF1α. Results from our glioblastoma growth reduction analysis showed that when the feedback loop was removed there was a greater decrease in glioblastoma diameter as compared to removing the downstream IGFI to HIF1α signal.

Conclusions: The simulations from the computational model are representative of the in vitro system and they are in agreement with known literature of in vitro growth of glioblastoma. Model sensitivity analysis highlighted that feedback from IGFBP2 to HIF1α is more integral to the sustained growth of the glioblastoma spheroid than the downstream signaling from IGFI to HIF1α. This implicates a more significant potential drug target as compared to the current IGFI targets. Ongoing studies in the lab are following the potential of these targets of glioblastoma in vitro.

References:

1. Chambless LB et al. Type 2 diabetes mellitus and obesity are independent risk factors for poor outcome in patients with high-grade glioma. Journal of neuro-oncology. 2012;106(2):383-9.

2. Slomiany MG et al. IGF-1-induced VEGF and IGFBP-3 secretion correlates with increased HIF-1 alpha expression and activity in retinal pigment epithelial cell line D407. Invest Ophthalmol Vis Sci. 2004;45(8):2838-47. 3. Jiang BH et al. Hypoxia-inducible factor 1 levels vary exponentially over a physiologically relevant range of O2 tension. Am J Physiol. 1996;271(4 Pt 1):C1172-80.

...............................................................................................................................
Tuesday, November 11
4:20 pm – 4:40 pm

SB T07
Ensemble-based design of experiments for gene regulatory networks

Erica Manesso1, Rudiyanto Gunawan1

1ETH Zurich

Model-based discovery in systems biology is an iterative process that integrates wet-lab experiments, parameter estimation, in silico analysis, and optimization. There are still many challenges in performing the iterative model-based discovery. The bottlenecking step is often encountered during the estimation of unknown kinetic parameters from experimental data. The estimation of kinetic parameters by fitting model simulations to biological data is usually ill posed; there often does not exist a single (best-fit) solution to the data fitting problem, and instead one can find many parameter combinations; i.e., an ensemble of parameters, that can fit the data statistically equally well. The parameter ensemble represents the uncertainty of the model parameters. However, the issue above describes only one type of uncertainty in the mathematical modeling of biological systems. There are also other factors that contribute to model uncertainty, including structural and dynamical uncertainty. In the context of gene regulatory networks, the use of ensemble models has been very limited and focused mainly on network structure. In practical applications, it is often desired and necessary to reduce the size of the ensemble by performing additional experiments and gathering new data. The goal of the present work is to design the experiments that would lead to a significant reduction in the ensemble size taking into account different aspects of model uncertainty.

Modern technique of model-based experimental design aims at obtaining the most informative data from an experiment in order to validate the predictions of a model (e.g., gene expression profiles). For this purpose, the experimental conditions are usually optimized to obtain the maximum information from the data. Only recently the uncertainty associated to the model parameters has been taken into account using the Approximate Bayesian Computation Design (ABCD), where the parameter uncertainty is employed as a priori information. In this work we adapt ABCD method such that both parameter and structure uncertainties can be considered. The resulting Ensemble-based Design Of Experiments (EDOE) gives the optimal experimental condition that simultaneously reduces the ensemble of structures and parameters. Briefly, the procedure consists of: (1) select the initial experimental design ξ(0); (2) draw a sample (s(0), θ(0)) from the ensemble of structure and parameters (i.e., a sample from the prior distribution p(s,θ), where s is the structure and θ is the vector of parameters); (3) evaluate model prediction y(0), thus producing a sample from the joint distribution p(y|s,θ;ξ(0)); (4) compute the design criterion h(0) according to the joint distribution p(y|s,θ;ξ(0)); (5) propose a new design ξ(c) from a Metropolis-Hastings probing density; (6) draw another sample (s(c), θ(c)) from the prior distribution p(s,θ) and evaluate y(c) from the joint distribution p(y|s,θ;ξ(c)); (6) compute the design criterion h(c) according to the joint distribution; (8) replace the existing design ξ(0) with the proposed design ξ(c) if the acceptance criterion is satisfied; (9) repeat steps 5-7 until convergence. We demonstrated the utility of the EDOE on a case study of gene regulatory networks: the GATA-1, GATA-2, and PU.1 circuit governing the differentiation of myeloid progenitors.

...............................................................................................................................
Tuesday, November 11
5:25 pm – 5:45 pm

SB T08
Reconstruction of gene regulatory networks based on repairing sparse low-rank matrices

Young Hwan Chang1, Roel Dobbe1, Palak Bhushan1, Claire Tomlin1

1University of California, Berkeley

With the growth of high-throughput proteomic data, in particular time series gene expression data from various perturbations, a general question that has arisen is how to organize inherently heterogeneous data into meaningful structures. Since biological systems such as breast cancer tumors respond differently to various treatments, little is known about exactly how these gene regulatory networks (GRNs) operate under different stimuli. For example, when we apply a drug-induced perturbation to a target protein, we often only know that the dynamic response of the specific protein may be affected. We do not know by how much, how long, and even whether this perturbation affects other proteins or not. Challenges due to the lack of such knowledge not only occur in modeling the dynamics of a GRN but also cause bias or uncertainties in identifying parameters or inferring the GRN structure.

This paper describes a new algorithm that enables us to estimate bias error due to the effect of perturbations and to correctly identify the common graph structure among biased inferred graph structures. To do this, we retrieve common dynamics of GRN subject to various perturbations. We refer to the task as “repairing” inspired by “image repairing” in computer vision. The method can automatically correctly repair the common graph structure across perturbed GRNs, even without precise information about the effect of the perturbations. We evaluate the method on synthetic data sets and demonstrate advantages over l1-regularized graph inference by advancing our understanding of how these networks respond across different targeted therapies.

............................................................................................................................
Tuesday, November 11
5:45 pm – 6:05 pm

SB T09
Understanding multicellular function and disease with human tissue-specific networks

Arjun Krishnan1, Casey Greene2, Aaron Wong1, Emanuela Ricciotti3, Rene Zelaya2, Daniel Himmelstein4, Daniel Chasman5, Garret Fitzgerald3, Kara Dolinski1, Tilo Grosser3, Olga Troyanskaya1

1Princeton University, 2Dartmouth College, 3University of Pennsylvania, 4University of California, San Francisco, 5Harvard Medical School

Tissue and cell-type identity lie at the core of human physiology and disease. Therefore, understanding the genetic underpinnings of complex tissues and individual cell lineages is crucial for developing improved diagnostics and therapeutics. Yet we still lack tools to systematically explore the landscape of genes and interactions that shape specialized cellular functions across hundreds of tissue types and cell lineages in the body. Here we present genome-wide functional interaction networks specific for each of 144 human tissues and cell types developed using an integrative data-driven methodology. Our approach integrates thousands of diverse genome-scale datasets by simultaneously using both tissue-specific and functional contexts. This technique effectively leverages signals detected by distinct technologies from experiments spanning both tissues and disease states. The tissue networks predict lineage-specific response of genes to perturbation, reveal changing functional roles of genes depending on tissue context, and illuminate meaningful disease-disease associations. We show that genes with nominally significant p-values in genome-wide association studies (GWAS) can be used in conjunction with tissue-specific networks to identify biologically important disease-gene associations, a procedure we term NetWAS. NetWAS identifies disease-associated genes more accurately than GWAS alone or an approach using a non-tissue-specific functional network. Our webserver, GIANT, (http://giant.princeton.edu) provides an interface to human tissue networks with multi-gene query capability, network visualization, analysis tools, and downloadable networks. GIANT also enables NetWAS reprioritization of users’ GWAS results.

............................................................................................................................
Tuesday, November 11
6:05 pm – 6:25 pm

SB T10
Pathway-based biomarkers specifically and robustly classify diverse multiple diseases

David Amar1, Tom Hait1, Ron Shamir1

1Tel Aviv University

Background: Gene expression signatures, serving as biomarkers, have been used successfully for prognosis, diagnosis, and patient stratification in cancer. However, such signatures are often not robust and sometimes perform poorly on new datasets. Moreover, standard case-control studies may yield a signature that is not specific to the tested disease. In addition, the set of genes constituting a signature often provides little insight about the disease etiology.

Methods: We compiled a compendium of annotated expression profiles from 174 gene expression studies from GEO, covering 13,314 samples from 17 different array technologies, and 1,699 RNA-Seq samples from TCGA. The RNA-Seq samples were used for validation only. Overall, our compendium covers 48 diseases, each covered by at least five different studies. Each sample was manually annotated with Disease Ontology terms. We used the compendium to learn a multi-label classifier. In order to avoid batch effects, leave-dataset-out cross validation was used to test the performance of the classifier on each disease. Previous studies sought classification of cases of one disease vs. all other samples. However, since many diseases originating from different tissues are included in the compendium, such classifiers may actually predict well the tissue but not the disease status. Thus, we take a more stringent approach: for each disease we test both for overall separation of cases vs. the rest, and for separation of cases vs. the 'disease controls' that were included in the same studies. In such analysis, a good signature would produce biomarkers that are disease-specific and consistently distinguish the cases from the disease controls and the background samples.

Results: Our strategy produced high performance classifiers for 16 diseases, including cardiovascular disease, gastric, breast, and immune system cancers. For these diseases, accurate classification was obtained in cross validation and even on new datasets produced by RNA-seq. In addition, our overall multi-label classifier outperformed previous studies while using a simpler classifier (e.g., Huang et al. (PNAS 2010) report 82% recall at 20% precision, while we get 90% recall at 20% precision, and 32% precision at 82% recall). We constructed a gene signature for each disease by including genes that (1) had high importance score in the disease classifier, and (2) were markedly differential in the disease patients compared to both healthy controls from the same study and to samples of other diseases. Reassuringly, our cancer gene signature is enriched with pathways from the hallmarks of cancer: e.g., cell cycle, cell cycle checkpoints, and DNA replication. Similarly, for leukemia and cardiovascular disease, the gene signature is highly enriched for pathways that were previously associated with these diseases. Of note, 43% of our signature genes are not part of known pathways, suggesting that they can lead to new biological hypotheses. Moreover, our leukemia signature contains two known targets of leukemia drugs. In gastric cancers, a new up-regulated druggable gene, NR1I2, is suggested.

Conclusions: A judicious analysis of a large, heterogeneous compendium of expression profiles produces disease-specific diagnostic signatures and reveals previously unknown disease genes.

............................................................................................................................
Tuesday, November 11
6:25 – 6:45 pm

SB T11
Network modeling reveals key features of epithelial-to-mesenchymal transition dynamics in liver cancer invasion

Steven Steinway1, Jorge Gomez Tejeda Zañudo2, Thomas Loughran1, Reka Albert2

1University of Virginia, 2Penn State University

Epithelial-to-mesenchymal transition (EMT) is a developmental process hijacked by cancer cells to leave the primary tumor site, invade surrounding tissue, and establish distant metastases. A hallmark of EMT is the loss of E-cadherin expression, and one major signal for the induction of EMT is transforming growth factor beta (TGFβ), which is dysregulated in up to 40% of hepatocellular carcinoma (HCC). We previously constructed and experimentally validated an EMT network of 69 nodes and 134 edges by integrating the signaling pathways involved in developmental EMT and known dysregulations in invasive HCC (Steinway et al., Cancer Research, 2014). Currently, we are analyzing perturbations (through computation and experiments) to our network that suppress TGFβ-driven EMT, with the ultimate goal of identifying therapeutic interventions which suppress tumor invasion. We noticed that some perturbations produce steady states that differed substantially from previously identified epithelial and mesenchymal steady states in our model. Further analysis revealed that these perturbations lead to states that are intermediate to epithelial and mesenchymal phenotypes. Similar so-called “EMT hybrid” states have been described in the literature. Quantitative analysis of these attractors reveals that these hybrid states form a subset of steady states that are distinct from epithelial and mesenchymal steady states. Lastly, our results suggest that combinatorial inhibition can effectively suppress EMT. Out of 2346 possible combinations (of two nodes), our model predicts that 9 nodes in combinations (SMAD, ERK, SOS1, GRB2, RAS, DLL, NOTCH, CSL) will robustly suppress TGFβ-driven EMT. We have demonstrated experimentally that expression of these nodes is enriched in mesenchymal relative to epithelial phenotype HCC cell lines. Furthermore, we demonstrate that these knockout combinations act by disrupting feedback loops that drive the EMT process. We are currently working to validate these finding experimentally. These results support network modeling as an important tool to identify critical mediators in complex biological processes. We further propose network modeling as a tool to discover therapeutic targeting strategies within complex disease pathways, specifically in liver cancer invasion.

............................................................................................................................

Top of Page


WEDNESDAY, NOVEMBER 12

 


9:45 am – 10:05 am

SB T12
A canonical correlation analysis based dynamic Bayesian network prior to infer gene regulatory networks from multiple types of biological data

Brittany Baur1, Serdar Bozdag1

1Marquette University

One of the challenging and important computational problems in systems biology is to infer gene regulatory networks of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer gene regulatory networks. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions with high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in gene regulatory network inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.

............................................................................................................................
Wednesday, November 12
10:05 am – 10:25 am

SB T13
Variability in B-vitamin dependencies in the human microbiome genomes

Matvei Khoroshkin1, Andrei Osterman2, Dmitry Rodionov2

1Institute for Information Transmission Problems, Russian Academy of Sciences, 2Sanford-Burnham Medical Research Institute

B vitamins are biochemical cofactors essential for any living systems. Human microbiota is the complex and dynamic community of commensal, symbiotic and pathogenic microorganisms that are present on and within the human body and has an enormous impact on humans. We investigate the ability of bacteria from human microbiome to produce and salvage B vitamins. We have selected the reference set of 1143 bacterial genomes from 7 phyla out of those sequenced in course of Human Microbiome Project (HMP). By using the metabolic subsystems approach (as implemented in the SEED database) and analyzing genomic context and regulons, we have reconstructed biochemical pathways for synthesis of eight B vitamins (thiamin, riboflavin, niacin, biotin, pyridoxine, cobalamin, pantothenate, folate) and predicted putative vitamin transporters in the reference HMP genomes. Using the reconstructed metabolic pathways, we have classified the HMP organisms with respect to their B-vitamin proto-, auxotrophy and their vitamin transport capabilities. The preferable patterns of vitamin dependency were attributed to a number of taxonomic units. For instance, the Bacteroides are mostly prototrophs that are capable synthesizing all B vitamins, excluding cobalamin. On the contrary, the Lactobacillales are auxotrophes for all vitamins, excluding folate. The reference HMP genomes show a relatively high level of conservation of vitamin synthesis phenotypes at the genus level, hence only 25% of the studied genera demonstrate variability of phenotypes for individual vitamins. Also we have identified patterns of vitamin dependency for a number of body sites. Gastrointestinal tract generally shows the prevalence of vitamin prototrophic bacteria, whereas oral cavity, urogenital tract and blood are largely populated by vitamin auxotrophs. This work is important for understanding the role of B-vitamins in maintaining homeostasis of human microbiome community structures and for future developing of specific vitamin diets.

............................................................................................................................
Wednesday, November 12
10:25 am – 10:45 am

SB T14
Master regulators of luminal and basal subtypes of breast cancer

Archana Iyer1, Celine Lefebvre2, Yishai Shimoni1, Mukesh Bansal1, Mariano Alvarez1, Jose Silva3, Andrea Califano1

1Columbia University, 2Gustave Roussy, 3Mount Sinai Medical Center

Breast cancer is a heterogeneous group of diseases that can be stratified into several subgroups based on their molecular signature. Understanding the regulators of these molecular subtypes will allow us to make them more amenable for targeted therapies or personalized medicine. Here we present our discovery of master regulators that are important in the transcriptional regulation of the two major subtypes: basal and luminal. We reverse engineered a breast-cancer specific transcriptional network using large-scale gene expression datasets in breast cancer (TCGA, Metabric, UNC-300) to create a breast cancer interactome. Using MARINa (Master Regulator Inference Algorithm) we have identified specific transcription factors that regulate basal and luminal subtypes of breast cancer. We further validated these master regulators experimentally by performing a pooled shRNA screen on six independent cell lines (2 luminal, 3 basal, and one normal mammary epithelial cell line). The pooled shRNA screen was sampled at days 0, 10, 18, and 25, and genomic DNA was barcoded and sequenced using the Illumina MiSeq technology. Both computational predictions and our experimental results from the deconvolution of the shRNA screen validate the luminal transcription factors (FOXA1, ESRI, and GATA3). In addition, we discover novel luminal-specific transcription factors like TFAP2C. For the basal subgroup, while we discover novel regulators we also find that this group is more heterogeneous compared to the luminal. Importantly we can effectively target this subgroup with a combination of master regulators.

............................................................................................................................
Wednesday, November 12
11:10 am – 11:30 am

SB T15
Inferring the genome-wide functional modulatory network: a case study on the NF-κB/RelA transcription factor

Xueling Li1, Min Zhu2, Allan Brasier1, Andrzej Kudlicki1

1The University of Texas Medical Branch at Galveston, 2Hefei Institutes of Physical Science, Chinese Academy of Sciences

How different pathways lead to the activation of a specific transcription factor with specific effects is not fully understood. A modulatory network is composed of triplets of a specific transcription factor, target genes, and modulators. Modulators usually affect the activity of the specific transcription factor at the post-transcription level in a target gene-specific manner (action mode), which may be classified as enhancement, attenuation, and inversion of the activation or inhibition. Reconstructing such modulatory networks will help to interpret how transcription factors produce distinct gene responses to different stimuli. As a case study we inferred, from a large collection of expression profiles, all potential modulations of NF-κB/RelA. The predicted modulators include many proteins previously not reported as physically binding to RelA. The functions of the predicted modulators are consistent with biological activities of NF-κB/RelA include RNA processing, alternative splicing, cell cycle, mitochondrion, ubiquitin-dependent proteolysis and ribosome biogenesis, and are consistent with binding modulators in our previous study. The predicted genome-wide RelA modulators from different enriched pathways or processes exert specific prevalent action modes on distinct pathways through RelA. Also, the modulators from noncoding RNA (ncRNA), RNA binding proteins, transcription factors, cytoskeleton, and kinases modulate the NF-κB/RelA activity with specific action modes consistent with their molecular functions and modulation level. Finally, we analyzed the modulatory network of NF-κB/RelA in the context of TGFB1-induced epithelial-mesenchymal transition (EMT). Here modulators of NF-kB/RelA included those involved in extracellular matrix (FBN1), cytoskeletal regulation (ACTN1), and tumor suppression (FOXP1).

............................................................................................................................
Wednesday, November 12
11:30 am – 11:50 am

SB T16
The gene expression cascade connecting p53 dynamics to cell fates

Antonina Hafner1, Jeremy Purvis2, Galit Lahav3

1Harvard University, 2University of North Carolina at Chapel Hill, 3Harvard Medical School

The dynamics of transcription factors have been shown to play important roles in a variety of biological systems. However, the mechanisms by which these dynamics are decoded to trigger specific responses are not well understood. Our study focuses on the tumor suppressor protein p53 and how its temporal dynamics control gene expression and cell fate decisions. We measured the genome-wide transcriptional response to different p53 dynamics with the goal to identify the input-output relationship between p53 and its target genes.

In response to γ-irradiation, cells exhibit pulses in p53 protein levels. The number of p53 pulses is proportional to the irradiation dose, with higher doses leading to more pulses and pushing cells toward permanent cell cycle arrest (senescence). Changing the pulses into a sustained p53 levels leads to senescence independently of the radiation dose. This suggests that a given temporal behavior profile of p53 can trigger a specific cellular outcome. p53 mainly functions a transcription factor. To understand the relationship between p53 dynamics and the expression of its target genes we exposed cells to pulsed or sustained p53 signaling and measured gene expression at a high temporal resolution and over several days.

Our analysis revealed clusters of genes with distinct temporal characteristics: induction versus repression, immediate versus delayed response, and a transient versus sustained expression. I will present our data and analysis for dissecting the properties of each cluster and the role of various molecular mechanisms (e.g., p53 DNA binding, chromatin states, and post-transcriptional regulation of mRNA) for connecting p53 dynamics and the downstream gene expression response. Understanding the mechanisms decoding p53 dynamics into cellular outcomes will enable us to identify and test novel methods for pushing cells toward a specific fate.

............................................................................................................................
Wednesday, November 12
1:55 pm – 2:15 pm

SB T17
Pathways on demand: automated reconstruction of human signaling networks

Anna Ritz1, Christopher Poirel1, Allison Tegge1, Nicholas Sharp1, Allison Powell1, Kelsey Simmons1, Shiv Kale1, T. M. Murali1

1Virginia Polytechnic Institute and State University

Signaling pathways are a cornerstone of systems biology. Several databases store representations of these pathways that are amenable for automated analyses. Despite painstaking manual curation, significant variations exist between databases. To overcome these limitations, we present PathLinker, a new computational method that can reconstruct a signaling pathway from a background protein interaction network given only the identities of the receptors and transcription factors and regulators in that pathway. We demonstrate that PathLinker can reconstruct the Wnt pathway in the NetPath database with much higher precision and recall than several state-of-the-art algorithms, recovering non-canonical branches that appear only in this pathway's representation in other databases. PathLinker suggests a surprising role for CFTR, a chloride ion channel transporter of the ABC class, in Wnt/beta-catenin signaling, which we validate using siRNA experiments. We extend our computational results to accurately reconstruct a comprehensive set of signaling pathways in the NetPath database. We demonstrate that PathLinker can bridge differing representations of the same pathway between databases.

............................................................................................................................
Wednesday, November 12
2:15 pm – 2:35 pm

SB T18
Cell-to-cell variability in overcoming a caspase-8 activity threshold explains fractional killing by TRAIL

Marc Hafner1, Jeremie Roux1, Samuel Bandara1, Joshua J. Sims1, Diana Chai2, Peter K. Sorger1

1Harvard Medical School, 2Merrimack Pharmaceuticals

Ligands and DR4/5-receptor agonist antibodies such as TRAIL or Apomab trigger apoptosis in tumor cells. Although promising, drugs targeting this pathway have stalled in Phase II/III of clinical trials because of variable efficacy. Many mechanisms of resistance have been proposed to explain patient-to-patient variability but no quantitative model has been built to evaluate and compare these different hypotheses. In this work, we developed a quantitative model of the death-inducing signaling complex (DISC) to identify dynamical features that predict cell fate. We used this model to understand how resistance genes prevent apoptosis and found drug combinations that overcome this resistance.
Using live cell microscopy and HeLa cells engineered with a FRET reporter of Bid cleavage, we monitored caspase-8 activity at the single cell level after exposure to TRAIL. For each cell (a few hundreds per condition), we derived parameters characterizing DISC activity. We found that the maximal value of FRET ratio (i.e., integrated caspase-8 activity) is not different between surviving and dying cells, but its derivative (a surrogate of instantaneous caspase-8 activity) is significantly lower in surviving cells; only cells with a capsase-8 activity reaching a specific threshold θ will die. Based on a simple mechanistic model, the maximal caspase-8 activity is the product of k, the rate of capspase-8 activation, and τ, the duration of this activation. Therefore, the three parameters k, τ, and θ determine cell fate at a single cell level and explain the fate divergence across an isogenic population with more than 70% accuracy.

Higher doses of TRAIL increase the rate k and consequently induce more killing. This relation is also true for the DR4/5 antibodies Mapatumumab and Apomab, although the caspase-8 activation rate is lower, which results in less than 50%, respectively 10%, of killing at saturating doses. We found that clustering these antibodies significantly increases the rate k beyond the saturation value and therefore induces more cell death. Using Bortezomib, we were also able to move cells along the τ-axis. As predicted by our model, for treatments where the rate k is high enough, Bortezomib strongly synergies with the DR agonists to induce apoptosis. In particular, Apomab, which has low efficacy when used a single agent (<8% killing), can kill the majority of cells when combined with the clustering agent and Bortezomib.

Using this framework, we studied how the FLICE-inhibitory proteins (FLIP-L or -S for the long or short forms) induce resistance to TRAIL. We found that FLIP overexpression is correlated with a decreased caspase-8 activation rate and prevents death even at the highest doses of TRAIL. Based on the values of k and the half-life of each FLIP isoform, we predicted that only FLIP-L overexpressing cells will be sensitized by Bortezomib, whereas cells with high levels of FLIP-S will not because of the stronger inhibitory effect of FLIP-S on caspase-8 activity. Our results confirmed these predictions and validated our model. In conclusion, we developed a framework to quantitatively understand the mechanisms of resistance to TRAIL and DR4/5 antibodies, and showed that it can be used to find drugs working in synergy with DR agonists.

............................................................................................................................
Wednesday, November 12
2:35 pm – 2:55 pm

SB T19
Synthesizing signaling pathways from temporal phosphoproteomic data

Ali Sinan Köksal1, Anthony Gitter2, Kirsten Beck3, Aaron McKenna3, Saurabh Srivastava4, Nir Piterman5, Rastislav Bodík1, Alejandro Wolf-Yadlin3, Ernest Fraenkel6, Jasmin Fisher7

1University of California, Berkeley, 2University of Wisconsin-Madison, 3University of Washington, 420n, 5University of Leicester, 6Massachusetts Institute of Technology, 7Microsoft Research Cambridge

Advances in proteomic measurements reveal that even the best-curated pathways fail to capture a large fraction of signaling events. Here we propose a synthesis approach to produce precise models of signal transduction from temporal phosphoproteomic data. We first integrate the time series data with a protein-protein interaction network to produce an initial undirected graph. Using program synthesis techniques, we exhaustively explore all possible signaling pathways that are consistent with the proteomic data and the initial graph without explicitly enumerating all models. These pathways must satisfy several logical constraints. Most notably, a chain of events initiating at the source of the stimulation, the epidermal growth factor (EGF) in this case, must explain the activation or inhibition of each phosphorylated protein. In addition, the timing of all events must agree with the temporal data such that upstream proteins are not activated or inhibited after their downstream neighbors. This approach identifies parts of the network that are consistent with all possible pathway models. We are able to determine the direction of interactions, whether edges activate or inhibit, and the times at which proteins are activated.

Using new mass spectrometry data of the temporal EGF response in EGFR Flp-In HEK-293 cells, we show that nearly all proteins that change significantly in phosphorylation (89 to 98% depending on the database) are absent from canonical maps of the epidermal growth factor receptor (EGFR) signaling. Our computational approach reconstructs and summarizes all valid pathway models that explain how proteins are activated or inhibited by EGF. Collectively, these models account for 83% of the significant proteins and contain 413 protein-protein interactions, of which 200 can be confidently assigned a direction. In all cases where we predict a directed interaction between two EGFR pathway nodes, the prediction is correct. We use three natural language processing (NLP) tools to search for literature support for 54 predicted pathway edges that are peripheral to known EGFR pathway interactions. Manually verifying the results, we find that the direction is correct for 15 of the 16 predictions for which there is a definitive direction in the literature. Overall, of the 200 predicted directed pathway edges, 82 are supported by the canonical EGFR pathway, NLP, or kinase-substrate interactions (whose directions are included as prior knowledge). We are presently testing several predictions experimentally by assessing whether kinase inhibitors disrupt phosphorylation of the predicted substrate at the specific times proposed by our model. In summary, our computational approach identifies many previously unrecognized components of a well-studied signaling pathway. Our technique is broadly applicable to systems where dynamic proteomic data is available and has great potential for constructing pathway maps in conditions that alter classic signaling cascades, such as in diseased cells.

............................................................................................................................
Wednesday, November 12
2:55 pm – 3:15 pm

SB T20
Dissecting germ cell metabolism through network modeling

Leanne Whitmore1, Ping Ye1

1Washington State University

Metabolic pathways are increasingly postulated to be vital in programming cell fate, including stemness, differentiation, proliferation, and apoptosis. The commitment to meiosis is a critical fate decision for mammalian germ cells, and involves a key metabolite, retinoic acid (RA). Recent evidence suggests that a pulse of RA is generated in the male mouse, thereby triggering meiotic commitment. However, enzymes and reactions that regulate this RA pulse have yet to be identified in germ cells. We developed a genome-scale mouse metabolic network with a refined RA pathway. Using this network, we implemented flux balance analysis throughout the initial synchronized wave of spermatogenesis to elucidate important reactions and enzymes for the generation and degradation of RA. Our results indicated that the primary RA source is from the extracellular region and the major RA sink is nuclear transport. We further performed in silico knockouts of gene and reaction in the RA pathway and discovered that retinol binding to proteins is crucial for successful meiosis commitment. Finally, we examined the activity of other metabolic pathways in the genome-scale network and found that fatty acid synthesis and oxidation are the primary sources of energy in germ cells. This study predicts enzymes, reactions, and pathways that are most important for germ cell commitment to meiosis. Findings from this study help to enhance our understanding of the metabolic control of germ cell fate, results that will be critical for guiding future experiments to improve reproductive health.

............................................................................................................................
Wednesday, November 12
3:40 pm – 4:00 pm

SB T21
A scalable method for molecular network reconstruction identifies properties of targets and mutations in acute myeloid leukemia

Edison Ong1, Anthony Szedlak2, Yunyi Kang, Peyton Smith1, Nicholas Smith1, Madison McBride3, Darren Finlay3, Kristiina Vuori3, James Mason4, Edward D. Ball5, Carlo Piermarocchi2, Giovanni Paternostro3

1Salgomed, 2Michigan State University, 3Sanford-Burnham Medical Research Institute, 4Scripps Health, San Diego, 5University of California, San Diego

A key aim of systems biology is the reconstruction of molecular networks. However, we do not yet have networks that integrate information from all datasets available for a particular clinical condition. This is in part due to the limited scalability, in terms of required computational time and power, of existing algorithms. Network reconstruction methods should also be scalable in the sense of allowing scientists from different backgrounds to efficiently integrate additional data.

We present a network model of acute myeloid leukemia (AML). In the current version (AML 2.1) we have used gene expression data (both microarray and RNA-seq) from five different studies comprising a total of 771 AML samples and a protein-protein interactions dataset. Our scalable network reconstruction method is in part based on the well-known property of gene expression correlation among interacting molecules. The difficulty of distinguishing between direct and indirect interactions is addressed by optimizing the coefficient of variation of gene expression, using a validated gold standard dataset of direct interactions. Computational time is much reduced compared to other network reconstruction methods. A key feature is the study of the reproducibility of interactions found in independent clinical datasets.

An analysis of the most significant clusters, and of the network properties (intraset efficiency, degree, betweenness centrality, and PageRank) of common AML mutations demonstrated the biological significance of the network. A statistical analysis of the response of blast cells from eleven AML patients to a library of kinase inhibitors provided an experimental validation of the network. A combination of network and experimental data identified CDK1, CDK2, CDK4, CDK6, and other kinases as potential therapeutic targets in AML.

............................................................................................................................
Wednesday, November 12
4:00 pm – 4:20 pm

SB T22
Enhancer poising and regulatory locus complexity shape transcriptional programs in hematopoietic differentiation

Alvaro Gonzalez1, Manu Setty2, Christina Leslie1

1Memorial Sloan Kettering Cancer Center, 2Columbia University

We carried out an integrative analysis of enhancer landscape and gene expression dynamics in hematopoietic differentiation using DNase-seq, histone mark ChIP-seq, and RNA-seq in order to model how enhancer poising and regulatory locus complexity together govern gene expression changes at cell state transitions. We found that high complexity genes – i.e., those with a large total number of DNase-mapped enhancers across the lineage – differ architecturally and functionally from low complexity genes, achieve larger expression changes, and are enriched for both cell-type specific and “transition” enhancers, which are established in HSPCs and maintained in one differentiated cell fate but lost in others. We then developed a quantitative model to predict gene expression changes from the DNA sequence content and lineage history of active enhancers. Our method accurately predicts expression changes for high complexity genes during differentiation, suggests a novel mechanistic role for PU.1 at transition peaks in B cell specification, and can be used to correct enhancer-gene assignments.

............................................................................................................................
Wednesday, November 12
5:05 pm – 5:25 pm

SB T23
Disease gene prioritization using network and feature

Bingqing Xie1, Gady Agam1, Sandhya Balasubramanian2, Jinbo Xu3, Natalia Maltsev2, Conrad Gilliam2, Daniela Boernigen2

1Illinois Institute of Technology, 2University of Chicago, 3Toyota Technological Institute of Chicago

Identification of the most promising candidate genes contributing to disease phenotypes among large lists of variations produced by high-throughput genomics using traditional experimental methods is time- and cost-consuming. Therefore, using computational approaches that utilize existing biological knowledge for the prioritization of such candidate genes will enhance the efficiency and accuracy of the analysis of biomedical data. It will also reduce the cost of the studies by avoiding experimental validations of irrelevant candidates. To prioritize candidate genes contributing to a disease or phenotype of user’s interest for further testing, in this study, we present a novel algorithm that utilizes both types of information sources, gene annotations, and gene interactions simultaneously, while preserving their original representation using Conditional Random Field (CRF) model. We further improve the accuracy and efficiency of our proposed approach by assigning enrichment scores to the annotation feature factors within the model. To estimate the performance of our approach, we evaluated it on two independent benchmark studies, ranking the candidate genes by both network and feature knowledge. Our results overall had high Area Under Curve (AUC) values and high partial AUC (pAUC) values on various diseases benchmarks and revealed a higher accuracy and precision at the top predictions (10%) as compared with other prioritization tools. Additionally, we applied our method on a case study for the prediction of molecular mechanisms contributing to intellectual disability and autism. Our method was able to recover additional genes related to both disorders and provide suggestions for possible candidates based on their rankings and functional categories.

............................................................................................................................
Wednesday, November 12
5:25 pm – 5:45 pm

SB T24
Elucidating compound mechanism of action by network dysregulation analysis in perturbed cells

Mukesh Bansal1, Jung Hoon Woo1, Yishai Shimoni1, Archana Iyer1, Andrea Califano1, Charles Karan2, Gonzalo Lopez1, Paola Nicoletti1, Maria Rodriguez-Martinez3, Prem Subramaniam1, Wan Seok Yang1, Ronald Realubit1, Brent R. Stockwell1, Michela Mattioli4

1Columbia University, 2Columbia University Medical Center, 3IBM, 4Istituto Italiano di Techologia (IIT)

Genome-wide identification of small-molecule compound targets and effectors, within specific tissues, represents a highly relevant yet equally elusive question, with critical implications in the assessment of compound efficacy and potential toxicity in drug discovery. Experimental approaches are labor-intensive, mostly in vitro, and limited to specific protein classes (e.g., protein kinases and other enzymes), thus potentially missing proteins responsible for undesired toxicity. Computational approaches are virtually non-existent. We introduce a new regulatory-network based algorithm for elucidating compound mechanism of action (MoA). Experimental validation, using large collections of molecular profiles following compound perturbations, confirmed its ability to correctly identify established MoA proteins for >80% of tested compounds, including specific effectors of drug toxicity, such as SIK1 for doxorubicin. Several new predicted effectors were experimentally validated, including RPS3A, VHL, and CCNB1 as effectors of the mitotic spindle inhibitor Vincristine and JAK2 as a novel modulator of Mitomycin C sensitivity. Finally, the algorithm was effective in identifying specific proteins responsible for compound MoA similarity, such as GPX4, an established effector of sulfasalazine which was inferred and validated also as a direct target of altretamine, responsible for the compounds’ MoA similarity through increase in lipid ROS levels. This suggests that regulatory networks can provide novel mechanistic insight into drug activity, thus contributing to the characterization of potent, non-toxic small-molecule inhibitors.

............................................................................................................................
Wednesday, November 12
5:45 pm – 6:05 pm

SBT25
Network Infusion to infer information sources in networks

Soheil Feizi1, Ken Duffy2, Muriel Medard1, Manolis Kellis1

1Massachusetts Institute of Technology, 2Hamilton Institute

Several models exist for diffusion of signals across biological, social, or engineered networks. However, the inverse problem of identifying the source of such propagated information seems on the surface intractable, even in the presence of multiple network snapshots, and especially for the single-snapshot case, given the many overlapping paths in real-world networks. Mathematically, this problem can be undertaken using a diffusion kernel that represents diffusion processes in a given network, but computing this kernel is generally intractable.

Here, we introduce a modified diffusion kernel that relaxes the path-coupling constraints by only considering k independent shortest paths among pairs of nodes, assuming an exponential time distribution for node-to-node spreading. We use the resulting Erlang network diffusion kernel to solve the inverse diffusion problem using both likelihood maximization and error minimization. We apply this framework for both single-source and multi-source diffusion, for both single-snapshot and multi-snapshot observations, and using both uninformative and informative prior probabilities for candidate source nodes.

We apply Network Infusion (NI) to identify disease-causing genes of several human diseases including T1D, Parkinson’s, MS, SLE, CVD, CAD, psoriasis, and schizophrenia, and show that NI infers candidate disease-causing genes that are biologically relevant and often not distinguishable using the raw p-values. In a second application, we identify the news sources for 3553 stories in the Digg social news network, and validate our results based on annotated information that was not provided to our algorithm. We also apply NI to several synthetic networks and compare its performance to centrality-based and distance-based methods for Erdos-Renyi graphs, power-law networks, symmetric grids, and asymmetric grids.

We also provide proofs that under a standard susceptible-infected (SI) diffusion model, (1) the maximum-likelihood Network Infusion method is mean-field optimal for tree structures or sufficiently sparse Erdos-Renyi graphs, (2) the minimum-error algorithm is mean-field optimal for regular tree structures, and (3) for sufficiently-distant sources, our multi-source solution is mean-field optimal in the regular tree structure.


Top of Page | Go directly to Wednesday, Nov 12

Exclusively for members

  • Member Discount

    ISCB Members enjoy discounts on conference registration (up to $150), journal subscriptions, book (25% off), and job center postings (free).

  • Why Belong

    Connecting, Collaborating, Training, the Lifeblood of Science. ISCB, the professional society for computational biology!

     

Supporting ISCB

Donate and Make a Difference

Giving never felt so good! Considering donating today.