HOME

Tweets by @ISMBinfo

Accepted Posters

Attention Conference Presenters - please review the Speaker Information Page available here.

If you need assistance please contact submissions@iscb.org and provide your poster title or submission ID.

Category O - 'Systems Biology and Networks'

O01 - Exploring Ikaros Regulation Networks in Leukemia

Xiaokang Pan, Pennsylvania, United States

Short Abstract: IKZF1 (Ikaros) is a gene regulator and important tumor suppressor in leukemia. It binds DNA with unique forms. Ikaros also regulates gene expression and epigenetic changes via chromatin remodeling. To explore the mechanism of Ikaros tumor suppressor in leukemia, we conducted ChIP-Seq and RNA-Seq experiments of Ikaros and histone modifications in leukemia several different cell lines of human and mouse. We then developed a data analysis pipeline to analyze the data. This pipeline includes two key components: one is to compute and visualize the binding sites of Ikaros and other associated factors in the range around TSS of differently expressed genes in a two dimension graphical view on the web. Another one is to perform the correlation analysis and permutation test to quantitatively identify the relationships among gene expression values, binding sites in promoter or entire genes and binding factors and then display these relationships in diagram graphs (networks) to analyze the Ikaros regulation in leukemia in different cells of human and mouse on the web. In human Nalm6 cells, the resulting Ikaros regulation network in leukemia shows that Ikaros regulation to histone modifications by recruiting HDAC1 and histone modifications cooperatively function in Ikaros regulation to gene expression.

O02 - Identification of Structurally Disordered Kinases as Potential Kinome Hubs

Jaymin Kathiriya, University of South Florida, United States

Short Abstract: Since aberrant cell signaling pathways underlie majority of pathophysiological morbidities, kinase inhibitors are routinely used for pharmacotherapy. However, most kinase inhibitors suffer from adverse off-target effects. Inhibition of one kinase in a pathogenic signaling pathway elicits multiple compensatory feedback signaling loops, reinforcing the pathway rather than inhibiting it, leading to chemoresistance. Thus, development of novel computational strategies providing predictive evidence to inhibit specific set of kinases to mitigate an aberrant signaling pathway with minimum side-effects is imperative. Herein, we first demonstrate that most kinases contain intrinsically disordered regions, which facilitate kinome-wide protein-protein interactions, playing significant roles in driving pathogenic signaling. Second, we employ kinome-wide approach to identify structural disorder and streamline a methodology that can be useful in therapeutically targeting kinase cascades in a novel way to treat diseases. Further, we show that the kinases with extensive intrinsic disorder, by virtue of their high topological significance, play critical roles as kinome modulators. Third, using network analysis, we demonstrate that 5 kinases emerge as topologically most significant form kinome sub-networks, comprising of other kinases and transcription factors that are known drivers of disease pathogenesis. To support these findings, we have biologically validated the interplay between kinome modulators SRC and AKT kinases and uncovered their novel function in regulating transcription factors of the SMAD family. Taken together, we identify novel kinome modulators driven by structural disorder, demonstrating that therapeutic disruption of the function of kinome modulators engaged in regulatory cross-talk between disparate pathways can lead to reduced oncogenic potential in cancer cells.

O03 - Bayesian Network Analysis of Signaling Pathways Crosstalk in Colorectal Cancer

Pranav Srinivas, Monta Vista High School, United States

Short Abstract:

In this study we use Bayesian network (BN) learned from Colon and Rectum Adenocarcinoma RPPA data to make inference regarding network structure and pathway crosstalk. Instead of relying on genomic alterations (mutation, CNV and methylation) to deduce active pathways and crosstalk, we use conditional probability query for evidence of low apoptotic signals Caspase-9 and Caspase-8 to derive active and collaborative pathways in tumor phenotype. Our study shows that JAK2/STAT5 and TGF-β/SMAD4 pathways are frequently deregulated in colon tumors with low CASP9 and CASP8 levels. The GO term enrichment analysis of the hub proteins STAT5, SMAD4, BRAF and KRAS in inferred causal network using BiNGO further confirms the role of STAT5 and SMAD4 pathways in negative regulation of apoptosis.

We use bootstrap sampling technique to learn 500 causal Bayesian networks from RPPA protein phosphorylation data and use model averaging to build representative Bayesian network with high statistical significance. We then use conditional probability queries on representative averaged BN to test conditions for low apoptosis and derive biologically meaningful pathways crosstalk. The bootstrap sampling and averaging technique was chosen primarily to minimize the impact of local optima on learning and subsequent causal inference. Such averaged networks are known to have better predictive performance than choosing a single high-scoring network.

O04 - General principles underlying viral pathogenesis in humans

Hampapathalu Adimurthy Nagarajaram, Centre for DNA Fingerprinting and Diagnostics, India

Short Abstract: Viruses are the major cause of infirmity and death across the globe. During infection viral proteins make several physical interactions with host proteins as they depend on the host for survival and propagation. It is well known that the Human protein-protein interaction (Hu-PPI) network is robust to random attacks owing to its inherent scale free nature but can collapse in the event of targeted attacks. As viruses are known to target hubs, bottlenecks, proteins involved in cell cycle regulation, transcription and signalling pathways implying that the interactions between viral and host proteins are akin to targeted attack by viruses on Hu-PPI network. In order to decipher the general principles associated with viral interactions with human proteins, we integrated known Hu-PPI with viral-human PPI and investigated the topological roles and sequence-structural properties of the human proteins (hVIPs) targeted by viral proteins in Hu-PPI. Our studies revealed that the hVIPs occupy central proteins in Hu-PPI network. The hVIPS are associated with large number of splice variants, multiple pathways and are expressed ubiquitously, abundantly and in multiple cellular locations. Furthermore, the hVIPs are slow evolving, mostly unstructured proteins enriched with eukaryotic linear motifs (ELMs). These findings confirm that the viruses have evolved to target host proteins bestowed with conformational flexibility, information control, diversity and variation in Hu-PPI.

O05 - Revealing Missing Parts of the Interactome via Link Prediction

Yuriy Hulovatyy, University of Notre Dame, United States

Short Abstract: Protein interaction networks (PINs) are often used to "learn" new biological function from their topology. Since current PINs are noisy, their computational de-noising via link prediction (LP) could improve the learning accuracy. LP uses the existing PIN topology to predict missing and spurious links. Existing LP relies on shared immediate neighborhoods of the nodes to be linked. As such, it has limitations. In order to comprehensively study what are the topological properties of nodes in PINs that dictate whether the nodes should be linked, we introduce novel sensitive LP measures that are expected to overcome the limitations of the existing methods. We systematically evaluate the new and existing LP measures by introducing "synthetic" noise into PINs and measuring how accurate the measures are in reconstructing the original PINs. Also, we use the LP measures to de-noise the original PINs, and we measure biological correctness of the de-noised PINs with respect to functional enrichment of the predicted interactions. Our main findings are: LP measures that favor nodes which are both "topologically similar" and have large shared extended neighborhoods are superior; using more network topology often though not always improves LP accuracy; and LP improves biological correctness of the PINs. Ultimately, we are less focused on identifying a superior method but more on showing that LP improves biological correctness of PINs, which is its ultimate goal in computational biology. But we note that our new methods outperform each of the existing ones with respect to at least one evaluation criterion.

O06 - Kinetic modeling of NF-kappaB signaling pathway

Kentaro Inoue, RIKEN Center for Integrative Medical Sciences (IMS-RCAI), Japan

Short Abstract: Background: Systems biology aims to understand rational design principles of complex biological systems. Mathematical modeling is a powerful tool to identify molecular mechanisms of the system and to predict and control the dynamical behaviors. In mammalian signaling systems, extra-cellular information is transmitted to intra-cellular signaling pathways followed by activation of transcription factors that control gene expression. As the transcription factor nuclear factor-kappa B (NF-kappaB) plays important roles in cell fate decision and is associated with several diseases, understanding the regulatory mechanism is important. Two major features are well-known for NF-kappaB dynamics, a switch-like response and oscillations in response to tumour necrosis factor (TNF). We recently found similar behavior in B cells upon BCR engagement. Currently, there is no mathematical model representing both phenomena in B cells.
Method: In an earlier study, we proposed a model to explain switch-like response of NF-kappaB. Here we build on that work to show that the interplay of transcriptionally-inducible negative-feedback loops do explain the observed oscillatory behavior.
Results: We simulated the time-course and stimuli dose response for an expanded model of NF-kappaB activation in response to BCR engagement. The simulations showed both a switch-like response and oscillations in NF-kappaB activity. Furthermore, we could reproduce several genetic mutations in experiments for the simulation.
Conclusion: The current study proposes the first model for NF-kappaB signaling pathway which can reproduce both switch-like response and oscillatory behavior. This model provides important insights for understanding the regulatory mechanisms in the NF-kappaB signailing system.

O07 - Integrative Network Approaches Reveal Shared and Specific Organization of Network Modules in Arabidopsis Immune Response

Xiaobao Dong, China Agricultural University, China

Short Abstract: Pattern-triggered immunity (PTI) and effector-triggered immunity (ETI) are two main forms of plant immune response to counter pathogen invasion. Large-scale mRNA expression profiling studies have suggested ETI as a more efficient version of PTI with faster and stronger immune response, however, the underline mechanisms raise this kind of quantitative difference are remain poorly defined. Beyond that, recent evidences also demonstrate distinct gene regulation in PTI and ETI signaling networks, indicating a more complex landscape of plant immunity. To systematic characterize and compare PTI and ETI signaling networks, we developed a network-based computational analysis framework that integrates gene regulatory network and gene expression data of immune response for identifying active network components in Arabidopsis’ PTI and ETI, respectively. We compared PTI and ETI in three network resolutions. First, at single gene level, our analysis successfully detected multiple known key regulators of plant immune response and also predicted LOV1 as a new immune regulator both for PTI and ETI. Second, at subnetwork level, we found PTI and ETI sharing a common subnetwork which is also a hotspot of pathogen virulence proteins, representing parts of core components in immune signaling networks. Finally, we constructed modular network models for PTI and ETI, respectively, to help explain the quantitative difference between them. Together, our results provide a comprehensive comparison between signaling network of PTI and ETI, and demonstrate the value of integrative network approaches for studies of plant pathology.

O08 - Dynamic networks reveal key players in aging

Tijana Milenkovic, University of Notre Dame, United States

Short Abstract: Since susceptibility to diseases increases with age, studying aging gains importance. Analyses of gene expression or sequence data, which have been indispensable for investigating aging, have been limited to studying genes and their protein products in isolation, ignoring their connectivities. However, proteins function by interacting with other proteins, and this is exactly what biological networks (BNs) model. Thus, analyzing the proteins' BN topologies could contribute to understanding of aging. Current methods for analyzing systems-level BNs deal with their static representations, even though cells are dynamic. For this reason, and because different data types can give complementary biological insights, we integrate current static BNs with aging-related gene expression data to construct dynamic, age-specific BNs. Then, we apply sensitive measures of topology to the dynamic BNs to study cellular changes with age.

While global BN topologies do not significantly change with age, local topologies of a number of genes do. We predict such genes as aging-related. We demonstrate credibility of our predictions by: 1) observing significant overlap between our predicted aging-related genes and "ground truth" aging-related genes; 2) observing significant overlap between functions and diseases that are enriched in our aging-related predictions and those that are enriched in "ground truth" aging-related data; 3) providing evidence that diseases which are enriched in our aging-related predictions are linked to human aging; and 4) validating our high-scoring novel predictions in the literature.

O09 - MAGNA: Maximizing Accuracy in Global Network Alignment

Vikram Saraph, University of Notre Dame, United States

Short Abstract: Biological network alignment aims to identify similar regions between networks of different species. Existing methods compute node similarities to rapidly identify from possible alignments the high-scoring alignments with respect to the overall node similarity. But, the accuracy of the alignments is then evaluated with some other measure that is different than the node similarity used to construct the alignments. Typically, one measures the amount of conserved edges. Thus, the existing methods align similar nodes between edges, hoping to conserve many edges (after the alignment is constructed!).

Instead, we introduce MAGNA to directly optimize edge conservation while the alignment is constructed, without decreasing the quality of node mapping. MAGNA uses a genetic algorithm and our novel function for "crossover" of two "parent" alignments into a "child" alignment to simulate a "population" of alignments that "evolves" over time; the "fittest" alignments survive and proceed to the next "generation", until the alignment accuracy cannot be optimized further. While we optimize our new and superior measure of edge conservation, MAGNA can optimize any alignment accuracy measure, including a combined measure of both node and edge conservation. In systematic evaluations against existing state-of-the-art methods (IsoRank, MI-GRAAL, and GHOST) on both synthetic networks and real-world biological data, MAGNA improves both node and edge conservation, as well as both topological and biological alignment accuracy, of all existing methods.

O10 - Exploring causal genes responsible for a phenotype of interest by the nonparanormal intervention-caluculus

Reiji Teramoto, Forerunner Pharma Research Co., ltd., Japan

Short Abstract: Intervention of gene functions by knockdown or overexpression is widely conducted to identify genes which play important roles in many aspects of cellular functions and phenotypes. However, the conventional methods rely heavily on the assumption of normality, and they often give incorrect results when the assumption is not true. To relax the Gaussian assumption in causal inference, we introduce the nonparanormal method to test conditional independence in the PC-algorithm. Then, we present NPN-IDA (Nonparanormal intervention-calculus when the DAG (Directed acyclic graph) is absent) which incorporates the cumulative nature of effects through a cascaded pathway via causal inference for ranking causal genes against a phenotype with the nonparanormal method for estimating DAGs. We demonstrate that causal inference with the nonparanomal method significantly improve the performance in estimating DAGs on synthetic data in comparison with the original PC-algorithm. Moreover, we show that NPN-IDA outperforms the conventional methods in exploring the regulators of flowering time in Arabidopsis thaliana and the regulators controlling the browning of white adipocytes in mouse. Our results show that performance improvement in estimating DAGs contributes to improvement for estimating causal effects. Despite of the simplest alternative procedure, our proposed method enables us to design efficient intervention experiments, and can be applied to a wide range of research purpose including drug discovery, due to its generality.

O11 - Metabolic Network Approaches for the functional division of Bacterial Communities

shany ofaim, Technion, Israel

Short Abstract: Rapid advances in metagenomics and genome sequencing have led to the accumulation of vast amounts of empirical ecological data such as 16S, RNA-Seq etc. With the increase in ecological data production, the need for robust automated functional community analysis approaches rises, creating an information-analysis gap. An accumulating body of evidence now supports the reliability of metabolic analysis, such as metabolic networks, as a tool for processing genomic data into information describing the 'lifestyle' of microbial species and the network of interaction they form within such communities. Metabolic networks are comprised of interconnected chains of chemical reactions that occur in a living organism to maintain life. It has been demonstrated that metabolic network approaches are a sufficient tool for the prediction of cellular activity and growth capacity under changing conditions at the single species level. The integration of multiple single species networks into a communal metabolic network representation allows for the investigation of the functional division between its participants, showing the metabolic hierarchy in the sampled environment. Here we demonstrate the use of such metabolic network approaches in the functional division analysis of communities in the rhizosphere and bulk soil environments based on RNA Seq data that was collected for cucumber and wheat crops in Israel. Through the application of the Expansion algorithm we simulate community activity following the systematic deletion of community members, allowing delineating its unique contribution to community metabolism. For example, we identify species' contribution to N-, S and glycan metabolism.

O12 - Subtype-specific Pathway Crosstalk in Breast Cancer

Michael Seiler, Boston University, United States

Short Abstract: Traditionally, functional gene or protein-based pathways in systems biology are thought of as functional modules representing isolated, context-dependent biological processes. However, due to factors such as gene coregulation and gene sharing between modules, the activity of one module can influence the activity of another, which we call pathway “crosstalk.” We have developed a method to identify pathway crosstalk using condition-specific mRNA expression information in conjunction with a known human functional linkage network (FLN), a network which connects human genes based on the strength of their functional associations. We apply this method to results derived from RNA sequencing data obtained from breast cancer tumors as well as normal breast tissue. Using this method, we build independent pathway crosstalk networks for multiple clinical subtypes of breast cancer as well as normal breast, observing changes in network topology related to carcinogenesis. Further, we show that pathways in each subtype identified to be either disregulated or “driver” pathways in cancer, using expression and mutation information, respectively, are significantly more likely to exhibit crosstalk interconnectivity not found in the normal breast tissue network. All breast tumor pathway crosstalk networks are available for download at visant.bu.edu.

O13 - MODULE COVER – A NEW APPROACH TO GENOTYPE-PHENOTYPE STUDIES

Yoo-Ah Kim, National Institutes of Health, United States

Short Abstract: Uncovering and interpreting phenotype/genotype relationships are among the most challenging open questions in disease studies. Set cover approaches are explicitly designed to provide a disease signature set for diverse disease samples and thus are valuable in studies of heterogeneous datasets. At the same time pathway-centric methods have emerged as key approaches that significantly empower studies of genotype-phenotype relationships in complex diseases such as cancer. Combining the utility of set cover with the power of network-centric approaches, we designed a novel approach that extends the concept of set cover to network modules cover. We developed two alternative methods to solve the module cover problem: (i) an integrated method that simultaneously determines network modules and optimizes the coverage of disease cases. (ii) a two-step method where we first determined a candidate set of network modules and subsequently selected modules that provided the best coverage of the disease cases. The integrated method showed superior performance in the context of our application. We applied our algorithm to several cancer datasets and demonstrated the utility of the module cover approach for the identification of groups of related genes whose activity is perturbed in a coherent way by specific genomic alterations, allowing the interpretation of the heterogeneity of cancer cases.

O14 - ResponseNet, TissueNet and myGeneNet: new ways to create and explore molecular networks

Omer Basha, Ben Gurion University in the Negev, Israel

Short Abstract: Knowledge of protein-protein interactions (PPIs) is important for identifying the functions of proteins and the processes they are involved in. Although data of human PPIs are easily accessible through several public databases, these databases do not specify the human tissues in which these PPIs take place, or point out possible pathways through them. In addition, in order to create a network from these databases one needs to download, parse and consolidate those databases. The network biology tool suite aims to solve most of these problems. The TissueNet database of human tissue PPIs associates each interaction with human tissue that expresses both pair mates. The myGeneNet webserver lets you build protein-protein networks automatically and filter them by gene expression data for a specific tissue. Lastly, the ResponseNet2.0 web-server identifies high probability pathways in your networks connecting your proteins and genes. All of our applications use our web based graphical representation of networks, designed to give lots of additional data in an intuitive way. Thus, the network biology suite provides a unique platform for assessing the roles of proteins and their interactions across tissues and diseases.

O15 - Enhancing ChEBI to support systems biology and metabolic modelling

Janna Hastings, European Bioinformatics Institute,

Short Abstract: ChEBI (http://www.ebi.ac.uk/chebi) is a curated database and ontology of biologically relevant small molecules. It is widely used as a reference for chemicals in the context of biological data such as protein interactions, pathways, and models. Systems biology brings together a wide range of information about cells, genes and proteins, as well as the small molecules that act on and within these biological structures. It gives a holistic perspective aiming to track and eventually simulate the entire functioning of biological systems. Chemical information plays a key role in these models and simulations. Within this context, efforts are under way to enhance ChEBI for the systems biology and metabolic modelling communities. Enhancements include the addition of a library for comprehensive programmatic access (libChEBI), provision of automatic ontology classification and support for bulk submissions, curation of the known metabolomes across four major species, and the introduction of novel visualizations of relevance to the systems biology community. The libChEBI library will include the facility to specify concrete instances of generic metabolites and to extract biologically relevant groupings or clusters of compounds, the ability to calculate important physicochemical properties and determine relationships such as redox between molecules, and the facility to detect chemical and semantic errors in models and metabolic reconstructions. This poster will present these new features of ChEBI.

O16 - Integrated high throughput phenotype profiling for metabolic network refinement of C. reinhardtii

Amphun Chaiboonchoe, New York University Abu Dhabi, United Arab Emirates

Short Abstract: We previously reported a genome scale metabolic network reconstruction of the unicellular green alga Chlamydomas reinhardtii (Chang et al., Mol Syst. Biol., 2011); the model was experimentally validated. In this study, we further refine the model by integrating the cellular phenotypes identified by using the Omnilog high-throughput phenotype MicroArrayTM(PM). The Omnilog is an in vivo assay that measures the respiration of cells as a function of time in thousands of microwells simultaneously. Each PM plate contains 96-well pre-filled with different metabolite and monitored automatically over time via the OmniLog system. Metabolite utilization by the cell is determined by the amount of color development produced by a tetrazolium-based redox dye. Here, we have adapted the application of Omnilog platform for use in algal research and developed protocols for various strains. Nearly 1000 assays were conducted for carbon, phosphorus and sulfur source utilization, peptide nutrient stimulation, osmotic stresses and pH tolerance. Data analysis was performed using the opm package (Vaas et al., 2013). Our results validated the presence of a large number of reactions accounted for in the model. We also identified a significant number of novel metabolites that are not described in our current in silico model, for instance, novel sulfur and nitrogen utilization pathways. This study describes a high throughput method to bridge the genomics with metabolomics and characterize phenotypic states of algal cells.

O17 - A Dynamic Bayesian Network Framework to Infer Gene Regulatory Networks from Multiple Types of Biological Data

Serdar Bozdag, Marquette University, United States

Short Abstract: One of the challenging and important computational problems in systems biology is to infer gene regulatory networks of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. Due to the limitations of the gene expression data, these approaches resulted in many spurious regulatory interactions. Additional information such as copy number and DNA methylation could be integrated to improve the accuracy of the results. Here, we report the first dynamic Bayesian network-based framework that integrates gene expression, copy number and DNA methylation to infer gene regulatory networks. We show that DNA methylation between genes that have a regulatory relationship is more correlated than DNA methylation between other genes. We also show that there is a higher correlation between copy number and gene expression of genes that have a regulatory relationship than correlation between copy number and gene expression of other genes. Our results show that the integration of copy number and DNA methylation as a network prior improves the accuracy of the network construction. Our approach could easily integrate any number of other biological datasets such as transcription factor binding and/or literature.

O18 - Region-specific molecular signatures in the human and mouse brain

Emma Myers, Boston University, United States

Short Abstract: A great deal is known about shared neuroanatomy between humans and other species. The extent to which neuroanatomical homologies reflect similarities in the molecules expressed within brain areas is less well understood. Recently available databases of gene expression with high spatial resolution afford an opportunity to study the molecular underpinnings of brain regionalization within and across species. We have implemented an approach to identify region-specific molecular signatures that are consistent between the human and mouse brain; i.e., to determine gene sets with (i) consistent expression in a specific brain region for the mouse and human brain, and (ii) distinct expression from other regions. Data are collated from the Allen Brain Atlases for mouse and human using ~3000 orthologous genes. Human data are six large microarray datasets from neurologically normal adults. Mouse data are from high-resolution in situ hybridization studies in adult mice. By defining an expression profile for a human region and correlating this profile with each voxel in the mouse data, we obtain similarity maps, depicting cross-species correspondences that tend to strongly reflect anatomical homologies. We use data-driven methods to partition the full gene set into subsets, and quantify the extent to which these subsets meet our criteria for cross-species region-specific signatures using the similarity maps and a novel scoring system. We are examining overrepresented gene functions and pathways to determine if our results suggest hypotheses about conservation of molecular processes that underlie common functional roles for these brain areas in mouse and human.

O19 - MINE: A Novel Computational Algorithm for Gene Network Identification

Michael Molla, EnBiotix Inc, United States

Short Abstract: A perturbation to a biological system results in changes to molecular processes, signaling networks, and the constituent genes. These changes reflect the mode-of-action (MOA) of the perturbation and, if properly characterized, can be used to gain insights into how the perturbation acts. When gene-expression measurements are coupled with appropriate computational methods, the efficiency and accuracy of MOA determination can be dramatically improved, by reducing the number of targets that need to be probed experimentally for definitive MOA determination. Our novel MINE approach (Mode-of-action by Iterative Network Expansion), leverages the strengths of two well-established computational methods: MNI (Mode-of-action by Network Identification) and CLR (Context Likelihood of Relatedness). MNI employs ordinary differential equations to interrogate a data compendium and reverse-engineer a gene network at the resolution of "Metagenes" (groups of genes with similar expression profiles). CLR utilizes the information theoretic measure of mutual information to determine pairwise gene-to-gene connections. MINE iteratively uses both methods to allow for enhanced characterization of the biological processes underlying a perturbation. An expansion/pruning algorithm enables MINE to identify the sub-networks of influence, and derive refined insight from large, compiled compendia of cross-platform gene-expression data. Details of this approach as well as initial results in microbial and mammalian datasets will be presented. These in-silico methods can elucidate regulatory mechanisms and possible metabolic changes associated with perturbations to any system, which will ultimately lead to improved strategies for targeted therapies in many fields.

O20 - Template-Based Interventions in Boolean Network Models of Biological Systems

Michael Verdicchio, The Citadel, United States

Short Abstract: A grand challenge in the modeling of biological systems is the identification of key variables which can act as targets for intervention. Boolean networks are among the simplest of models, yet have been shown to adequately model many complex dynamics of biological systems. In our recent work we identified quality single-variable targets for intervention from smaller Boolean network models. However, as the number of variables in a network increases, the more likely it is that a successful intervention strategy will require multiple variables. Thus, for larger networks, such an approach is required in order to identify more complex intervention strategies while working within the limited view of the network's state space.
We introduce a multiple-variable intervention target called a template and show through simulation studies of random networks that these templates are able to identify top intervention targets in increasingly large Boolean networks. We first show that when other methods show drastic loss in performance, template methods show no significant performance loss between fully explored and partially sampled Boolean state spaces. We also show that, when other methods show a complete inability to produce viable intervention targets in sampled Boolean state spaces, template methods maintain significantly consistent success rates even as state space sizes increase exponentially with larger networks. Finally we show the utility of the template approach on a real world Boolean network modeling T-LGL leukemia.
Our results demonstrate how template-based approaches can now take over for single-variable approaches and produce quality intervention targets in larger Boolean network models.

O21 - Modulation of Gene Expression Regulated by Transcription Factor

Xueling Li, University of Texas Medical Branch, United States

Short Abstract: In post-genomic era, one important task is to uncover the functional involvement of molecular constitutes perturbed in a cell or tissue under different processes. Integrated molecular network is a critical approach to complete the task. Through integrating heterogeneous data including genome-wide protein-protein interaction and gene regulatory network of 204 transcription factors based on a large compendium of gene expression profiles, we constructed a systematic-scale modualtory network. We found that the even without protein-protein constraints, our modulatory network can recover the underlying transcriptional pathways and reveal new transcriptional pathways. By mapping the differentially expressed genes in published data including mesencymal transition processes to the modulatory network, we can uncover the important molecular systems in these processes, which consist of transcription factor drivers, signaling protein hubs and the involved regulatory pathways. We noticed different groups of modulators, including RNA-binding proteins, chromatin remodeling proteins, cytoskeleton proteins and mitochondrion proteins, which perform characteristic functions in gene expression modulation. The results will shed light on the regulatory mechanisms and provide testable hypotheses in the eptithlial-mesencymal transition process, which provide important therapeutic intervention targets in cancer metastasis and progression.

O22 - Systematic and fair evaluation of global network aligners

Joseph Crawford, University of Notre Dame, United States

Short Abstract: Analogous to genomic sequence alignment, biological network alignment identifies topologically and functionally conserved regions between networks of different species. Then, biological function can be transferred from well- to poorly-annotated species between aligned network regions. Network alignment consists of two algorithmic components, a node cost function and an alignment strategy. Since different existing network alignment methods use both different node cost functions and different alignment strategies, it is not clear whether the superiority of a method comes from its node cost function, its alignment strategy, or both. Thus, here we fairly evaluate state-of-the-art network alignment approaches by mixing and matching their node cost functions and alignment strategies, in hope that a combination of the node cost function of one existing method and the alignment strategy of another existing method would beat all existing methods. While doing so, we approach an additional important research question that has not been asked systematically in the context of network alignment thus far: we ask how much of network topology should be considered within the node cost function (for example, is considering "k hop" neighborhood around a node more desirable than considering its "k-1 hop" neighborhood), as well as how much of the node cost function information should come from protein sequence data compared to network topology data.

O23 - DRUGMNEM: An optimization strategy for targeted combination of drugs using single- drug screening single cell data

Benedict Anchang, Stanford University, United States

Short Abstract: Accumulating evidence implicates intratumor heterogeneity as an important challenge to cancer treatment. To address this challenge, there has been the recent surge in clinical trials for combination therapy for various cancer types, however a principled framework for rational and unbiased designs for optimal drug combinations is needed. We rationalize that by simultaneously targeting multiple key pathways across different cell types at onset, we will decrease the likelihood of emerging resistant populations. Our goal is to develop an optimization framework for effective combination therapy using cell population data that reveals heterogeneity in inter- and intra-cellular signaling at the level of single cells. Here we introduce an algorithm named DRUGMNEM that combines cell state identification, nested effect modeling and link analysis to optimize possible novel combination therapeutic strategies from experimentally derived known multi-drug screening, single cell data that may yield better prognostic results compared to the known monotherapies. We apply DRUGMNEM on normal hematopoietic drug screening single cell data comprising 10 surface markers and 14 intracellular protein expression responses measured after 30 minutes following 5 Jak and Bcr inhibitors administered at 2 dose levels (maximum and no dose) under different stimulations. Under each stimulation, we show that DRUGMNEM is optimized across the entire single cell drug screening data and produces a reduced drug combination set that has the potential for maximally targeting different signals in all major and rare cell types. Experimental validation by comparing the progression of the cells after treatment with the derived optimized drug combination cocktail is ongoing.

O24 - Progressive promoter element combinations classify conserved plant gene expression modules

Sandra Smieszek, Royal Holloway, University of London,

Short Abstract: We aimed to test the proposal that progressive combinations multiple promoter elements
acting in concert of may be responsible for the full range of phases observed in plant
circadian output genes. In order to allow reliable selection of informative phase grouping of
genes for our purpose, intrinsic cyclic patterns of expression were identified using a novel,
non-biased method for the identification of circadian genes. Our non-biased approach
identified two dominant, inherent orthogonal circadian trends underlying publicly-available
microarray data from plants maintained in constant conditions. Furthermore, these trends
were highly conserved across several plant species. Four phase-specific modules of circadian genes were generated by projection onto these trends and, in order to identify potential combinatorial promoter elements that might classify genes in to these groups, we used a random forest pipeline which merged data from multiple decision trees to look for presence of element combinations. We identified a number of regulatory motifs which aggregated into coherent clusters capable of predicting the inclusion of genes within each phase module with very high fidelity and these motif combinations changed in a consistent, progressive manner from one phase module group to the next, providing strong support for our hypothesis.

O25 - A Novel Method for Elucidating microRNA and Transcription Factor Co-regulatory Networks in Prostate Cancer

Youngmi Yoon, Gachon University, Korea, Rep

Short Abstract: MicroRNA (miRNA) and transcription factor (TF) are critical gene regulators. Their mechanisms may provide insights into fundamental aspects of cancer biology and help discover the potential therapeutic targets in cancers. However, studying miRNA and TF co-regulatory mechanisms is difficult to achieve by experimental methods. Even at computational approaches, miRNA and TF co-regulatory networks are too complicated to fully reveal their condition-specific functions, for instance, interplay between regulators, and feedback and feedforward regulations. Here, we propose a new method that integrates mRNA and miRNA expression data, chromatin immuneprecipitation with sequencing data, computational miRNA and TF target predictions, and protein-protein interaction (PPI). For RNA sequencing data sets, miRNA-mapped and mRNA-mapped reads were used to quantify miRNA and mRNA expression, respectively. The expression of mRNA was modeled as a linear combination of expression values of mapped regulators and genes. This regression model was used to construct a prostate cancer-specific co-regulatory network. Furthermore, functional clusters were pulled out from co-regulatory network, and we evaluate and analyze these clusters. The results elucidate the complex gene regulatory mechanisms for prostate cancer, and the discovered interactions and molecular functions are studied with literatures. In addition, we significantly identified several interactions which can be potential candidates for experimental validation.

O26 - Reductive evolution in Mycobacterium leprae by comparison with Mycobacterium tuberculosis using functional protein-protein interaction networks

Richard Akinola, University of Cape Town, South Africa

Short Abstract: Mycobacterium leprae (MLP) and Mycobacterium tuberculosis (MTB) both have a common ancestor. Though both organisms are similar, yet
the former has a reduced genome compared to the latter. The genome of textit{Mycobacterium leprae} is approximately 1.4 Mb smaller
than that of textit{Mycobacterium tuberculosis}. The reduction in the genome of MLP is due to the inactivation of some genes to pseudogenes as well as
the deletion of genes.
In an earlier study, functional networks for both organisms were generated and extensive computational analyses were conducted to reveal
the biological organization of the organisms on the basis of their network
topological properties. In this work, we use the generated networks
and ortholog data downloaded from public databases to
identify 59 non orthologous MTB hub proteins (and their neighbours) with high betweenness and high GC content that were
deleted from the MLP proteome thus causing a reduction in its genome and might explain why MLP has the least GC content of all known mycobacteria.
We show that reductive evolution in MLP
was as a result of deletions of important proteins from neighbours of corresponding
ortholog MTB protein, that is while each orthologous MTB protein had insertions
in most instances, the corresponding MLP protein suffered deletions.
Out of the 1277 orthologous proteins in both networks, we identified 1188 and 89 proteins
with conserved and divergent functional classes respectively in both organisms

O27 - Structural Toll-like Receptor Pathway May Illuminate Its Roles in Inflammation and Cancer Crosstalk

Emine Guven-Maiorov, Koc University, Turkey

Short Abstract: Inflammation is crucial for defense against pathogens, maintain homeostasis and heal wounds. Inflammation should be strictly regulated; if not finely tuned, it can lead to oncogenesis. TLR pathway orchestrates both innate and adaptive immune systems with an essential role in inflammation. Although extremely useful, the classical representation of pathways in terms of nodes-and-edges is incomplete: they exhibit which proteins interact but not how. Also, atomic details of interactions elucidate which parallel pathways can co-exist, how mutations affect the protein interactions and change the cellular outcome and support malignancies. TLR pathway plays a central role in inflammation and cancer crosstalk and construction of their structural pathway provides insights on their mechanism of action in tumor microenvironment. Here, we constructed the structural TLR pathway by employing a powerful algorithm, PRISM (PRotein Interactions by Structural Matching), mapped clinically observed oncogenic mutations of the key adaptor molecules and checked their effect on the interactions. Remarkably, we found that parallel pathways of TLR network are mutually exclusive: TRAF6, TRAF3, and FADD – which induce pro-inflammatory cytokines, interferons and anti-inflammatory cytokine IL-10, and apoptosis, respectively – compete to bind to the overlapping interfaces on MyD88. We also found that C27* nonsense mutation on FADD protein abolishes its interaction with MyD88 and thus prevents apoptosis. If FADD can no longer occupy MyD88 binding site, TRAF6 is free to bind, allowing constitutive activation of MAPKs and production of pro-inflammatory cytokines. And this may explain how C27* mutation on FADD contributes to initiation or progression of tumor.

O28 - Inferring the Transcriptional Regulatory Network in Maurer’s Cleft Pathway of Plasmodium falciparum

Itunuoluwa Isewon, Covenant University, Nigeria

Short Abstract: Biological networks are representations of molecular interactions within a cell. Recent technical advances in high throughput technologies and computational techniques have made it possible to survey these complex interactions by modeling them as networks. Transcriptional regulatory networks (TRNs) can be inferred by the combinatorial interaction between trans-acting transcription factors (TFs) - regulators and cis-regulatory DNA elements (target genes). The Maurer's clefts (MCs) are very important to the survival of Plasmodium falciparum (P.f.) within an infected cell as they are induced by the parasite itself in the erythrocyte for protein trafficking. Virulence proteins as well as other proteins used to remodel the erythrocyte are exported. Here, we describe and analyze the transcriptional network controlling the formation of MCs within the cell using an integrated approach. Our approach combines prior knowledge of known TFs and their interacting regulatory elements (experimentally confirmed and computationally predicted ones) with newly predicted ones. Our in-silico prediction was done using reverse engineering methods in combination with data gathered from high throughput gene expression profiling. Presently about 113 genes are known that belong to this pathway in P.f. and regulation for 110 genes were found. New TFs with corresponding target genes have been predicted and a comprehensive catalog of these interactions was compiled in order to construct a TRN for this pathway. The constructed TRN provides a network-level understanding of transcriptional regulation within this pathway which we believe will help in the quest to eradicate the parasite.

O29 - PheNetic: “Omics” Interpretation Using Interaction Networks

Bram Weytjens, KU Leuven, Belgium

Short Abstract: PheNetic: “Omics” Interpretation Using Interaction Networks

Dries De Maeyer,Bram Weytjens, Joris Renkens, Luc De Raedt, Kathleen Marchal

Omics experiments are becoming a standard in wet-lab practice to study molecular phenotypes. Such experiments usually yield lists of genes, involved in the phenotype of interest. Analysis of these lists, commonly done by enrichment analysis, is useful to have a first idea on the processes involved but only provides limited molecular insight into the origin of the phenotype. By overlaying the in-house data with publicly available information, a much richer interpretation can be obtained (cloots et al., 2011).

Therefore we present PheNetic (De Maeyer et al., 2013), a network-based approach that uses interaction networks constructed from publicly available omics data to guide the data interpretation. The probabilistic querying algorithm underlying PheNetic allows 1) selecting key genes involved in the process under study which were not recovered from the experimental analysis and 2) providing insight into the molecular mechanism of the phenotype under study.

To illustrate the usefulness of our network-based approach in prokaryotic model organisms, we show how PheNetic can be used to interpret large scale differential expression measurements and RNA-seq time series in E. coli and S. enterica strains.

O30 - Guidance for RNA-seq co-expression network construction and analysis: safety in numbers

Sara Ballouz, Cold Spring Harbor Laboratory, United States

Short Abstract: RNA-seq offers profound biological and technical advantages over microarray technologies, most usefully being able to detect the whole transcriptome. Although differential expression analysis is a more common means for interpreting transcriptomic data, co-expression analysis is far more routine in the context of meta-analysis, with thousands of expression profiles aggregated to generate robust signatures using repurposed data. Co-expression methods are already available to allow the meta-analysis of disparate datasets with quite different properties, subsuming most of the ambiguities that still exist in analyzing RNA-seq data.
Co-expression analysis is typically based on the correlation (or similar) of expression levels from microarray data sets and we applied a parallel approach to characterizing the extant public RNA-seq data. We used 50 separate RNA-seq experiments across 1,970 individual samples and a union of 30,705 RNA species (20,027 coding and 10,678 non-coding) to generate a reference co-expression network. Each node of the network represents an RNA species, and the edges are weighted by their correlation of expression.
We demonstrate that the network generated from RNA-seq encodes known biology (as captured by GO, KEGG and Reactome) through systematic recapitulation of functional connectivity. Perhaps surprisingly, we find the known dependency in microarray co-expression on sample sizes is almost identical in RNA-seq, suggesting hundreds to thousands of samples are necessary to obtain strong co-expression network performance. Further to this, we also find a high dependence of co-expression on the read depth per sample.

O31 - Optimal Design of Experiments for Gene Regulatory Network Inference

Rudiyanto Gunawan, ETH Zurich, Switzerland

Short Abstract: The inference of gene regulatory network (GRN) from microarray data is an important problem that has remained unsolved. The inference is known to be underdetermined, and there can be many equivalent solutions. Much of the challenge can be attributed to the lack of information in the data to differentiate direct from indirect regulations. We have recently developed a framework to identify an ensemble of GRNs that are indistinguishable based on a given expression dataset from gene knock-out (KO) or perturbation experiments. The size of the ensemble thus represents the uncertainty in the network inference. Here, we present an algorithm utilizing the predicted ensemble to propose KO experiments. The design of experiments is based on evaluating the number of direct and indirect gene regulations within the ensemble than can be potentially verified using expression data from a gene KO experiment. We have used the concept of graph separators and genetic algorithm to rank experiments, each involving KO of a set of genes, satisfying pre-specified constraints (e.g. the maximum number of gene KO in a single experiment and the exclusion of essential genes). The experiments can be performed either sequentially following the ranking or iteratively in which the ranking is updated using data from new experiments. We have applied the algorithm on DREAM4 challenge and E. coli GRNs and implemented the experiments in silico using GeneNetWeaver. We demonstrate that our algorithm vastly outperforms systematic KO study in the number of required experiments and network accuracy, as measured by AUROC and AUPR.

O32 - Identification of differentially expressed gene modules from protein-protein interactions and transcriptional correlations in primary and metastatic endometrial lesions

Kanthida Kusonmano, University of Bergen, Norway

Short Abstract: We propose a network-based approach to identify differentially expressed subnetworks between aggressive primary tumors and metastases in endometrial cancer, the most common pelvic gynecological malignancy. The subnetworks are built by combining protein-protein interaction data and microarray based transcriptional profiles for 66 primary tumors with metastasis at diagnosis or later recurrent disease within 3 years of observation, and for 42 metastatic lesions. First, subnetworks were identified by a modified Weighted Gene Co-expression Network Analysis integrating protein-protein interaction information. While a scale free topology property was still maintained in the generated subnetworks, a link between genes/proteins represents both a correlation of gene expressions within primary or metastatic lesions, and a physical protein-protein interaction. Secondly, we evaluate the identified subnetworks by applying Gene Set Enrichment Analysis to indicate the discriminatory power of the gene expressions between primary tumors and metastases in input subnetworks. The candidate subnetworks with significant enrichment score are provided as the results for interpretation. We found several metastasis-related genes in the identified subnetworks such as TCF4, RAC1, and TWIST2. By deriving transcriptional correlations based subnetworks from each group of primary and metastatic lesions separately, it is also interesting to distinguish the subnetworks existing in both groups that show core modules between cancer types, or subnetworks that are present only in either group possibly indicating unique mechanisms in that group. As a supplement to the traditional approach by identifying individual genes discriminating between sample groups, this approach allows to identify discriminatory submodules facilitating a more contextual/systemic understanding of cancer biology.

O33 - Expanding views of the human interactome network

Marc Vidal, Dana-Farber Cancer Institute, United States

Short Abstract: Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks, which underlie most cellular properties, will be critical to fully understand genotype-phenotype relationships. Despite sustained efforts at mapping interactions, both in small-scale experiments reported in the literature and more systematic high-throughput screens, many fundamental properties of the human interactome remain unknown. Here, we describe the largest systematic map of binary protein-protein interactions ever reported, doubling the number of high-quality interactions from the literature. This unprecedented coverage reveals a strikingly “broader” human interactome network that includes thousands of proteins within uncharted territories of previous maps, of which many are associated with human disease. Interaction detectability correlates with co-functionality, suggesting that the biophysical interactome network is more functionally heterogeneous than originally anticipated. Known and candidate cancer genes are highly connected in the interactome network, providing the first unbiased evidence for an expanded functional cancer landscape. Reference maps of the full human interactome network are now within reach.

O34 - Towards a reference map of the human protein-protein interactome

Marc Vidal, Dana Farber Cancer Institute, United States

Short Abstract: Interactions between macromolecules govern most biological processes. Comprehensive knowledge of all interactions of a cell, the interactome, is essential to understand cellular function and explore disease mechanisms. Current interaction maps are estimated to cover only a small fraction of the interactome. Here, we present our progress in providing the first reference map of the human binary protein-protein interactome to the community. Our pipeline is based on yeast two-hybrid (Y2H) screens followed by validation with orthogonal assays. We have completed one full screen of 17,500 x 17,500 human genes covering 90% of the human genome representing more than 150 million pairwise combinations of proteins. We obtained ~14,300 verified protein interactions that together with previously published data from the Center for Cancer Systems Biology (CCSB) form a network of more than 27,000 binary protein interactions. This dataset (available for download) largely exceeds the number of similar quality small scale interactions from the literature. By integrating our interaction network with spatial, temporal, and structural data we aim to explore dynamics in cell signaling at a systems level. We are performing additional screens using variants of the Y2H assay as well as alternative binary assays and project our interactome to contain ~60,000 interactions by the end of 2015. This represents the largest effort in systematic protein-protein interaction mapping undertaken so far and will provide an unprecedented resource to guide focused biological studies as well as to understand cellular behavior and organization at a systems level.

O35 - Using genetic alteration patterns to identify functionally and clinically relevant co-altered modules for clear cell Renal Cell Carcinoma

SAKSHI GULATI, Cancer Research UK - London Research Institute,

Short Abstract: Cancer as we understand is a systems disease. Studying a specific somatic alteration in isolation is likely to give a limited view for understanding cancer biology, which is further complicated by the fact that cancer cells are highly heterogeneous. Considering the complex nature of cancer development, here we study two major types of genetic variations in cancer – somatic mutations and somatic copy number alterations (SCNAs) – and use network models to identify co-altered modules that are likely to drive the growth of clear cell renal cell carcinoma (ccRCC).We apply a probabilistic model, which takes into account the likelihood of co-occurrence of genetic alterations in patients, and combined it with a network search algorithm to identify co-altered modules in a background gene interaction network. We use this protocol to analyse cohort of 350 ccRCC patients for which all of somatic mutation, SCNA as well as RNA-Seq data are available. We identified over 500 co-altered module pairs that are significantly associated with patient prognosis. Some of the module pairs are enriched specifically in one of the two main subgroups of ccRCC (ccA and ccB), where ccB patients are known to have worse prognosis than ccA subgroup. However, a considerable proportion of our module pairs do not have obvious correlation with either ccA or ccB subtypes but still perform well in terms of distinguishing patients with high or low survival rates. Our results as a whole provide explanations for the molecular mechanism behind the different prognosis of ccRCC.

O36 - Identifying Adverse Outcome Pathways through the integration of high-throughput in vitro assays and corresponding in vivo data: A case study in integration of ToxCast, ToxRefDB and a medium-throughput in vivo zebrafish screen.

Noffisat Oki, Oak Ridge Institute for Science and Education/ U.S. Environmental Protection Agency, United States

Short Abstract: High-throughput screening (HTS) results are an efficient way of gathering toxicity information for a variety chemicals, but the connection of these assays with biologically relevant adverse outcomes is often unknown. The Adverse Outcome Pathway (AOP) framework is a useful tool for describing the mechanisms driving these biological connections starting at the molecular level and ending in an adverse in vivo outcome.
We integrated information from the U.S. EPA ToxCast dataset (designed as a high throughput in vitro screening mechanism for chemicals commonly found in the environment) and in vivo data from a zebrafish developmental study to determine which assays are associated with endpoints from zebrafish and rodent endpoints from the U.S. EPA ToxRefDB in vivo data. The experimental design of the zebrafish dataset allows the use of these data as an intermediary for finding candidate associations between the ToxCast assays and the ToxRefDB rat phenotypes. We use Frequent Itemset Mining to make our associations between datasets by using the chemicals as our aggregating variable for the analysis.
Using the zebrafish in vivo data as an intermediary, we associated phenotypes between the two organisms and also found assays predictive for these shared associations. Our results show a separation of assays with regards to phenotypic endpoints for both in vivo datasets while also suggesting zebrafish assays that may be candidates for follow-up testing of certain HTS results.
Disclaimer: The views expressed in this presentation are those of the authors and do not necessarily reflect the views or policies of the U.S. EPA

O37 - Transcriptional Regulatory Networks, Sp1 and Usf2: Environmental and Genetic Influences on the Incidence on Alzheimer’s Disease

George Acquaah-Mensah, Massachusetts College of Pharmacy and Health Sciences, United States

Short Abstract: Transcriptional regulatory events in the Alzheimer Disease (AD) hippocampus have the potential of providing etiological insights. Microarray data are a valuable resource for identifying transcriptional regulatory relationships among genes. However, there is a paucity of brain region-specific high throughput gene expression data. The Allen Brain Atlas in situ hybridization (ISH) represents a valuable source of high throughput gene expression data. In this study, two high-performing network inference algorithms: Context Likehood of Relatedness (CLR) and GEne Network Inference with Ensemble of trees (GENIE3) were used to explore the Allen Brain Atlas mouse brain data for insights.

Unique ISH data in the hippocampal fields were extracted. Focusing on 275 genes relevant to neurodegeneration, transcriptional regulatory networks were learned using CLR and GENIE3. Human post-mortem hippocampal microarray AD data was analyzed, and the results superposed on the networks.

A network module with several genes differentially expressed in the AD hippocampus shows that Sp1 and Usf2 are notable drivers of gene expression changes in the disease. Usf2 is a regulator of the expressions of several key genes, many of which (such as Sod1, Nfe2l1, Tkt, Snap25, Syn2, Dnm1l, etc….) are down-regulated in the AD hippocampus. Other Usf2 targets (such as Ldb1, Vamp2, Bag4, Ep300, Pigt, Rtn4, etc….) are up-regulated in the AD hippocampus. Both Sp1 and Usf2 have decreased gene methylation when in the presence of methionine/choline/folate deficiency. In addition, Sp1 expression/activity is affected by a wide range of toxicants. These findings shed light on possible environmental and genetic influences on the incidence on AD.

O38 - An Integrative Network-Based Approach to Prioritising Genome-Wide Association Study Results

Alex Cornish, Imperial College London,

Short Abstract: Over the past decade, genome-wide association studies (GWAS) have identified more than 12,600 single nucleotide polymorphisms (SNPs) associated with various human traits. However, the heritability of many complex diseases, such as coronary artery disease, is still poorly understood. There is currently a requirement for tools that are able to integrate GWAS results with additional genomic, proteomic and transcriptomic data in order to identify genes and pathways that warrant further analysis. We have developed a novel method that combines multiple biological interaction networks with GWAS data in order to achieve this goal. Biological networks, detailing the physical and functional interactions that occur within the cell, have been mapped using a range of technologies in different cell types and under different conditions. Each of these networks represents a unique image of the cell in a certain state. Our method measures the distribution of trait-associated genes across the networks in order to identify which networks best explain the mechanisms that underly the trait of interest. Integration of only the most informative networks reduces the noise within the system and allows for more accurate predictions to be made. The addition of pathway data aids in the interpretation of the results and helps to highlight possible therapeutic targets.

O39 - Curation, Visualization and Analysis of Biological Pathways

Martina Kutmon, Maastricht University, Netherlands

Short Abstract: Pathway diagrams are found everywhere: in textbooks, in review articles, on posters and on whiteboards. Their utility to biologists as conceptual models is obvious. They have also become immensely useful for computational analysis and interpretation of large-scale experimental data when properly modeled. We will highlight the latest developments and newest features of WikiPathways (www.wikipathways.org), a community curated pathway database that enables researchers to capture rich, intuitive models of pathways. WikiPathways and the associated tools PathVisio and pathvisio.js are developed as open source projects with a lot of community engagement.

The new interactive JavaScript-based pathway viewer, pathvisio.js
(https://github.com/wikipathways/pathvisiojs/), is integrated in the WikiPathways website and enables users to zoom in and click on pathway elements to show linkouts to other databases. In the future pathvisio.js will replace the Java applet editor and introduce a quick and simple way to curate and edit pathways.

The standalone pathway editor and analysis and visualization tool, PathVisio (www.pathvisio.org), was refactored with the goal to achieve a better, modular system that can be easily extended with plugins. Plugins are accessible through the new plugin repository and can be installed through the plugin manager from within the application. This is an important aspect of usability that will allow users to build an application with all the necessary modules relevant for their work. The WikiPathways plugin of PathVisio allows searching and browsing WikiPathways from within PathVisio. Furthermore users can upload new pathways or update existing pathways.

O40 - Construction of gene networks for growth traits and genetic profiling of Nelore beef cattle of Brazil

Mauricio Mudadu, Embrapa, Brazil

Short Abstract: The Nelore is the major beef cattle breed in Brazil and Brazilian beef market is one of the largest of the world with more than 200 million heads. Growth and beef quality traits are of interest in animal breeding programs and markers associated to genes involved in these traits could be used to assist in selection programs. Genome-wide association studies (GWAS) are a common practice to associate markers and genome regions to growth and meat quality traits. The AWM/PCIT is an alternative methodology to simple GWAS, which involves the construction of gene network interactions, that integrates results from several GWAS, with the use of Association Weight Matrices (AWM) and Partial Correlation and Information Theory (PCIT). In this work we used high density genotyping data of 780 Nelore animals (34 half-sib families derived from the most commercially frequent and unrelated-prone sires from Brazil) to evaluate the genetic profile of Brazilian Nelore cattle. Results suggest lower variability than expected between the individuals studied. We also performed multiple GWAS with eight traits related to growth and meat quality and constructed a AWM/PCIT gene network set up with a growth key phenotype. We selected the most connected trio of transcription factors from this network and derived a sub-network that revealed to have several genes involved in growth and meat quality phenotypes. We expect this gene network to be useful in characterizing the genes and gene networks that are most influential in growth and meat quality traits in Nelore cattle.

O41 - Modeling additive effects of platelet microRNA on endothelial cells

David Feldman, South Texas Veterans Healthcare System, United States

Short Abstract: Bioinformatic models of microRNA targets are used as a first step, for in vitro studies that dissect transcriptional regulatory networks. Focusing on the recently identified pathway of intercellular delivery of microRNA from platelets to endothelial cells, we analyzed publically available transcriptome data and developed a model of the additive effects platelet microRNA on gene expression.

The microRNA dataset consisted of differentially expressed genes in hyper-reactive platelets (n=10) and hypo-reactive platelets (n=7). Three datasets of endothelial cell transcriptomes were assessed: acidotic endothelium (30 genes); tumor-associated angiogenic endothelium (14 genes) and physiologic angiogenic endothelium(10 genes). The microRNAs targeting each gene were identified using the TarMir 1.0 software tool, which queried 9 different algorithms that predict microRNA targets. For each gene, the lists of targeting microRNAs was searched to identify which are platelet microRNA. We found that various endothelial cell genes were targeted by 0-6 microRNAs from our dataset of platelet microRNA. Comparing hyper-reactive platelets to hypo-reactive platelets, the following genes in acidotic endothelium have 5 more microRNA binding sites: ICAM1, TNSFR9, and CXCL3. Half of the genes in acidotic endothelium were inhibited by microRNA from hypo-reactive platelets, but not from hyper-reactive platelets. A similar pattern was seen with angiogenic endothelium. Tumor-associated endothelium also showed high level targeting of TNSFR9 by hyper-reactive platelets. This model of the additive effects of platelet microRNA is deductive and could be applied to a larger amount of genomic data, however in vivo correlation is needed to confirm the biological role.

O42 - Analysis of Microbes

Meghan Drake, Oak Ridge National Laboratory, United States

Short Abstract: The Microbial Sciences component of the KBase project has three overall goals: 1) to enable the generation of predictive models for metabolism and gene regulation to facilitate the manipulation of microbial function; 2) to vastly increase the capability of the scientific community to communicate and utilize existing data; and 3) to enable the planning of effective experiments and to maximize our understanding of microbial system functions. To achieve these goals we have focused on unifying existing ‘omics datasets and modeling toolsets within a single integrated framework that will enable users to move seamlessly from the genome assembly and annotation process through to a reconciled metabolic and regulatory model that is linked to all existing experimental data for a particular organism. The results are hypotheses for such things as gene-function matching and the use of comparative functional genomics to perform higher quality evidence-based annotations. KBase offers tools for reconciling the models against experimental growth phenotype data, and using them to predict phenotypes in novel environments or under genetic perturbations.
To prioritize the development of the microbial science area and enable new science, we are focusing on developing prototype analysis workflows that will be most useful to microbial scientists. To date we have developed KBase capabilities and demonstrations workflows for: (1) genome annotation and metabolic reconstruction, (2) regulon reconstruction, (3) metabolic and regulatory model reconstruction, and (4) reconciliation with experimental phenotype and expression data.

O43 - Genome-wide mapping and computational analysis of non-B DNA structures in vivo

Damian Wojtowicz, National Institutes of Health, United States

Short Abstract: The canonical right-handed double helical structure of B-DNA may undergo various deformations and adopt alternative conformations including single-stranded DNA, Z-DNA, G-quadruplex, H-DNA, cruciform. Previous studies confirmed the existence of some non-B DNAs in a few gene promoters (e.g. c-myc, ADAM-12) and implicated their role in gene regulation, but it is not known how abundant the alternative DNA conformations might be at the genomic level. Computer-based studies uncovered a large number of sequences across the mammalian genomes that can potentially form non-B DNA structure and play functional roles in regulating DNA transactions.

We developed a new experimental technique, which combines chemical and enzymatic techniques with high-throughput sequencing, to map non-B DNA conformations at the genomic scale in vivo. The new protocol was applied to identify in vivo formation of non-B DNA structures in mouse and human. We performed a genome-wide analysis of occurrences of these alternative DNA conformations and compared the experimental data to genomic regions computationally predicted to have a propensity to form non-B DNA conformations. We showed a significant enrichment of alternative signal near computationally predicted non-B DNA motifs. Moreover, each type of predicted non-B DNA structures has also a distinctive experimentally derived signature. This strong evidence for the in vivo formation of alternative structures provides the first look at their genome-wide landscape. This newly developed technique will help to comprehensively characterize DNA conformations in different cells to establish their role in the regulation of DNA transactions.

O44 - Leveraging network structure to discover genetic interactions in genome-wide association studies

Wen Wang, University of Minnesota, United States

Short Abstract: Genetic interactions (epistasis) are important factors in complex diseases that may contribute to unexplained heritability in genome-wide association studies (GWAS). However, existing methods for identifying genetic interactions, which mainly focus on testing individual locus pairs, lack statistical power. We proposed a novel computational approach for discovering disease-specific, pathway-pathway genetic interactions from GWAS data. The key motivation, derived from the extensive analysis of genetic interaction networks in yeast, is that genetic interactions tend to occur between functionally compensatory modules rather than between isolated pairs of genes. We developed a method that explicitly searches for such large structures, guided by established sets of genes belonging to characterized pathways. We applied this approach to a Parkinson's disease (PD) GWAS study and found 50 pathway level interactions that were statistically significant (false discovery rate ≤0.25), suggesting large genetic interaction structures indeed exist and can be discovered by leveraging structural properties with prior information on pathways. None of the SNPs involved in the discovered pathway-pathway interactions are significant either based on single-locus association tests or independent tests of SNP pairs, and thus these signals would not have been discovered by traditional GWAS analyses. Interestingly, many of the discovered interactions are associated with reduced disease risk while a substantially smaller number are associated with increased disease risk. A significant fraction of them are validated in two independent cohorts. Our study highlights specific insights derived from analysis of the PD interactions and, more broadly, provides a general framework for systematic detection of genetic interactions from GWAS studies.

O45 - Transcriptional regulatory networks of single cells during in vitro hepatic differentiation of human pluripotent stem cells

Rathi Thiagarajan, University of California San Diego, United States

Short Abstract: Directed differentiation of pluripotent stem cells (PSCs) into specific cell types in vitro begins with a relatively homogenous population of undifferentiated cells, however, often produces heterogeneous cell populations and low yields of the desired cell type. By studying the different trajectories of single cells during in vitro differentiation and the transcriptional regulatory networks required to specify and maintain cellular fate, we aim to understand the factors involved in in vivo differentiation and to develop strategies to improve the efficiency of in vitro differentiation.
Using an established directed differentiation protocol, we employed single cell RNA-sequencing to profile the transcriptome during hepatic differentiation of human PSCs. Cells were collected from three critical phases of hepatic differentiation: undifferentiated PSCs, definitive endoderm (day 3), and hepatic progenitor cells (day 8). In all cells, lineage-specific transcripts representative of each differentiation phase were detected. However, principal component analysis clustering results indicated increasing heterogeneity among the cells from a given stage as differentiation progressed. Inspection of transcriptional regulatory networks using transcription factors (TFs) identified from an unsupervised clustering analysis of single cells, in-silico TF motif predictions of distal promoters, and permutation analysis of network connectivity suggested combinatorial use of key lineage-specific TFs. These initial results highlight that lineage markers alone are not sufficient to characterize differentiation, and that detailed molecular characterization can highlight critical differences and help describe a variety of cellular trajectories. This study demonstrates the application of single cell sequencing to understand in vitro differentiation and identify key TF networks involved in cell fate decisions.

O46 - Prediction of genetic interactions by decision tree model

Esther Camilo, University of São Paulo state, Brazil

Short Abstract: A genetic interaction between two genes is observed when the phenotype of a double mutant differs from what would be expected from the product of each individual mutant phenotype. The knowledge of such interactions is fundamentally important to understand the structure and function of genetic pathways and the evolutionary dynamics of complex genetic systems.The quantification of these interactions can be done through the S-score (S) defined by the expression: S=Wab-Wa.Wb, where Wab is the colony size of the double mutant strain and Wa and Wb are the colony sizes of each single mutant. When S is negative the interaction is said aggravating. Conversely, when S is positive the interaction is said alleviating.Currently, few works are addressed to the prediction of genetic interactions of prokaryotes. Hence, we devised an in silico model based on machine learning and integrated network of genes (ING) able to predict, with more than 69% of precision, the genetic interaction type. This model is decision tree based, where the attributes are, among others, the centrality measures calculated from the Escherichia coli ING and the classes are assigned based on the S-scores of known genetic interactions. As we applied the undersampling technique to overcome the unbalanced nature of the training data, we obtained a large set of decision tree models. To narrow down the amount of trees to be analyzed and better determine the rules governing the status of genetic interactions, we also devised a method to extract a representative decision tree based on clustering analysis.

O47 - BioRica simulation of hierarchical SBML models: including composition and randomness

Rodrigo Assar, University of Chile, Chile

Short Abstract: During the last years many efforts have been done to establish a common format to specify biological systems, allowing reusing and combining models in a flexible and non ambiguous way. Currently, the SBML format is accepted by the scientific community as the specification format, but there are some gaps to fill yet. Although a recent version of the SBML specification includes hierarchical composition, for our knowledge, there are no frameworks allowing the simulation of such composed model. In addition, stochastic delays and multiple transition possibilities have not sufficiently been addressed by the SBML format.
We face these challenges by using the BioRica framework, allowing the specification and simulation of processes described by interacting continuous and discrete dynamics without ambiguities. We introduce a new version of BioRica, which allows automatically reusing and composing SBML models with good performance in simulation. Through an SBML-BioRica parser, we translate SBML models into BioRica, and by QSS solver libraries we tackle problems associated to use different interacting timescales.
We illustrate our approach and its utility through the well-known Guyton model, describing the functioning of the human circulatory system. We complete gaps in SBML-model components and integrate all of them in a hierarchical BioRica-simulable model allowing to test in silico the effect of input changes over all the system. To illustrate the inclusion of non-determinism and randomness in event transitions, we considered some models describing the control of cell decisions to divide and to differentiate, pointing to test in silico medical treatments.

O48 - Developing putative AOPs from high content data

Shannon Bell, Oak Ridge Institute for Science and Education, United States

Short Abstract: The adverse outcome pathway (AOP) framework provides a high-level description of the biological processes connecting molecular perturbations in response to an exposure event to an adverse health endpoint affecting whole organisms or populations of individuals. Development of detailed AOPs from traditional experimental results is a slow, tedious process and is unrealistic when covering the breadth of perturbations in response to the >83,000 chemicals in commerce. Large toxicogenomic screening studies, such as the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation system (TG-GATEs), offer an opportunity to link molecular changes in response to chemical exposure to an adverse outcome. In this work we test the hypothesis that putative AOPs can be developed from high content assays using the TG GATEs rat liver microarray and pathology data. Associations based on frequent itemset mining identified and prioritized top candidate associations based on differential expression of biological pathways. Integration of pathology and transcriptomics data enabled the identification of a putative AOP for nonalcoholic steatohepatitis. Short term effects (<=24hrs) such as necrosis were distinguishable from the regenerative proliferation that presented from repeat chemical exposure. Putative biomarkers distinguishing different toxicological pathways are described. This work highlights the utility of toxicogenomic data for AOP discovery and in the identification of candidates for high throughput screening.
This is an abstract or a proposed presentation and does not necessarily reflect EPA policy.
Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

O49 - KBase Overview: An Integrated Knowledgebase for Predictive Biology and Environmental Research

Elizabeth Glass, Argonne National Laboratory, United States

Short Abstract: Systems biology is driven by the ever-increasing wealth of data resulting from new generations of genomics-based technologies. The advancement of systems biology relies not only on sharing the results of projects through traditional methods of peer review and publication, but also on sharing the datasets, workflows, software, models, best practices, and other essential knowledge that made those published results possible. Establishing a common framework for managing this knowledge could save time, reduce duplicative effort, and increase the scientific return on investment in systems biology research. This framework for precisely tracking what was done to achieve a certain outcome also will empower researchers to reproduce published results and review projects more effectively.

The KBase team is developing an open-source, open-architecture framework for reproducible and collaborative computational systems biology. KBase's primary scientific aim is to push multiple types of functional data towards increasingly specific models of metabolic and regulatory behavior of microbes, plants and their communities. We have brought together data and tools that allow probabilistic modeling of gene function, which can be used in turn to produce experimentally testable models of cellular metabolism and gene regulation.
A new component that will enable these complex, iterative analyses is KBase's new prototype Narrative Interface, which provides a transparent, reproducible, and persistent view of the computational steps and thought processes leading to a particular conclusion or hypothesis. These “active publications” enable researchers anywhere in the world to re-use a workflow, follow chains of logic, and experiment with changing input data and parameters.

O50 - Developing in KBase

Elizabeth Glass, Argonne National Laboratory, United States

Short Abstract: The KBase team is developing an open-source, open-architecture framework for reproducible and collaborative computational systems biology. One of the key operating principles of KBase is to allow the scientific community to incorporate their own algorithms into the system to make them available to others easily; to avail themselves of the KBase computational architecture; and to make use of the KBase data sources. The KBase team aims to make this process simple and to provide an easy route for dissemination of new tools and comparison to existing tools in a common framework.

The KBase system design is based on several sound best practice principles including consistent code use, code re-use, and the decoupling of modular system components. We have established standard software engineering processes for version control, software and data builds, testing, QA/QC, deployments and releases. These enable the deployment of a large number of services by a relatively small release engineering team.

To prepare for potential future services contributed by the community, we provide developer training materials on our website as well in hands-on developer training sessions called bootcamps. In the future, we plan to offer a wider range of bootcamps and webinars to target different types of developers and different scientific focus areas. In the meantime, prospective developers and computational biologists can find information about KBase service design at http://kbase.us/developer-zone/.

O51 - Bladder cancer gene regulatory networks inferred from large-scale RNAseq, Bead and Oligo microarray gene expression data sets

Ricardo de Matos Simoes, Queen's University Belfast,

Short Abstract: Bladder cancer is a highly heterogeneous complex disease that has been difficult to study on the basis of single genes. The underlying network structure and regulatory programs driving urothelial pathogenesis are largely unknown on the molecular level. In this study, we infer three gene regulatory networks by the application of the BC3net inference algorithm to large-scale transitional cell carcinoma gene expression data sets from Illumina RNAseq (169 samples), Illumina beadarrays (165 samples) and Affymetrix microarrays (188 samples). We provide a detailed structural and functional analysis of the networks, identify highly co-regulated genomic regions, hub genes and study the network properties of known bladder cancer specific genes and biomarkers. The bladder cancer gene regulatory networks are highly enriched by biological processes that are associated to known cancer hallmarks and involved in cell cycle, immune response, signalling, differentiation and translation. The topmost significant co-regulated chromosomal locations of the networks are 17q21.2, 8q24.3 and 1q23.3 which represent popular genomic regions that have been shown to be frequently abberated in bladder cancer and also in other cancer types. Hub genes of the bladder cancer gene regulatory networks are enriched by transmembrane proteins and represent target mediators of cellular activities and signaling processes. Our results shed new light on the analysis and integration of large-scale data sets to aid in the process for the identification and development of novel diagnostical targets in bladder cancer. This is the first study to our knowledge investigating genome-scale bladder cancer networks.

O52 - Refined Integrative Model of Regulation and Metabolism Improves Prediction for Phenotype in Saccharomyces Cerevisiae

Zhuo Wang, Institute for Systems Biology, United States

Short Abstract: Integrating transcriptional regulatory networks (TRN) with metabolic networks (MN) is useful to uncover the relationship between regulatory interactions and downstream phenotype. Probabilistic Regulation of Metabolism (PROM) is an approach to predict metabolic phenotypes under TF knockouts from an integrative model of TRN and MN. Here we developed a refined PROM regulatory-metabolic model framework for Saccharomyces cerevisiae by integrating the metabolic model Yeast 6 with transcriptional regulatory interaction information extracted from the Yeastract database and inferred from the algorithm Inferelator. We find no significant correlation between experimental growth data of TF knockouts and corresponding TF knockout growth rates predicted by PROM models using either Inferelator-based or Yeastract-based TRNs. However, we find significant prediction correlation (r = 0.4325, p-value= 0.0014) arises from applying constraints to metabolism based only on a core set of regulatory interactions that exist both in validated binding interactions from Yeastract and in the Inferelator-based network, compared with the correlation of 0.2110 (p-value=0.0459) when only using Yeastract. The PROM model growth rate prediction consistency is further improved by accounting for activator and inhibitor status of TFs, and by using the Inferelator interaction false discovery rate metric to guide the probabilistic influence of the TFs on the target metabolic genes. In conclusion, integrating Yeastract interactions with an Inferelator-based network yielded a core set of regulatory influences that improved regulatory-metabolic growth rate predictions.

O53 - Network models for multi-target treatments of Triple Negative Breast Cancer

Francesca Vitali, University of Pavia, Italy

Short Abstract: Triple-negative breast cancer (TNBC) is an aggressive, heterogeneous type of breast cancer whose biology is poorly understood. The absence of specific molecular targets and the limited response to single-drug therapy contribute to the poor clinical outcome associated with this disease. In this scenario, network-based approaches may improve the clinical outcomes by automatically identifying candidate combinations of drug targets, which can be hit in a synergistic approach. From a recent mutation study of TNBC, we have constructed an undirected Protein-Protein Interaction (PPI) network related to the disease. Next, the application of a novel score called “Topological Score of Drug Synergy” (TSDS) resulted in a set of best-ranked target combinations of approved drugs. This score computes measures the capability of a given target combination to reach all the proteins involved in the disease. Finally, we resorted to a recent approach that orients the interaction edges, to gain further insights on therapeutic, as well as off-target, effects. The proposed integrated network approach can potentially aid in defining new polypharmacological strategies. Among the targets selected as candidates by applying the proposed approach, none are currently used in ongoing clinical trials. In order to propose novel potential therapies, we retrieved the approved drugs for the selected targets from DrugBank. Preliminary results indicate that these drugs are used for other diseases associated with breast cancer, opening the possibility to analyze them for drug repurposing.

O54 - Codon-usage is a universal assessment measure of gene coexpression

Takeshi OBAYASHI, Tohoku University, Japan

Short Abstract: Databases of coexpressed gene sets can provide valuable information for a wide variety of experimental designs, such as targeting of genes for functional identification, gene regulation and/or protein-protein interactions. To develop the methods to construct better coexpression data, an appropriate measure to quantify the quality of coexpression is necessary. Gene annotation is one way to assess coexpression data, but the amount of gene annotation largely differs according to species, so that inter-species comparisons of coexpression data are difficult. In addition, gene annotations are not provided in equal quality among genes even in a single species. As other potential quality measures, comparison with genome-wide data (such as Y2H) or coexpression conservation among close species is promising. However both of them also have shortcomings for species dependence. Here, we proposed a universal measure to assess coexpression data using genome information. Since coexpression relationships are intrinsically coded in genome sequences, comparison between coexpression and genomic feature is reasonable. Among various genomic features we used similarity networks among codon usages of every gene. Comparison with this quality measure and other standard measures are presented.

O55 - Network-based Transcript Quantification with RNA-Seq Data

Wei Zhang, University of Minnesota Twin Cities, United States

Short Abstract: High-throughput mRNA sequencing (RNA-Seq) provides valuable information for accurate transcript quantification. In this project, we introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ) to integrate protein domain-domain interaction information with short read alignment for transcript abundance estimation. Based on the observation that the abundances of the neighboring transcripts by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with a heuristic alternating optimization algorithm. We demonstrate in the experiments that (1) qRT-PCR confirmed that Net-RSTQ achieves better transcript quantification accuracy with RNA-Seq data from a stem cell line and an ovarian cancer cell line compared with the models without using transcript network; and (2) the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification tested on the RNA-Seq data of ovarian cancer, breast cancer and lung cancer in The Cancer Genome Atlas (TCGA). Availability: http://arxiv-web3.library.cornell.edu/abs/1403.5029

O56 - rBiopaxParser - an R package to parse, modify and visualize BioPAX data

Frank Kramer, University Medical Center Göttingen, Germany

Short Abstract: Biological pathway data, stored in structured databases,
is a useful source of knowledge for a wide range of bioinformatics
algorithms and tools. The Biological Pathway Exchange (BioPAX) lan-
guage has been established as a standard to store and annotate path-
way information. However, use of these data within statistical analyses
can be tedious. On the other hand, the statistical computing environ-
ment R has become the standard for bioinformatics analysis of
large-scale genomics data. With our package, we enable R
users to work with BioPAX data and make use of the always increas-
ing amount of biological pathway knowledge within data analysis
methods.
rBiopaxParser is an open-source software package that provides a com-
prehensive set of functions for parsing, viewing and modifying BioPAX
pathway data within R. These functions enable the user to access and
modify specific parts of the BioPAX model. Furthermore, it allows to
generate and layout regulatory graphs of controlling interactions and
to visualize BioPAX pathways.

O57 - Modeling human immunity through transcriptomics and metabolomics

Shuzhao Li, Emory University, United States

Short Abstract: Blood transcriptome is the bellwether of host immunity at both healthy and disease states, but the data complexity also poses a great challenge to its biological interpretation. Canonical molecular pathways, covering limited immunology and ill fitted for complex tissues, often fail to meet this challenge. We undertook an approach to infer high quality gene networks from large amount of public transcriptomic data, and construct context specific gene modules (BTM, Blood Transcription Modules), which also integrated public databases and expression patterns of different immune cell types. This novel BTM method was applied to a series of vaccine studies, and revealed distinctive transcriptional programs three days after vaccination that were predictive of antibody response a month later. These results shed light on the early events that orchestrate vaccine immunity in humans, and demonstrated the power of integrated network modeling. We will also present additional applications of BTM to systems medicine and the integration with metabolomics.

O58 - Prediction of protein interaction types based on sequence and network features

Florian Goebels, Technical University of Munich, Germany

Short Abstract: Protein interactions mediate a wide spectrum of functions in various cellular contexts. It has become customary to distinguish between obligate and non-obligate interactions dependent on whether or not the protomers can exist independently. In terms of spatio-temporal control protein interactions can be either simultaneously possible (SP) or mutually exclusive (ME). So far different types of interactions were distinguished based on the properties of the corresponding binding interfaces derived from known three-dimensional structures of protein complexes. Here we present PiType, an accurate 3D structure-independent computational method for classifying protein interaction types. Our classifier exploits features of the binding partners predicted from amino acid sequence, their functional similarity, and network topology. We find that the constituents of non-obligate complexes possess a higher degree of structural disorder, more short linear motifs, and lower functional similarity compared to obligate interaction partners while SP and ME interactions are characterized by significant differences in network topology.

O59 - A systems biology approach to characterize inflammation in wounds

Jaques Reifman, Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, United States

Short Abstract: Inflammation is the primary physiological reaction initiated by the innate immune system in response to infection or injury. Dysregulation of inflammatory pathways is a key cause of impaired wound healing and other pathological states. The functional complexity of the inflammatory response is widely recognized, and new approaches are needed to integrate existing knowledge and to guide hypothesis-driven experimental research. We have developed a computational model that represents the kinetics of 16 key cellular and molecular components, and more than 50 processes, constituting the innate immune response. Our modeling predictions reflected essential qualitative and (semi-)quantitative aspects of acute and chronic inflammation. By simulating and analyzing 10,000 distinct inflammatory scenarios, we identified altered macrophage influx and efflux rates as the main mechanistic drivers of chronic inflammation, and IL-6, TGF-β, and PDGF as its reliable molecular indicators. These predictive results are consistent with in vivo data.

O60 - Network clustering made easy: finding overlapping modules from multiple types of biological networks

Yongjin Park, Massachusetts Institute of Technology, United States

Short Abstract: Network clustering provides valuable insights into complex biological systems. Vertices of biological networks correspond to proteins, genes, or DNA elements, and blocks of vertices are translated to co-occurring protein complexes or co-regulated functional modules. Stochastic block models and variants have been an excellent tool that summarizes full adjacency matrix to block matrix. However, stochastic block models are limited in modeling large-scale networks: Under the model groups embedded in a core of network remain undetectable in practice and in theory. Therefore, discovery of modules had to be restrained to well-separated groups residing in the periphery, leaving the core clumped.

We propose a different perspective on network clustering in that algorithms manage to identify groups of edges. For nonnegative edge by vertex incidence matrix, we modeled the probability that endpoints of each edge observing shared vertices is determined by linear combination of edge membership and group-specific vertex propensity parameter. We employed nonparametric Bayesian method to infer membership assignment and model parameter, not prescribing model complexity. Inference algorithm scales linearly in the number of edges.

We applied our new method to multiple types of biological networks. We have seen our link clustering method indeed dissects clumped cores of protein-protein interaction networks and generalizes to joint analysis with genetic interactions without modification of algorithm. We also analyzed ENCODE chromatin networks (regulatory modules of DNA elements), and were able to relate the model parameters with tissue-specific epigenetic features extracted by restricted Boltzmann machine trained on Epigenomics Roadmap data sets.

O61 - A dynamic network analysis in time-varying molecular regulatory networks

Jeong-Rae Kim, University of Seoul, Korea, Rep

Short Abstract: Objective: The concept of network analysis has been widely considered to understand large-scale complex biomolecular regulatory networks. In most cases, time-invariant ‘static’ networks have been considered implicitly so far. However, such approaches cannot be used to reveal time-specific ‘dynamic’ biological traits and hence may not be applicable to, for example, development, where temporal regulation of gene expression is an indispensable characteristic.

Results: We propose the concept of a ‘dynamic network kernel’, a sequence of network kernel in active subnetworks constructed over time, and investigate its usefulness on the analysis of the developmental regulatory network of Drosophila melanogaster. We found a sequence of network kernel which changes according to developmental stages. Interestingly, the network kernels that are found from specific developmental stages cannot be identified from a static network analysis. Moreover, we showed that the dynamic network kernel corresponding to each developmental stage can be used to describe the pivotal developmental events. The functional roles of the dynamic network kernels are investigated based on computational simulations.

Conclusion: We applied the dynamic network kernel approach to the developmental network of D. melanogaster, and showed that it can be used to investigate the dynamics of complex biological systems. We also suggest the dynamic network kernel analysis is a useful method for prediction of key molecules from the large-scale biological networks.

O62 - Integrative network analysis of Candida albicans-zebrafish interaction

Yu-Chao Wang, National Yang-Ming University, Taiwan

Short Abstract: Candida albicans is an opportunistic fungal pathogen responsible for many life-threating infections in humans. Although some virulence factors of C. albicans have been recognized, however, the complex interaction between C. albicans and its host has not yet been fully elucidated. Unlike many other studies which investigated C. albicans-host interaction using cell lines, zebrafish (Danio rerio) was chosen as the in vivo host model in this study. The transcriptional dynamics of host-pathogen interaction were then monitored with time-course microarray for both pathogen and host. Based on the simultaneously quantified transcriptomes and other omics data from public databases, the intracellular and interspecies co-expression networks and protein-protein interaction networks were constructed, respectively. Similarity network fusion method was further applied to fuse these networks as the integrative network. Network comparison during the host-pathogen interaction were investigated to explore the pathogenic mechanism for C. albicans and the defensive mechanism for zebrafish. According to the proposed systems biology approach, some important genes/proteins and the crucial functional modules associated with pathogenesis and immune response during C. albicans-zebrafish interaction were identified. The results not only reveal the potential mechanism in C. albicans-zebrafish interaction which can be further validated by experiments, but also provide new directions for potential therapeutic strategy and drug discovery to treat C. albicans infections.

O63 - Data driven modeling of spatially resolved single cell signaling transduction networks using highly multiplexed Imaging Mass Cytometry.

Denis Schapiro, University of Zurich, Switzerland

Short Abstract: Current cancer research focuses on the hallmarks of cancer, such as tumor promoting inflammation, escape from immune destruction, and sustained proliferative signaling. Central to many hallmarks is a deregulation of cell-to-cell interactions and communication among tumor and normal cells within the so called tumor microenvironment (TME). Therefore an in-depth analysis of the TME requires the analysis of cell phenotypes and signaling states at the single cell level with spatial resolution. Imaging Mass Cytometry enables such an analysis by imaging of up to 100 proteins and phosphorylation sites simultaneously at a sub-cellular resolution.
To analyze the imaging mass cytometry data, we developed a computational pipeline to reveal all phenotypes, single cell features and spatially resolved signaling transduction networks. Based on this pipeline, we currently use data driven modeling and discrete logic modeling to reveal phenotypes, cell-to-cell interactions and signaling states which might be correlated with a metastatic cell state (e.g. cells undergoing epithelial-to-mesenchymal transition (EMT)). Thus, Imaging Mass Cytometry is a powerful tool to generate spatially resolved single cell signaling transduction networks which can be linked to phenotypes and cell features for identification of driving forces of metastasis and disease outcome.

O64 - Transcriptome analysis reveals thousands of targets of nonsense-mediated mRNA decay that offer clues to the mechanism in different species

Steven Brenner, University of California, Berkeley, United States

Short Abstract: Nonsense-mediated mRNA decay (NMD) is an RNA surveillance system that degrades aberrant isoforms containing a premature termination codon. This pathway is conserved throughout eukaryotes. NMD coupled with alternative splicing is a mechanism of post-transcriptional gene regulation that is known to be particularly important in the splice factor regulatory network. The canonical model of defining a premature termination codon in mammals is the 50nt rule: a termination codon more than 50 nucleotides upstream of an exon-exon junction triggers degradation by NMD. There is also evidence that a longer 3’UTR triggers NMD in plants, flies, and mammals.
To survey the targets of NMD genome-wide in human, zebrafish, and fly, we performed RNA-Seq analysis on cells where NMD has been inhibited via knockdown of UPF1, a critical protein in NMD. We found that thousands of genes produce alternative isoforms degraded by NMD in the three species, including over 20% of genes alternatively spliced in the HeLa cell line. We found significant enrichment of ultraconserved elements in the human NMD targets. Additionally, we found that the 50nt rule is a strong predictor of NMD degradation in human cells, and has an effect in zebrafish and, surprisingly, in fly. In contrast, we found little correlation between the likelihood of degradation by NMD and 3' UTR length in any of the three species. Ultimately, our findings demonstrate that gene expression regulation through NMD is widespread in human, zebrafish, and fly, and that NMD is strongly predicted by the 50nt rule but not by 3’ UTR length.

O65 - Spectral network algorithms reveal conserved human, fly and worm regulatory pathways

Soheil Feizi, MIT, United States

Short Abstract: Rewiring of regulatory networks is a primary driver of evolutionary change, but the paucity of comprehensive catalogs of regulatory genomics datasets has hindered its study in animal genomes. Here, we leverage genome-wide functional genomics datasets from ENCODE and modENCODE to infer and compare regulatory networks and modules across human, fly, and worm, using two novel spectral algorithms for edge-level and module-level network alignment. We first infer and validate regulatory networks in each species by integration of transcription factor binding, gene expression, and regulatory motif information. We then use our edge-level network alignment algorithm to identify conserved connectivity patterns using both matching and non-matching interactions in a spectral decomposition framework. We next discover conserved modules using a new module-level network alignment algorithm based on spectral partitioning trees. Despite large evolutionary distances spanned, we find strong conservation of modules and centrally-connected genes, especially for human-fly comparisons. Our network analysis and methods are general and applicable beyond regulatory networks studied here.

O66 - Identification of master regulators of GWAS traits

Gerald Quon, Massachusetts Institute of Technology, United States

Short Abstract: Genome-wide association studies (GWAS) have identified thousands of single nucleotide variants (SNVs) that are associated with variation in a diverse range of human complex traits. Understanding how these variants combine to give rise to trait variation and disease is an open challenge. We designed a probabilistic model to test the hypothesis that top-scoring variants collectively disrupt the regulation of gene regulatory modules, and that these regulatory modules are defined by the combinatorial action of a small set of transcription factors.

Our model is guided by an independently inferred network of regulatory interactions between 2,757 transcriptional regulators and ~20,000 target genes. Our model intersects associated GWAS variants with the regulatory regions to infer genes whose regulation is disrupted, and uses the TF-gene network to simultaneously identify other genes participating in the same regulatory modules, as well as the transcription factors that define those modules.

We predicted that for lipid, cardiovascular, and immune-mediated disorders, non-coding associated SNVs collectively disrupt regulatory modules defined by fewer than 40 transcription factors, and found our predicted regulators replicate across studies. Predicted regulatory modules and transcription factors involved in HDL and LDL cholesterol levels are most highly expressed in liver cell types, and gene knockouts reported in the MGI database led to abnormalities including perturbed circulating lipid levels and susceptibility to atherosclerosis. Modules and regulators predicted for multiple sclerosis and Crohn’s disease are most highly expressed in CD4+ and CD8+ cells, and knockouts led to defects in B-cell and NK-cell morphology and circulating levels.

O67 - Extraction of Phenotypic Complexity by Image Analysis

Derek Kelly, University of Missouri, United States

Short Abstract: Imaging technologies can rapidly produce high quality experimental data in the physical and life sciences if meaningful objects can be identified. Because what is meaningful is often problem-specific, techniques for image analysis must be tailored to an application while ensuring all salient
variations are detected. In genetics, segmenting images of complex phenotypes can be especially difficult because the phenotypes vary along many dimensions. A classic example is the disease lesion mimic mutants of Zea mays, which produce irregularly shaped regions of chlorotic and necrotic tissue on otherwise healthy leaves. These lesions vary considerably in size, shape, color, distribution, and many other measures as a function of the plant's genotype and environmental conditions. To characterize the phenotypes, the lesions must be separated from healthy tissue in images of the leaves. We segment leaf images using a cascade of adaptive algorithms that allow identification of lesions without prior knowledge of their size, shape, type, location, density, or the number of subpopulations of lesions. We first perform a multiresolution transformation by wavelet decompositions at different scales to capture lesions of different shapes and sizes. Next, a gradient vector diffusion is applied to the set of grayscaled, transformed images to delimit putative lesions. These are passed to an active contours algorithm for final boundary adjustment. The set of lesions extracted from each leaf are then quantitatively characterized along multiple geometric and color dimensions. This approach's generality suggests it could analyze other forms of lesions, or problems of similar complexity.

O68 - Interactive Exploration of Spatial Distribution in Mass Spectrometry Imaging

Jan Kölling, Bielefeld University, Germany

Short Abstract: Mass spectrometry imaging generates a series of mass spectra from discrete positions on a tissue or thin-film, thereby providing comprehensive information on molecular composition and spatial distribution in a single experiment. This allows for an untargeted and simultaneous measurement of a wide variety of molecules.
The resulting data is commonly interpreted as a stack of images where each image shows the intensity distribution of one m/z value across all measured positions.
However, untargeted manual inspection of those images is not feasible due to the size of the data sets and consecutive analysis frequently targets only a few preselected molecules.
QUIMBI is a web-browser-based visualization tool that enhances manual inspection with interactive aggregation and overview capabilities. The similarity between a selected seed spectrum and the spectra of all positions is computed and encoded with a color scale to generate a pseudo color map of the sample. The researcher can change the seed position interactively by hovering positions in the image to see which regions are similar compared to the current spectrum and limit the comparison to selected m/z ranges. To enable interactive frame rates both the computations and the rendering are done on the graphics processing unit. If more than one spectrum is selected, the pseudo color maps are combined to similarities are encoded as a fusion image of the different. The researcher can also select regions of interest in the image to display their mean spectra or select ranges in the spectrum viewer to get the mean intensity image.

O69 - The network of pain: connecting known protein interactions

Ben Sidders, Pfizer, Neusentis,

Short Abstract: Understanding the molecular mechanisms associated with disease is a central goal of modern medical research. As such, many thousands of experiments have been published that detail individual molecular events that contribute to a disease. Here we use a semi-automated text mining approach to accurately and exhaustively curate the primary literature for chronic pain states. In so doing, we create a comprehensive network of 1,002 contextualised protein-protein interactions (PPIs) associated with pain. The PPIs form a highly interconnected and coherent structure, and the resulting network is more relevant and complete compared to networks derived from gene expression and manually curated datasets. We exploit the contextual data associated with our interactions to analyse sub-networks specific to inflammatory and neuropathic pain, and to various anatomical regions. Here, we identify potential targets for further study and several drug-repurposing opportunities. This overall approach has applicability to any biomedical research field and has potential to enhance the value to be derived from the existing literature.

O70 - Intracellular information processing through encoding and decoding of dynamic signaling features

Hirenkumar Makadia, Thomas Jefferson University, United States

Short Abstract: Cell signaling and transcriptional regulatory responses are variable within isogenic cells responding to the same stimulus. Studies thus far have considered the information transfer between the signaling and transcriptional domains based on instantaneous relationships between the molecular abundances. These studies predicts a limited binary on/off encoding mechanism that does not explain the complexity of biological information processing. Here we pursue a novel strategy that reformulates the information transfer problem as involving dynamic features of signaling rather than molecular abundances. We developed a computational framework to test if and how the transcriptional regulatory activity patterns can be informative of the temporal history of signaling. Our analysis revealed novel encoding-decoding mechanism through (1) the dynamic features of signaling that significantly alter transcriptional regulatory patterns (encoding), and (2) the temporal history of signaling that can be inferred from single cell scale snapshots of transcriptional activity (decoding). Immediate early gene expression patterns were informative of delay in signaling kinetics, whereas transcription factor activity patterns were informative of cross-talk, rate-limiting kinetics and history of signaling. We further developed information transfer maps to unravel the dynamic multiplexing of signaling features for each network components. Unsupervised clustering of the maps revealed two groups that aligned with the network motifs distinguished by transcriptional feedforward vs feedback interactions. Our new computational methodology impacts the single cell scale experiments by identifying the snapshot measures required for inferring specific dynamic aspects of upstream signaling in individual cells with a broad implication in application of intracellular information processing via signaling dynamics.

O71 - Multi-scale control of liver regeneration: integrating molecular regulation, cell phenotype dynamics, and physiological response

Daniel Cook, University of Delaware, United States

Short Abstract: Following damage, the liver initiates a recovery program inducing hepatocytes to enter the cell cycle and recover lost mass. Dynamic molecular changes begin as early as 30 seconds after injury and continue for ~1 week, when liver mass is fully restored. Yet much remains to be understood as to how regulation of multiple molecular factors is coordinated to control liver repair mechanisms. Additionally, cell types within the liver become activated to multiple distinct phenotypes contributing to or inhibiting repair. The present study takes a systems-based approach to investigate how multi-scale balances in cell phenotypes and molecular regulation impact liver regeneration.

We developed a computational model to synthesize the intrinsically multi-scale nature of liver regeneration by simulating connections between physiological-scale dynamics, activation phenotypes of non-parenchymal cells, and molecular signaling networks. Model analysis showed that shifting balances between populations of non-parenchymal cell activation phenotypes was sufficient to alter regeneration dynamics and overall tissue recovery following partial hepatectomy. As a perturbation to regeneration phenotype, we simulated alcohol-mediated suppression of liver regeneration by fitting our model to experimental data of liver recovery following chronic alcohol consumption and partial hepatectomy. Based on the model simulations, we predict that chronic alcohol consumption acts at a cellular-scale by shifting populations of non-parenchymal cells to anti-proliferative phenotypes. At a molecular-scale, these changes are paralleled by dynamic increases in anti-inflammatory cytokine production and high levels of anti-regenerative molecules. We tested these predictions at multiple-scales using high-throughput mRNA and protein measurements following partial hepatectomy in alcohol-fed rats and controls.

View Posters By Category

Search Posters:

TOP