RSG POSTER ABSTRACTS - 21 through 41

Complete list of RSG Poster Abstracts (.pdf) - Click here.
...............................................................................................................................

Poster: P21

Epigenetic dysregulation of human myogenesis affects time regulated eRNA and associated transposable element expression

Loqmane Seridi, King Abdullah University of science and technology, Saudi Arabia
Yanal Ghosheh, King Abdullah University of Science and Technology, Saudi Arabia
Beatrice Bodega, Istituto Nazionale Genetica Molecolare ‘Romeo ed Enrica Invernizzi’, Italy
Gregorio Alanis-Lobato, King Abdullah University of Science and Technology, Saudi Arabia
Timothy Ravasi, King Abdullah University of Science and Technology, Saudi Arabia
Valerio Orlando, King Abdullah University of Science and Technology, Saudi Arabia

Transcriptional regulation is a complex process that involves the interaction of transcription factors, promoters, enhancers, noncoding RNAs, transposable elements and chromatin states. To understand the transcriptional regulome, spatiotemporal measurements of its components is necessary. Myogenesis is a model system to study transcriptional regulation because factors driving the process are well known and evolutionary conserved. However, most time course studies of myogenesis are limited to few time points and cell lines. Here, using RNA-Seq and CAGE, we deep sequenced a high-resolution time-course of myogenesis transcriptome from human primary cells of healthy donors and donors affected by Duchenne Muscular Dystrophy (DMD). We compiled a full catalog of coding and non-coding RNAs, promoters, enhancers, and active transposable elements. Comparative analysis of the two time-courses suggests a major change in epigenetic landscape in DMD leading to global dysregulation of coding and non-coding genes, enhancers, and full-length transposable elements. It also indicates a high correlation between enhancers and transposable elements activities.

...............................................................................................................................

Poster: P22
Characterizing the dynamics of enzyme localization


Pablo Meyer, IBM T.J.Watson Research Center, United States
Stacey Gifford, IBM T.J.Watson Research Center, United States


To better understand how enzyme localization affects enzyme activity we studied using timelapse microscopy the cellular localization of necessary enzymes for cell wall synthesis (MurA and MurG) in the bacteria Bacillus subtilis. Enzymes localize during exponential growth and measuring the diffusion coefficient of their complex shows that it diffuses actively around the cell in an antibiotic-dependent manner. Point mutations in the helical domain of one of the proteins, disrupts its localization to the membrane caused severe sporulation defects, but did not affect localization nor caused detectable defects during exponential growth. We found a lipid-dependent mechanism for MurG localization, as in strains where the cardiolipin-synthesizing genes were deleted, MurG levels were diminished at the forespore. These results support localization as a critical factor in the regulation of proper enzyme function and catalysis.

...............................................................................................................................

Poster: P23
The Systems Toxicology Computational Challenge: Identification of Exposure Response Markers


Vincenzo Belcastro, PMI, Switzerland
Carine Poussin, PMI, Switzerland
Stephanie Boue, PMI, Switzerland
Florian Martin, PMI, Switzerland
Alain Sewer, PMI, Switzerland
Bjoern Titz, PMI, Switzerland
Manuel C Peitsch, PMI, Switzerland
Julia Hoeng, PMI, Switzerland

Risk assessment in the context of 21st century toxicology relies on the identification of specific exposure response markers and the elucidation of mechanisms of toxicity, which can lead to adverse events. As a foundation for this future predictive risk assessment, diverse set of chemicals or mixtures are tested in different biological systems, and datasets are generated using high-throughput technologies. However, the development of effective computational approaches for the analysis and integration of these data sets remains challenging. The sbv IMPROVER (Industrial Methodology for Process Verification in Research; http://sbvimprover.com/) project aims to verify methods and concepts in systems biology research via challenges posed to the scientific community. In fall 2015, the 4th sbv IMPROVER computational challenge will be launched which is aimed at evaluating algorithms for the identification of specific markers of chemical mixture exposure response in blood of humans or rodents. The blood is an easily accessible matrix, however remains a complex biofluid to analyze. This computational challenge will address questions related to the classification of samples based on transcriptomics profiles from well-defined sample cohorts. Moreover, it will address whether gene expression data derived from human or rodent whole blood are sufficiently informative to identify human-specific or species-independent blood gene signatures predictive of the exposure status of a subject to chemical mixtures (current/former/non-exposure). Participants will be provided with high quality datasets to develop predictive models/classifiers and the predictions will be scored by an independent scoring panel. The results and post-challenge analyses will be shared with the scientific community, and will open new avenues in the field of systems toxicology.

...............................................................................................................................

Poster: P24
E-Flux2 and SPOT: Validated methods for inferring intracellular metabolic flux distributions from transcriptomic data

Min Kyung Kim, Rutgers University, United States
Anatoliy Lane, Rutgers University, United States
James Kelly, Rutgers University, United States
Desmond Lun, Rutgers University, United States

Several methods have been developed to predict system-wide intracellular metabolic fluxes by integrating transcriptomic data with genome-scale metabolic models. While powerful in many ways, existing methods have several shortcomings, and because of limited validation against experimentally measured intracellular fluxes, it is unclear which method has the best accuracy in general.

We present a general strategy for inferring intracellular metabolic flux distributions using transcriptomic data coupled with genome-scale metabolic reconstructions. It consists of two different template models called DC (determined carbon source model) and AC (all possible carbon sources model) and two different new methods called E-Flux2 (E-Flux method combined with minimization of l2 norm) and SPOT (Simplified Pearson cOrrelation with Transcriptomic data), which can be chosen and combined depending on the availability of knowledge on carbon source or objective function. This enables our strategy to be applied to a broad range of experimental conditions. We examined E. coli and S. cerevisiae as representative prokaryotic and eukaryotic microorganisms respectively. The predictive accuracy of our algorithm was validated by calculating the uncentered Pearson correlation between predicted fluxes and measured fluxes. To this end, we compiled 20 experimental conditions (11 in E. coli and 9 in S. cerevisiae), of transcriptome measurements coupled with corresponding central carbon metabolism intracellular flux measurements determined by 13C metabolic flux analysis (13C-MFA), which is largest dataset assembled to date for the purpose of validating inference methods for predicting intracellular fluxes. In both organisms, our method achieves an average correlation coefficient ranging from 0.59 to 0.87, outperforming a representative sample of competing methods. Easy-to-use implementations of E-Flux2 and SPOT are available as part of the open-source package MOST (http://most.ccib.rutgers.edu/).

Our method represents a significant advance over existing methods for inferring intracellular metabolic flux from transcriptomic data. It not only achieves higher accuracy, but it also combines into a single method a number of other desirable characteristics including applicability to a wide range of experimental conditions, production of a unique solution, fast running time, and the availability of a user-friendly implementation.

...............................................................................................................................

Poster: P25
SetRank: A highly specific tool for pathway analysis


Cedric Simillion, Bern University, Switzerland
Robin Liechti, SIB Swiss Institute of Bioinformatics, Switzerland
Heidi Lischer, University of Zurich, Switzerland
Vassilios Ioannidis, SIB Swiss Institute of Bioinformatics, Switzerland
Rémy Bruggmann, Bern University, Switzerland

The purpose of gene set enrichment analysis (GSEA) is to find general trends in the huge lists of genes or proteins generated by many functional genomics techniques and bioinformatics analyses. We present SetRank, an advanced GSEA algorithm which is able to eliminate many false positive hits. The key principle of the algorithm is that it discards gene sets that have initially been flagged as significant, if their significance is only due to the overlap with another gene set. The algorithm is explained in detail and its performance is compared to that of other methods using objective benchmarking criteria. The benchmarking results show that SetRank is a highly specific and accurate tool for GSEA. Furthermore, we show that the reliability of results can be improved by taking sample source bias into account . SetRank and the accompanying visualization tools are available both as R/Bioconductor packages and through an online web interface.

...............................................................................................................................

Poster: P26
The role of genome accessibility in transcription factor binding in bacteria


Antonio Gomes, Columbia University, United States
Harris Wang, Columbia University, United States


ChIP-seq enables the identification of regulatory regions that govern gene expression at genome-scale. However, the biological insights generated from ChIP-seq analysis have been limited to predictions of binding sites and cooperative interactions. Furthermore, ChIP-seq data often poorly correlate with in vitro measurements or predicted motifs, highlighting that binding affinity alone is insufficient to explain transcription factor (TF)-binding in vivo. A more comprehensive biophysical representation of TF-binding will improve our ability to understand, predict, and alter gene expression. Here, we show that genome accessibility is a key parameter that impacts TF-binding in bacteria. We developed a thermodynamic model that parameterizes ChIP-seq coverage in terms of genome accessibility and binding affinity. The role of genome accessibility is validated using a large-scale ChIP-seq dataset of the M. tuberculosis regulatory network. We find that accounting for genome accessibility led to a model that explains 69% of the ChIP-seq profile variance, while a model based in motif conservation alone explains only 46% of the variance. Moreover, our framework enables de novo ChIP-seq peaks prediction and is useful for inferring TF-binding peaks in new experimental conditions by reducing the need for additional experiments. We observe that the genome is more accessible in intergenic regions, and that increased accessibility is positively correlated with gene expression and anti-correlated with distance to the origin of replication. Our biophysical model provides a more comprehensive description of TF-binding in vivo from first principles towards a better representation of gene regulation in silico, with promising applications in systems biology.

...............................................................................................................................

Poster: P27
Does the overall shape of gene networks differ between cancer and normal states? Towards a comprehensive understanding of cancer system biology by meta-analysis of various cancer transcriptomes

Pegah Khosravi, School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran (Islamic Republic of)
Esmaeil Ebrahimie, Department of Genetics and Evolution, School of Biological Sciences, The University of Adelaide, Adelaide, Australia

Recent advances in computational biology have provided the possibility of formulating the characteristics of gene networks in terms of network topology statistics. The aim of the present study is to find the possible network topology rules which can distinguish different types of cancer from normal state. To this end, meta-analysis is employed to analyse the gene regulatory networks of 8 different types of cancer (breast, cervical, esophageal, head and neck, leukemia, prostate, rectal, lung and two subtypes of lung cancer (small cell lung and non-small cell lung)) in comparison to normal state. Microarray data were downloaded from the GEO database, NCBI. Gene regulatory networks were constructed using the ARACNE algorithm through the Cyni toolbox; consequently, 20 network statistics were calculated using NetworkAnalyzer plugin for Cytoscape. These statistics mainly describe number of edges, clustering coefficient, connected components, network diameter, network centralization, characteristics path length, average number of neighbors, number of nodes, network density, and heterogeneity in networks. Discriminant function analysis show that number of edges, network diameter, and average number of neighbors are the main network topology statistics which discriminate cancer networks from normal ones. Cancer networks have lower number of edges with shorter diameter, and fewer number of neighbors that confirms the extensive networks rewiring during cancer progression. Discriminant function analysis is able to predict gene network of cancer from normal with 70% accuracy according to cross-validation test. PCA analysis demonstrates the similarity in network statistics between cervical cancer and breast cancer. Lung cancer have a distinguished different network pattern with low network centralization and diameter. This study demonstrates the possibility of finding universal pattern in different types of cancers based on network topological statistics. It also shows that decision tree models (pattern recognition) are successful in finding the pattern of cancer induction based on the important network statistics.

...............................................................................................................................

Poster: P28
Comparative Assessment Suite for Transcription Factor Binding Motifs


Caleb Kipkurui Kibet, Department of Computer Science and Research Unit in Bioinformatics (RUBi), Rhodes University, Grahamstown South Africa, South Africa
Philip Machanick, Department of Computer Science and Research Unit in Bioinformatics (RUBi), Rhodes University, Grahamstown South Africa, South Africa

Predicting transcription factor (TF) binding sites remains an active challenge due to degeneracy and multiple potential binding sites in the genome. The advent of high throughput sequencing has seen several experimental approaches, including ChIP-seq, DNase-seq and ChIP-exo, and dozens of algorithms developed to address the challenge. An increasing number of motif models has been published and those in databases have more than doubled in the last two years. However, there is no standardized means of motif assessment let alone a computational tool to rank the available motifs for a given TF. This makes it hard to choose the best models and for algorithm developers to benchmark, test, quantify and improve on their tools. We introduce a web server hosting a suite of tools that assesses PWM-based motif models using scoring, comparison and enrichment approaches. Given that there is no agreed standard for motif quality assessment, we present a range of measures so users can apply their own judgement. An assess-by-scoring approach uses motif models to score benchmark data partitioned into positive and background sets, then uses AUC, Pearson, MNCP and Spearman’s rank statistics to quantify their performance – scoring functions are energy, GOMER, sum occupancy and sum log-odds. An assess-by-comparison approach seeks to rank, for a given TF, motifs based on similarity to all available motifs in the database using TOMTOM’s Euclidean distance function and FISim. It assumes the best model should be representative of information in the others, provided a variety of data and algorithms is used. This is a quick data-independent approach that has proved to be powerful, reproducing assessment-by-score ranks with over 0.7 average correlation. A web interface to the tools uses the Django framework with a MySQL back end. The database contains 6,530 human and mouse motif models and benchmark data derived from available databases and publications. A user-entered test motif for a given TF is ranked against motifs for the same TF in the database using the available benchmark data as well as user-supplied data in BED or FASTA format. Results are returned in interactive visuals providing further information on motif clustering, similarity and ranks, with options to download publication-ready figures and ranked motif data. We have demonstrated the benefit of our web server in motif choice and ranking as well as in motif discovery. Web server and command-line versions are available (link to be added once available, estimated mid-October 2015).

...............................................................................................................................

Poster: P29
Loregic: A method to characterize the cooperative logic of regulatory factors


Daifeng Wang, Yale University, United States
Koon-Kiu Yan, Yale University, United States
Cristina Sisu, Yale University, United States
Chao Cheng, Dartmouth Medical School, United States
Joel Rozowsky, Yale University, United States
William Meyerson, Yale University, United States
Mark Gerstein, Yale University, United States

Gene expression is controlled by various gene regulatory factors. Those factors work cooperatively forming a complex regulatory circuit on genome wide. Corruptions of regulatory cooperativity may lead to abnormal gene expression activities such as cancer. Traditional experimental methods, however, can only identify small-scale regulatory activities. Thus, to systematically understand the cooperativity between and among different types of regulatory factors, we need the efficient and systematic computational methods. Regulatory circuits have been found to behavior very analogous to the electronic circuits in which a wide variety of electronic elements work coordinately to function correctly. Recently, an increasing amount of next generation sequencing data provides great resources to study regulatory activity, so it is possible to go beyond this and systematically study regulatory circuits in terms of logic elements. To this end, we develop Loregic, a computational method integrating gene expression and regulatory network data, to characterize the cooperativity of regulatory factors for the first time in cancers such as acute myeloid leukemia, which provided unprecedented insights into the gene regulatory logics in complex biological systems [1]. Loregic uses all 16 possible two-input-one-output logic gates (e.g. AND or XOR) to describe triplets of two factors regulating a common target. We attempt to find the gate that best matches each triplet’s observed gene expression pattern across many conditions. We make Loregic available as a general-purpose tool (loregic.gersteinlab.org). We validate it with known yeast transcription-factor knockout experiments. Next, using human ENCODE ChIP-Seq and TCGA RNA-Seq data, we are able to demonstrate how Loregic characterizes complex circuits involving both proximally and distally regulating transcription factors (TFs) and also miRNAs. Furthermore, we show that MYC, a well-known oncogenic driving TF, can be modeled as acting independently from other TFs (e.g., using OR gates) but antagonistically with repressing miRNAs. Finally, we inter-relate Loregic’s gate logic with other aspects of regulation, such as indirect binding via protein-protein interactions, feed-forward loop motifs and global regulatory hierarchy.

[1] Daifeng Wang, Koon-Kiu Yan, Cristina Sisu, Chao Cheng, Joel Rozowsky, William Meyerson, Mark Gerstein, "Loregic: A method to characterize the cooperative logic of regulatory factors," PLoS Computational Biology 11(4): e1004132, April 2015

...............................................................................................................................

Poster: P30
Affymetrix Probesets as Proxies for Mature MicroRNAs


Rebecca Tagett, Wayne State University, United States
Sorin Draghici, Wayne State University, United States

Motivation:
MicroRNAs are small non-coding RNAs that regulate mRNA abundance post-transcriptionally, and have been implicated in many contexts, from development to disease. Primary microRNAs (pri-miRNAs) from intergenic regions, which represent around half of the known population, are independently transcribed, 5-prime capped and poly-adenylated. Like precursor messenger RNAs, they can be kilobases in length, and must undergo extensive processing.
Previous studies have suggested that the expression of some intergenic pri-miRNAs can be used as surrogates for expression of mature microRNAs. Little is known about which pri-miRNAs have this property, and sequence annotation is un-available for the majority of pri-miRNAs.
The Affymetrix HG U133 Plus 2.0 array includes probesets for 19,612 protein coding genes and 15,943 non-coding, mostly un-annotated transcripts. With over 3000 public data series, these experiments cover a large variety of human tissues, conditions and disease states. We identify pri-miRNAs among the non-protein-coding probesets, and show that some of them can be used as proxies for mature microRNAs, opening up the possibility of combined microRNA - mRNA studies of mRNA abundance in relation to the presence of microRNAs.

Methods:
Using sequence similarity, we leverage the Unigene database to connect microRNA stem-loop sequences to target sequences from U133 Plus 2.0, identifying over 250 probeset-microRNA pairs. Fifty of these are one-to-one matches between a Unigene cluster and a probeset, with one or several associated stem-loop precursors. Having compiled a set of public datasets from experiments where samples are run both on a microRNA platform and U133 Plus 2.0, we identify pri-miRNA probesets that correlate with mature microRNA. We select two well-studied microRNAs from these results and screen all public U133 plus 2.0 series to find which sets express the pri-miRNA of interest. We then validate the selected pri-miRNAs based on published literature, and perform functional analyses on the genesets that are co-expressed and anti-co-expressed with these microRNAs.

Results:
We present the set of probesets from U133 Plus 2.0 which target pri-microRNA transcripts, highlighting which are acceptable surrogates for mature microRNA abundance. Those that do not correlate to mature microRNA abundance may be useful to study microRNA processing regulation, tissue specificity, co-transcription and transcription factor activity. Those that are proxies for mature microRNAs can be studied within the pool of all coding mRNAs, over the vast repository of U133 Plus 2.0 chips, to generate new, testable hypotheses regarding microRNA function. We present some intriguing results for the two pri-miRNAs selected for in-depth analysis.

...............................................................................................................................

Poster: P31
Non-coding isoforms of coding genes in B cell development and malignancies

Irtisha Singh, Memorial Sloan Kettering Cancer Institute, United States
Shih-Han Lee, Memorial Sloan Kettering Cancer Institute, United States
Christina Leslie, Memorial Sloan Kettering Cancer Institute, United States
Christine Mayr, Memorial Sloan Kettering Cancer Institute, United States

Alternative cleavage and polyadenylation (ApA) is most often viewed as the selection of alternative pA signals in the 3'UTR, generating 3'UTR isoforms that code for the same protein. However, ApA events can also occur in introns, generating either non-coding transcripts or truncated protein-coding isoforms due to the loss of C-terminal protein domains, leading to diversification of the proteome. Since previous studies have demonstrated the cell type and condition specific expression of 3'UTR isoforms, we decided to investigate the cell type specificity and potential functional consequences of isoforms generated by intronic ApA. We therefore carried out an analysis of 3'-seq and RNA-seq profiles from chronic lymphocytic leukemia (CLL) and multiple myeloma (MM) samples as compared to mature human B cells (naïve and CD5+) and plasma cells, respectively, together with our previous 3'-seq atlas generated from a wide variety of tissues and cell lines. This analysis showed that intronic ApA is a normal and regulated process, most widely used in immune cells, with intronic ApA events enriched near the start of the transcription unit, yielding non-coding transcripts or messages with minimal coding sequence (CDS). These early intronic ApA events preferentially occur in transcription factors, chromatin regulators, and ubiquitin pathway genes. De novo assembly of RNA-seq data supports ~60% of the intronic ApA events from plasma cells and MM samples, leading to >2000 candidate alternative transcripts arising from intronic ApA, with ~900 transcripts ending near the start of the transcription unit, retaining less than 25% of the coding sequence. Our analysis showed that two thirds of these intronic ApA isoforms have minimal coding potential, likely generating non-coding isoforms from protein coding genes. CLL cells increase the expression of early intronic ApA events relative to mature B cells, while MM cells decrease the expression of these events relative to plasma cells. For a fraction of genes, increased expression of isoforms generated by intronic ApA coincides with reduced expression of the full length mRNA in CLLs compared to mature B cells; conversely, lower expression of intronic ApA events coincides with higher full length mRNA expression for some genes in MM samples compared to plasma cells. In these genes, expression of the intronic event may function as a switch to alter full-length mRNA expression. The other fraction of these non-coding isoforms may potentially act as scaffolds for recruiting regulatory factors to the locus.

...............................................................................................................................

Poster: P32
Predicting Metabolic Networks through Pairwise Rational Kernels


Abiel Roche-Lima, Medical Science Campus, University of Puerto Rico, Puerto Rico

Metabolic networks are represented by the set of metabolic pathways. Metabolic
pathways are a series of chemical reactions, in which the product from one reaction serves as the input to another reaction. Many pathways remain incompletely characterized, and in some of them not all enzyme components have been identified. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes.

Pairwise kernel frameworks have been used in supervised learning approaches, e.g., Pairwise Support Vector Machines (SVMs), to predict relationships among two pairs of entities. Pairwise kernel methods are computationally expensive in terms of processing, especially when used to manipulate pairs of sequences, for example to predict metabolic networks. Rational kernels are based on transducers to manipulate sequence data, computing similarity measures between sequences or automata. Rational kernels take advantage of the smaller and faster representation and algorithms of weighted finite-state transducers. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and
machine translations.

We propose a new framework, Pairwise Rational Kernels (PRKs), to manipulate pairs of sequence data, as pairwise combinations of rational kernels. We develop experiments using SVM with PRKs applied to metabolic pathway predictions in order to validate our methods. As a result, we obtain faster execution times with PRKs than similar pairwise kernels, while maintaining accurate predictions. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We also obtain a new type of Pairwise Rational Kernels based on automaton and transducer operations. In this case, we define new operations over two pairs of automata to obtain new rational kernels. We also develop experiments to validate these new PRKs to predict metabolic networks. As a result, we obtain the best execution times when we compare them with pairwise kernels and the previous PRKs.

...............................................................................................................................

Poster: P33
Measuring and interpreting similarity between scale-free biological networks


Qian Peng, Department of Computer Science, Illinois Institute of Technology, United States
Bingqing Xie, Department of Computer Science, Illinois Institute of Technology, United States
Gady Agam, Department of Computer Science, Illinois Institute of Technology, United States

Biological networks such as metabolism networks, protein-protein interaction networks, and gene regulation networks, are used in numerous applications to reveal the functions of genes, proteins, and molecules. Measuring the similarity between such networks is important for both clustering algorithms and the validation of algorithms. In clustering algorithms the similarity measure is used to determine networks that can be grouped in the same cluster. In the validation of algorithms the similarity measure is used to determine the similarity between a synthetic network and an actual one. Synthetic networks are useful for algorithm validation because they can be synthesized in a manner where the ground truth is known (e.g. A network where the clusters are known).

Existing methods for measuring network similarity, such as NetSimile, do not target biological networks specifically and lack absolute interpretation of their measurements. In this paper we propose a principled metric using machine learning which consistently measures the similarity between biological networks and apply it to measure the similarity between actual networks and synthesized ones. In addition to improved performance, similarity in our approach has a meaning of edge rewiring percentage which makes interpreting absolute similarity results easier.

Our similarity classifier uses several network features such as: maximum degree, local and global clustering coefficients, degree exponent, degree distribution, and some other geometric features. To train our model we used a set of 140 actual biological networks for which we generated perturbed versions at various levels by randomly rewiring the edges in them. We use Random forest regression with 10 fold cross validation. Our cross validation results show that we can accurately estimate the known percentage of edge rewire with an average accuracy of roughly 5.5%.

After training the model we measured its performance on predicting the similarity between synthesized networks. In this evaluation we synthesized 100 networks using the Barabasi scale-free network synthesis algorithm. The objective was to measure whether our model can accurately estimate the difference between networks generated with various synthesis parameters (number of vertices and degree distribution). We compared the proposed approach to a standard implementation of NetSimile.

We observed that in the proposed approach the produced measure monotonically increased as the difference in synthesis parameters increased whereas in NetSimilie this was not always the case. The paper provides the full details of this evaluation.

...............................................................................................................................

Poster: P34
Global Functional Annotation and Visualization of the 2015 Yeast Genetic Interaction Network


Anastasia Baryshnikova, Princeton University, United States
Michael Costanzo, University of Toronto, Canada
Chad Myers, University of Minnesota, United States
Brenda Andrews, University of Toronto, Canada
Charlie Boone, University of Toronto, Canada

Large-scale biological networks map functional relationships between most genes in the genome and can potentially uncover high level organizing principles governing cellular functions. Despite the availability of an incredible wealth of network data, our current understanding of their functional organization is very limited and nearly inaccessible for biologists. To facilitate the discovery of functional structure and advance its biological interpretation, we developed a systematic quantitative approach to determine which functions are represented in a network, which parts of the network they are associated with and how they are related to one another. Our method, named Spatial Analysis of Functional Enrichment (SAFE), detects network regions that are statistically overrepresented for a functional group or a quantitative phenotype of interest, and provides an intuitive visual representation of their relative positioning within the network. Using SAFE, we examined the most recent genetic interaction network from budding yeast Saccharomyces cerevisiae, which was derived from the quantitative growth analysis of over 20 million double mutants. By annotating the genetic interaction network with GO biological process, protein localization and protein complex membership data, SAFE showed that the network is structured hierarchically and reflects the functional organization of the yeast cell at many different levels of resolution. In addition, we analyzed the network using a large-scale chemical genomics dataset and generated a global view of the yeast cellular response to chemical treatment. This view recapitulated the known modes-of-action of chemical compounds and identified a potentially novel mechanism of resistance to the anti-cancer drug bortezomib. Our results demonstrate that SAFE is a powerful tool for annotating biological networks and a unique framework for understanding the global wiring diagram of the cell.

...............................................................................................................................

Poster: P35
Memory of Inflammation in Regulatory T Cells


Joris van der Veeken, Memorial Sloan Kettering Cancer Center, United States
Alvaro J. González, Memorial Sloan Kettering Cancer Center, United States
Hyunwoo Cho, Memorial Sloan Kettering Cancer Center, United States
Aaron Arvey, Memorial Sloan Kettering Cancer Center, United States
Christina S. Leslie, Memorial Sloan Kettering Cancer Center, United States
Alexander Y. Rudensky, Memorial Sloan Kettering Cancer Center, United States

Regulatory T (Treg) cells are a specialized lineage of suppressive CD4 T cells that act as critical negative regulators of inflammation in various biological contexts. Treg cells exposed to an inflammatory environment undergo numerous transcriptional and epigenomic changes, acquire highly enhanced suppressive capacity, and show altered tissue homing potential. Whether these changes represent stable differentiation akin to memory T cells, or a transient adaptation to the inflammatory environment, is currently unclear.

We used an inducible lineage tracing system to analyze the long-term stability of inflammation-induced transcriptional, epigenomic, and functional changes in inflammation-experienced Treg cells. To this end, we performed an integrative computational analysis of ATAC-seq, histone modification (H3K27ac, H3K27me3, H3K4me1) ChIP-seq, and RNA-seq profiles of Treg cells before, during, and two months after exposure to an acute inflammatory environment. We found that Treg cells, in contrast to memory T cells, showed a striking ability to revert activation-induced transcriptional and epigenomic changes and maintained only a selective and specific memory of inflammation. Genes undergoing stable expression changes underwent qualitatively similar but more dramatic chromatin remodeling than genes undergoing transient changes. Stable gene expression changes were further reinforced during secondary Treg cell activation, while genes undergoing transient expression changes were similarly regulated during primary and secondary responses. Moreover, transiently expressed genes did not maintain stable chromatin modifications that would facilitate their reactivation. Importantly, while the activation-induced increase in Treg cell suppressive function was transient, inflammation-experienced Treg cells acquired a stable non-lymphoid tissue preference characterized by differential expression of tissue homing molecules. These data suggest that memory of inflammation allows Treg cells to preferentially localize to non-lymphoid organs to dampen ongoing tissue inflammation, without becoming stably hyperactive and causing an immunosuppressed state.

...............................................................................................................................

Poster: P36
Enhancing the detection of genomic rearrangements to better understand cancer pathology


Francesca Cordero, Department of Computer Science, University of Torino, Italy
Marco Beccuti, Department of Computer Science, University of Torino, Italy
Maddalena Arigoni, University of Torino, Italy
Raffaele Calogero, University of Torino, Italy

Among the genome structural variants, the genomic rearrangements are one of the major sources of genetic diversity in human cancer. The chimera (or fusion) genes are derived by recombination event formed by the breakage and re-joining of two DNA sequences. There are some intrinsic difficulties in the detection of these rearrangements due to both the experimental protocol used and the computational methodologies implemented to detect the fusion events. Fusion events occurring in a specific cell type are usually detected at transcription-level and results generated in different laboratories are only partially overlapping. A prototypical example is the case of MCF7 analysis for fusion detection Edgren et al. (Genome Biology, 2011), Kangaspeska et al. (PLoS One, 2012), Sakarya et al. (PLoS Computational Biolology, 2012), Maher et al. (PNAS, 2009), and Inaki et al. (Genome Research 2011) used different tools and also sequencing was done using both Illumina and Solid sequencing technologies and starting from polyA selected RNA or totalRNA.

To understand the effect of library preparation on fusion detection, we compare different sequencing protocols: polyA selection, ACCESS protocol and ribosomal depleted total RNA.

Our data indicated that sequencing polyA selected RNAs is the least effective method to detect MCF7 known fusions, while ribosomal depleted total is the most efficient.

Taken together our data and those previously published, MCF7 cell line represents an ideal model to evaluate the presence of specific genomic roles to define those sites involved in aberrant translocations.

However, RNAseq does not provide information on the effective region in which the translocation is located. Thus, we have sequenced at the MCF7 genome at 35X and we have detected the breakpoint region of the MCF7 know fusions. Preliminary data indicates the presence of some patterns that are associated to these events. We are actually evaluating if we could identify genomic roles that could be also observed in translocations annotated in COSMIC database.

...............................................................................................................................

Poster: P37
In-silico Analysis of Circular RNA as Regulators of miRNA


Nicholas Akers, Icahn School of Medicine at Mount Sinai, United States
Xintong Chen, Icahn School of Medicine at Mount Sinai, United States
Eric Schadt, Icahn School of Medicine at Mount Sinai, United States
Bojan Losic, Icahn School of Medicine at Mount Sinai, United States

MicroRNA (miRNA) are well characterized as important non-coding regulators of cellular gene expression. Less well understood are the mechanisms that regulate miRNA. Recently, circular RNA (cRNA), have been described as a well expressed, non-coding, tissue specific RNA product with an ambiguous cellular function. One plausible hypothesis for the function of cRNA is to specifically bind cellular regulators such as miRNA. This hypothesis was informed by the discovery that cRNA ciRS-7 contains numerous binding sites for miRNA miR-7, allowing the cRNA to attenuate the effects of the miRNA. This finding has led to speculation that the cellular role of many cRNA is to ‘sponge’ miRNA as a tier of control over gene expression. To investigate this hypothesis we have aligned with BLAST the sequences of all known miRNA with all reported cRNA to look for enrichment of miRNA binding sites. As negative controls we have also created a random nucleotide database of miRNA and aligned this to our cRNA database as well. Our results considered ~236M cRNA/miRNA pairs, and indicate that published cRNA are 41% more likely to contain a binding site for known miRNA than randomly generated miRNA (p<2.2x10-16). In addition to this, ciRS-7 and miR-7 are among the strongest cRNA/miRNA pairs, residing in the top 0.001% of all combinations in binding sites per base pair. The next step in this experiment is to examine the effects of cRNA expression on miRNA targets. We are analyzing the effect of expression of the most prolific miRNA binding cRNA on mRNA targets of these miRNA in a public access RNA-Seq dataset. If the hypothesis that cRNA attenuate the effects of miRNA is true, we expect to find that mRNA targets of miRNA degradation will be increased with greater cRNA expression. The results of this analysis will be presented in order to provide clarity on the role of cRNA in regulating mRNA within the cell.

...............................................................................................................................

Poster: P38
A parallel negative feedback motif exhibits bidirectional control based on differential kinetics in cytokine regulatory networks


Warren Anderson, Thomas Jefferson University, United States
Hirenkumar Makadia, Thomas Jefferson University, United States
Andrew Greenhalgh, McGill University, Canada
James Schwaber, Thomas Jefferson University, United States
Samuel David, McGill University, Canada
Rajanikanth Vadigepalli, Thomas Jefferson University, United States

Negative feedback is critical for maintaining homeostasis within and between cells. Inflammatory immune diseases aberrant negative feedback interactions within cytokine regulatory networks. We developed a computational model of a macrophage cytokine interaction network to study the regulatory mechanisms of macrophage-mediated inflammation. We established a literature-based cytokine network, including TNF , TGF , and IL-10, and fitted a mathematical model to published data from LPS-treated microglia (brain macrophage). We evaluated the validity of our model by testing whether it could recapitulate the experimentally determined “tolerance” response to the endotoxin LPS. We applied two doses of LPS and determined the gain of the peak TNF responses. Our results were consistent with published experimental data demonstrating tolerance to LPS. Global sensitivity analysis revealed that TGF - and IL-10-mediated inhibition of TNF was critical for regulating network behavior. Further analysis revealed that TNF exhibited adaptation to sustained LPS stimulation. We simulated the effects of functionally inhibiting TGF and IL-10 on TNF adaptation. Our analysis showed that TGF and IL-10 knockouts (TGF KO and IL-10 KO) exert divergent effects on adaptation. TGF KO attenuated TNF adaptation whereas IL-10 KO enhanced TNF adaptation. We experimentally tested the hypothesis that IL-10 KO enhances TNF adaptation in murine macrophages and found supporting evidence. Next, we tested the effect of IL-10 and TGF KO on tolerance using our computational model. Surprisingly, we found that IL-10 KO enhanced tolerance of the TNF response to sequentially applied LPS doses. In contrast, TGF KO repressed LPS tolerance. These opposing effects could be explained by differential kinetics of negative feedback. Inhibition of IL-10 reduced early negative feedback that results in enhanced TNF -mediated TGF expression. To further assess whether the relative effects of IL-10 and TGF could be explained by their differential kinetics, we adapted our macrophage model to a 3-node system. We found that the 3-node parallel negative feedback topology supported robust adaptation and tolerance. Inhibition of relatively fast negative feedback enhanced adaptation and tolerance. In contrast, inhibition of relatively slow negative feedback attenuated adaptation and tolerance. We propose that differential kinetics in parallel negative feedback loops constitute a novel mechanism underlying the complex and non-intuitive pro- versus anti-inflammatory effects of individual cytokine perturbations. Based on the data from our reduced 3-node network, we posit that parallel negative feedback motifs with differential kinetics can be tuned for bi-directional control (i.e., negative and positive influences) in contexts ranging from intracellular biochemical signaling to inter-cellular interactions.

...............................................................................................................................

Poster: P39
Sequence biases in CLIP experimental data are incorporated in protein RNA-binding models


Yaron Orenstein, MIT, United States
Bonnie Berger, MIT, United States

Protein-RNA interactions play important roles in many processes in the cell. CLIP-based methods measure protein RNA-binding in vivo in a high-throughput manner on a genome-wide scale. In these technologies, the protein is cross-linked to the RNA and pulled down. The protein is then removed following the cleaving of the RNA by a restriction enzyme. Later, bound RNA segments are sequenced and mapped back to the genome to be called as peaks.

Here, we present a newly-identified bias in CLIP peaks, which we call the ‘terminating G’. Most called peaks terminate in a G, since RNase T1 cleaves at accessible G’s much more strongly than at other nucleotides. The fact that most raw sequences do not terminate at a G implies that this bias is introduced in the peak calling process. Unfortunately, protein RNA-binding preferences are not easily disentangled from enzyme specificities. Thus, we call for an appropriate experimental control to measure the cleaving enzyme specificities. These should later be incorporated as co-variants in the peak calling process to identify unbiased binding sites. Then, better algorithms may be developed to predict more accurate binding sites.

...............................................................................................................................

Poster: P40
A novel study of the scope and limitations of baker’s yeast as a model organism for human tissue- specific pathways


Shahin Mohammadi, Purdue University, United States
Baharak Saberidokht, Purdue University, United States
Shankar Subramaniam, University of California, San Diego, United States
Ananth Grama, Purdue University, United States

Budding yeast, S. cerevisiae, has been used extensively as a model organism for studying cellular processes in evolutionarily distant species, including humans. However, different human tissues, while inheriting a similar genetic code, exhibit distinct anatomical and physiological properties. Driving biochemical processes and associated biomolecules that mediate the differentiation of various tissues are not completely understood, neither is the extent to which a unicellular organism, such as yeast, can be used to model these processes within each tissue.

We propose a novel computational framework coupled with the corresponding statistical model to assess the suitability of yeast as a model organism for different human tissues. Using our method, we dissect the functional space of human tissue-specific networks according to their conservation both across species and among different tissues.

Using a case study of GNF Gene Atlas dataset, we classify different tissues based on their similarity to yeast. In cases where suitability of yeast can be established, through conservation of tissue-specific pathways in yeast, it can serve as an experimental model for further investigations of new biomarkers, as well as an unbiased phenotypic screen to assay pharmacological and genetic interventions. On the other hand, for tissues with missing functionality in yeast, we provide molecular constructs (gene insertions) for creating more appropriate, tissue-engineered humanized yeast models.

...............................................................................................................................

Poster: P41
Deciphering single-cell transcriptional heterogeneity to understand principles of neuronal phenotype organization and plasticity


James Park, University of Delaware, United States
Babatunde Ogunnaike, University of Delaware, United States
James Schwaber, Thomas Jefferson University, United States
Rajanikanth Vadigepalli, Thomas Jefferson University, United States

Reconciling a cell’s transcriptional state and its phenotypic function is confounded by the transcriptional heterogeneity observed at the single-cell scale. This transcriptional heterogeneity conflicts with the traditional expectation that a neuronal phenotype consists of functionally identical neurons that respond uniformly to synaptic and neuromodulatory inputs. Moreover, this transcriptional heterogeneity is prominent within and across post-mitotic neuronal populations throughout the brain, where neurons interact to form circuits that regulate physiological function. High-throughput “-omic” level analysis, however, suggests that a more complex molecular organization potentially underlies neuronal phenotypic function and emergent systems-level behavior that occurs in the brain. In order to understand the functional relevance of this transcriptional heterogeneity, we examined two distinct brain nuclei by analyzing the transcriptional responses of individual neurons responding to specific physiological perturbations. In the first case, we generated a high-dimensional gene expression data set from individual blood pressure-regulating neurons within the nucleus tractus solitarus (NTS) that were collected from rats undergoing an acute hypertensive challenge. In the second case, we analyzed the transcriptional states of hundreds of single neurons within the suprachiasmatic nucleus (SCN) from mice responding to a light-induced phase shift in circadian rhythms. Using a combination of multivariate analytical techniques, graph network theory, and a novel fuzzy logic-based regulatory network modeling methodology, we identified molecular organizational structures in which individual neurons from both brain nuclei form distinct transcriptional states that align with synaptic/neuromodulatory inputs. Concomitantly, our quantitative regulatory network models and simulations of NTS neurons suggest that distinct networks correspond to these subtypes and drive heterogeneous gene expression behavior in a continuous fashion. Within the SCN, the presence of transcriptionally distinct neuronal subtypes provides insight into the organization and intercellular interactions underlying SCN regulation of circadian function. Having identified these SCN neuronal subtypes, we are now able to postulate a cellular interaction network in which specific neuronal subtypes fulfill specific functional roles in regulating circadian phase-shift behavior.


top