20th Annual International Conference on
Intelligent Systems for Molecular Biology


Posters

Poster numbers will be assigned May 30th.
If you can not find your poster below that probably means you have not yet confirmed you will be attending ISMB/ECCB 2015. To confirm your poster find the poster acceptence email there will be a confirmation link. Click on it and follow the instructions.

If you need further assistance please contact submissions@iscb.org and provide your poster title or submission ID.

Category R - ''
R01 - Prediction of an N-glycoprotein structure by using tandem mass spectrometry
Short Abstract: Glycoproteins influence many indispensable biological functions, and changes in protein glycosylation have been observed in various diseases. Mass spectrometry is now a versatile analytical tool for profiling glycan and glycopeptide structures. Since the identification and characterization of glycoprotein and glycosylation sites from mass spectrometry (MS) data remain challenging tasks, great efforts have been made toward developing proteome informatics tools to facilitate MS data analysis. We have developed gFinder, a new analytical tool that allows analysis of mixtures of native N-glycopeptides using tandem MS. Using gFinder, MS/MS spectra can be categorized into N-glycopeptide spectra and native peptide spectra that can be easily identified using the MASCOT search engine. With these data, peptide backbone sequences and possible N-glycan structures were characterized simultaneously with assigned scores. For N-glycan analysis, we used GlycomeDB glycan structure database, which integrates the structural and taxonomic data of all major carbohydrate databases available in the public domain. Thus, we have developed an approach that provides a convenient and high-throughput method for interpreting tandem mass spectra of N-glycopeptides. (This study was supported by a grant from the WCU grant R31-2008-000-10086-0 and National Project for Personalize Genomic Medicine, [A111218-11-CP01], Ministry of Health, Welfare and Family Affairs, Republic of Korea).
TOP
R02 - PIA - Protein Inference Algorithms
Short Abstract: Most search engines for protein identification in MS/MS experiments return protein lists, although the actual search results in a set of peptide spectrum matches (PSMs). The step from PSMs to proteins is called "protein inference". If the identified PSMs support the detection of more than one protein in the searched database ("protein ambiguity"), usually only one representative accession is reported. These may differ according to the used search engine and settings. Thus the protein lists of different search engines are generally not directly comparable. PSMs of complementary search engines are often combined to enhance the number of reported proteins or to verify the evidence of a peptide, which is improved by detection with distinct algorithms.
We introduce an algorithm suite, including a web-interface, which combines PSMs from different experiments and/or search engines, and reports consistent and thus comparable results.
To represent the connections between proteins and identified peptides and to speed up any further analysis, a directed acyclic graph is created as internal data structure and saved in an XML file. Via a web interface the user may change the parameters for the actual protein inference and report generation, e.g. to not show all possible identified proteins and peptides, but use Occam’s Razor on a peptide base or only show proteins containing at least two identified peptides. All these individual parameters for the inference are not fixed as in prior approaches, but held as flexible as possible, to allow for any adjustments needed by the user.
TOP
R03 - Signal Processing and Spatial Segmentation of DESI Imaging Mass Spectrometry Data with Regularized and Spatially-Aware Clustering
Short Abstract: Recent advances in matrix-assisted laser desorption/ionization (MALDI) and desorption electrospray ionization (DESI) have demonstrated the usefulness of these technologies in molecular imaging of biological samples. However, development of computational methods for the statistical interpretation and analysis of imaging mass spectrometry (IMS) data remains a challenge. We propose statistically-minded computational methods for analyzing DESI imaging experiments. Specifically, we present techniques for signal processing and unsupervised multivariate image segmentation, which are also applicable to other IMS methods such as MALDI.

Signal processing of DESI spectra typically involves binning to reduce dimensionality, but this inefficient for downstream analysis as it retains empty regions of the mass spectrum. In our proposed processing step, we apply a novel peak picking algorithm based on windowed smoothing splines that allows adaptive resolution based on spectral profile. Peaks are aligned using a recursive dynamic programming algorithm which accounts for the heterogenous nature of IMS data by making pairwise alignments between pixels based on their proximity. Peaks are then normalized using total ion count.

To segment the sample into sub-regions of homogenous chemical composition in MALDI images, Alexandrov & Kobarg (2011) proposed two spatially-aware clustering techniques. We demonstrate these approaches are also useful for DESI. Moreover, we extend one of these clustering methods using statistical regularization, enabling simultaneous feature selection of structurally-important peaks and facilitating interpretation.

We evaluate the performance of the proposed methods in a both a biological and non-biological example, and show that statistical regularization improves accuracy and interpretation of spatial segmentation over existing approaches.
TOP
R04 - Naïve Bayes for hot spots predictions in protein interfaces combining graph features
Short Abstract: Protein–protein interactions occur when two or more proteins bind together, often to carry out their biological function. A small fraction of interfaces on protein surface found providing major contributions to the binding free energy are referred as hot spots. Identify hot spots is important for examining the actions and properties occurring around the binding sites. However experimental studies require significant effort; and computational methods still have limitations in prediction performance and feature interpretation. In this work we describe a Naïve Bayes model for predicting hot spots residues. Accessible surface area, Propensity scaled sequence conservation, Inter-residue potentials, Small-world structure characteristics, Phi-psi interaction features and Contact number are collected from combining protein sequence and structure information to used as input features. Computational alanine scanning value is also used as one of the features. To demonstrate its effectiveness, the proposed method was applied to both the Alanine Scanning Energetics database (ASEdb) and the binding interface database (BID) benchmark datasets. Our prediction model achieved an accuracy of 0.738, F-score of 0.649 on training set with 10-fold cross-validation, and accuracy of 0.682, F-score of 0.653 on training set. Experimental results show that the additional graph feature can improve the prediction performance. We future performed an exhaustive comparison of our method with various machine learning based methods and those previously published prediction models in the literature. Empirical studies show that our method can yield significantly better prediction accuracy.
TOP
R05 - Comparative Proteomics Reveals Unexpected Diversity of Signal Peptides
Short Abstract: Signal peptides are a cornerstone mechanism for protein localization, yet until now experimental determination of signal peptides has come from only a narrow taxonomic sampling. Proteolytic cleavage, such as signal peptide cleavage, creates a new protein N-terminus that is amenable to discovery via proteomics. We surveyed ~140 million tandem mass spectra to define the signal peptide sequences for 31 bacterial and archaeal organisms from nine phyla. Using the prokaryotic proteogenomic pipeline, the signal peptide signature was defined from N-terminal peptides of identified proteins.
The expected AxA was easily recapitulated with the identified sequences from Escherichia coli and most other organisms. However, in five organisms we observed for the first time a departure from this canonical motif. Two Alphaproteobacteria (Ehrlichia chaffeensis and Pelagibacter ubique) had a novel motif, for which alanine at –3 and –1 is marginalized. Eight other Alphaproteobacteria in the dataset retained the traditional motif. Both organisms from the Spirochaetes phylum (Leptospira interrogans and Borrelia burdorferi) exhibited the same pattern of deviation from AxA. The only archaeon in the dataset, Methanospirillum hungatei had a third distinct motif that was similar to humans and yeast, although computational predictions have previously suggested a more bacterial-like motif. Finally, although the variability of positions –3 and –1 was the most obvious deviation from the AxA rule, it appears that –2 is not an unrestricted position.
TOP
R06 - customProDB: an R package to generate customized protein database from RNA-Seq data for proteomics search
Short Abstract: Database search is the most widely used approach for peptide and protein identification in mass spectrometry based proteomics studies. We recently showed that a sample-specific protein database derived from RNA-Seq data can better approximate the real protein pool in the sample and thus improve protein identification. Because many research groups have started to apply RNA and protein profiling technologies in parallel to the same samples to gain a complete understanding of cellular systems, we have developed an R package customProDB that is dedicated to the generation of customized database from RNA-Seq data. Based on the assumption that lowly expressed transcripts are less likely to produce abundant proteins to be detected by proteomics, the package allows users to filter out unexpressed or lowly expressed proteins according to RNA-Seq based transcript quantification. Functions are provided to either calculate the RPKM values, or accept user input measurements from other sources such as FPKM values from cufflinks. Besides, customProDB also allows users to incorporate SNVs identified from RNA-seq into the database. It screens all SNVs for non-synonymous coding variations, which are introduced to protein sequences to produce variant protein entries. A FASTA file is generated as the output of the package. In summary, customProDB enables proteomics researchers to easily generate sample specific protein databases from RNA-Seq data. In addition to taking full advantage of sequencing data, this work also bridges genomics and proteomics studies and facilitates further cross-omics integrative data analysis.
TOP
R07 - Proteomics Signature Profiling (PSP): A novel contextualization approach applied towards cancer proteomics
Short Abstract: We introduce here a novel hit-rate based analytical method that is capable of resolving consistency and coverage issues in proteomics. We illustrate this in a case study of liver cancer.
TOP
R08 - PRIDE Inspector: Supporting quality control in public proteomics data
Short Abstract: PRIDE Inspector tool (http://goo.gl/4i7bU) is an open source application for inspecting MS proteomics data.Experiments can be examined based on different views emphasising either metadata, identified proteins or peptides, mass spectra or quantification results.Another major strength is the possibility to perform a first assessment on data quality using the ‘Summary charts’
TOP
R09 - Transfer learning based methods towards the discovery of protein-protein interactions for new pathogen-host organisms
Short Abstract: The last decade has seen a tremendous increase in protein-protein interaction (PPI) experiments for several organisms, first within single species, and more recently for several host-pathogen species. In our work on host-pathogen PPIs we use this existing known interactions data to infer and discover PPIs in new hosts or pathogens. Our techniques use a supervised machine learning model built for Salmonella-human PPI prediction and extend it to predict interactions between Salmonella and Arabidopsis proteins. We present methods to “transfer” the known Salmonella-human interactions along with our model’s predictions to the new host, Arabidopsis. This transfer is achieved using intra-host PPI networks of human and Arabidopsis, in addition to parallel transcriptomic data available from both hosts being infected with Salmonella. We also use a Transductive-Support Vector Machine like approach that builds a joint model combining data from the two hosts. Interactions from this ensemble of approaches form our initial set of predicted interactions to be validated experimentally. This set is biased by our knowledge of PPIs in human but will give us an informative start towards exploring the Salmonella-Arabidopsis interactome, for which no experimental data is available so far.
We perform Gene Ontology enrichment analysis on the Arabidopsis proteins from our predicted interactions and find an over-expression of terms related to abscisic acid and brassinosteroid mediated signaling pathways, histone kinase activity, histone phosphorylation. Preliminary experimental validation of the predictions using mass spectrometry binding studies show supporting results for some of the predicted plant proteins.
TOP
R10 - Technology to identify global dynamics of protein interaction networks
Short Abstract: Cancer and other genetic diseases are mediated by a web of macromolecular interactions that are regulated dynamically (for example, through post-transcriptional modification). Thus, a technology that captures the regulated dynamics of a global-scale protein interaction network would be important to accelerate our understanding of complex diseases. In vivo assays such as affinity purification followed by mass spectrometry (AP-MS) capture interactions under one condition, while in vitro assays such as Y2H capture interactions that could occur under different conditions, so long as these interactions do not require a third co-factor or post-translational modifier. No current method has the ability to economically produce many “conditional interactome” maps, each in the presence of different co-factors or modifiers. Here we describe a new technology BFG-Y2H (Barcode Fusion Genetics-Y2H) which exploits the efficiencies of deep short-read sequencing and offers the potential to map dozens of genome-scale conditional interactomes for a given species by one researcher within one year with the cost of less than $1,000 per interactome.
TOP
R11 - Molecular architecture of the 26S proteasome holocomplex determined by an integrative approach
Short Abstract:
TOP
R12 - A three-dimensional map of protein networks within and between species
Short Abstract:
TOP

View Posters By Category

Search Posters:


TOP