ISMB/ECCB 2011 Posters

19th Annual International Conference on
Intelligent Systems for Molecular Biology and
10th European Conference on Computational Biology

Accepted Posters

Category 'R'- Proteomics'

Poster R01

Statistical approach to absolute protein quantification

Sarah Gerster Eidgenössische Technische Hochschule Zürich

Peter Bühlmann (Eidgenössische Technische Hochschule Zürich, Seminar für Statistik (Mathematik));

Short Abstract: A major goal in proteomics is the comprehensive and accurate description of a proteome. Proteomics provides additional insights into biological systems that cannot be provided by genomic or transcriptomic approaches (Aebersold and Mann, 2003). In particular, proteomics holds great promise for the identification of biomarkers capable of accurately predicting disease already at a very early stage.

The method of choice for the analysis of complex protein mixtures is shotgun proteomics. Proteins are identified and quantified based on experimentally measured peptides. While several probabilistic models exist for the identification of proteins, label-free quantification is often done in a deterministic way.

We propose a statistical approach to protein quantification with three main advantages. (i) Peptide intensities are modeled as random quantities, allowing to account for the uncertainty of these measurements. (ii) Our Markovian-type model for bipartite graphs ensures transparent propagation of the uncertainties and reproducible results. (iii) The problem of peptides mapping to several protein sequences (often neglected in other models) is addressed automatically according to our statistical model.

The performance of our model is shown on two synthetic control datasets and compared to the results of two common approaches for protein quantification (Silva et al., 2006; Lu et al., 2007).

References:
[1] Aebersold and Mann, (2003), Mass spectrometry-based proteomics. Nature 422(6928):198-207.
[2] Silva et al., (2006), Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell. Proteomics 5:144–156.
[3] Lu et al., (2007), Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol. 25(1):117-24.

Poster R02

Accurate Retention Time Prediction for Post-translationally Modified Peptides

Luminita Moruz Stockholm University

An Staes (University of Ghent, Department of Medical Protein Research); Evy Timmerman (University of Ghent, Department of Medical Protein Research); Lennart Martens (University of Ghent, Department of Medical Protein Research); Lukas Käll (Stockholm University, Department of Biochemistry and Biophysics); Maria Hatzou (Stockholm University, Department of Biochemistry and Biophysics); Joe Foster (European Bioinformatics Institute, Wellcome Trust Genome Campus);

Short Abstract: Predictions of peptide retention times in liquid chromatography have proven to be a valuable tool in mass spectrometry-based proteomics. In shotgun experiments, differences between predicted and observed retention times are used to remove incorrect peptide identifications. In targeted proteomics such predictions are employed for designing efficient mass spectrometry experiments. While extensive work has been put into developing accurate predictors for unmodified peptides, little efforts have been directed towards post-translationally modified peptides. In a previous study we proposed ELUDE, a machine learning-based retention time predictor. We showed that ELUDE outperforms state-of-the-art predictors, while including a special work-flow for dealing with small datasets. Here we extended our software to peptides modified by arbitrary post-translational modifications. We evaluated the new version of ELUDE by applying it on five types of modifications that shift the retention time of the peptides. Our results suggest that ELUDE yields equally good performances for modified and unmodified peptides. Furthermore, our predictor can handle peptides with more than one modification. ELUDE is fully portable to new chromatographic conditions and can be easily applied for other types of post-translational modifications. Our software can be downloaded under Apache License at www.per-colator.com or applied via an web-interface at http://elude.sbc.su.se.

Poster R03

Chimeras knocking at the door

Milana Frenkel-Morgenstern Spanish National Research Centre (CNIO)

Alfonso Valencia (Spanish National Cancer Research Centre (CNIO), Structural Biology and BioComputing);

Short Abstract: Chimeric RNAs are distinct from conventional alternatively spliced isoforms; they may result from the trans-splicing of pre-mRNAs, or alternatively, may be the product of gene fusion following translocations or rearrangements. Chimeric transcripts can contribute to the complex completion of distinct cellular processes, permitting a combinatorial increase in the gene products available. So far, only a limited number of chimeric transcripts and/or their associated protein products have been characterized, most of which result from chromosomal translocations and many of which are associated with cancer. Therefore, it is important to extend these observations in order to catalog the chimeric transcripts that exist and to study the potential functions of their corresponding chimeric proteins.
Here, we used the dataset of more than 900 known chromosomal breakpoints, which are characterized by production of chimeric or fusion proteins in cancers. Using this dataset, we identified important consequences of chromosomal translocations in a context of the chromatin structure. Intriguing examples of specific translocations are discussed in details. Finally, we found constitutive heterochromatin rearrangements, which can contribute to the cancer progression by changing gene expression via long range epigenetic mechanisms.

Poster R04

Bioinformatics for Proteomics: developing new tools for the analysis of experimental data

Angelo Facchiano Italian National Research Council

Anna Marabotti (Italian National Research Council , Institute of Biomedical Technologies, Segrate (MI)); Francesco Facchiano (Istituto Superiore di Sanità, Dept. of Haematology, Oncology and Molecular Biology);

Short Abstract: Bioinformatics and computational tools for analysing data obtained in proteomics studies have been developed in our labs over the last years in order to help the complex analysis of the experimental data obtained in this field of research. A dedicated web site has been created to host these tools and it is planned to be implemented with additional tools required by our present and future users. The main tools concern the analysis of mass spectrometry data in terms of spectra alignment and comparison, the search for specific protein features in PubMed archive by means of an automatic search engine, the analysis of protein structure at different levels (primary, secondary and tertiary structure), the analysis of lists of protein sequences, or their accession numbers, to simulate proteolysis with common proteases and predict features of the expected mixture of fragments, for comparison to experimental results by proteomics and peptidomics studies. Additional tools are also available for the editing of lists and tables, so that experimental data can be arranged in the suitable format for the tools, although these editing features can be also used for other purposes. Links to our previous tools for protein and peptide investigation are also provided.
The web site for accessing the tools is at the URL: http://www.bioinformatics.org/bioinfo-af-cnr/proteomics_tools/

Poster R05

Predict subcellular localization for proteins in all kingdoms

Tatyana Goldberg Technische Universität München

Burkhard Rost (Technische Universität München, Department for Bioinformatics and Computational Biology);

Short Abstract: The prediction of protein subcellular localization is an important step towards understanding its function. Here, we present a new approach to predict localization in all five taxonomic kingdoms namely bacteria, archaea, fungi, plants and animals. We use a hierarchical system based on support vector machines (SVMs). The SVMs allow prediction of protein subcellular localization by mimicking the cascading mechanism of cellular sorting. The method is trained on a non-redundant data set of proteins of known localization from SWISS-PROT. We target more than 20 groups including cell wall, extracellular space, nucleus, cytosol, endoplasmic reticulum, golgi apparatus and melanosome. Input information is an amino acid sequence and a taxonomic classification. First, TMHMM (Krogh et al.) predicts transmembrane helices. Subsequently, the amino acid sequence is processed by SVMs in order to detect compartment-specific patterns. Finally, the protein is assigned to one of the subcellular classes and a confidence of the prediction is reported. Our program is useful for the subcellular localization prediction of newly discovered proteins and its results may be applied to other problems in biology.

Poster R06

Quantitative mass spectrometry data based pathway analysis

Philipp Gormanns Max-Planck-Institut für Psychiatrie

Simone Wolf (Max-Planck-Institut für Psychiatrie, Proteomics und Biomarker); Chris Turck (Max-Planck-Institut für Psychiatrie, Proteomics und Biomarker); Ralf Zimmer (Ludwig-Maximilians-Universität München, Lehrstuhl für Praktische Informatik und Bioinformatik);

Short Abstract: Shotgun mass spectrometry in combination with stable isotope labeling methods enables the determination of relative protein abundance between two physiological conditions. Based on multiple peptides quantitative data protein abundance ratios are estimated which can subsequently be analyzed by pathway enrichment methods to identify affected networks. We developed an approach which predicts interesting subnetworks using the relative peptide abundance ratios. Density estimation is applied to pathway modules which consist of the corresponding peptides to discover pathway hotspots. Using peptide instead of protein abundance ratios for the pathway analysis assures that no information is lost from the experimental data to the pathway analysis process. Thus, a sophisticated data model is created which is suitable for successful subnetwork inference. The inferred probabilities directly indicate potential dysregulations of subnetworks. Therefore, it allows the detection of the most prominent subnetworks affected by the physiological condition under investigation. The method has been applied to datasets derived from trait anxiety mouse line specimens. Interesting candidate networks have been identified which will be further validated.

Poster R07

Protein Identification Using Top-Down Spectra

Yakov Sirotkin St. Petersburg Academic University, Algorithmic Biology Laboratory

Xiaowen Liu (University of California, San Diego) Yufeng Shen (PNNL, Biological Science Division); Gordon Anderson (PNNL, Biological Science Division); Yihsuan Tsai (University of Washington, Department of Medicinal Chemistry); Ying Ting (University of Washington, Department of Medicinal Chemistry); David Goodlett (University of Washington, Department of Medicinal Chemistry); Richard Smith (PNNL, Biological Science Division); Vineet Bafna (University of California, San Diego, Department of Computer Science and Engineering); Pavel Pevzner (University of California, San Diego, Department of Computer Science and Engineering);

Short Abstract: In the last two years, due to advances in mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications (PTMs). Given a top-down deconvoluted spectrum, we use a dynamic programming algorithm to find the best spectral alignment between the spectrum and each protein in the protein database and report the best-scoring protein. To speed up the protein identification process, we apply a spectral convolution approach to quickly filter out the proteins that can not have good alignments. In addition, we estimate the E-value of each identified protein-spectrum-match(PrSM) using a generating function approach. We developed a web-based software tool for MS-Align+. It has a user-friendly interface and interactive web pages for showing identification results on spectrum, protein species and protein levels.

We tested MS-Align+ on two top-down data sets from Saccharomyces cerevisiae(SC 8,468 deconvoluted spectra) and Salmonella typhimurium(ST 3,582 deconvoluted spectra). We identified 2,455 and 1,490 MS/MS spectra with E-value threshold 0.0001 from the SC and ST data sets respectively. On the SC data set, MS-Align+ identified 123 new proteins compared to SEQUEST. On the ST data set, MS-Align+ identified 86 new proteins compared to ProSightPTM, and 35 proteins compared to PIITA.

Poster R08

PRIDE, the PRoteomics IDEntifications database

Johannes Griss EMBL-European Bioinformatics Institute

Melih Birim (EMBL, European Bioinformatics Institute); Richard Côté (EMBL, European Bioinformatics Institute); Attila Csordas (EMBL, European Bioinformatics Institute); Joe Foster (EMBL, European Bioinformatics Institute); Gavin O'Kelly (EMBL, European Bioinformatics Institute); David Ovelleiro (EMBL, European Bioinformatics Institute); Daniel Ríos (EMBL, European Bioinformatics Institute); Florian Reisinger (EMBL, European Bioinformatics Institute); Rui Wang (EMBL, European Bioinformatics Institute); Juan A Vizcaino (EMBL, European Bioinformatics Institute); Henning Hermjakob (EMBL, European Bioinformatics Institute);

Short Abstract: The PRoteomics IDEntifications database (PRIDE, http://www.ebi.ac.uk/pride) at the European Bioinformatics Institute (EBI) has become one of the main repositories of mass spectrometry (MS) derived proteomics data. PRIDE stores three different kinds of information: MS and MS/MS mass spectra as peak lists, the derived peptide and protein identifications, and any associated metadata.
A major aim for PRIDE is the support of the peer-review process in the proteomics community. Therefore, we developed a new open source application called PRIDE Inspector (http://code.google.com/p/pride-toolsuite/). It allows researchers to examine their own data sets before the actual submission to PRIDE is performed, or access data already in PRIDE for data mining purposes. It can also be used by journal’s editors/reviewers since it facilitates the thorough review of submitted data at the pre-publication stage. Different views of the data are provided: metadata, spectrum, peptide and protein centric. Additionally, the major strength of the tool is the possibility to perform a first assessment on data quality, since a variety of charts based on the data are automatically generated.
Finally, PRIDE serves as the entry point for the ProteomeXchange consortium (http://www.proteomexchange.org), which aims at establishing a regular data exchange between major proteomics repositories.

Poster R09

Lost Data – The Consequences of Changing Protein Accession

Johannes Griss EMBL-European Bioinformatics Institute

Richard Côté (EMBL, European Bioinformatics Institute); Christopher Gerner (Medical University of Vienna, Department of Medicine I); Henning Hermjakob (EMBL, European Bioinformatics Institute); Juan A Vizcaino (EMBL, European Bioinformatics Institute);

Short Abstract: Proteomics data is produced in constantly growing quantities. The storage of digital data for long periods of time always comes with the question of how long it will be possible to read the data. Protein identifications are reported and stored using an unstable reference system: protein identifiers. These identifiers are created by every protein database individually and can change or even be deleted over time.
Therefore, we analyzed the changes of the reported protein identifiers from all public experiments from the PRIDE database (http://www.ebi.ac.uk/pride) by November 2010. We mapped every protein identifier to a currently active entry using two independent approaches. The first one used the PICR service at the EBI, which is based on 100% sequence identity. The second one (termed the logical mapping algorithm) accessed the source databases and retrieved the current status of the reported identifier.
Our analysis showed great differences between the most commonly used protein databases (IPI, UniProtKB, NCBI nr database and Ensembl), in respect to identifier stability. In this analysis UniProtKB proved to be the most stable resource. On the other hand, experiments using IPI already contained 20% of deleted identifiers after only two years. In addition, we furthermore investigated the proportion of peptide identifications no longer fitting the current protein sequence. Up to our knowledge, this is the first analysis that quantifies the effect of changing protein sequences on the long-term storage of proteomics data.

Poster R10

Proteomics based genome annotation using conceptually translated ESTs

Paul Blakeley University of Manchester

Jennifer Siepen (University of Manchester, Faculty of Life Science); Simon Hubbard (University of Manchester, Faculty of Life Science );

Short Abstract: Mass Spectrometry (MS) based proteomics is a powerful approach to gain evidence for protein-coding genes and splice variants predicted from genome sequences. However, protein and peptide identification is critically dependant on searching the MS data against a database of amino acid sequences. Alternative strategies include searching against theoretical translations of the “raw” genome sequence, de novo gene predictions and transcriptome-derived data. These ‘proteogenomic’ approaches enable the high-throughput discovery of proteins that are not present in existing sequence datasets (e.g. Ensembl), as well as validation of those that are. Most proteogenomic studies to date have searched MS data against a 6-frame translation of a genome sequence and have facilitated improvements in gene models and gene discovery. However, this approach remains challenging owing to size of the database, high content of false positives and difficulties associated with estimating error rates. As an alternative we have investigated different methodologies that can be used overcome this using transcriptome data, thereby reducing the size of the database, improving the conceptual translation of nucleotide sequences, and applying appropriate measures of statistical significance. We demonstrate this approach using the chicken genome as a model system, searching MS data against protein sequences predicted from assembled Expressed Sequence Tags (ESTs). This approach enabled the validation of 2526 known Ensembl proteins, as well as 145 single amino acid polymorphisms, and 275 candidate novel genes illustrated via the discovery of aldolase c, which was previously unannotated in the chicken genome.

Poster R11

Data mining on a leukemia phosphoproteome dataset: insights on chemoresistant cell signaling

Renato Milani University of Campinas

Carmen Ferreira (University of Campinas, Department of Biochemistry); Eduardo Galembeck (University of Campinas, Department of Biochemistry);

Short Abstract: Biologists in the bench have been generating ever-growing amounts of data from high-throughput techniques. However, there is often a shortage of means to analyze these data. Reversible protein modification by phosphorylation is a ubiquitous mechanism to control signal transduction networks governing biological processes such as chemoresistance and cancer itself. Techniques like phosphoproteomics enable researchers to evaluate whether and where a phosphorylation reaction takes place in the cell. Here we present a leukemia chemoresistance phosphoproteome dataset generated by our own research group. Our objective was to establish whether a phosphorylation event activates or inhibits a particular enzyme for each protein in the dataset. We built an automated tool, called PhosphoActivity, to extract this information from trusted online databases and submitted 4.257 phosphorylation entries to it. With this information, we could sort between active and inhibited enzymes that may play a role on chemoresistance. Thus, we were able to propose novel signaling networks related to chemoresistance and to identify essential components of multidrug resistance myeloid leukemia, as well as suggest new avenues for addressing this issue.

Poster R12

Identification of protein expression dynamics during germination in Streptomyces

Eva Strakova Institute of Microbiology, Czech Academy of Sciences

Alice Zikova (Institute of Microbiology, Czech Academy of Sciences, Laboratory of Bioinformatics); Pavel Rehulka (University of Defence, Institute of Molecular Pathology); Jiri Vohradsky (Institute of Microbiology, Czech Academy of Sciences, Laboratory of Bioinformatics);

Short Abstract: A comprehensive insight into the mechanisms controlling Streptomyces germination is of primary importance both for practical applications, as Streptomyces species are important producers of antibiotics and other secondary metabolites, and for fundamental studies of cell development. To understand initialization of germination in Streptomyces we employed proteomic based approach in a time-dependent manner. Germination is a transition process when dormant spores, under favourable conditions, trigger physiological processes essential for active growth and development to vegetative cells. We observed dynamic changes in proteome in 13 time points in the initial 6 hours of germination of Streptomyces coelicolor using 2D gel electrophoresis to examine both protein concentration level (florescence staining) and newly expressed proteins (in vivo radioisotope labelling) in each time point. Proteins with increased expression level were identified by mass spectrometry. Analysis of protein kinetic profiles revealed biochemical pathways associated with initiation and other phases of germination.

Poster R13

Tracing Evolutionary Events Over Ancestral Proteomes

Matt Oates University of Bristol

Julian Gough (University of Bristol, Computer Science);

Short Abstract: With an abundance of well covered proteomes in the public domain we can begin to explore possible ancestral objects relating them. Important historic events such as the rise of animals and vertebrates occurred in coincidence with large-scale domain shuffling, and novel architecture creation that is visible in protein domain annotations.
Presented is an online resource for exploring the evolution of the protein repertoire, from a protein domain architecture perspective; describing the functional relevance of domain level changes along an ancestral trace of the tree of life.
Using Dollo parsimony-based ancestral reconstructions of domain architecture from the SUPERFAMILY database, we create domain architecture networks for every internal node of a phylogenetic tree. These networks provide the basis of comparison along each edge of the tree. Using a domain centric GO annotation available per domain, a description is made for each branch using topic allocation, similar to TF-IDF from natural language processing. This provides a descriptive trace along each path of the tree.
A user can query based on a set of gene IDs or functional keywords, resulting in a map of relevant domains being generated automatically over the tree. This can be viewed in relation to the background events of each branch to find the evolutionary story that has given rise to the system of interest, in any available taxa.
This tool will be of use to biologists wishing to explore divergent protein systems over a span of phylogeny.

Poster R14

Viral proteins acquired from a host converge to simplified domain architectures

Nadav Rappoport Hebrew University of Jerusalem

Michal Linial (Hebrew University of Jerusalem, Department of Biological Chemistry Institute of Life Sciences);

Short Abstract: Viruses’ infection cycle creates multiple opportunities for the exchange of genetic material with the host. As part of their life cycle, some viruses, due to faulty excision, may acquire host sequences. In most instances, such sequences will accumulate mutations or be deleted. However, in rare instances, sequences acquired from a host become beneficial for the virus. We searched for unexpected sequence similarity among the 900,000 viral proteins and all cellular organism sequences. Here we focus on viruses that infect specifically metazoa. Our analysis was carried out at two levels of sequence conservation: we consider (using UniRef90) highly conserved sequences found in viruses and their hosts. Our low-conservation sequence analysis utilizes the Pfam family collection. The high-conservation analysis yielded 187 viral-host sequences of which 80% are viral sequences that were inserted into their host, and 20% in which a virus hijacked and maintained host sequences. From the 12,000 statistical models archived in Pfam, 670 families are composed of viral-metazoan proteins. We show that in about 75% of these families, the viral proteins are significantly shorter than their metazoan counterparts. The phenomena that lead to shorter viral proteins are a substantial elimination of domains, a reduction in the length of the protein tails and the linkers between domains. We conclude that throughout viral evolution, host-originated sequences were shaped toward simplified domain compositions. We postulate that such shorter proteins may act by interfering with fundamental functions of the host. We demonstrate such functions in intracellular signaling, post-translational modification, protein-protein interaction and cellular trafficking.

Accepted Posters

Preparing your Poster - Information and Poster Size
Poster Schedule
Vienna Poster Printing Services
Poster Categories
Search for a Poster

Attention Poster Authors: The ideal poster size should be max. 1.30 m (130 cm) high x 0.90 m (90 cm) wide. Fasteners (Velcro / double sided tape) will be provided at the site, please DO NOT bring tape, tacks or pins. View a diagram of the the poster board here

Posters Display Schedule:

Odd Numbered posters:

Set-up timeframe: Sunday, July 17, 7:30 a.m. - 10:00 a.m.
Author poster presentations: Monday, July 18, 12:40 p.m. - 2:30 p.m.
Removal timeframe: Monday, July 18, 2:30 p.m. - 3:30 p.m.*

Even Numbered posters:

Set-up timeframe: Monday, July 18, 3:30 p.m. - 4:30 p.m.
Author poster presentations: Tuesday, July 19, 12:40 p.m. - 2:30 p.m.
Removal timeframe: Tuesday, July 19, 2:30 p.m. - 4:00 p.m.*

* Posters that are not removed by the designated time may be taken down by the organizers and discarded. Please be sure to remove your poster within the stated timeframe.

Delegate Posters Viewing Schedule

Odd Numbered posters:
On display Sunday, July 17, 10:00 a.m. through Monday, June 18, 2:30 p.m.
Author presentations will take place Monday, July 18: 12:40 p.m.-2:30 p.m.

Even Numbered posters:
On display Monday, July 18, 4:30 p.m. through Tuesday, June 19, 2:30 p.m.
Author presentations will take place Tuesday, July 19: 12:40 p.m.-2:30 p.m

Want to print a poster in Vienna - try these options:

Repacopy- next to the congress venue link [MAP]

Also at Karlsplatz is in the Ring Center, Kärntner Str. 42, link [MAP]

If you need your poster on a thicker material, you may also use a plotter service next to Karlsplatz: http://schiessling.at/portfolio/

View Posters By Category

Search Posters:

↑ TOP