Sunday Poster Presentations: Protein Structure Prediction

77 - Clustering Protein Sequences--Structure Prediction by Transitive Homology
Eva Bolten, Alexander Schliep, Universität zu Köln; Sebastian Schneckener, Science Factory; Dietmar Schomburg, Rainer Schrader, Universität zu Köln
We investigated the limits on transitivity when inferring structural similarity of proteins based upon their sequence similarities. We developed a novel graph-based clustering algorithm capable of handling multi-domain proteins. We will present our algorithmic advances yielding a 24% improvement over pair-wise comparisons, statistics of the clusterings, and our general methodology.

78 - CELIAN: A Side-chain Modelling Program Using Structural Environment-Specific Substitution Tables and Energy Strategy
Lan Chen, Kenji Mizuguchi, Tom L. Blundell, University of Cambridge
CELIAN is a program for protein side-chain prediction using structural environment-specific substitution tables and energy strategy. Based on the test-set derived from HOMSTRAD, CELIAN built side-chains on 102 structures with chi1 accuracy of 72% on structurally conserved regions, compared with 66% by SCWRL.

79 - Conjoined Hidden Markov Models for Protein Secondary Structure Prediction
Christian A. Cumbaa, University of Waterloo
Patterns in protein sequence influence secondary structure. These influences can be modeled by probability distributions. The joint structural influence of two overlapping patterns is obtained by conjoining their probability distributions. Ab initio predictions are made using discovered protein sequence patterns and their associated structure models.

80 - Domain Parsing: Detecting Signals of Continuous Structural Domains from Protein Sequence Data
A. A. Dayanik, H. J. Yun, D. Zhang, G. Armhold, Y. Song, D. Snyder, C. Nevill-Manning, I. Muchnik, C. A. Kulikowski, G. T. Montelione, Rutgers University
HMMs were constructed from 1471 DDD domains and their NR sequence homologs. Detecting the structural domains from an independent testing subset of 347 protein sequences from corresponding SCOP families was most reliably achieved by combining HMM with BLAST results, yielding 94% correct predicted alignments, and 73.5% fold recognition assignments.

81 - Prediction of Coiled-coil Domains
Mauro Delorenzi, Terry Speed, The Walter and Eliza Hall Institute, Australia
The performance of Coiled-Coil predictors from primary protein sequence was analysed with curves of sensitivity versus specificity on a heterogenous collection of test sequences. At specificity levels useful for genome-wide screenings, it seems that an HMM-algorithm (Marcoil) can give a higher sensitivity than the programs Paircoil and Coils.

82 - Domain Prediction from Sequence Information Alone
Richard Anthony George, Jaap Heringa, National Institute for Medical Research, UK
The dissection of a protein into its structural units is essential in preparation for structure determination by NMR. Using individual domains to search a database for related sequences is often more successful than using the whole protein sequence. The work described here predicts structural domains from sequence by using several methods developed in our lab.

83 - Protein Fold Recognition by Combining Evolutionary, Structural, and Proximity Information
Igor V. Grigoriev, Chao Zhang, Sung-Hou Kim, Lawrence Berkeley National Laboratory
We propose a new method for detecting remote homologues on basis of sequence derived properties. Instead of traditional comparison between single residues the local segments of protein sequence are compared. The method is shown to substantially enhance the sensitivity of the conventional sequence alignment methods, and applied to complete genomes.

84 - Multiple Protein Structure Alignment Using Monte Carlo Optimization
Chittibabu Guda, Philip E. Bourne, Ilya N. Shindyalov, University of California, San Diego
We have developed a new algorithm for alignment of multiple protein structures based on Monte Carlo simulation technique. Scoring function is based on inter protein distances calculated for aligned and superimposed residues with penalties for gaps. The algorithm improves alignment for the majority of protein families when starting from pair-wise structural alignments.

Poster 85 withdrawn by author.

86 - Predicting the Structure of Membrane Porins
Irene Jacoboni, P. L. Martelli, P. Fariselli, R. Casadio, University of Bologna
We describe a neural network based predictor to locate putative beta strands adopting the TM beta barrel structure, starting from the protein sequence. The predictor is trained and cross-validated using porins from the PDB database. Network outputs are then filtered with a Hidden Markov Model procedure.

87 - Polypeptide Structure Prediction Via Multiple Copy Simulated Annealing in Torsional Space Based on Amber Energy Functions, Generalized Born Solvation and Solvent Accessible Surface Areas
Yongzing Liu, David L. Beveridge, Wesleyan University
The multiple copy simulated annealing in torsional space algorithm using AMBER 5.0 force field and Generalized Born model is devised for molecular conformational optimization. The implementation of this algorithm using MPI (Message Passing Interface) on a BEOWULF PC cluster shows a linear scale for parallelization.

Poster 87 withdrawn by author.

89 - PHAT: A Transmembrane-Specific Substitution Matrix
Pauline C. Ng, Jorja G. Henikoff, Steven Henikoff, University of Washington
Database searching algorithms for proteins use scoring matrices based on globular proteins that may be inappropriate for transmembrane regions. In searches with transmembrane queries, the PHAT matrix significantly outperforms generalized matrices. We conclude that a better matrix can be constructed by using background frequencies characteristic of the twilight zone rather than database frequencies. http://blocks.fhcrc.org/~pauline

90 - Prediction of Protein Secondary Structure at 80% Accuracy Using a Combination of Many Neural Networks
Thomas Nordahl Petersen, Claus Lundegaard, Morten Nielsen, Henrik Bohr, Jakob Bohr, Søren Brunak, Garry P. Gippert, Ole Lund, Structural Bioinformatics Advanced Technologies A/S, Denmark
A secondary structure prediction protocol involving up to 800 neural network predictions has been developed, by use of novel methods such as output expansion and a unique balloting procedure. An overall performance of 80.2% (80.6% mean per-chain) for three-state (helix, strand, coil) prediction was obtained.

91 - SSpro: SS prediction using BRNN
Gianluca Pollastri, Pierre Baldi, University of California, Irvine
We are here presenting SSpro, a server for protein secondary structure prediction. This server is based on a set of Bidirectional Recurrent Neural Networks (BRNN) and currently achieves 76.7% correct classification on a set of 126 sequences with low similarity to the training set.

92 - Automated Protein Structure Prediction Using Templates from the CATH Protein Family Database
Adrian Shepherd, Christine Orengo, University College London; Nigel Martin, Roger Johnson, Birkbeck College, London
Important recent developments in CATH include: the addition of nearly 200,000 sequence relatives from Genbank; generating multiple structural alignments for around 400 CATH homologous families; and setting up a Dictionary of Homologous Superfamilies containing structure/function information. To manage this new data, a database is currently being developed using Oracle8i.

93 - FUGUE: A Fold Recognition Method Using Structural Environment-specific Substitution Tables and Structure-dependent Gap Penalties
Jiye Shi, Tom L. Blundell, Kenji Mizuguchi, University of Cambridge
FUGUE is a program for recognizing distant homologues by sequence-structure comparison using structural environment-specific amino acid substitution tables and structure-dependent gap penalties. By combining structural environment information with amino acid sequence, FUGUE achieved better performance in both fold recognition and alignment accuracy, compared with some widely used fold recognition algorithms.

94 - How Does It Fold? Searching for Folding Pathways Using A Motion Planning Approach
Guang Song, Nancy M. Amato, Texas A&M University
We present a framework for studying folding problems from a motion planning perspective. Our preliminary experimental results with traditional paper crafts and some small proteins are very encouraging. For the protein folding problems, we try to validate our folding pathways by comparing the order in which the secondary structures form on our pathways to known results from pulse labeling experiments.

95 - Learning Sequence-structure Affinity Using Neural Networks and a Probablisitic Representation of Protein Folding Motifs
Daniel St-Arnaud, Francois Major, Université de Montréal
A novel measure of protein sequence-structure affinity is introduced. Based on Bayesian network theory, this measure can be learned from data using artificial neural networks and should prove useful for protein fold recognition by optimal sequence-structure alignment (threading).

96 - Constraint Logic Programming and the Protein Side-chain Placement Problem
Martin T. Swain, Graham J. L. Kemp, University of Aberdeen, King’s College, Scotland
Residue positions may be formulated in finite domain CLP as variables, with side-chain rotamers as sets of possible values. Constraints are used to eliminate steric overlaps greater than a chosen threshold. Preliminary results indicate the accuracy of CLP is comparable to other side-chain placement methods.

97 - Using ASTRAL for Protein Structure and Sequence Analysis
Nigel Walker, University of California, Berkeley; Patrice Koehl, Michael Levitt, Stanford University; Steven E. Brenner, University of California, Berkeley
ASTRAL is a compendium of tools and databases for the study of protein structure through sequence analysis. One component of ASTRAL is a sequence database of the structural domains defined by SCOP. By combining sequence information and crystallographic data, we provide low redundancy sequence subsets, useful for homology based structure prediction.

98 - A Bayesian Network Model for Protein Fold and Remote Homologue Recognition
D. L. Wild, A. Raval, Keck Graduate Institute of Applied Life Sciences, CA; Z. Ghahramani, University College London
We describe a Bayesian network that learns primary, secondary structure and residue accessibility for proteins of known 3-D structure. In cross validation tests using the SCOP database the Bayesian network performs better in classifying proteins of known structural superfamily than a hidden Markov model trained on amino acid sequences.

99 - GeneAtlasTM - An Automatic High-throughput Pipeline for Structure Prediction and Function Assignment for Genomic Sequences
Lisa Yan, David Kitson. Zhan-Yang Zhu, Azat Badredinov, Krzysztof Olszewski, David Edward, Molecular Simulations, Inc.
GeneAtlas is an automated, high-throughput pipeline for the prediction of protein structure and function using sequence similarity detection, homology modeling, and fold recognition methods. Using a subset of structures from SCOP database, we demonstrate that GeneAtlas detects additional functional relationships in comparison with the widely used sequence searching method, PSI-BLAST.