An automated protocol for membrane protein prediction and annotation

Melissa J. Davis1, Zheng Yuan, Shane Fashang Zhang and Rohan D. Teasdale, Institute of Molecular Biosciences

Membrane proteins constitute an important class of proteins that play critical roles in cell signaling, ion transport, and cell-cell communications. These proteins are also often pharmaceutical targets in drug design, due to their biological significance, however few have known high-resolution structures, and increasingly, computational techniques have been used to predict the structures of these proteins. In order to annotate the membrane organization of whole-proteome datasets, we have developed an annotation protocol that adopts a consensus approach to protein feature prediction, and automates it for high through-put analysis. This protocol predicts the presence of alpha-helical transmembrane domains, ER retention signal peptides and GPI anchor sites. These features are combined to generate a prediction of membrane organization that provides a biological context for any further functional annotation. In an attempt to address the issue of confounding N-terminal signal peptides with transmembrane domains in prediction, three filters have been tested. The aim of these treatments is to remove erroneous TM domain predictions, and so improve the overall performance of the protocol. Correctly distinguishing signal peptides from transmembrane domains has also been shown to improve the accuracy of topology prediction. The prediction protocol assigns proteins to the following classes: Type I membrane protein, Type II membrane protein, Multi-span membrane protein, and GPI anchored membrane protein. The category of soluble/secreted proteins and soluble/intracellular proteins are also identified by this protocol.