Optimizing the location and the number of the maximal scoring subsequences with constrained segment lengths with MaxSubSeq

Piero Fariselli1, Pier Luigi Martelli2, Ivan Rossi and Rita Casadio
1piero@lipid.biocomp.unibo.it, Department of Biology CIRB, University of Bologna; 2 Department of Biology CIRB, University of Bologna

A problem in predicting the topography of transmembrane proteins is the optimal localization of the transmembrane segments along the protein sequences, provided that each residue is associated with a propensity of being or not included in the transmembrane protein region. From previous work it is known that post-processing of propensity signals with suited algorithms can greatly improve the quality and the accuracy of the predictions. We developed a general dynamic programming-like algorithm (MaxSubSeq, Maximal SubSequence) specifically designed to optimise the number and length of segments with constrained length in a given protein sequence. The results on the MaxSubSeq applications to both helical and beta strand transmembrane segments show that our algorithm can increase of 5-10 percentage points with respect to the original outputs derived with different methods (Bioinformatics 19:500-505 (2003)). Our algorithm is devised to be used independently of the predictive method and is available through the web interface at http://gpcr.biocomp.unibo.it . A possible application of MaxSubSeq will be presented and is related to the 3D modeling of voltage dependent anion channels (VDAC) in eukaryotes (Casadio et al., FEBS Lett 2002, 520:1-7).