Comparing Protein Structures with Constraints

Su-Hyun Lee1, Jin-Hong Kim2, Geon-Tae Ahn, and Myung-Joon Lee
1suhyun@sarim.changwon.ac.kr, Changwon National University, South Korea; 2avenue@ulsan.ac.kr, University of Ulsan, South Korea

S4E(Search Substructures of Secondary Structure Elements) is a protein structure comparison system using constraint technology. It searches for common substructures between two proteins represented in PSAML[1] by the following procedure: (1) constructing a compatibility graph using correlation information such as angle, length and distance between two secondary structure elements, (2) trying to find similar structures through a maximal common subgraph matching algorithm utilizing constraint programming technique.

Two input protein structures described in PDB format are converted to PSAML documents through the developed conversion tool. PSAML(PSA Markup Language) is the XML representation of PSA which defines a set of protein secondary structure elements and the relations between those elements. To describe the relationships between secondary structure elements, PSA includes appropriate information about the angle, distance, length, hydrogen bonding and direction. These features are calculated from two vectors which are defined by projection of the two secondary structures into a plane parallel to both of them.

A compatibility graph G=(V,E) is defined as followed. Each element of node set V is composed with two secondary structures; one is of A and the other is of B. Edges of graph G express the existence of compatibility property between two relations; one is defined on A and the other is defined on B. For example, an edge between node (ai, bk) and (aj , bl) denotes that the relation between ai and aj is compatible to the relation between bk and bl with respect to the compatibility measures.For fast comparison, we developed an efficient algorithm for constructing the compatibility graphs, applying constraint programming technique to finding maximal common subgraphs.

[1] Su-Hyun Lee, Jin-Hong Kim, Geon-Tae Ahn, and Myung-Joon Lee, "An XML Representation of Protein Data for Efficient Structure Comparison", Proc. of Second International Conference on Computer and Information Science, No. 1, pp. 313, 2002.

Acknowledgements
This work was supported by grant No. R01-2001-00535 from the Korea Science & Engineering Foundation.