Comparison of intra-molecular disulphide bonding arrangements between disulphide-rich and -poor proteins in the Protein Data Bank

Gerald Hartig1, Tran Trung Tran2, Mark Smythe
1g.hartig@imb.uq.edu.au, Institute for Molecular Bioscience; 2tran@doctor.com, Protagonist Pty Ltd

Disulphide rich peptides are found from bacteria to man. They have wide ranging biological functions, including (but not limited to) toxins, pheromones, insecticides, hormones and defensins. They also have favourable drug-like characteristics, including long lasting action, primarily as a function of their rigid structure. There has been significant interest in exploiting these molecules as potential drugs, and several are currently in clinical trials. The intramolecular disulphide bonds (IDSB) in these molecules are an important determinant of their structure and stability. The aim of this work was to describe the differences in IDSB arrangements between disulphide-rich and -poor proteins. IDSB arrangements in 1202 sequence non-homologous, IDSB-containing protein chains from the Protein Data Bank were analysed. The proteins were partitioned into two groups, disulphide-rich and -poor, based on a naturally occurring division at 25.2 residues/IDSB. Disulphide-rich proteins (<25.2 residues/IDSB) were commonly toxins in a knottin fold, with three IDSBs in an overlapping bonding pattern, having no free cysteines. Disulphide-poor proteins (>25.2 residues/IDSB) were commonly hydrolases, either in a trypsin-like serine protease or immunoglobulin-like beta-sandwich fold, with one IDSB. When several IDSBs were present in disulphide-poor proteins, they were arranged in disjoint bonding patterns. Two fifths of disulphide-poor proteins had free cysteines. This work demonstrates that it is important to analyse disulphide-rich and -poor proteins separately because they represent two distinct populations of IDSB-containing proteins with different characteristics. This work outlines several interesting topological characteristics of disulphide-rich peptides which may be relevant to protein folding, protein engineering and protein structure.