Novel members of the C12/C19 cysteine proteases identified through human genome mining efforts: primary characterization of selected genes

Pierrat Benoit1, Bruengger Adrian2, Cai Richard, Gerhartz Bernd, Kossida Sophia, Nirmala Nanguniri, Worpenberg Susanne, Novartis Institute of Biomedical Research;, Novartis Institute of Biomedical Research

Protein ubiquitination controls many intracellular processes, including protein turnover, cell cycle progression, transcriptional activation, and signal transduction. Protein ubiquitination is a highly regulated, dynamic process involving enzymes that add ubiquitin (ubiquitin conjugating enzymes) and enzymes that remove ubiquitin (deubiquitinating enzymes, DUBs) in order to balance the conjugating reaction. Removal of the ubiquitin modification (deubiquitination) is performed by 2 families of cysteine proteases, ubiquitin-specific proteases (USPs) and ubiquitin C-terminal hydrolases (UCHs). Whereas the UCH family comprises enzymes of about 25 Kd which preferentially cleave ubiquitin from small adducts such as peptides and amino acids, the USP family members are large proteins (800 to 2000AA residues) and liberate ubiquitin from target proteins. These families share two regions of similarity within a core domain. One region contains conserved asparagines residues and the catalytic cysteine (cysteine box) and the other, C-terminal, contains two conserved histidines residues, one of which is also implicated in the catalytic mechanism (histidine box). In addition, N-terminal to the core domain is a region of variable length (from 10 to 200-800 amino acids) which probably mediates the recognition of the different ubiquitinated target proteins. DUBs were first characterized in yeast and belong to families C12 (UCHs) and C19 (USPs) in MEROPS. In contrast to the UCH family, the USP family is highly diverse with more than 50 mammalian members. Due to the very large number of DUBs, it is often suggested that these deubiquitinating enzymes recognize distinct substrates and are therefore involved in specific cellular processes. Using PSI-Blast, Hidden Markov Model and SW tools, we have conducted specific genome-wide searches for new human members of the cysteine C12/C19 protease families leading to the identification of 11 new members. Therefore, with 68 members, this family represents the largest family of enzymes in the ubiquitin system as well as the largest family of cysteine proteases. In the mean time, an extended analysis of all predicted transcripts and their expression profiles was carried out, leading to the in silico characterization of all the 68 members. Here we discuss the primary structure-function relationships and biochemical properties of selected new members isolated during our search.