Ligand specificity of proteases and Kinases: an applicationto IC50 prediction on a large scale

Shandar Ahmad1, Koji Kitajima2, Akinori Sarai
1shandar@bse.kyutech.ac.jp, Department of Biochemical Enigineering and Science, Kyushu Institute of Technology, Japan; 2,

Proteases and kinases are of crucial interest for understanding and controlling most serious diseases such as different types of cancers and HIV. A large number of in-vitro experiments on the inhibition of these enzymes by different ligands have been reported in literature. ProLINT database was developed to provide a single repository of this information. Using thousands of activity data in ProLINT, we have developed a unified model to predict inhibition coefficient (IC50) for a choice of ligand and protein. Each of the protease and kinase included in the database has been assigned a consensus ligand fragment signature, based on the available information about its interaction with different ligands. A set of 43 fragments efficiently represent every ligand and can therefore be used as a powerful tool for filtering candidates of inhibiting ligands in drug design. Only inputs required for prediction are the SMILES of the ligand and the choice of the protein for which prediction is to be made. Neural networks have been trained and cross-validated to make these predictions. Results of predictions are promising and compared well with existing QSAR models for specific proteins.