ISCB-Asia/SCCG 2012 Technical Track Presentation

Dr. Kana Shimizu
Computational Biology Research Center, AIST, Japan

Privacy-preserving search for a chemical compound database


Abstract

Searching similar compound from a database is among the most important approaches in the process of drug discovery. Since a query compound is an important starting point for a new drug, the query compound is usually treated as secret information. The most popular method for a client to avoid information leakage is downloading whole database and using it in a closed network. However, this naive approach is not appropriate if the database side does not want to give the client more information than the search result. Therefore it is expected to develop new method which enables to search a database while both sides keep their privacy.

In this study, we address the problem of searching similar compounds in a database in privacy-preserving manner, and develop a novel protocol for solving the problem. During the run of the proposed protocol, the client sends the server an encrypted query, and the server searches a database with the encrypted query. The search results is calculated in its encrypted form and only the client can decrypt the results. Thus the client's query is not leaked to the server, and the server does not send more information than search results to the client. The main technical contribution of this study is to design the protocol by using additive homomorphic encryption only, which largely improves both computational cost and communication size comparing to related studies. We implemented our protocol and compared it to general purpose Multi party computation (MPC) on a simulated data set. We confirmed that the CPU time of the proposed protocol was around 1000 times faster than that of MPC. The proposed protocol can be used for searching the database where each record is represented as a bit-vector, therefore, we expect that our protocol will be applied for other important issues in bioinformatics.

Biography

Dr. Kana Shimizu is a research scientist of Computational Biology Research Center (CBRC) of National Institute of Advanced Industrial Science and Technology (AIST), Japan. She received her doctoral degree in computer science from Waseda University in 2006. She joined CBRC in 2006. She also works as a visiting associate professor of Waseda University from 2009. Her main research subject is biological sequence analyses. Her interests vary from genome to protein. She developed a series of tools for predicting protein intrinsic disordered regions, POODLE. In her latest work, she developed very fast algorithm named SlideSort for analyzing next-generation sequencing data. Her current interests also include privacy-preserving data mining for biological data.