Using XML-RPC for Distributed BLAST -Desterilizing idle resources-

Yong Wook Kim1, Keun Woo Lee, Hee Won, Yong-Ho In
1yongari@bioinfomatix.com, Bioinfomatix, Inc.

There are many needs of searching homology of unknown sequences in biological data analysis. Most famous method is BLAST system. However, personal computers (such as desktop PC or Server) cannot provide enough computing power for BLASTING when handing such an abundant sequences. Therefore, more computing facility is needed to handle the high-throughput data and get results faster. One option is using clustering or distributed system. The NCBI BLAST algorithm has not been developed for distributed system. An parallel blast program which uses MPI or PVM libraries for message transfer such as mpi-BLAST working under UNIX or LINUX operating system is developed. There are many powerful computers operating on the MS Windows system used only for word-processing, web browsing or personal minor facility. It is the squandering of the whole lab?s resources, so we can use the wasted computing as a member of the Distributed Blast System. In the previous study, almost parallel or distributed BLAST solutions requests same operating system for every nodes and cannot support MS Windows system into the bargain. To solve this problem, we keep an eye on the XML-RPC and HTTP protocols, because they are very easy to use and work well with almost all operating systems. Another merit of using XML-RPC is communication availability between different language due to almost languages provide XML-RPC libraries. XML-RPC uses XML for calling the remote host?s procedure and HTTP for communication of each node. In our work, we show efficiency of the distributed BLAST system using XML-RPC and implementing heterogeneous OS hosts.