Parallel Implementation of Hmm-pfam on EARTH platform Using THREADED-C

Weirong Zhu1, Yanwei Niu2, Jizhu Lu, Guang R. Gao
1weirong@capsl.udel.edu, University of Delaware; 2niu@capsl.udel.edu, University of Delaware

Hmmpfam is a widely used bioinformatics software for sequence classification provided by Sean Eddy's Lab at the Washington University in St.Louis (http://hmmer.wustl.edu/). Hmmpfam reads a sequence file seqfile and compares each sequence in it, one at a time, against all the HMMs in hmm database, looking for significantly similar sequence matches. In real situations, this program may need a few months to finish processing large amounts of sequence data. Thus parallelization of the Hmmpfam is an urgent demand from bioinformatics researchers. HMMer 2.2g provides a parallel hmmpfam program based on PVM (Parallel Virtual Machine). However, this PVM version does not have good scalability and can not fully take advantage of the current advanced supercomputing clusters.

Using state-of-the-art multi-threading computing concept, we implement a new parallel version of hmmpfam on EARTH (Efficient Architecture for Running Threads). EARTH is an event-driven fine-grain multi-threaded programming execution model, which supports fine-grain, non-preemptive fibers, developed by CAPSL (Computer Architecture and Parallel System Laboratory) at the University of Delaware. In its current implementations, the EARTH multi-threaded execution model is built with off-the-shelf microprocessors in a distributed memory environment. The EARTH runtime system (version 2.5) performs fiber scheduling, inter-node communication, inter-fiber synchronization, global memory management, dynamic load balancing and SMP node support. The EARTH architecture executes applications coded in Threaded-C, a multi-threaded extension of C.

For parallelizing hmmpfam, we develop two different schemes: one pre-determines job distribution on all computing nodes by a round-robin algorithm; the other takes advantage of the dynamic load balancing support of EARTH Runtime system, which simplifies the programmer's coding work by making the job distribution completely transparent.

In this poster, we will show a detailed analysis of the hmmpfam program and different parallel schemes, and some basic concepts regarding multi-threaded parallelization of HMM-pfam on EARTH RTS 2.5. Then we will show our test results on various computing environments with comparison to the PVM version hmmpfam. When searching 250 sequences against a 585-family Hmmer database on 18 dual-CPU computing nodes, the PVM version gets absolute speedup of 18.50, while EARTH version gets 30.91, achieving a 40.1% improvement on execution time. We also test our program on the advanced supercomputing cluster Chiba City at Argonne National Laboratory . When the seqfile contains 38192 sequences, and the HMMer database has 50 families, the EARTH version achieves an absolute speedup of 222.8 on 128 dual-CPU nodes, which means that it could reduce the total execution time from 15.9 hours (serial program) to only 4.3 minutes.