Implementing the Smith-Waterman Algorithm on a Reconfigurable Computer

Gianpaolo Gioiosa1, David Kearney2, University of South Australia;, University of South Australia

Some large bioinformatics organisations currently use application-specific integrated circuits (ASICs) to accelerate biological sequence analysis. However, as these circuits are very expensive, smaller organisations cannot afford to use them. Furthermore, the circuits cannot be reconfigured to support new analysis techniques. Hence there is a niche in the market for more flexible and less expensive hardware implementations of bioinformatics algorithms. Reconfigurable computing has the potential to fill this niche.

Reconfigurable computers contain hardware logic that can be modified in response to changing requirements. A field programmable gate array (FPGA) is a type of reconfigurable computer. Software tools and high-level hardware specification languages are available to support application development on FPGAs. For example, Celoxica's Handel-C language enables programmers to use many of the constructs available in the C programming language, as well as high-level constructs for the specification of parallelism, to develop applications.

The Smith-Waterman algorithm is a dynamic-programming algorithm that finds the optimal alignment between two biological sequences. Its quadratic time complexity limits its use in large-scale sequence analysis. However, hardware implementations of the algorithm that exploit opportunities for parallelism in the dynamic-programming procedure can be used to accelerate its execution.

Our project focused on implementing the Smith-Waterman algorithm on an FPGA. The results provide insight into the suitability of FPGAs for computationally intensive sequence analysis. The Handel-C language was used to implement a parallelised version of the algorithm, which manages partitioning of the query and database sequences. In addition, a method of interaction between the FPGA and its host computer was designed in order to improve the performance of the implementation.