Parallel DNA Sequence Alignment on High-Performance Systems with CUDA and MPI
Abstract
Sequence alignment is a cornerstone of bioinformatics, widely used to identify similarities between DNA, RNA, and protein sequences and studying evolutionary relationships and functional properties. The Needleman-Wunsch algorithm remains a robust and accurate method for global sequence alignment. However, its computational complexity, O(mn), poses significant challenges when processing large-scale datasets or performing multiple sequence alignments. To address these limitations, a hybrid implementation of the Needleman-Wunsch algorithm that leverages CUDA for parallel execution on GPUs and MPI for distributed computation across multiple nodes on a supercomputer is proposed. CUDA efficiently offloads computationally intensive tasks to GPU cores, while MPI enables communication and workload distribution across nodes to handle large-scale alignments. This work details the implementation and performance evaluation of the Needleman-Wunsch algorithm in a massively parallel computing environment. Experimental results demonstrate significant acceleration of the alignment process compared to traditional CPU-based implementations, particularly for large input sizes and multiple sequence alignments. In summary, the combination of CUDA and MPI effectively overcomes the computational bottlenecks inherent to the Needleman-Wunsch algorithm without requiring substantial modifications to the underlying algorithm, highlighting the potential of high-performance computing in advancing sequence alignment workflows.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- DOI:
- arXiv:
- arXiv:2412.21103
- Bibcode:
- 2024arXiv241221103Z
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing