CUDASW++4.0 : ultra-fast GPU-based Smith–Waterman protein sequence database search

dc.contributor.authorSchmidt, Bertil
dc.contributor.authorKallenborn, Felix
dc.contributor.authorChacon, Alejandro
dc.contributor.authorHundt, Christian
dc.date.accessioned2025-04-14T08:06:56Z
dc.date.available2025-04-14T08:06:56Z
dc.date.issued2024
dc.description.abstractBackground: The maximal sensitivity for local pairwise alignment makes the Smith-Waterman algorithm a popular choice for protein sequence database search. However, its quadratic time complexity makes it compute-intensive. Unfortunately, current state-of-the-art software tools are not able to leverage the massively parallel processing capabilities of modern GPUs with close-to-peak performance. This motivates the need for more efficient implementations. Results: CUDASW++4.0 is a fast software tool for scanning protein sequence databases with the Smith-Waterman algorithm on CUDA-enabled GPUs. Our approach achieves high efficiency for dynamic programming-based alignment computation by minimizing memory accesses and instructions. We provide both efficient matrix tiling, and sequence database partitioning schemes, and exploit next generation floating point arithmetic and novel DPX instructions. This leads to close-to-peak performance on modern GPU generations (Ampere, Ada, Hopper) with throughput rates of up to 1.94 TCUPS, 5.01 TCUPS, 5.71 TCUPS on an A100, L40S, and H100, respectively. Evaluation on the Swiss-Prot, UniRef50, and TrEMBL databases shows that CUDASW++4.0 gains over an order-of-magnitude performance improvements over previous GPU-based approaches (CUDASW++3.0, ADEPT, SW#DB). In addition, our algorithm demonstrates significant speedups over top-performing CPU-based tools (BLASTP, SWIPE, SWIMM2.0), can exploit multi-GPU nodes with linear scaling, and features an impressive energy efficiency of up to 15.7 GCUPS/Watt. Conclusion: CUDASW++4.0 changes the standing of GPUs in protein sequence database search with Smith-Waterman alignment by providing close-to-peak performance on modern GPUs. It is freely available at https://github.com/asbschmidt/CUDASW4.en
dc.identifier.doihttps://doi.org/10.25358/openscience-12005
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/12026
dc.language.isofre
dc.rightsCC-BY-4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subject.ddc004 Informatikde
dc.subject.ddc004 Data processingen
dc.titleCUDASW++4.0 : ultra-fast GPU-based Smith–Waterman protein sequence database searchen
dc.typeZeitschriftenaufsatz
jgu.journal.titleBMC Bioinformatics
jgu.journal.volume25
jgu.organisation.departmentFB 08 Physik, Mathematik u. Informatik
jgu.organisation.nameJohannes Gutenberg-Universität Mainz
jgu.organisation.number7940
jgu.organisation.placeMainz
jgu.organisation.rorhttps://ror.org/023b0x485
jgu.pages.alternative342
jgu.publisher.doi10.1186/s12859-024-05965-6
jgu.publisher.issn1471-2105
jgu.publisher.nameBioMed Central
jgu.publisher.placeLondon
jgu.publisher.year2024
jgu.rights.accessrightsopenAccess
jgu.subject.ddccode004
jgu.subject.dfgNaturwissenschaften
jgu.type.dinitypeArticleen_GB
jgu.type.resourceText
jgu.type.versionPublished version

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
cudasw40___ultrafast_gpubased-2025041410065665641.pdf
Size:
2.05 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
5.1 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections