Please use this identifier to cite or link to this item: http://doi.org/10.25358/openscience-7418
Authors: González-Domínguez, Jorge
Liu, Yongchao
Schmidt, Bertil
Title: Parallel and scalable short-read alignment on multi-core clusters using UPC++
Online publication date: 14-Jul-2022
Language: english
Abstract: The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 aligner show that our implementation based on dynamic scheduling obtains good scalability on multi-core clusters. Through our evaluation, we are able to complete the single-end and paired-end alignments of 246 million reads of length 150 base-pairs in 11.54 and 16.64 minutes, respectively, using 32 nodes with four AMD Opteron 6272 16-core CPUs per node. In contrast, the multi-threaded original tool needs 2.77 and 5.54 hours to perform the same alignments on the 64 cores of one node. The source code of our parallel implementation is publicly available at the CUSHAW3 homepage (http://cushaw3.sourceforge.net).
DDC: 004 Informatik
004 Data processing
Institution: Johannes Gutenberg-Universität Mainz
Department: FB 08 Physik, Mathematik u. Informatik
Place: Mainz
ROR: https://ror.org/023b0x485
DOI: http://doi.org/10.25358/openscience-7418
Version: Published version
Publication type: Zeitschriftenaufsatz
License: CC BY
Information on rights of use: https://creativecommons.org/licenses/by/4.0/
Journal: PLoS one
11
1
Pages or article number: e0145490
Publisher: PLoS
Publisher place: Lawrence, Kan.
Issue date: 2016
ISSN: 1932-6203
Publisher URL: http://dx.doi.org/10.1371/journal.pone.0145490
Publisher DOI: 10.1371/journal.pone.0145490
Appears in collections:DFG-OA-Publizieren (2012 - 2017)

Files in This Item:
  File Description SizeFormat
Thumbnail
parallel_and_scalable_shortre-20220712210247913.pdf1.29 MBAdobe PDFView/Open