GPU based Eulerian Assembly of Genomes

Date

2013-02-18

Authors

Mahmood, Syed Faraz

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Advances in sequencing technologies have revolutionized the eld of genomics by providing cost e ective and high throughput solutions. In this paper, we develop a parallel sequence assembler implemented on general purpose graphic processor units (GPUs). Our work was largely motivated by a growing need in the genomic community for sequence assemblers and increasing use of GPUs for general purpose computing applications. We investigated the implementation challenges, and possible solutions for a data parallel approach for sequence assembly. We implemented an Eulerian-based sequence assembler (GPU-Euler) on the nVidia GPUs using the CUDA programming interface. GPU-Euler was benchmarked on three bacterial genomes using input reads representing the new generation of sequencing approaches. Our empirical evaluation showed that GPU-Euler produced lower run times, and comparable performance in terms of contig length statistics to other serial assemblers. We were able to demonstrate the promise of using GPUs for genome assembly, a computationally intensive task. An error correction step was also incorporated into GPU-Euler to be able to process reads containing some errors. Error correction output was benchmarked on simulated read on three bacterial genomes with different read length.

Description

Keywords

Sequence assembly, Euler Tour, GPU, Error correction, CUDA, Spectral alignment

Citation