cfaed Publications

ALPHA: A Novel Algorithm-Hardware Co-design for Accelerating DNA Seed Location Filtering

Reference

Fazal Hameed, Asif Ali Khan, Jeronimo Castrillon, "ALPHA: A Novel Algorithm-Hardware Co-design for Accelerating DNA Seed Location Filtering", In IEEE Transactions on Emerging Topics in Computing (IEEE TETC), 12 pp., Jun 2021. [doi]

Abstract

Sequence alignment is a fundamental operation in genomic analysis where DNA fragments called reads are mapped to a long reference DNA sequence. There exist a number of (in)exact alignment algorithms with varying sensitivity for both local and global alignments, however, they are all computationally expensive. With the advent of high-throughput sequencing (HTS) technologies that generate a mammoth amount of data, there is increased pressure on improving the performance and capacity of the analysis algorithms in general and the mapping algorithms in particular. While many works focus on improving the performance of the aligner themselves, recently it has been demonstrated that restricting the mapping space for input reads and filtering out mapping positions that will result in a poor match can significantly improve the performance of the alignment operation. However, this is only true if it is guaranteed that the filtering operation can be performed significantly faster. Otherwise, it can easily outweigh the benefits of the aligner. To expedite this pre-alignment filtering, among others, the recently proposed GRIM-Filter uses highly-parallel processing-in-memory operations benefiting from light-weight computational units on the logic-in-memory layer. However, the significant amount of data transferring between the memory and logic-in-memory layers quickly becomes a performance and energy bottleneck for the memory subsystem and ultimately for the overall system. By analyzing input genomes, we found that there are unexpected data-reuse opportunities in the filtering operation. We propose an algorithm-hardware co-design that exploits the data-reuse in the seed location filtering operation and, compared to the GRIM-Filter, cuts the number of memory accesses by 22-54%. This reduction in memory accesses improves the overall performance and energy consumption by 19-44% and 21-49%, respectively.

Bibtex

@Article{hameed_tetc21,
author = {Fazal Hameed and Asif Ali Khan and Jeronimo Castrillon},
journal = {IEEE Transactions on Emerging Topics in Computing (IEEE TETC)},
title = {{ALPHA}: A Novel Algorithm-Hardware Co-design for Accelerating {DNA} Seed Location Filtering},
pages = {12 pp.},
abstract = {Sequence alignment is a fundamental operation in genomic analysis where DNA fragments called reads are mapped to a long reference DNA sequence. There exist a number of (in)exact alignment algorithms with varying sensitivity for both local and global alignments, however, they are all computationally expensive. With the advent of high-throughput sequencing (HTS) technologies that generate a mammoth amount of data, there is increased pressure on improving the performance and capacity of the analysis algorithms in general and the mapping algorithms in particular. While many works focus on improving the performance of the aligner themselves, recently it has been demonstrated that restricting the mapping space for input reads and filtering out mapping positions that will result in a poor match can significantly improve the performance of the alignment operation. However, this is only true if it is guaranteed that the filtering operation can be performed significantly faster. Otherwise, it can easily outweigh the benefits of the aligner. To expedite this pre-alignment filtering, among others, the recently proposed GRIM-Filter uses highly-parallel processing-in-memory operations benefiting from light-weight computational units on the logic-in-memory layer. However, the significant amount of data transferring between the memory and logic-in-memory layers quickly becomes a performance and energy bottleneck for the memory subsystem and ultimately for the overall system. By analyzing input genomes, we found that there are unexpected data-reuse opportunities in the filtering operation. We propose an algorithm-hardware co-design that exploits the data-reuse in the seed location filtering operation and, compared to the GRIM-Filter, cuts the number of memory accesses by 22-54\%. This reduction in memory accesses improves the overall performance and energy consumption by 19-44\% and 21-49\%, respectively.},
month = jun,
year = {2021},
doi = {10.1109/TETC.2021.3093840},
issn = {2168-6750},
}

Downloads

2107_hameed_TETC [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3116


Go back to publications list