  • 學位論文

GAM-Cluster: 以GPU加速MetaCluster5.0

GAM-Cluster: Accelerating MetaCluster5.0 with GPU

指導教授 : 唐傳義




MetaCluster5.0 is a program for metagenomics binning, which is used to classify similar reads (DNA fragments) in a metagenomic sample into clusters. As reads come in large scale (up to millions in for a typical sample), and pairwise comparison between reads are needed to determine their similarity, the running time is slow. In this thesis, our goal is to accelerate MetaCluster5.0. We profiled MetaCluster5.0, and found out the performance bottleneck lies in its component func-tions USMerge and GetNeighbor. On the other hand, these two functions are good candidates to be parallelized with GPU for acceleration; various techniques, such as data partitioning, utilizing shared memory, using table lookup and output buffer, randomization, are proposed and found to be effective. Our experimental results showed a speedup of 3.1 times in USMerge and 8.1 times in GetNeighbor from the original 40-thread parallel version, or a speedup of 64.4 times and 178.3 times, respectively, from the original single-thread version.


[1] B. Yang, Y. Peng, H. C. M. Leung, S. M. Yiu, J. C. Chen, and F. Y. L. Chin (2009), “Unsupervised Binning of Environmental Genomic Fragments Based on An Error Robust Selection of l-mers”, in Third International Workshop on Data and Text Mining in Bioinformatics (DTMBio), pp. 3-10, 2009
[3] H. C. M. Leung, S. M. Yiu, B. Yang, Y. Peng, Z. Liu, J. Chen, J. Qin, R. Li, and F. Y. L. Chin (2013), “A Robust and Accurate Binning Algorithm for Metagenomic Sequences with Arbitrary Species Abundance ratio”, in Journal of Bioinformatics, vol. 27 issue 11, pp. 1489-1495, June 2013
[5] S. H. Lo, C. R. Lee, I. S. Chung, and Y. C. Chung (2013), “Optimizing Pairwise Box Intersection Checking on GPUs for Large-Scale Simulations”, in Journal of ACM Transactions on Modeling and Computer Simulation (TOMACS), vol. 23, issue 3, 2013
[6] T. H. Cormen, C. E. Leiserson, R L. Rivest, and C. Stein, Introduction to Algorithms, third edition
[8] Wikipedia, (2015) CUDA.
