透過您的圖書館登入
IP:3.17.181.21
  • 學位論文

GAM-Cluster: 以GPU加速MetaCluster5.0

GAM-Cluster: Accelerating MetaCluster5.0 with GPU

指導教授 : 唐傳義
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在總體基因學中,MetaCluster5.0是一個把環境中取得的基因片段做分群的程式。當要分群的基因片段數量越來越龐大的時候,比較兩兩基因片段的相似度將會花上非常多的時間。在這篇論文中,我們的目標為加速MetaCluster5.0。我們先分析MetaCluster5.0在各個階段所花費的時間,並且找出Metacluster5.0的效能瓶頸,我們發現在USMerge和GetNeighbor這兩個函數上花費的時間最多。另一方面,我們也發現這兩個函數非常適合以GPU平行技術來加速,故此我們提出了以GPU加速的MetaCluster5.0,名為GAM-Cluster,並使用了更多技巧(如將大資料切成許多塊以利計算、使用GPU的共享記憶體以加速資料的取得、使用查表法及創造一個緩衝區來收集要輸出的結果)以增進加速的效果。我們的實驗結果發現︰與使用了40個執行緒的CPU版本比較,GAM-Cluster在USMerge上達到了3.1倍的加速,而在GetNeighbor則是達到了8.1倍的加速;若與使用單一執行緒的CPU版本比較,則在這兩個函數中分別達到了64.4倍和178.3倍的加速。

並列摘要


MetaCluster5.0 is a program for metagenomics binning, which is used to classify similar reads (DNA fragments) in a metagenomic sample into clusters. As reads come in large scale (up to millions in for a typical sample), and pairwise comparison between reads are needed to determine their similarity, the running time is slow. In this thesis, our goal is to accelerate MetaCluster5.0. We profiled MetaCluster5.0, and found out the performance bottleneck lies in its component func-tions USMerge and GetNeighbor. On the other hand, these two functions are good candidates to be parallelized with GPU for acceleration; various techniques, such as data partitioning, utilizing shared memory, using table lookup and output buffer, randomization, are proposed and found to be effective. Our experimental results showed a speedup of 3.1 times in USMerge and 8.1 times in GetNeighbor from the original 40-thread parallel version, or a speedup of 64.4 times and 178.3 times, respectively, from the original single-thread version.

參考文獻


[1] B. Yang, Y. Peng, H. C. M. Leung, S. M. Yiu, J. C. Chen, and F. Y. L. Chin (2009), “Unsupervised Binning of Environmental Genomic Fragments Based on An Error Robust Selection of l-mers”, in Third International Workshop on Data and Text Mining in Bioinformatics (DTMBio), pp. 3-10, 2009
[3] H. C. M. Leung, S. M. Yiu, B. Yang, Y. Peng, Z. Liu, J. Chen, J. Qin, R. Li, and F. Y. L. Chin (2013), “A Robust and Accurate Binning Algorithm for Metagenomic Sequences with Arbitrary Species Abundance ratio”, in Journal of Bioinformatics, vol. 27 issue 11, pp. 1489-1495, June 2013
[5] S. H. Lo, C. R. Lee, I. S. Chung, and Y. C. Chung (2013), “Optimizing Pairwise Box Intersection Checking on GPUs for Large-Scale Simulations”, in Journal of ACM Transactions on Modeling and Computer Simulation (TOMACS), vol. 23, issue 3, 2013
[6] T. H. Cormen, C. E. Leiserson, R L. Rivest, and C. Stein, Introduction to Algorithms, third edition
[8] Wikipedia, (2015) CUDA.

延伸閱讀