平行簡化群體演算法與混合梯度下降法於CUDA架構之實現

隨著圖形處理器（GPU）的取得成本逐年降低，運算能力的大幅進步，跨領域支援性的提升，需要大量平行化處理的最佳化問題在現今得以由個人電腦來處理。最佳化計算領域中，智能群體演算法（SIAs）以及梯度下降法的資料特性適用於平行化處理，但過去尚無學者提出運行於GPU上的簡化群體演算法（SSO）以及簡單實現的混合梯度下降法；因此本研究在考慮計算資源有有限，演算法泛用性、實現難度的條件下，提出基於統一計算架構（CUDA）上運行的平行簡化群體演算法（PSSO），以及基於梯度的混合梯度下降法G-SOA。在PSSO方面，參考其他SIAs由CPU轉換為GPU運行的研究中，有兩個共通的缺點，其一：適合度函數的時間複雜度理論值為O (tNm)，其中有t次迭代數目，N個適合度函數，需要兩兩比較m次；其二：pBests與gBest在更新時會有資源搶佔的問題。本研究提出PSSO以改善上述缺點，研究結果顯示時間複雜度降低了N的指數量級，且完全避免了資源搶佔的問題。在G-SOA方面，梯度下降法能夠確保收斂到區域最佳解，但其無法保證收斂到全域最佳解，本研究因此利用了ea-mutation以及混合梯度的Search方法，讓已經掉入區域最佳解的粒子能夠逃出。經由研究結果顯示G-SOA相較於原本的梯度下降法，所找到的解更爲優良。最終本研究將以不同的最佳化問題來驗證本研究所提出方法的求解品質以及運行速度。

關鍵字

圖形處理器；平行運算；智能群體演算法；簡化群體演算法；梯度下降法；統一計算架構

並列摘要

As the acquisition cost of the graphics processing unit (GPU) has decreased, personal computers (PC) can handle optimization problems nowadays. In optimization computing, intelligent swarm algorithm (SIAs) and the gradient descent method are suitable for parallelization. However, there neither had a GPU-based simplified swarm algorithm (SSO) nor a simple-implemented hybrid gradient method been proposed. Accordingly, this dissertation proposed PSSO and G-SOA based on CUDA platform considering computational ability and versatility. In PSSO, first, the theoretical value of time complexity of fitness function is O (tNm). There are t iterations and N fitness function, each of which required pair comparisons m times. Second, pBests and gBest have the resource preemption when updating. As the experiments showed, the time complexity has successfully reduced by order of magnitude of N, and the problem of resource preemption was avoided entirely. In terms of G-SOA, gradient descent method can guarantee to local optimum convergence, but cannot ensure to global optimum. Thus, this study used ea-mutation and a hybrid-gradient method-Search to enable particles to get rid of local optimum. The results showed that G-SOA performed better than the original gradient descent method.

並列關鍵字

Graphics Processing Unit (GPU) ； Parallelism ； Intelligent Swarm Algorithm (SIAs) ； Simplified Swarm Optimization (SSO) ； Gradient Descent (GD) ； Compute Unified Device Architecture (CUDA)

參考文獻

[1] R. Shams, P. Sadeghi, R. A. Kennedy, and R. I. Hartley, “A survey of medical image registration on multicore and the gpu,” IEEE Signal Processing Magazine, vol. 27, no. 2, pp. 50–60, 2010.

Google Scholar

[2] S. Mittal and J. S. Vetter, “A survey of methods for analyzing and improving gpu energy efficiency,” ACM Computing Surveys (CSUR), vol. 47, no. 2, pp. 1–23, 2014.

Google Scholar

[3] B. Deschizeaux and J.-Y. Blanc, “Imaging earth s subsurface using cuda,” GPU Gems, vol. 3, pp. 831–850, 2007.

Google Scholar

[4] G. Hager, T. Zeiser, and G. Wellein, “Data access optimizations for highly threaded multi-core cpus with multiple memory controllers,” in 2008 IEEE International Symposium on Parallel and Distributed Processing, pp. 1–7, IEEE, 2008.

Google Scholar

[5] C. T. Documentation, “v10. 1.168,” 2019.

Google Scholar

國際替代計量

平行簡化群體演算法與混合梯度下降法於CUDA架構之實現

全文下載

主題瀏覽