植基於GPU與CPU之快速物件辨識

在平行架構下演算法中，一般使用分散式電腦來提升效能，但只要一台電腦停止運作，就會影響整個系統，這是分散式電腦無法避免的缺點。由 NVIDIA所發展的GPGPU技術中，以多核心處裡器(GPU)同時工作取代傳統分散式電腦，GPU本身的穩定性比分散式電腦高上許多，加上GPU的記憶體頻寬比CPU的大上許多，用來計算大資料量與龐大數學結構都比CPU佔優勢。物件辨識為一個高運算量的系統，大多數的物件辨識的研究都鎖定在乾淨的背景，或是已經擷取的物件進行辨識，降低運算量，而本篇論文實驗環境為複雜背景並設計出GPU與CPU之高效能合作模式，以加速物件辨識之速度。本論文之視覺辨識系統分為1.離線階段2.線上辨識，1.離線階段建立樣本資料庫；其資料庫建立步驟為：CPU模式目標物的角點偵測、GPU模式平行化多解析度角點SIFT描述、GPU模式平行化PCA降維，並最後在CPU使用SMO (Sequential Minimal Optimization) 解其支持向量機最佳化問題，儲存相關參數以利在線上辨識操作，在2.線上辨識階段執行步驟為輸入影像之CPU模式角點偵測、GPU模式平行化SIFT描述、GPU模式支持向量機進行平行化分類與RANSAC演算法。最後由實驗結果得知本論文之演算法可以有效在真實環境中偵測目標物，並在一個GPU與CPU之混合驗算法中可以有效提升計算速度。

關鍵字

改良式直覺角點偵測； RANSAC ；支持向量機； GPU ； CUDA ；平行處理

並列摘要

It usually used network personal computers to improve calculating performance in the applaiction of parallel algorithms. However, the whole system will fail when one of the network personal computers is broken. The NVIDIA has developed the GPGPU technology to replace network personal computers because it is more stable and have a higher memory accessing bandwidth than the network personal computers. Therefore, the parallel algorithms with massive dataset are increseasingly executing by GPGPU. The computation load of object recognition is massive. Hence, this thesis proposes a hybrid architecture which synthesizes Graphic Processing Unit (GPU) and Central Processing Unit (CPU) to achieve high speed object recognition. In this thesis, the NVIDIA CUDA and Intel Pentium(R) Dual-Core will be adapted as GPU and CPU, respectively. The algorithm architecture of this thesis consists of two phases: (1) the off-line training phase, and (2) the on-line operating phase. The GPU handles all the heavy computing process in both pahses including parallel multi-resolution SHIF description in off-line training pahse and parallel SHIF description, parallel PCA transformation and parallel Support Vector Machine (SVM) classifier in on-line operating phase. The CPU handles the proposed corner detection to quickly extract the features in both phases, the SVM optimum parameters solution with SMO (Sequential Minimal Optimization), PCA transformation in off-line training phase, and the RANSAC criterion applied to reject the remaining outlier in on-line operating phase. Finally, experimental results show that the proposed algorithms can rapidly detect the target even in the real environment. And the hybrid architecture of GPU and CPU can dramatically improve the computing efficience.

並列關鍵字

Modified intuitive corner detection for cluster ； Support Vector Machine ； RANSAC ； SMO ； CUDA ； GPU

參考文獻

[3] Z. Zhang, R. Deriche, O. Faugeras and Q. T. Luong, "A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry," Artificial Intelligence, Vol. 78, pp.87-119, 1995.

[5] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, Vol. 60, No. 2, pp.91-110, 2004.

[8] K. Mikolajczyk and C. Schmid, "A performance evaluation of local descriptors," IEEE Transactions on Patten Analysis and Machine Intelligence, Vol. 27, No. 10, pp.1615-1630, October 2005.

[11] V. Lepetit and P. Fua, "Keypoint recognition using randomized trees," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, No. 9, pp.1465-1479, September, 2006.

[14] K. Junchul, P. Eunsoo, C. Xuenan and K. Hakil, A. William, " A Fast Feature Extraction in Object Recognition Using Parallel processing on CPU and GPU, " in: IEEE Conference on Systems, Man, and Cybernetics, San Antonio, TX, USA, October , 2009.

國際替代計量

植基於GPU與CPU之快速物件辨識

全文下載

主題瀏覽