摘要 針對大尺度面板的高精度自動表面瑕疵檢測,一張完整的待測物影像,必定含有非常大量的像素(pixel)。針對此類應用,目前多採離線(off-line)方式進行,亦即,以單一視覺檢測模組,依序擷取涵蓋面板之所有影像後,再進行影像處理以檢測瑕疵。本論文提出一個異構式平行計算系統,來達成高速度、高精度之線上(in-line)表面瑕疵檢測的目標。從硬體角度而言,此系統是由多個CPU (Central Processing Unit;中央處理器)和GPU (Graphics Processing Unit;圖形處理器)所組成;每一對CPU與GPU的組合,代表一個視覺檢測模組。從軟體角度而言,此系統採用MPI (Message Passing Interface;訊息傳遞介面)與CUDA (Compute Unified Device Architecture;統一計算架構)兩種編程模型,來實現瑕疵檢測演算法。這是針對高速度、高精度的線上表面瑕疵檢測,所提出的第一個異構式平行計算系統。本論文中,模擬待測物之長度和寬度分別為28 cm和8.6 cm,精度要求為每個像素代表一個3.5μm × 3.5μm的區域,速度要求為4秒內完成待測物的瑕疵判定。實驗結果顯示,本論文所提之異構式平行計算系統,能於4秒內處理含有大於 個像素的影像,達到準確的瑕疵區域判定,以及平均 的瑕疵數目高估率。
Abstract In applications of high precision surface defect detection for large-scale panels, the image of an inspected object must contain a tremendous amount of pixels. Currently, for this kind of applications, the off-line approach is frequently adopted. The off-line approach is to use a single visual inspection module to capture all the partial images of an object, and then perform image processing to detect defects on the object. The thesis presents a heterogeneous parallel computing system to achieve in-line surface defect detection with high performance on the speed and precision. With regard to the hardware, the system consists of multiple CPUs (Central Processing Units) and GPUs (Graphics Processing Units); each pair of a CPU and a GPU represents a single visual inspection module. With regard to the software, the system adopts Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA) programming models to implement defect detection algorithms. It is the first heterogeneous parallel computing system proposed for high speed and high precision in-line detection of surface defects. In the thesis, a simulated object to be inspected has 28 cm and 8.6 cm in length and width, respectively. The precision requirement is that each pixel represents a 3.5μm × 3.5μm area. The speed requirement is that defect detection of such object shall be completed in 4 seconds. Experimental results show that the proposed heterogeneous parallel computing system can process more than pixels within 4 seconds to accurately determine the defect contours and obtain an average of overestimate rate on the number of defects.