立體視覺(stereo vision)在電腦視覺(computer vision)領域中是熱門的研究之一,此研究是以兩台左右相機模擬人類的雙眼,以獲取三維空間中的深度資訊(depth information)。為了要找到影像內像素(pixel)的深度資訊,必須透過一對影像的相對應像素幾何關係中取得,加上深度在幾何關係中與視差(disparity)成反比,因此本文將以視差為主要分析的目標,並建置一套基於可適性支持權重之成本聚合(based on cost aggregation with adaptive support weight)的立體匹配演算法(stereo matching algorithm, SMA),以尋求影像內像素的深度資訊。 立體匹配演算法係指將目標影像(左影像)中像素,尋找參考影像(右影像)之對應點。對應點是由目標像素與參考像素之間的相似度計算,係透過支持視窗(support window)將鄰近的像素做成本計算與成本聚合,最後再取出最相似點之對應點,其對應點之間的距離差稱為視差。 本研究方法的第一步是先在此視窗內以截斷絕對差異(truncation absolute difference, TAD)方式做成本計算(cost computing),之後再完成可適性支持權重之成本聚合,再使用贏者全拿(winner-take-all, WTA)找出最小的成本聚合的位置,以獲取初始視差。因視差與鄰近視差之間的連續性關係,為了讓視差值更精確,本論文利用直方圖統計鄰近支持視窗內像素點之視差,將視差值以眾數進行取代,以達到降低錯誤率的目的。 本論文所提出可適性支持權重之成本聚合,其可分為兩個部份,第一部份為CABSW,該方法係透過目標影像與參考影像分別進行二值化處理,並取得二值化之交集區域形成一個不規則的可適性聚合視窗;第二部份為將聚合視窗進行CAASW,而CAASW為描述可適性支持權重之成本聚合,其方法係依據參考區域與目標區域顏色相似性與接近性為權重考慮因素,形成可適性聚合權重。 為了能更精確表示本論文所提出方法的準確性,因此以Middlebury所提供的標準立體圖庫與標準立體匹配評估法進行測式,並分別與其他學者所提出之立體匹配演算法做一系列視覺上與數據上的分析比較,實驗結果證明本論文所提出的方法能獲得較高精確的視差結果,未來可將此方法應用於機器人導航、工業量測、人機介面與物體三維重建,以提高電腦的智慧化能力。
Stereo vision is a popular area of research in computer vision. Stereo vision uses a pair of images from left and right cameras, mimicking the vision of human eyes. Stereo vision can easily provide depth and three-dimensional information. In order to find the pixel depth information, the image must encompass its geometric relations. The main purpose of this project is to develop an accurate stereo vision algorithm based on cost aggregation with adaptive support weight. Because the depth and disparity in the geometric relationship is inversely proportional, this paper will analyze the disparity as the main target. In this study, we use a pair of images (from left and right cameras) to find corresponding points. First, we indicate truncation absolute difference as cost computing, and then complete the cost aggregation with adaptive support weight. Then we use the local winner-take-all method to find the minimum cost aggregation value of the location to obtain the initial disparity. To enhance the accuracy of this study, a disparity map uses the disparity neighboring relationship between continuity; we indicate the histogram as a disparity refinement, making it possible to reduce the disparity map’s error rate. In this paper, the cost aggregation with adaptive support weight can be divided into two parts. The first part is CABSW, a method employing binary target and reference images, with area of intersection to form an irregular adaptive support window. The second part is CAASW, using similarity and proximity as features of an adaptive support window with CABSW. In order to more accurately represent the accuracy of this method, in the experiment, this test will be use the Middlebury database, along other methods, for experimental comparison and analysis, to explore the experimental results, and obtain results with a lower percentage of bad matching pixels. In the future, research can use robot navigation, industrial manufacturing, human interface, 3-D reconstruction, and improved computer intelligence capabilities.