目標物追蹤在目前影像伺服與機器視覺等相關應用中是一個很重要的探討題目。然而大多數的目標物追蹤的研究都鎖定在乾淨的背景,本篇論文主要在發展一個在真實環境下的三維物體視覺追蹤系統,對特定的目標物進行識別且估測其深度,最後進行追蹤。 本論文之視覺追蹤系統在離線階段時,事先建立樣本資料庫與相機校正,其資料庫建立步驟為:進行目標物的角點提取、多解析度角點取樣與角點描述,並且建立粗與細之匹配結構;在線上操作階段時,進行雙眼輸入影像的角點提取與角點描述、目標物識別、深度計算和追蹤。因本系統同時考量穩定性與即時性,辨識時採用左CCD做目標物辨識,其角點提取採用改良式直覺角點偵測以達到快速提取角點,角點描述部份使用SIFT特徵描述並透過PCA降低其描述空間的維度,在目標物識別時,採用二階段角點匹配找出初始配對點群,再使用條件式RANSAC演算法移除配對錯誤點群,最後使用正確配對點群求出平面轉換參數,估測出目標物在左CCD的外形。在右CCD的目標物辨識,是採用左CCD與樣本的正確配對點,與右CCD的角點進行角點匹配,之後採用快速平面幾何限制移除左右CCD的錯誤配對點群,再進行目標物深度估測,之後利用運動向量對左右兩眼進行目標物搜尋範圍估測,最後利用前後張左眼的目標物重心運動向量加上伺服延遲補償,輸入運動命令給馬達,使馬達完成目標物追蹤。
Object tracking is an important issue for applications of visual servo and robot vision. This thesis proposes a 3D objects recognition and tracking system based on stereo vision under the real environment. The proposed system is divided into on-line operate phase and off-line training phase. In off-line training phase, the data corresponding to the object are collected, then the corners are detected, the multi-resolution patches are sampled, the corner descriptions are represented and then the hierarchical database is contrusted. In the operating phase, the test images are processed in the same manner mentioned above with single resolution patches, and then the corner will be matched with the hierarchical database. In the corner detection and description, this thesis adopts the modified intuitive corner detection to detect the corner and the SIFT to describe the corner. The two-stage corner matching is adopted, coarse and fine matching based on hierarchical structures of corner descriptions appears to reduce the range of patch’s candidates, is then adopted to improve the matching performance, and then the conditional RANSAC criterion is applied to reject the remaining outliers. The correct matching corners are applied to estimate the outline of the object by homography. For stereo vision, the proposed fast homography constraint is used to reject the outliers after the corners matching in the two images captured by the parallel camera. Then the inliers in above two images are used to estimate the object depth. Finally, the parallel camera mechanism is drived by the estimated value of motion vector to track the object.