本論文提出一個基於影像偵測之物件夾取系統,本論文首先使用YOLO (You Only Look Once)演算法來偵測物件的類別以及位置,然後使用點雲方法來獲得物件的夾取點與法向量,讓機械手臂以這個物件的夾取點與法向量來有效地夾取物件。主要有二個部分:(1) 物件偵測、以及(2) 物件法向量估測。在物件偵測上,本論文先使用RGB-D攝影機來取得物件的深度影像圖,然後標記各種需要訓練之物件的照片來作為YOLO演算法的訓練資料,最後實際測試已訓練好之網路確實可以準確地偵測各種物件的類別以及位置。在物件法向量估測上,本論文先將YOLO演算法所偵測出之物件的邊界框作為限制點雲的範圍,然後在範圍內決定一個中心點與一個半徑,最後將在半徑內的點雲所得到之法向量的平均為中心點的法向量。而這個中心點與中心點的法向量作為這個物件夾取的夾取點與法向量。從實驗結果可知,所提方法確實可以有效地得到夾取這個物件的物件法向量。
In this thesis, an object grasping system based on image detection is proposed. The YOLO (You Only Look Once) algorithm is first used to detect the category and location of the object, and then the point cloud method is used to obtain the grasping point and the normal vector of the object so that the robot manipulator can use this grasping point and the normal vector of the object to effectively grasp the object. There are two main parts: (1) object detection and (2) object normal vector estimation. In the object detection, an RGB-D camera is first used to obtain the depth image of the object, and then the photos of various objects to be trained are marked to be the training data of the YOLO algorithm. Finally, it is actually tested that the trained network can indeed accurately detect the category and location of various objects. In the object normal vector estimation, the bounding box of the object detected by the YOLO algorithm is first used to as the limit of the point cloud, and a center point and a radius within the range are determined. Finally, the average of the normal vector obtained from the point cloud within the radius is the normal vector of the center point. And the center point and the normal vector of the center point are respectively used as the grasping point and the normal vector for the object grasping. It can be seen from the experimental results that the proposed method can indeed effectively obtain the normal vector of the object to grasp the object.