  • 學位論文


Analyzing Product Semantic Labels as Cue for Placing Action - Product Semantic Dataset and Cooperative Dual-Arm Active Manipulation

指導教授 : 王學誠


在人工智慧的浪潮下,有愈來愈多的倉庫與商店走向無人化,如: 位於西雅圖的AmazonGo 無人商店, 主打顧客挑選商品完就可以帶著商品離開, 不須交由店員結帳付款, 而是自動辨識顧客挑選的商品進行信用卡扣款, 省去了等待排隊的時間; 而網路購物也是相當便利, 由廠商針對顧客網路訂單從倉儲進行集貨並由貨運公司進行運輸到客戶手中, 其中許多廠商為了提高集貨的效率已經著手於倉儲的自動化。在這些情境中,機器人需要根據商品上的語意標示進行自動夾取與放置任務。機器人夾取任務在亞馬遜機器人競賽已有重大的突破,但放置的議題卻鮮有人研究。例如無人商店需要商品整齊排列於貨架上, 讓顧客能清楚的找尋商品。此外,在倉庫中集貨時,也需要將商品上的條碼面對條碼機,系統才能分類商品並藉由輸送帶傳輸商品到指定箱子打包。特定姿態放置任務,主要有幾項困難:1)物品上的語意標示與物品幾何需同時考慮2)在雜亂的環境中物品互相遮蔽,導致難以辨識與操作。此外針對機器人操作的相關議題,目前並無一個統一的方法去評估機器人操作系統的性能,為了解決這些問題,我改良[1],並進行驗證。論文主要貢獻為1)提供開源的商品語意資料集,其中包含影像及物體、商標、條碼標注2)開發主動式操作雙手臂協作系統。藉由商品語意資料集證明此系統能有效解決上述提到之問題,並提出與分析失敗案例作為未來改善的依據。


Under the wave of artificial intelligence, there are more and more automatic warehouses and stores. For example, Amazon Go, an automatic store in Seattle, is known for its convenience: Customers can pick products and leave directly without checking out with store clerk. In Amazon Go, picked products will be automatically detected and paid by credit card. It saves lots of time for standing in line. The other example is Online shopping. Company receives customer’s Online orders, collects products in warehouse, and the shipping company delivers the goods to the customers. Many companies have started to develop automation of warehouse to improve the efficiency for collecting products. In these scenarios, robots need to depend on the semantic label on products to execute picking and placing tasks. Picking tasks have had important progress in Amazon Picking Challenge, but placing tasks reamin challenging. For example, in an automatic store, products need to be placed neatly on the shelf, and customers can find products easily. More, when classifying products in warehouse, barcode on the products needs to face barcode scanner so it can be classified, shipped on the conveyor, and packed to specific box. There are some challenges for pose-aware placing tasks: 1) Semantic label and geometry on the products need to be considered jointly. 2) The occlusion in cluttered environment makes detection and manipulation hard. More, there are no unified methods to evaluate the performance for robotic manipulation. To solve these problems, I improve [1], and evaluate the system. The paper contributions are 1): Offer open source dataset which includes image, labels for object, barcode, and brandname. 2) Develop cooperative dual-arm active manipulation system and prove its capibility for solving the problems. Finally, I analyze the failure case for the future improvement.


[1] H.-M. Huang, “Object pose estimation based on text recognition for automated pick-and-place of cashierless stores,” Master’s thesis, National Chiao Tung University, 11 2018.
[2] A. Zeng, K.-T. Yu, S. Song, D. Suo, E. Walker Jr, A. Rodriguez, and J. Xiao, “Multi-view self-supervised deep learning for 6d pose estimation in the amazon picking challenge,” in ICRA, 2017.
[3] N. Sünderhauf, O. Brock, W. Scheirer, R. Hadsell, D. Fox, J. Leitner, B. Upcroft, P. Abbeel, W. Burgard, M. Milford et al., “The limits and potentials of deep learning for robotics,”The International Journal of Robotics Research, vol. 37, no. 4-5, pp. 405–420, 2018.
[4] N. Atanasov, B. Sankaran, J. Le Ny, G. J. Pappas, and K. Daniilidis, “Nonmyopic view planning for active object classification and pose estimation,” IEEE Transactions on Robotics, vol. 30, no. 5, pp. 1078–1090, 2014.
[5] A. Doumanoglou, R. Kouskouridas, S. Malassiotis, and T.-K. Kim, “Recovering 6d object pose and predicting next-best-view in the crowd,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3583–3592.
