針對機械手臂複合式工作的神經網路控制模型之比較

近幾年來，神經網絡（NNs）的巨大進步已在機器人控制領域中得到廣泛採用。隨著越來越複雜的機器人操作被嘗試後，用於機器人控制的神經網絡也變得更加複雜，而這也使得神經網路難以訓練並且需要大量的計算資源來運行。儘管研究人員追求可以使用單個神經網絡執行機器人操作的複雜界線，然而更直觀且實用的方法則是將機器人操作劃分為多個子任務，而每個子任務則由較小的神經網絡控制。在本文中，我們以倒飲料的示範行為來定量地比較這兩種方法。倒飲料的操作是可以很容易地被細分為三個子任務，分別是到達(reaching)飲料旁, 抓取(grasping)飲料以及倒(pouring)飲料。而機器人透過行為複製(behavior cloning)的方式從人類示範中學到該操作。我們根據完成整個操作的成功率和它們神經網絡模型中的模型參數數量，來對上述提到的兩種方法進行了比較。我們的實驗顯示，在相似的成功率下，多個NN模型所需的模型參數最多比單個NN模型少53.5％。而在相似數量的模型參數量下，多個NN模型的成功率比單個NN模型高25.6％。

關鍵字

機器人控制；單一神經網路模型；多個神經網路模型

並列摘要

The vast advances of neural networks (NNs) in recent years have seen their widespread adoptions in robot controls. As increasingly more complex robot operations are attempted, the neural networks for robot controls become very complicated, making them difficult to train and requiring large computing resources to run. While researchers pursue the complexity boundary of robot operations that can be performed using a single neural network, a more intuitive and practical approach is to divide the robot operation into subtasks, each is controlled by a smaller neural network. In this thesis, we compare the two approaches quantitatively using drink-pouring from demonstration as an example. The drink-pouring operation can be easily subdivided into three subtasks, reaching, grasping, and pouring a bottle of drink. The robot learns this operation from human demonstration via behavior cloning. The two approaches are compared in terms of their success rate of accomplishing the whole operation and the number of parameters in their neural network models. Our experiments show that under similar success rates, the multiple NN models require at best 53.5% fewer parameters than the single NN model. Under similar numbers of model parameters, the multiple NN models can achieve as much as 25.6% better success rate than the single NN model.

並列關鍵字

Robotic control ； Single neural network model ； Multiple neural network models

參考文獻

[1]Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng. Deep Speech: Scaling up end-to-end speech recognition. arXiv:1412.5567, 2014.

Google Scholar

[2]Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, et al. Deep speech 2: End-to-end speech recognition in English and Mandarin. In ICML, 2016.

Google Scholar

[3]Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, et al. End to End Learning for Self-Driving Cars. arXiv:1604.07316 ,2016

Google Scholar

[4]Sergey Levine, Peter Pastor, Alex Krizhevsky, Deirdre Quillen. Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection. arXiv:1603.02199, 2016.

Google Scholar

[5]Rouhollah Rahmatizadeh1, Pooya Abolghasemi, Ladislau Bol¨ oni ¨, Sergey Levine. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration. In IEEE International Conference on Robotics and Automation (ICRA), 2018.

Google Scholar

國際替代計量

針對機械手臂複合式工作的神經網路控制模型之比較

全文下載

主題瀏覽