透過您的圖書館登入
IP:3.140.198.173
  • 學位論文

基於深度學習及遷移式學習之機器人操作平板電腦虛擬鍵盤的視覺與動作協調系統

A Vision and Motion Coordination System Based on Deep Learning and Transfer Learning for a Robot to Type Virtual Keyboards on a Tablet Computer

指導教授 : 鄭士康
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在本論文中,我們提出了一種新穎的視覺和運動協調系統機器人作為長期護理長者的物理代理人操作鍵盤。要年長者去學習現代多樣化的應用程式是如何使用是一項非常困難的挑戰,如果機器人可以為他們操作這些智慧裝置,勢必可以大幅減少照護者的負擔。我們所提出的系統使用卷積神經網絡物件偵測來感知目標按鈕位置,並通過深度神經網絡來控制其動作。我們設計了一個虛擬代理人NAOgym,他負責管理機器人感知和運動模型之間的訊息交換。我們使用了基於CNN的視覺模型來偵測顯示在平板電腦上的目標按鍵與出現在視線中的觸控筆,並且計算它們的相對位置和距離作為觀察到的高階語義信息。而基於DNN的運動模型,將會根據結合了相對位置與物理代理人傳感器的狀態訊息,通過運動模型的策略來產生下一個動作。另外,我們把注意機制應用在動作控制模型上,並將其受專注的程度當作關節的運動速度,來加速強化學習演算法的訓練。在虛擬手臂環境中,我們設計了像NAO一樣的手臂來評估訓練過程和效能,特徵的選擇對效能的影響以及演算法對無須預先校准的假設。通過虛擬手臂進行實驗以評估所提出的系統。實驗結果驗證了我們提出的概念。

並列摘要


In this work, we propose a novel vision and motion coordination system robot as a physical agent typing keyboard for elders in long-term care. It is a challenge for older people to learn how to use modern and diverse applications; if robots can operate these smart devices for them, it will inevitably reduce the burden on caregivers. Our proposed system uses convolutional neural network object detection to sense the position of the target button and control its motion through a deep neural network. We designed a cyber-agent, NAOgym, who manages the exchange of information between robot perception and motion models. We used a CNN-based model to detect the target buttons displayed on the tablet computer and the stylus pen that appeared in sight, and calculate their relative position and distance as the observed high-level semantic information. The DNN-based actor model will generate the next action through the policy of the actor model based on the state information combined with the relative position and the physical agent sensor. In addition, we apply the attention mechanism to the motion control model and use the degree of concentration as the speed of the joint to accelerate the training of the reinforcement learning algorithm. In the virtual arm environment, we design an arm like NAO’s to evaluate the training process and performance, the features affection to the performance, and the calibration-free assumption of the algorithm. The experiments are conducted through the virtual arm environment to evaluate the proposed system. Experiment results verify the conception we proposed.

參考文獻


[1] Y. D. Qian Yu. "Attention-OCR." https://github.com/da03/Attention-OCR (accessed.
[2] A. Robotics. "Aldebaran official website." http://doc.aldebaran.com/2-5/home_nao.html (accessed.
[3] S. Feng, E. Whitman, X. Xinjilefu, and C. G. Atkeson, "Optimization based full body control for the atlas robot," in 2014 IEEE-RAS International Conference on Humanoid Robots, 2014: IEEE, pp. 120-127.
[4] S. Feng, X. Xinjilefu, C. G. Atkeson, and J. Kim, "Optimization based controller design and implementation for the atlas robot in the darpa robotics challenge finals," in 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), 2015: IEEE, pp. 1028-1035.
[5] C. W. Wampler, "Manipulator inverse kinematic solutions based on vector formulations and damped least-squares methods," IEEE Transactions on Systems, Man, and Cybernetics, vol. 16, no. 1, pp. 93-101, 1986.

延伸閱讀