無人機環境感知與階層式手勢控制於人機協作任務應用__國立政治大學博碩士論文全文影像系統

上傳須知

帳號：guest(3.144.43.125) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中):	邱庭毅
作者(英):	Chiu, Ting-Yi
論文名稱(中):	無人機環境感知與階層式手勢控制於人機協作任務應用
論文名稱(英):	UAV Environment Perception and Hierarchical Gesture Control in Human-Robot Collaboration Applications
指導教授(中):	劉吉軒
指導教授(英):	Liu, Jyi-Shane
口試委員:	劉吉軒廖文宏唐政元
口試委員(外文):	Liu, Jyi-Shane Liao, Wen-Hung Tang, Cheng-Yuan
學位類別:	碩士
校院名稱:	國立政治大學
系所名稱:	資訊科學系
出版年:	2021
畢業學年度:	109
語文別:	中文
論文頁數:	76
中文關鍵詞:	無人機、手勢辨識、人機協作、基於視覺的即時定位與地圖構建、實例分割
英文關鍵詞:	UAV、Gesture Recognition、Human-Robot Collaboration、v-SLAM、Instance Segmentation
Doi Url:	http://doi.org/10.6814/NCCU202101454
相關次數:	推薦:1 點閱:133 評分: 下載:52 收藏:0

無人機應用由早期軍事任務的範疇，逐漸拓展到現今民生服務的領域，鑑於無人機具有易於專案部署、低保養成本、高機動性的特點，因此廣受各領域人士的歡迎，但是實體搖桿控制對操作人員並不友善，需要專業培訓以熟練所有操作技能，是屬於高學習門檻的人機互動方式。除此之外，無人機的自動化控制是難以有效地被應用於現實任務，主要原因是現實環境往往是非結構化的，對於自動化控制而言，可能存在未被定義或者未能被準確定義的例外狀況。
為了建立自然直觀的無人機操控方式，本研究提出無人機環境感知與階層式手勢控制的人機協作方法，採用階層式框架以手勢進行半自動化飛行控制的調控，是基於Mediapipe的手部追蹤與定位技術，提出由幾何向量計算手指開合狀態與指向方位作為手勢辨識的方法；此外也基於ORB-SLAM2的即時定位與地圖構建與Detectron2的實例分割技術，提出可以根據自訂義資料集進行特定目標的感知，透過圖片的實例分割進行3D物體的體積與座標估計。最後，經由數名受試者的實驗資料結果分析，得以證實本研究提出的控制方法更優於實體的搖桿控制，可以更快更高效率地完成任務，而且在環繞飛行時目標的檢視畫面也更為平穩。

The application of UAVs has gradually shifted from military missions to civilian services. UAVs are popular in various fields due to their convenient deployment, low maintenance cost, and high maneuverability. However, the joystick control is not friendly to the operator, because the joystick is a human-computer interaction with the high learning threshold, and requires professional training to be proficient in skill. In addition, since real-world conditions are usually unstructured, and there may be undefined or inaccurately defined exceptions, it is difficult for the automated control of UAVs to be applied to real-world tasks.
In order to create an intuitive drone control method, we propose a human-robot collaboration method of UAV environment perception and hierarchical gesture control, using a hierarchical framework to adjust flight procedures through gestures to achieve semi-automatic control. In hierarchical gesture control, we adopt flexion state of fingers and pointing direction of hand as the features of gesture recognition, based on the hand tracking technology of Mediapipe. Furthermore, we provide customizable target perception based on combining ORB-SLAM2 and Detectron2, which can estimate the volume and coordinates of 3D objects by instance segmentation. Finally, through the analysis of the experimental results of the participants, given that our proposed control method can complete the task more efficiently and provide a more stable image during surround inspection, we can confirm that our proposed control method is better than physical joystick control.

第1章緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
1.3 論文架構 4
1.4 研究成果與貢獻 5
第2章文獻回顧 7
2.1 人機互動 7
2.1.1 手勢辨識 7
2.1.2 人機協作 9
2.2 3D物體的點雲分類 10
2.3 基於視覺的即時定位與地圖構建 11
2.4 影像的分割 15
第3章研究方法 17
3.1 研究流程設計 17
3.2 實驗設備與實施方法 18
3.3 手勢辨識 21
3.3.1 靜態手勢 22
3.3.2 動態手勢 28
3.4 無人機環境感知 29
3.4.1 探索可行徑空間的射線分析 30
3.4.2 3D物體的實例分割 31
3.5 階層式手勢控制架構 34
3.5.1 第一階層：基礎的飛行控制 35
3.5.2 第二階層：環境感知與目標物選擇 37
3.5.3 第三階層：巨集的飛行控制 39
第4章實驗設計與結果分析 43
4.1 實驗設計 43
4.1.1 受試資料收集 46
4.1.2 地圖比例尺校正 47
4.1.3 系統模組單元測試 48
4.2 實驗評估指標 50
4.2.1 完成所有任務所需時間 50
4.2.2 實際飛行路徑與最短距離比 51
4.2.3 環繞檢視時目標於畫面像素位移速度 52
4.3 實驗結果與分析 53
4.3.1 完成所有任務所需時間之分析 53
4.3.2 實際飛行路徑與最短距離比之分析 57
4.3.3 環繞檢視時目標於畫面像素位移速度之分析 61
4.4 小結 67
第5章結論與未來展望 68
5.1 研究結論 68
5.2 未來展望 69
參考文獻 70
附錄 76

[1] R. Austin, Unmanned Aircraft Systems: UAVS Design, Development and Deployment, John Wiley & Sons, 2011.
[2] S. G. Gupta, M. M. Ghonge and P. M. Jawandhiya, "Review of Unmanned Aircraft System (UAS)," International Journal of Advanced Research in Computer Engineering & Technology, vol. 2, no. 4, pp. 1646-1658, 2013.
[3] A. P. Cracknell, "UAVs: regulations and law enforcement," International Journal of Remote Sensing, vol. 38, no. 8-10, pp. 3054-3067, 2017.
[4] PwC, "Global market for commercial applications of drone technology valued at over $127bn," 2016. [Online]. Available: https://pwc.blogs.com/press_room/2016/05/global- market-for-commercial-applications-of-drone-technology-valued-at-over-127bn.html. [Accessed Nov 2020].
[5] N. Smolyanskiy and M. Gonzalez-Franco, "Stereoscopic First Person View System for Drone Navigation," Frontiers in Robotics and AI, vol. 4, no. 11, 2017.
[6] D. A. Schoenwald, "AUVs: In space, air, water, and on the ground," IEEE Control Systems Magazine, vol. 20, no. 6, pp. 15-18, 2000.
[7] K. W. Williams, "A summary of unmanned aircraft accident/incident data: Human factors implications," 2004.
[8] E. Peshkova, M. Hitz and B. Kaufmann, "Natural interaction techniques for an unmanned aerial vehicle system," IEEE Pervasive Computing, vol. 16, no. 1, pp. 34-42, 2017.
[9] F. F. Mueller and M. Muirhead, "Jogging with a Quadcopter," in In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 2015.
[10] M. Stokkeland, K. Klausen and T. A. Johansen, "Autonomous visual navigation of unmanned aerial vehicle for wind turbine inspection," in 2015 International Conference on Unmanned Aircraft Systems (ICUAS), 2015.
[11] A. Loquercio, A. I. Maqueda, C. R. del-Blanco and D. Scaramuzza, "DroNet: Learning to Fly by Driving," IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 1088-1095, 2018.
[12] A. Giusti, J. Guzzi, D. C. Cireşan, F.-L. He, J. P. Rodriguez, F. Fontana, M. Fässler, C. Forster, J. Schmidhuber, G. Di Caro, D. Scaramuzza and L. M. Gambardella, "A machine learning approach to visual perception of forest trails for mobile robots," IEEE Robotics and Automation Letters, vol. 1, no. 2, pp. 661-667, 2015.
[13] P. Tsarouchi, S. Makris and G. Chryssolouris, "Human–robot interaction review and challenges on task planning and programming," International Journal of Computer Integrated Manufacturing, vol. 29, no. 8, pp. 916-931, 2016.
[14] H. Liu and L. Wang, "Gesture recognition for human-robot collaboration: A review," International Journal of Industrial Ergonomics, vol. 69, pp. 355-367, 2018.
[15] Google, "Pixel Phone," 2019. [Online]. Available: https://support.google.com/pixelphone/answer/9517454?hl=zh-Hant. [Accessed 22 Nov 2020].
[16] BMW, "BMW ConnectedDrive 智慧互聯駕駛," 2017. [Online]. Available: https://www.bmw.com.tw/zh/all-models/x-series/X3/2017/connectivity-driver- assistance.html. [Accessed 22 Nov 2020].
[17] M. Karam, "A framework for research and design of gesture-based human-computer interactions," PhD Thesis, University of Southampton, 2006.
[18] P. K. Pisharady and M. Saerbeck, "Recent methods and databases in vision-based hand gesture recognition: A review," Computer Vision and Image Understanding, vol. 141, pp. 152-165, 2015.
[19] W. Zeng, "Microsoft kinect sensor and its effect," IEEE multimedia, vol. 19, no. 2, pp. 4-10, 2012.
[20] F. Weichert, D. Bachmann, B. Rudak and D. Fisseler, "Analysis of the accuracy and robustness of the leap motion controller," Sensors, vol. 13, no. 5, pp. 6380-6393, 2013.
[21] P. Hong, M. Turk and T. S. Huang, "Gesture modeling and recognition using finite state machines," in Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition, 2000.
[22] L. Bretzner, I. Laptev and T. Lindeberg, "Hand Gesture Recognition using Multi-Scale Colour Features, Hierarchical Models and Particle Filtering," in Proceedings of fifth IEEE international conference on automatic face gesture recognition, 2002.
[23] I. Oikonomidis, N. Kyriazis and A. A. Argyros, "Efficient Model-based 3D Tracking of Hand Articulations using Kinect," in BmVC, 2011.
[24] Z. Ren, J. Yuan, J. Meng and Z. Zhang, "Robust Part-Based Hand Gesture Recognition Using Kinect Sensor," IEEE transactions on multimedia, vol. 15, no. 5, pp. 1110-1120, 2013.
[25] O. Kopuklu, N. Kose and G. Rigoll, "Motion fused frames: Data level fusion strategy for hand gesture recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018.
[26] O. Köpüklü, A. Gunduz, N. Kose and G. Rigoll, "Online Dynamic Hand Gesture Recognition Including Efficiency Analysis," IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 2, no. 2, pp. 85-97, 2020.
[27] G. Michalos, S. Makris, P. Tsarouchi, T. Guasch, D. Kontovrakis and G. Chryssolouris, "Design considerations for safe human-robot collaborative workplaces," Procedia CIrP, vol. 37, pp. 248-253, 2015.
[28] A. Bauer, D. Wollherr and M. Buss, "Human-Robot Collaboration: A Survey," International Journal of Humanoid Robotics, vol. 5, no. 1, pp. 47-66, 2008.
[29] V. Villani, F. Pini, F. Leali and C. Secchi, "Survey on human–robot collaboration in industrial settings: Safety, intuitive interfaces and applications," Mechatronics, vol. 55, pp. 248-266, 2018.
[30] I. Surgical, "Da Vinci Surgical Systems," 2017, [Online]. Available: https://www.intuitive.com/en-us/products-and-services/da-vinci/systems. [Accessed Nov 2020].
[31] T. Fong, A. Abercromby, M. G. Bualat, M. C. Deans, K. V. Hodges, J. M. Hurtado Jr, R. Landis, P. Lee and D. Schreckenghost, "Assessment of robotic recon for human exploration of the Moon," Acta Astronautica, vol. 67, no. 9-10, pp. 1176-1188, 2010.
[32] M. Ester, H.-P. Kriegel, J. Sander and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Kdd, 1996.
[33] Q.-Y. Zhou, J. Park and V. Koltun, "Open3D: A Modern Library for 3D Data Processing," arXiv preprint arXiv:1801.09847, 2018.
[34] I. Bogoslavskyi and C. Stachniss, "Fast range image-based segmentation of sparse 3D laser scans for online operation," in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016.
[35] S. Song and J. Xiao, "Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[36] C. R. Qi, O. Litany, K. He and L. J. Guibas, "Deep Hough Voting for 3D Object Detection in Point Clouds," in Proceedings of the IEEE International Conference on Computer Vision, 2019.
[37] S. Pillai and J. J. Leonard, "Monocular slam supported object recognition," arXiv preprint arXiv:1506.01732, 2015.
[38] L. Zhang, L. Wei, P. Shen, W. Wei, G. Zhu and J. Song, "Semantic SLAM based on object detection and improved octomap," IEEE Access, vol. 6, pp. 75545-75559, 2018.
[39] T. Taketomi, H. Uchiyama and S. Ikeda, "Visual SLAM algorithms: a survey from 2010 to 2016," IPSJ Transactions on Computer Vision and Applications, vol. 9, no. 1, 2017.
[40] A. J. Davison, I. D. Reid, N. D. Molton and O. Stasse, "MonoSLAM: Real-Time Single Camera SLAM," IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 6, pp. 1052-1067, 2007.
[41] G. Klein and D. Murray, "Parallel tracking and mapping for small AR workspaces," in 2007 6th IEEE and ACM international symposium on mixed and augmented reality, 2007.
[42] R. Mur-Artal and J. D. Tardós, "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras," IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, 2017.
[43] R. Sun and B. A. Giuseppe, "3D reconstruction of real environment from images taken from UAV (SLAM approach)," PhD Thesis, Politecnico di Torino, 2018.
[44] J. Engel, T. Schops and D. Cremers, "LSD-SLAM: Large-Scale Direct Monocular SLAM," in European conference on computer vision, 2014.
[45] J. Engel, V. Koltun and D. Cremers, "Direct Sparse Odometry," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611-625, 2017.
[46] C. Forster, Z. Zhang, M. Gassner, M. Werlberger and D. Scaramuzza, "SVO: Semidirect Visual Odometry for Monocular and Multicamera Systems," IEEE Transactions on Robotics, vol. 33, no. 2, pp. 249-265, 2016.
[47] N. Yang, R. Wang, X. Gao and D. Cremers, "Challenges in Monocular Visual Odometry: Photometric Calibration, Motion Bias, and Rolling Shutter Effect," IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 2878-2885, 2018.
[48] R. Mur-Artal, J. M. M. Montiel and J. D. Tardos, "ORB-SLAM: a versatile and accurate monocular SLAM system," IEEE transactions on robotics, vol. 31, no. 5, pp. 1147- 1163, 2015.
[49] M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn and A. Zisserman, "The pascal visual object classes (voc) challenge," International journal of computer vision, vol. 88, no. 2, pp. 303-338, 2010.
[50] A. Geiger, P. Lenz and R. Urtasun, "Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite," in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
[51] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei, "ImageNet Large Scale Visual Recognition Challenge," International journal of computer vision, vol. 115, no. 3, pp. 211-252, 2015.
[52] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
[53] A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, A. Kolesnikov, T. Duerig and V. Ferrari, "The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale," 2018.
[54] V. Badrinarayanan, A. Kendall and R. Cipolla, "Segnet: A deep convolutional encoder- decoder architecture for image segmentation," IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 12, pp. 2481-2495, 2017.
[55] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, "Semantic image segmentation with deep convolutional nets and fully connected crfs," arXiv preprint arXiv:1412.7062, 2014.
[56] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.
[57] L.-C. Chen, G. Papandreou, F. Schroff and H. Adam, "Rethinking atrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.
[58] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schrof and H. Adam, "Encoder-decoder with atrous separable convolution for semantic image segmentation," in Proceedings of the European conference on computer vision (ECCV), 2018.
[59] K. He, G. Gkioxari, P. Dollar and R. Girshick, "Mask R-CNN," in Proceedings of the IEEE international conference on computer vision, 2017.
[60] Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo and R. Girshick, "Detectron2," Facebook, 2019. [Online]. Available: https://github.com/facebookresearch/detectron2. [Accessed 22 Nov 2020].
[61] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler and A. Ng, "ROS: an open-source Robot Operating System," in ICRA workshop on open source software, 2009.
[62] DJI, "DJI Mavic2," 2018. [Online]. Available: https://www.dji.com/tw/mavic-2. [Accessed Nov 2020].
[63] F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C.-L. Chang and M. Grundmann, "MediaPipe Hands: On-device Real-time Hand Tracking," arXiv preprint arXiv:2006.10214, 2020.

電子全文

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文