作者(英):Lee, Hsuan-I
論文名稱(英):Visual Analysis for drone with Reinforcement Learning in Virtual Environment
指導教授(英):Chi, Ming-Te
口試委員(外文):Lin, Shih-Syun
Wang, Ko-Chih
Peng, Yan-Tsung
英文關鍵詞:Deep reinforcement learningDrone racingVirtual environmentVisual analytics
Doi Url:http://doi.org/10.6814/NCCU202200384
近年來非常流行全自動無人機競賽,2019 年微軟團隊 Airsim 於
NeurlIPS 的會議上舉辦一個基於虛擬環境的無人機過框比賽,其主要
中無人機時常運用的 ROS 系統作為指令傳遞的通訊架構縮小虛擬與
Autonomous drone racing has become very popular in recent years. At the 2019 Microsoft team, Airsim at the NeurlIPS conference held a virtual environment-based drone passing-gate competition. Its main goal is to surpass the performance of human players. None of the contestants designed a method for utilizing DRL (Deep Reinforcement Learning) specifically for this competition. This research uses the DRL method to train a model for this virtual racing and combines the ROS system that is often used by drones in reality as the communication architecture for command transmission to reduce the difference between virtual and reality. It is well known that the method of DRL is like a black box, and the user does not know what the model has learned. Therefore, this research designed a visual interface to provide users with an analysis of the model's performance and designed a chart to analyze the probability of each action selection so users could know whether the thinking of the model in the current state is the same as the general cognition. Finally, the neural network visualization technique is used to identify the problem of poor performance of the model and improve it, as well as to find to behave similarly to human behavior. In some cases, it greatly increases the trust in DRL and the possibility of real-world applications.
摘要 i
Abstract ii
目錄 iii
圖目錄 vi
表目錄 x
第一章 緒論 1
1.1 研究動機與目的 1
1.2 問題描述 2
1.3 論文貢獻 3
1.4 論文章節架構 3
第二章 相關研究 4
2.1 深度強化學習 4
2.1.1 深度學習 4
2.1.2 強化學習 5
2.1.3 深度強化學習的發展 6
2.2 深度強化學習與無人機應用 8
2.3 視覺化分析及技巧 10
第三章 研究方法 16
3.1 系統架構 16
3.2 環境設置 17
3.3 利用深度強化學習控制無人機 18
3.3.1 ACKTR 及模型架構 18
3.3.2 物件偵測 19
3.3.3 無人機控制 21
3.4 獎勵函數設計 21
3.5 數據收集 23
第四章 視覺化設計 25
4.1 設計動機以及目標 25
4.2 儀表板概覽 27
4.3 神經網路視覺化 29
4.3.1 反向傳播法 30
4.3.2 基於擾動式顯著圖 31
4.3.3 利用顯著圖觀察問題以及改良 32
4.4 Grad-Cam++分析視覺化 34
第五章 實驗結果與討論 36
5.1 實作與實驗環境 36
5.2 模型的測試結果 36
5.3 模型視覺化分析 39
5.3.1 數據分析 40
5.3.2 模型的行為思考分析 43
5.3.3 藉由擾動式顯著圖分析模型的知識 47
5.3.4 Grad-Cam++結果分析 49
5-4 限制 52
第六章 結論與未來工作 53
6.1 結論 53
6.2 未來工作 54
參考文獻 55
