透過您的圖書館登入
IP:18.191.132.36
  • 學位論文

無人機基於深度強化學習於虛擬環境之視覺化分析

Visual Analysis for drone with Reinforcement Learning in Virtual Environment

指導教授 : 紀明德

摘要


近年來非常流行全自動無人機競賽,2019 年微軟團隊 Airsim 於 NeurlIPS 的會議上舉辦一個基於虛擬環境的無人機過框比賽,其主要 目標希望能夠超越人類玩家的表現,而在得名的參賽者中並沒有針對 這項競賽設計一套利用深度強化學習的方法,因此本研究針對此虛擬 競賽使用深度強化學習的方法訓練成功過框完賽的模型,並結合現實 中無人機時常運用的 ROS 系統作為指令傳遞的通訊架構縮小虛擬與 現實的差異。 眾所周知深度強化學習這項方法就如同黑盒子,使用者不知道模 型究竟學習到什麼,因此本研究設計一套視覺化介面,提供使用者分 析模型表現,並設計一套圖表分析各項動作選擇的機率,看出模型在 當下狀態所做的思考是否與普遍認知上相同,最後利用神經網路視覺 化的技巧看出模型表現不佳的問題並將其改良,其中發現某些情況下 模型表現與人類的行為相似,使得對深度強化學習的信任以及現實應 用的可能性大幅增加。

並列摘要


Autonomous drone racing has become very popular in recent years. At the 2019 Microsoft team, Airsim at the NeurlIPS conference held a virtual environment-based drone passing-gate competition. Its main goal is to surpass the performance of human players. None of the contestants designed a method for utilizing DRL (Deep Reinforcement Learning) specifically for this competition. This research uses the DRL method to train a model for this virtual racing and combines the ROS system that is often used by drones in reality as the communication architecture for command transmission to reduce the difference between virtual and reality. It is well known that the method of DRL is like a black box, and the user does not know what the model has learned. Therefore, this research designed a visual interface to provide users with an analysis of the model's performance and designed a chart to analyze the probability of each action selection so users could know whether the thinking of the model in the current state is the same as the general cognition. Finally, the neural network visualization technique is used to identify the problem of poor performance of the model and improve it, as well as to find to behave similarly to human behavior. In some cases, it greatly increases the trust in DRL and the possibility of real-world applications.

參考文獻


[1] Gebhardt, C., Stevšić, S., & Hilliges, O. (2018). Optimizing for aesthetically
pleasing quadrotor camera motion. ACM Transactions on Graphics (TOG), 37(4),
[2] Hepp, B., Dey, D., Sinha, S. N., Kapoor, A., Joshi, N., & Hilliges, O. (2018).
Learn-to-score: Efficient 3d scene exploration by predicting view utility. In

延伸閱讀