基於深度強化學習之足球機器人的進攻策略與動態避障

本論文設計與實現了一個基於深度強化學習(Deep Reinforcement Learning, DRL)之足球機器人的進攻策略與動態避障的方法。提出一個基於柔性行動者評論家(Soft Actor-Critic, SAC)之訓練方法，其可以有效地自我學習一個足球機器人之最佳進攻策略來有效地避免對方機器人的攔截以及提高進球率。本論文使用機器人作業系統(Robot Operating System, ROS)的Gazebo模擬器建構了一個動態模擬環境來訓練神經網路，這個環境是依據RoboCup（機器人世界盃組織）中型組之規則所實現的。當輸入場地及對方機器人之資訊後，所提出之方法就可以在目前狀態下決定足球機器人之一個最佳行動。在實驗結果的部分，本論文設計了四個實驗場景來訓練神經網絡，並且比較這四種情景之進球率與訓練所需回合數來說明所提方法的有效性。

關鍵字

深度強化學習；機器人作業系統；足球機器人； Gazebo模擬器；柔性行動者評論家

並列摘要

In this thesis, an offensive strategy and a dynamic obstacle avoidance method for soccer robots are designed and implemented based on Deep Reinforcement Learning (DRL). A training method based on the Soft Actor-Critic (SAC) is proposed to effectively self-learn an optimal offensive strategy for the soccer robot to effectively avoid the interception of the opponent’s robot to increase the goal rate. The Gazebo simulator of the Robot Operating System (ROS) is used to construct a dynamic simulation environment to train neural networks. The environment is implemented based on the rules of RoboCup (Robot World Cup Initiative) Middle Sized League. When the information of the environment and the opponent’s robot is inputted, the proposed method can determine the best action of the soccer robot in the current state. In the experimental results, four experimental scenarios are designed to train the neural networks. The goal rate and the number of episodes required for training in these four scenarios are compared to illustrate the effectiveness of the proposed method.

並列關鍵字

Deep Reinforcement Learning ； Robot Operating System (ROS) ； Soccer Robot ； Gazebo Simulator ； Soft Actor-Critic (SAC)

參考文獻

[1] RoboCup, URL: http://www.robocup.org

Google Scholar

[2] FIRA, URL: http://www.fira.net

Google Scholar

[3] A. Mackworth, “On seeing robots,” World Scientific Press Computer Vision: System, Theory, and Applications, pp. 1-13, 1993.

Google Scholar

[4] FIFA, URL: http://www.fifa.com

Google Scholar

[5] A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” in Proc. Advances Neural Inf. Process. Syst., pp.1106-1114, 2012.

Google Scholar

國際替代計量

基於深度強化學習之足球機器人的進攻策略與動態避障

不提供下載

主題瀏覽