針對單一和多智能體人形機器人之創新雙演員近端策略優化算法

李安民

doi:10.6345/NTNU202400949

透過您的圖書館登入 IP:3.145.151.116

透過您的圖書館登入

IP:3.145.151.116

繁體中文
English
简体中文

精確檢索 : 冠狀病毒
模糊檢索 : 冠狀病毒
冠狀病毒感染

冠狀病毒疾病
查詢出版品: 冠狀病毒

進階查詢

查詢歷史

主題瀏覽

【下載完整報告】國民法官、工作與心理健康成熱門研究議題？熱門研究焦點一次看！

學位論文

針對單一和多智能體人形機器人之創新雙演員近端策略優化算法

A Novel Dual-Actor Proximal Policy Optimization Algorithm for Single and Multi-Agent Humanoid Robot

李安民(Akbar Ilham)

指導教授：包傑奇；薩義德

國立臺灣師範大學/科技與工程學院/電機工程學系/碩士(2024年)

https://doi.org/10.6345/NTNU202400949

若您是本文的作者，可授權文章由華藝線上圖書館中協助推廣。

查找全文

摘要

none

關鍵字

None

並列摘要

Single-agent and multi-agent systems are integral to the dynamic environmental processes of reinforcement learning in advanced humanoid robotic applications. This thesis introduces the Dual Proximal Policy Optimization (DA-PPO) algorithm and its extension, Independent Dual Actor Proximal Policy Optimization (IDA-PPO),designed for robotic navigation and cooperative tasks using the ROBOTIS-OP3 humanoid robot. The study validates the effectiveness of DA-PPO and IDA-PPO cross various scenarios, demonstrating significant improvements in both single-agent and multi-agent environments. DA-PPO excels in robotic navigation and movement tasks, outperforming established reinforcement learning methods in complex environments and basic walking tasks. This success is attributed to its innovative architecture, efficient utilization of hardware resources like the NVIDIA GeForce RTX 3050, and an effective reward function strategy. IDA-PPO, with its decentralized training and dual actor policy network, achieves higher mean rewards and faster learning compared to IPPO and MAPPO. IDA-PPO is 5.49 times faster than MAPPO and 8.22 times faster than IPPO, highlighting its superior efficiency and adaptability in multi-agent tasks. These findings underscore the importance of algorithmic innovation and hardware capabilities in advancing robotic performance, positioning DA-PPO and IDA-PPO as significant advancements in robotic learning

並列關鍵字

DA-PPO ； IDA-PPO ； Single Agent ； Multi Agent ； reinforcement learning ； cooperative tasks ； humanoid robots ； robotic navigation

參考文獻

S. Saeedvand, M. Jafari, H. S. Aghdasi, and J. Baltes, “A comprehensive survey on humanoid robot development,” The Knowledge Engineering Review, vol. 34, p. e20, 2019.

Google Scholar

D. Rodriguez and S. Behnke, “Deepwalk: Omnidirectional bipedal gait by deep reinforcement learning,” in 2021 IEEE International Conference on Robotics and Automation(ICRA), pp. 3033–3039, 2021.

Google Scholar

J. Baltes, G. Christmann, and S. Saeedvand, “A deep reinforcement learning algorithm to control a two-wheeled scooter with a humanoid robot,” Engineering Applications of Artificial Intelligence, vol. 126, p. 106941, 2023.

Google Scholar

S. Saeedvand, H. Mandala, and J. Baltes, “Hierarchical deep reinforcement learning to drag heavy objects by adult-sized humanoid robot,” Applied Soft Computing, vol. 110,p. 107601, 2021.

Google Scholar

R. Sutton and A. Barto, “Reinforcement learning: An introduction,” IEEE Transactions on Neural Networks, vol. 9, pp. 1054–1054, 1998.

Google Scholar

國際替代計量

針對單一和多智能體人形機器人之創新雙演員近端策略優化算法

主題瀏覽

針對單一和多智能體人形機器人之創新雙演員近端策略優化算法

A Novel Dual-Actor Proximal Policy Optimization Algorithm for Single and Multi-Agent Humanoid Robot

摘要

關鍵字

並列摘要

並列關鍵字

參考文獻

延伸閱讀

國際替代計量

相關連結

本網站使用Cookies