透過您的圖書館登入
IP:3.144.96.159
  • 學位論文

機器人步態學習之研究

The Study on the Learning of Walking Gaits for Biped Robots

指導教授 : 黃國勝
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在人型機器人的範疇中,要建立一個具有18個維度的雙足機器人模型,並同時要將此模型運用在機器人的行走平衡上,是需要極大量的數學推導與計算的。本論文的目的在於利用增強式學習法來實現控制雙足機器人的行走與平衡。在預先沒有對動態模型有所認知的情況下,學習如何穩定的向前步行,並且只利用兩腳的型態改變來達到穩定行走的目的,這可使雙足機器人空出雙手來做其他運用,接著,再利用增強式學習法來學習該如何縮短步態,使行走的速度可以有穩定的成長。 Q-learning可以在訓練每個姿態穩定的同時,把步態之間的關聯性納入其中,因此,我們設計了一個學習架構,每個決定都是基於上一個步態,來處理在實際的連續動作空間中穩定行走並保持平衡的這個複雜問題,另一方面,設計了另一個架構來解決步行速度的問題,我們只需要將機器人的動作設計的非常密集,利用本文提出的方法可以使機器人一方面可以增快行走速度,也顧及方向及穩定度,而在機器人行走時,比較容易向前傾斜或向後傾倒,因此在狀態的設計上,並沒有以等量的方法做切割,而是針對機器人行走時前後倒的狀況做比較嚴格的判斷。 在本論文中,代理人的學習納入了人類在平衡時的直覺反應,以及對於步行的評估來做為學習的依據。由於雙足機器人的物理模型已經具備如何行走的知識,我們藉著本論文提出的方法,在行走速度以及穩定度的觀點上,來做改善和增進。

並列摘要


In the context of the humanoid biped robot, to build a robot model with 18 dimensions, and want to apply this model to achieve the balance of robot behavior at the same time needs a lot of calculation of mathematical derivations. The study on biped walking and balance control using reinforcement learning is presented in this paper. The algorithm can lead a robot learn how to walk without any previous knowledge of any explicit dynamics model. At the same time, achieving stable walking by using only the type of gait change to achieve stable walking. This will allow robot’s hand free and able to do other things. We only need to design robot action very dense. Then, using reinforcement learning to discover how to shorten the partitions between the gait, and walking speed can have a steady progress. Q-learning which can not only train each gait to stable , but also train the correlation between the gait in continuous domain. The learning architecture is developed in to solve about this. It spans the basis discrete actions to construct a continuous action policy. On the other hand, another architecture is developed in to solve the problem of walking speed, by reducing the pose between gaits. Not only can increase the speed of walking, but also take into account the stability of direction. When the robot is walking, it easier to tilt forward or recline, so the states are not cutting equivalently. In this paper, the agent will incorporate the human intuitive balancing knowledge and walking evaluation knowledge during walking. The biped robot can perform its basic walking skill with a priori knowledge and then learn to improve its behavior in terms of walking speed and restricted positions of center.

參考文獻


[2] D. Kim, S. J. Seo and G. T. Park, “Zero-moment point trajectory modelling of a biped walking robot using an adaptive neuro-fuzzy system,” IEE Proc.-Control Theory Appl., Vol. 152,No. 4,, pp. 411 – 426, 2005.
[3] Napoleon, “Balance control analysis of humanoid robot based on ZMP feedback control ,” IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol.3, pp.2437 - 2442, 2002.
[4] L. Hu, Z. Sun, “Reinforcement Learning Method-Based Stable Gait Synthesis for Biped Robot,” International Conferenc e on Control, Automation, Robotics and Vision, pp.1017-1022, 2004.
[5] K. Suwanratchatamanee, M. Matsumoto, “Balance Control of Robot and Human-Robot Interaction with Haptic Sensing Foots ,” IEEE Human System Interaction(HSI), pp. 68 –74, 2009.
[6] R. S. Sutton and A. G. Barto, Reinforcement Learning An Introduction, Cambridge, Mass., MIT Press,1998.

延伸閱讀


國際替代計量