透過您的圖書館登入
IP:18.118.212.165
  • 學位論文

類神經網路自組織增強式學習模型

Self-Organizing Reinforcement Learning Model

指導教授 : 劉長遠

摘要


在這篇論文中主要提出了一種增強式學習(Reinforcement Learning, RL)的運動行為控制模型,該模型是由大腦皮質層的組織原則做為啟發,基於大腦皮質層上的感覺和運動區域功能來做模擬。自組織映射圖網路(Self-Organizing Maps, SOM)已經被證明在模擬腦皮質的拓撲功能上非常有效,利用這個特性做為外部環境的狀態對該模型激刺的一個感覺中介層,同樣的,也做為運動行為輸出的中介層,然後模型內使用一種具有相鄰函式(neighborhood function)的SARSA Q-learning演算法。由於有了SOM做為中介,原始的增強式學習在連續空間上所造成的查表過大問題得以解決,最後該模型能夠將連續空間上的狀態對映到連續的運動行為空間上。

並列摘要


In this thesis, we propose a motor control model based on reinforcement learning (RL). The model is inspired by organizational principles of the cerebral cortex, specifically on cortical maps and functional hierarchy in sensory and motor areas of the brain. Self-Organizing Maps (SOM) have proven to be useful in modeling cortical topological maps. The SOM maps the input space in response to the real-valued state information, and a second SOM is used to represent the action space. We use a neighborhood update version of the SARSA Q-learning algorithm, and the SOM is a practical tool for Q-function to avoid representing in a large tabular form when the state or action space is continuous or very large. The final model can map a continuous input space to a continuous action space.

參考文獻


[1] J. A. Smith, “Applications of the self-organizing map to reinforcement learning,” Neural Networks, vol. 15, no. 1, pp. 8-9, Oct. 2002.
[2] K. S. Hwang, S. W. Tan, and M. C. Tsai, “Reinforcement learning to adaptive control of nonlinear systems,” IEEE Transactions on Systems, vol. 33, no. 3, pp. 514–521, Jun. 2003.
[8] A. Fagg and M. Arbib, “Modeling parietal–premotor interactions in primate control of grasping,” Neural Networks, vol. 11, no. 7, pp. 1277-1303, Oct. 1998.
[9] A. Murata, L. Fadiga, L. Fogassi, V. Gallese, V. Raos, and G. Rizzolatti, “Object representation in the ventral premotor cortex (area F5) of the monkey,” Journal of Neurophysiology, vol. 78, pp. 2226-2230, 1997.
[11] I. R. Johnston, “The role of optical expansion-pattern in aerial location,” American Journal of Psychology, vol. 86, no. 2, pp. 311-324, Jun. 1973.

延伸閱讀