透過您的圖書館登入
IP:18.217.144.32
  • 學位論文

強化學習用於發明自旋冰模型上的蒙地卡羅演算法

Discover Monte Carlo Algorithm on Spin Ice Model Using Reinforcement Learning

指導教授 : 高英哲

摘要


強化學習具備了在動態環境中卓越的探索與決策能力,成為機器學 習中快速發展的研究領域。強化學習受到了心理學習的啟發,其理論 架構中包含一個具有可改進策略能力的機器代理人。代理人使用現有 策略,對環境採取行動並根據收到的回饋,進一步改善策略,從不斷 嘗試與修正的過程中達成目標。 在這篇論文中,我們將利用強化學習,讓代理人自我創造出在自旋 冰模型上的蒙地卡羅演算法。自旋冰是一種磁性挫折系統,在低能量 時具有強烈的拓撲拘束條件。在物理學中,迴路蒙地卡羅演算法可以 很有效率的更新系統而不會破壞其局部的拓樸約束。 但是,有效率的更新演算法往往是問題相依的,當面對新的系統時 並需要設計新的演算法。因此,我們開發了一種基於強化學習的架構, 利用深度神經網路對蒙地卡羅狀態轉換子建模。並將馬可夫鏈推廣為 馬可夫決策過程,使得機器代理人能在與物理系統交互作用中創造出 有效率的更新策略。並且我們相信,本演算法可以作為收尋蒙地卡羅 更新方法的通用架構。

並列摘要


Reinforcement learning is a fast-growing research field due to its outstanding exploration capability in dynamic environments. Inspired by psychological learning theories, the reinforcement learning framework contains a software agent with improvable policies that takes actions on the environment and attempts to achieve the goal according to given reward. A policy is a stochastic rule which governs the decision-making process of the agent and is updated based on the response of the environment. In this work, we apply reinforcement learning framework on the spin ice model. Spin ice is a frustrated magnetic system with strong topological constraint on the low-energy configurations. In the physics community, it is well-known that the loop Monte Carlo algorithm can update the system efficiently without breaking its local constraint. However, from a broader perspective, the global update schemes can be problem-dependent and require customized algorithm design. Therefore, we exploit a reinforcement learning method that parameterize transition operator with neural networks. By extending the Markov chain to Markov decision process, the algorithm can adaptively search for global update policy through its interactions with the physical model. It may serve as a general framework for the search of update patterns.

參考文獻


[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.
[2] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driess- che, J. Schrittwieser, I. Antonoglou, V. Panneer shelvam, M. Lanctot, et al., “Mastering the game of go with deep neural networks and tree search,” nature, vol. 529, no. 7587, pp. 484–489, 2016.
[3] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al., “Mastering the game of go without human knowledge,” Nature, vol. 550, no. 7676, p. 354, 2017.
[4] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, 2006.
[5] G. Carleo and M. Troyer, “Solving the quantum many-body problem with ar- ti cial neural networks,” Science, vol. 355, no. 6325, pp. 602–606, 2017.

延伸閱讀