A System Improving Reinforcement Learning Agents' Policies with Two Types of Mixture Models of Bayesian Networks

In recent years, many reinforcement learning (RL) methods have been suggested and applied to various problems, where agents acquire their own policies to maximize the total amount of rewards. We have considered that the agents' policies are improved effectively by supervised learning mechanism using the stored data for the agent's behavior and rewards, and then have proposed a system for improving an RL agent's policy with a mixture model of Bayesian Networks. This paper employs two types of mixture models, and introduces a new technique so that agents can adapt to dynamic environments. We investigated the adaptability of our system to environmental changes and compared the properties of the new technique with the previous one.

並列關鍵字

Bayesian Network ； Mixture Model of Bayesian Networks ； Profit Sharing ； Stochastic Knowledge ； Policy-Improving System

被引用紀錄

Liao, Z. X. (2013). 探勘智慧型手機中應用程式使用行為之研究 [doctoral dissertation, National Chiao Tung University]. Airiti Library. https://doi.org/10.6842/NCTU.2013.00476

魏志丞（2010）。同步輻射與材料性質分析對硫化鎘、氧化鋅及氧化鎵之研究〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2010.00256

國際替代計量

A System Improving Reinforcement Learning Agents' Policies with Two Types of Mixture Models of Bayesian Networks

全文下載

主題瀏覽