自我疾病診斷在近年來越來越受到重視,不論是在醫療資源充足或是匱乏的地區皆有其重要的地方,雖然在網路資源發達的世代,病患可以隨時從網路上查詢相關醫療資訊,但資訊的正確性及相關性常常令人質疑,一個好的疾病診斷系統必須具備正確診斷及快速診斷這兩個特點,但這兩項特點是互相牴觸的,因此要如何找到一個好的問診策略能權衡這兩項特點是重要的課題。在這篇論文中,我們將疾病診斷視為一個做連續決策的過程,並提出深度強化學習演算法RE^2 來改善線上疾病診斷系統。在醫療資料中,一個疾病通常只會出現少數幾個症狀,也就是說症狀(特徵) 是非常稀疏的,先前的方法沒有提到這個問題也沒有在這方面有特別的著墨,以至於在較多疾病的情況下,疾病診斷結果不佳。因此,在RE^2 演算法中,我們融合獎勵設計及特徵重建來解決這個問題,其中,獎勵設計可以引導代理人(Agent) 能有更多的機會探索到病人有的症狀,而特徵重建可以引導代理人學習特徵之間的相依性,結合這兩個方法後,代理人能有更高的機會詢問到病人的症狀,同時,也可以大幅改善診斷結果並在不同的實驗設定中獲得最佳的實驗結果。
Self disease diagnosis is becoming more and more important in recent years. Irrespective of whether the area has sufficient medical resources or not, the system for self disease diagnosis plays an important role. Although patients can conveniently search medical information from the internet, search results are often inaccurate. A robust system for self disease diagnosis needs to provide accurate prediction and fast disease diagnosis. However, these two properties are in conflict with each other. In addition, only a few symptoms per patient appear in a disease, that is, the feature space is sparse. The previous methods do not consider the sparse feature problem and result in inferior performance when dealing with a large number of possible diseases. In this work, we formulate the disease diagnosis problem as a sequential decision making process, and propose a reinforcement learning algorithm RE^2 to improve the performance of self disease diagnosis. To overcome the sparse feature problem, we propose a reward shaping technique and a reconstruction technique in RE^2. Reward shaping can guide the search towards symptoms that actually appear on the patient. Reconstruction can guide the agent to learn correlations between symptoms. Together, they can find symptom queries that yield key positive responses from a patient with high probability. Consequently, by using these techniques, the agent can obtain much improved diagnoses and state-of-the-art results in different experimental settings.