使用深度強化學習技術與可訓練模擬使用者之互動式語音數位內容檢索

本論文之主軸在探討語音數位內容之互動式檢索 (Interactive Retrieval of Spoken Content) 與針對互動式檢索系統中的模擬使用者做改進。由於數位語音內容難以快速瀏覽，且語音辨識的錯誤造成高度的不確定性，所以使用者與系統的互動對語音數位內容檢索系統 (Spoken Content Retrieval System) 有關鍵性的影響。在互動式檢索的系統中，系統會選擇不同的行動與使用者互動來得到更多資訊，所以如何讓系統根據目前的狀態選擇最有效率的行動是極為重要的。在前人的研究中，互動式檢索系統使用深度Q-類神經網路 (Deep-Q Network) 的演算法訓練馬可夫決策模型 (Markov Decision Process, MDP) ，並使用基於經驗法則訂定規則 (Rule-based) 的模擬使用者 (User Simulator)。然而，建立一個可信賴且貼近真實使用者行為的模擬使用者是很大的挑戰。本論文提出可與互動式檢索系統同步訓練的模擬使用者，來增進互動式語音數位內容檢索系統的效能，取代基於規則的模擬使用者。實驗顯示，可與檢索系統同步訓練的模擬使用者比起基於規則的模擬使用者不但得到更大獎勵，在真人評估 (Human Evaluation) 的測驗中也更像真實使用者。

關鍵字

深度強化學習；語音數位內容檢索；互動式資訊檢索

並列摘要

User-machine interaction is crucial for information retrieval, especially for spoken con- tent retrieval, because spoken content is difficult to browse, and speech recognition has a high degree of uncertainty. In interactive retrieval, the machine takes different actions to interact with the user to obtain better retrieval results; here it is critical to select the most efficient action. In previous work, deep Q-learning techniques were proposed to train an interactive retrieval system but rely on a hand-crafted user simulator; building a reliable user simulator is difficult. In this thesis, we further improve the interactive spoken content retrieval framework by proposing a learnable user simulator which is jointly trained with interactive retrieval system, making the hand-crafted user simulator unnecessary. The ex- perimental results show that the learned simulated users not only achieve larger rewards than the hand-crafted ones but act more like real users.

並列關鍵字

Deep Reinforcement Learning ； Spoken Content Retrieval ； Interactive Information Retrieval

參考文獻

[1] Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Freitas, “Dueling network architectures for deep reinforcement learn- ing,” arXiv preprint arXiv:1511.06581, 2015.

Google Scholar

[2] merchdope.com, “37 mind blowing youtube facts, figures and statistics – 2018,” http://https://merchdope.com/youtube-statistics/, Accessed June 6, 2018.

Google Scholar

[3] Ciprian Chelba, Timothy J Hazen, and Murat Saraclar, “Retrieval and browsing of spoken content,” IEEE Signal Processing Magazine, vol. 25, no. 3, 2008.

Google Scholar

[4] Lin-shan Lee and Berlin Chen, “Spoken document understanding and organization,” IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 42–60, 2005.

Google Scholar

[5] Tsung-Hsien Wen, Hung-Yi Lee, and Lin-Shan Lee, “Interactive spoken content retrieval with different types of actions optimized by a markov decision process,” in Thirteenth Annual Conference of the International Speech Communication Associa- tion, 2012.

Google Scholar

國際替代計量

使用深度強化學習技術與可訓練模擬使用者之互動式語音數位內容檢索

全文下載

主題瀏覽