透過您的圖書館登入
IP:3.145.60.166
  • 期刊
  • OpenAccess

語音文件檢索使用類神經網路技術

On the Use of Neural Network Modeling Techniques for Spoken Document Retrieval

摘要


近年來由於含有語音資訊的多媒體內涵不斷增長,語音文件檢索已成為一個相當熱門的議題並吸引許多學者與實務家的投入研究。除了發展強健的索引機制和有效的檢索模型外,如何正確地且有效率地對於查詢內容進行模型化對於增進語音文件檢索的表現也扮演著非常關鍵的角色。有鑒於此,在本論文,我們提出一個新穎的基於類神經網路之相關性感知模型來得到較佳的查詢表示方式,同時可以避免使用傳統較耗費時間的準相關回饋程序。再者,我們嘗試將查詢意向分類的概念融入我們所提出的模型架構中,以進一步獲取更精緻的查詢表示方式。在TDT-2 語音文件及所進行的初步實驗顯示出本論文所提出方法的效用。

並列摘要


Due to ever-increasing amounts of publicly available multimedia associated with speech information, spoken document retrieval (SDR) has been an active area of research that captures significant interest from both academic and industrial communities. Beyond the continuing effort in the development of robust indexing and effective retrieval methods to quantify the relevance degree between a pair of query and spoken document, how to accurately and efficiently model the query content plays a vital role for improving SDR performance. In view of this, we present in this paper a novel neural relevance-aware model (NRM) to infer an enhanced query representation, extricating the conventional time-consuming pseudo-relevance feedback (PRF) process. In addition, we incorporate the notion of query intent classification into our proposed NRM modeling framework to obtain more sophisticated query representations. Preliminary experiments conducted on the TDT-2 collection confirm the utility of our methods in relation to a few state-of-the-art ones.

參考文獻


(Linguistic Data Consortium. (2000). Project of Topic Detection and Tracking.).
Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013). Efficient estimation of word representations in vector space. Retrieved from arXiv:1301.3781.
Baeza-Yates, R.,Ribeiro-Neto, B.(2011).Modern information retrieval: the concepts and technology behind search.Boston, MA:Addison-Wesley Professional.
Blei, D. M.,Ng, A. Y.,Jordan, M. I.(2003).Latent Dirichlet Allocation.Journal of Machine Learning Research.3(4-5),993-1022.
Chelba, C.,Hazen, T. J.,Saraclar, M.(2008).Retrieval and browsing of spoken content.IEEE Signal Processing Magazine.25(3),39-49.

延伸閱讀