使用機器學習方法於語音文件檢索之研究

本論文初步地討論機器學習之方法在資訊檢索上的應用，即所謂排序學習(Learning to Rank)；並針對近年被使用在資訊檢索上的各種機器學習模型及概念，以及所使用的各種特徵，包含詞彙本身之特徵、相近度特徵、及機率特徵等進行分析與實驗。除此之外，本論文亦將之延伸至語音文件檢索的應用上。本論文初步地使用TDT(Topic Detection and Tracking)中文語料部份作為實驗題材，此語料為過去TREC(文件檢索暨評測會議)上公開評估語音文件檢索系統的標準語料(Benchmark)之一，此語料包含TDT-2及TDT-3兩套語料，提供了大量的新聞語料，及豐富的主題、轉寫等標註，以作為語音文件檢索相關研究使用。為了更有效地開發富含資訊的語音文件特徵，本論文亦使用臺師大大陸口音中文大詞彙連續語音辨識器(Large Vocabulary Speech Recognition, LVCSR)作為語音文件轉寫平台，產生的詞圖(Word Graph)，作為擷取語音文件獨特特徵的主要依據。此外，我們並考慮到資訊檢索中之訓練語料不平衡問題，並提出解決此問題之對策。最後，初步的實驗結果顯示，成對式訓練方法RankNet之訓練模型檢索成效較逐點式訓練方法SVM之訓練模型檢索成效為佳。

關鍵字

資訊檢索；排序學習；語音辨識

並列摘要

This thesis investigates the use of machine-learning approaches, namely learning-to-rank algorithms, for information retrieval (IR), with special emphasis on their theoretical foundations and the associated features that are used by them, such as the lexical features, proximity features, and probabilistic features. Meanwhile, we also consider the application of these approaches for spoken document retrieval (SDR). All experiments were conducted on the Topic Detection and Tracking corpora (especially, TDT-2 and TDT-3), which are the benchmark collections widely adopted for various SDR evaluations since they contain tens of hours of mainland-accented Chinese broadcast news documents equipped with topic labels and orthographic transcripts. In the hope of discovering more useful speech-related features for SDR as well as analyzing the problems caused by speech recognition errors, a large vocabulary speech recognition (LVCSR) system that can output a word lattice consisting of multiple recognition hypotheses for each broadcast news document is established. Moreover, we also deal with the problem of training the machine-learning retrieval models with unbalanced training data, and propose a remedy for it. Finally, the preliminary experimental results seem to show that the RankNet based retrieval model outperforms the support vector machine (SVM) based retrieval model for the SDR task studied in this thesis.

並列關鍵字

Information Retrieval ； Learning to Rank ； Speech Recognition

參考文獻

[Bai et al. 2000] B. R. Bai, B. Chen, H.-M. Wang. Syllable-based Chinese text/spoken document retrieval using text/speech Queries. International Journal of Pattern Recognition and Artificial Intelligence, 14(5), pp. 603-616, August 2000.

[Carbonell & Goldstein 1998] Jaime G. Carbonell, Jade Goldstein: The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proc. SIGIR’98, pp. 335-336, 1998

[Chang 1997] Shih-Fu Chang. Content-Based Indexing and retrieval of visual information. IEEE Signal Processing Magazine, 14(4), pp. 45-48, July 1997.

[Chang & Lin 2001] Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

[Chen et al. 2004a] B. Chen, H.-M. Wang, and L.-S. Lee, A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents. ACM Transactions on Asian Language Information Processing, Vol. 3, No. 2, pp. 128-145, June 2004.

被引用紀錄

張鈺玫（2010）。使用多種鑑別式模型以及特徵資訊於語音文件摘要之研究〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315212968

國際替代計量

使用機器學習方法於語音文件檢索之研究

主題瀏覽