  • 學位論文


Robust Unsupervised Topic-Based Language Model Adaptation

指導教授 : 李琳山


本論文的主要貢獻在於提出一個基於主題分析的語言模型調適法,這個方法主要是使用潛藏狄式配置(Latent Dirichlet Allocation, LDA)。我們使用機率式潛藏語意分析(Probabilistic Latent Semantic Analysis, PLSA)自動地把一個具有不同性質的文字語料加以聚成許多個潛藏主題,然後用這些結果當作我們LDA模型的初始化模型。我們用最後的LDA模型一句一句地建造主題式的文字語料,這些主題式語料則用來估計主題式的語言模型。當我們用語言模型調適進行N-best重新評分時,我們把這些主題式的語言模型以內插法跟一個背景(也就是非主題式的)語言模型結合在一起。本論文共提出幾個機制,可以讓主題推論的結果更強健,比較不會被辨識錯誤扭曲,我們也用詮釋資料做片段分割,進行節目層的語言模型調適。最後在多來源的美國國防部GALE計劃中文資料上的結果顯示比其他最新的語言模型調適方法更有效。


We present a novel topic mixture-based language model adaptation approach that uses Latent Dirichlet Allocation (LDA). We use Probabilistic Latent Semantic Analysis (PLSA) to automatically cluster a heterogeneous training corpus, and then train an LDA model using the resultant topic-document assignments. Using this LDA model, we construct fine-grained topic-specific corpora at the utterance level, which we use to train topic language models. These topic LMs are interpolated with a background language model during language model adaptation under an N-best rescoring framework. We describe several techniques for hardening LDA topic inference to first-pass recognition errors, and demonstrate the effectiveness of metadata-based segmentation when combined with show-level language model adaptation. Good improvements over state-of-the-art schemes were obtained in experiments on multi-genre GALE Project data in Mandarin Chinese.


[4] M.J.F. Gales. Cluster Adaptive Training for Speech Recognition. In Proceedings of ICSLP, pages 1783–1786, 1997.
[5] B. J. Hsu and J. Glass. Style & Topic Language Model Adaptation Using HMMLDA. In EMNLP, 2006.
[6] R. Iyer and M. Ostendorf. Modeling Long Distance Dependency in Language: Topic Mixtures vs. Dynamic Cache Models. In Proceedings of ICSLP, pages 236–239, 1996.
[7] Slava M. Katz. Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35:400–401, March 1987.
[8] R. Kuhn and R. De Mori. A Cache-Based Natural Language Model from Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 570–583, 1990.
