透過您的圖書館登入
IP:18.222.67.251
  • 學位論文

使用鑑別式語言模型於語音辨識結果重新排序

Applying Discriminative Language Models to Reranking of M-best Speech Recognition Results

指導教授 : 陳柏琳
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


語言模型代表語言的規律性,在語音辨識中,它可用以減輕聲學特徵混淆所造成的問題,引導辨識器在多個候選字串中作搜尋,並量化辨識器產生的最終辨識結果字串的可接受度高低。然而,隨著時空及領域的不同,語言產生差異,固定不變的語言模型無法符合實際需求。語言模型調適提供了一個解決之道,使用少量同時期或同領域的調適語料對語言模型進行調整,以增進效能。鑑別式語言模型為語言模型調適方法之一,它首先取得一些特徵(Feature),每一個特徵各有其對應之權重(Feature Weight),以代表語言中的句子或字串,並以這些特徵及其相關權重為基礎,構建出一套評分機制,用以對基礎辨識器(Baseline Recognizer)所產生的多個辨識結果進行重新排序(Reranking),以期最正確的詞序列可以成為最終辨識結果。本文提出以關鍵詞自動擷取方法所得結果,增加鑑別式語言模型之特徵。關鍵詞自動擷取方法是透過計算字或詞在語料庫中同時重複出現的次數以擷取出關鍵詞,其優點為可以在不依賴詞典(Lexicon)的情況下,擷取出新生詞彙或不存在詞典裡的語彙,這樣的特性也許會對鑑別式訓練有所助益,但實驗結果顯示未有顯著之改善效果。

並列摘要


A language model (LM) is designed to represent the regularity of a given language. When applied to speech recognition, it can be used to constrain the acoustic analysis, guide the search through multiple candidate word strings, and quantify the acceptability of the final word string output from a recognizer. However, the regularity of a language would change along with time and cross domains, such that a static or invariable language model cannot meet the realistic demand. Language model adaptation seems to provide a solution, by using a small amount of contemporaneous or in-domain data to adapt the original language model, for better performance. The discriminative model is one of the representative approaches for language model adaptation in speech recognition. It first derives a set of indicative features, where each feature has a different weight, to characterize sentences or word strings in a language, and then build a sentence scoring mechanism on the basis of these features and the associated weights. This mechanism is used to re-rank the M-best recognition results such that the most correct candidate word string is expected to be on the top of the rank. This paper proposes an approach which takes the results of a fast keyword extraction method as additional features for the discriminative model. This method extracts keywords by counting the repetition of co-occurrences of characters or words in the speech corpus, such that these keywords may capture the regularity of language being used. A nice property is that it extracts keyword without the need of a lexicon, so it can extract new keywords and the keywords which do not exist in, or contain words of the lexicon. This property may be useful for discriminative language modeling, but, however, empirical experiments show it only provides insignificant improvements.

參考文獻


[Aubert 2002] X. Aubert, “An Overview of Decoding Techniques for Large Vocabulary Continuous Speech Recognition,” Computer Speech and Language, Vol. 16, pp. 89-114, 2002.
[Bacchiani et al. 2003] M. Bacchiani and B. Roark.,” Unsupervised Language Model Adaptation”, ICASSP , 2003.
[Brown et al. 1992] Peter F. Brown, Vincent J. Della Pietra, Peter V. deSouza, JennifeC. Lai, and Robert L. Mercer. "Class-based N-gram Models of Natural Language", Computational Linguistics, 18(4):467–479, December, 1992.
[Chen et al. 2002 ] Z. Chen, K. F. Lee and M. J. Li, “Discriminative Training on Language Model”, ICSLP, 2002.
[Collins et al. 2000] M. Collins, T. Koo, “Discriminative Reranking for Natural Language Parsing”, ICML, 2000.

被引用紀錄


劉家妏(2010)。多種鑑別式語言模型應用於語音辨識之研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315213184
陳冠宇(2010)。主題模型於語音辨識使用之改進〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315213186
賴敏軒(2011)。實證探究多種鑑別式語言模型於語音辨識之研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315254524
黃邦烜(2012)。遞迴式類神經網路語言模型使用額外資訊於語音辨識之研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315300315

延伸閱讀