透過您的圖書館登入
IP:3.145.119.199
  • 學位論文

使用邊際資訊於鑑別式聲學模型訓練

A Study on Margin-Based Discriminative Training of Acoustic Models

指導教授 : 陳柏琳
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文旨在探究近年具代表性的鑑別式聲學模型訓練方法及其背後之一致性,並且延伸發展各種不同以邊際為基礎的資料選取方法來改善鑑別式聲學模型訓練,應用於中文大詞彙連續語音辨識。首先,為了進一步探討近年各種鑑別式訓練方法,我們整理歸納近年所發展鑑別式訓練方法之目標函數其背後一致性。其次,我們討論了各種不同邊際資訊應用於鑑別式訓練的方法,進而在大詞彙連續語音辨識中有效地降低語音辨識錯誤率。再者,我們結合了柔性邊際與增進式方法使得在資料選取的範圍上更為明確且具彈性,以提供更具鑑別資訊的統計量。在實作上,我們觀察了以語句為層次的選取資料為例,以進一步了解各式統計資訊對於鑑別式訓練成效之影響。最後,本論文以公視新聞語料做為實驗平台,實驗結果初步證實了本論文所提出之作法在某種程度上能夠改善過去方法所面臨的過度訓練之問題。

並列摘要


This thesis sets the goal at investigating the consistency properties underlying the most popular algorithms for discriminative training of acoustic models. Various margin- and boosting-based training data selection methods are also extensively explored in conjunction with the discriminative training algorithms for Mandarin large vocabulary continuous speech recognition (LVCSR). First, for providing an in-depth evaluation of the utilities of the discriminative acoustic model training algorithms developed recently, we try to deduce the consistency properties from their individual training objectives. Second, we compare among different margin- and boosting-based methods that have the abilities to make acoustic training concentrate more on discriminative training data so as to effectively enhance the LVCSR performance. Furthermore, we also attempt to pair the soft-margin- with the boosting-based methods to make good use of more discriminative statistics, while the implementation is instantiated by utterance-level data selection. All experiments are conducted on a Mandarin broadcast news corpus compiled in Taiwan, and the associated results seem to demonstrate that the proposed approaches could relieve the over-training problem to a certain extent.

並列關鍵字

無資料

參考文獻


[Kuo et al. 2006] J.-W. Kuo, S.-H. Liu, H.-M. Wang and B. Chen, “An empirical study of word error minimization approaches for mandarin large vocabulary speech recognition,” International Journal of Computational Linguistics and Chinese Language Processing, Vol. 11, No.3, pp. 201-222, 2006.
[郭人瑋 2005] 郭人瑋, 最小化音素錯誤鑑別式聲學模型學習於中文大詞彙連續語音辨識之初步研究, 國立台灣師範大學資訊工程研究所碩士論文, 2005.
[劉士弘 2007] 劉士弘, 改善鑑別式聲學模型訓練於中文連續語音辨識之研究, 國立台灣師範大學資訊工程研究所碩士論文, 2007.
[朱芳輝 2008] 朱芳輝, 資料選取方法於鑑別式聲學模型訓練之研究, 國立台灣師範大學資訊工程研究所碩士論文, 2008.
[McDermott et al. 2009] E. McDermott, S. Watanabe, and A. Nakamura, “Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training,” in Proc. Interspeech 2009.

延伸閱讀