透過您的圖書館登入
IP:18.117.183.172
  • 學位論文

應用於語音辨認之隱藏式條件隨機域聲學模型研究

Acoustic modeling of Hidden Conditional Random Field for Speech Recognition

指導教授 : 洪維廷
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本文討論使用隱藏式條件隨機域(Hidden Conditional Random Field, 簡稱HCRF)於語音辨識之聲學模型,與傳統之隱藏式馬可夫模型(Hidden Markov Model, 簡稱HMM)進行分析比較,並提出一個結合鑑別式法則之新穎HCRF模型訓練方法。經由TEST500語音資料庫進行連續音節辨識的實驗結果,發現HCRF有較佳的辨識率,其辨認反應時間遠快於HMM,更適合運用於即時辨識。此外,對於HCRF模型訓練方面,比較鑑別式法則與傳統的最大相似度法則,發現採用鑑別式訓練法則的HCRF模型較具有鑑別力。我們利用鑑別式法則訓練HMM至收斂,將其參數轉換成HCRF初始參數,並繼續使用鑑別式法則訓練HCRF模型,得到最佳的HCRF聲學模型,其效能相較於最大相似度法則訓練出的HMM,提高了 10.7%相對音節正確率。本文同時探討在定點化的特徵參數與聲學模型情況下,HCRF與HMM相比,HCRF不論是反應時間與音節正確率皆優於HMM,並在人名辨識的實驗中,搭配光束搜尋法,也得到不錯的效果。

並列摘要


In this thesis, we adopt an acoustic modeling with Hidden Conditional Random Field (HCRF)-based approach for speech recognition; and its performance is compared with the traditional Hidden Markov Model (HMM) in the same structure. A novel HCRF training algorithm combining the discriminative training criterion is proposed. In comparison with the performance of the continuous Mandarin syllable recognition in TEST500 database, the HCRF-based approach is better than the one obtained with HMM in the accuracy rate and response time. Proved by a serial of related experiments, we think HCRF is more suitable for real-time speech recognition system. Next, we compare two methods for training HCRF. One is based on maximum likelihood criterion; the other is based on discriminative criterion. These results indicate that the discriminative approach outperforms the training scheme in maximum likelihood criterion. Finally, we investigate our HCRF-based system in fixed-point and limited beam-size issues. The related experimental results show again the advantages of the HCRF-based approach in this thesis.

並列關鍵字

HCRF ASR

參考文獻


[1] P. C. Woodland and D. Povey, “Large scale discriminative training of hidden Markov models for speech recognition,” CSL 2002, vol. 16, 25–47, 2002
[2] B.-H. Jaung and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Transactions of Signal Processing, 1992, vol. 40, issue 12, 3043-3054.
[3] D. Povey and P. C. Woodland, “Minimum phone error and I-smoothing for improved discriminative training,” ICASSP 2002, vol. 1, 105-108, 2002.
[4] D. Povey, Discriminative Training for Large Vocabulary Speech Recognition, Ph.D. thesis, Cambridge University, 2003.
[5] H-K. Kuo and Y. Gao, “Maximum entropy direct models for speech recognition,” ASRU 2003, 1-6, 2003.

被引用紀錄


曾家宏(2012)。基於隱藏式條件隨機域模型之千人語者辨識研究〔碩士論文,元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2012.00244
許順翔(2009)。基於隱藏式條件隨機域聲學模型之資源受限裝置語音命令系統〔碩士論文,元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2009.00304
李秋芬(2009)。基於隱藏式條件隨機域聲學模型之強健式訓練演算法〔碩士論文,元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-2807200913023300
邢凱婷(2009)。基於隱藏式條件隨機域語者模型之語者識別演算法〔碩士論文,元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-2807200914245700
劉維宸(2011)。基於隱藏式條件隨機域模型調適之語者識別演算法〔碩士論文,元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-2801201414583635

延伸閱讀