本文討論使用隱藏式條件隨機域(Hidden Conditional Random Field, 簡稱HCRF)於語音辨識之聲學模型,與傳統之隱藏式馬可夫模型(Hidden Markov Model, 簡稱HMM)進行分析比較,並提出一個結合鑑別式法則之新穎HCRF模型訓練方法。經由TEST500語音資料庫進行連續音節辨識的實驗結果,發現HCRF有較佳的辨識率,其辨認反應時間遠快於HMM,更適合運用於即時辨識。此外,對於HCRF模型訓練方面,比較鑑別式法則與傳統的最大相似度法則,發現採用鑑別式訓練法則的HCRF模型較具有鑑別力。我們利用鑑別式法則訓練HMM至收斂,將其參數轉換成HCRF初始參數,並繼續使用鑑別式法則訓練HCRF模型,得到最佳的HCRF聲學模型,其效能相較於最大相似度法則訓練出的HMM,提高了 10.7%相對音節正確率。本文同時探討在定點化的特徵參數與聲學模型情況下,HCRF與HMM相比,HCRF不論是反應時間與音節正確率皆優於HMM,並在人名辨識的實驗中,搭配光束搜尋法,也得到不錯的效果。
In this thesis, we adopt an acoustic modeling with Hidden Conditional Random Field (HCRF)-based approach for speech recognition; and its performance is compared with the traditional Hidden Markov Model (HMM) in the same structure. A novel HCRF training algorithm combining the discriminative training criterion is proposed. In comparison with the performance of the continuous Mandarin syllable recognition in TEST500 database, the HCRF-based approach is better than the one obtained with HMM in the accuracy rate and response time. Proved by a serial of related experiments, we think HCRF is more suitable for real-time speech recognition system. Next, we compare two methods for training HCRF. One is based on maximum likelihood criterion; the other is based on discriminative criterion. These results indicate that the discriminative approach outperforms the training scheme in maximum likelihood criterion. Finally, we investigate our HCRF-based system in fixed-point and limited beam-size issues. The related experimental results show again the advantages of the HCRF-based approach in this thesis.