Minimum Classification Error Training of Hidden Conditional Random Fields for Speech and Speaker Recognition

Hidden conditional random fields (HCRFs) are derived from the theory of conditional random fields with hidden-state probabilistic framework. It directly models the conditional probability of a label sequence given observations. Compared to hidden Markov models, HCRFs provide a number of benefits in the acoustic modeling of speech signals. Prior works for training on HCRFs were accomplished with gradient descent based algorithms by conditional maximum likelihood criterion. In this paper, we extend that methodology by applying minimum classification error criterion-based training technique on HCRFs. Specifically, we adopt generalized probabilistic descent (GPD)- based training algorithm with HCRF framework to improve the discrimination capabilities of acoustic models for speech and speaker recognition. Two tasks including a speaker identification and a Mandarin continuous syllable recognition are applied to evaluate the proposed approach. We present the results on the MAT2000 database and these results confirm that the HCRF/GPD approach has good capabilities for speech recognition and speaker identification regardless of the length of the test and training speech or the presence of noise. We note that the HCRF/GPD enjoys its potential for development in acoustic modeling.

並列關鍵字

speech recognition ； speaker recognition ； hidden conditional random field ； discriminative training algorithm ； Mandarin syllable recognition

被引用紀錄

魏輔辰（2012）。倒頻譜域麥克風陣列波束成形之語音辨認研究〔碩士論文，元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2012.00278

曾家宏（2012）。基於隱藏式條件隨機域模型之千人語者辨識研究〔碩士論文，元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2012.00244

國際替代計量

Minimum Classification Error Training of Hidden Conditional Random Fields for Speech and Speaker Recognition

全文下載

主題瀏覽