傳統上麥克風陣列波束成形(Beamforming)搭配適應性濾波處理殘響時,多以針對 訊號波形做最佳化,而不是針對語音辨認器做最佳化。本論文提出一個倒頻譜域 麥克風陣列波束成形演算法,將傳統針對訊號波形最佳化為目標調整適應性濾波 器,改以使輸入訊號與語音模型的相似度最佳化為目標,使用語音辨認法則來調 整適應性濾波器的參數,並將調適階段與測試階段使用的語音模型統一在倒頻譜 域下,使陣列濾波器與語音辨認器間的差距降到最小,以期能夠在語音訊號有殘 響干擾時能得到最佳的語音辨認效果。實驗結果在相同的收斂停止條件下,倒頻 譜域下HCRF的錯誤率最多能比HMM少8.16%。在平均單次遞迴時間方面,倒頻 譜域下HCRF所需的時間最多可以比HMM少3.89%。
Generally,microphone array and adaptive beamforming are used for erase reverberation by optimize the signal waveform,instead of optimize the parameters that used for recognition. In this thesis, we propose a Cepstral Domain Array Beamformer for Speech Recognition,we change the system goal from optimize the signal waveform to optimize the likelihood between the signal and the speech model. We use the speech recognition rule to adapt the parameters of the lters and use only one speech model in cepstral domain for training phase and testing phase. The purpose of this thesis is to investigate the improvement by using the cepstral domain models for training phase and testing phase.