以雙耳時間差為基礎之強健性語音辨識技術

本論文提出一種利用最大信心值估測雙麥克風強健性語音辨識中雙耳時間差門檻值之方法，本方法緊密地結合了語音辨識器與麥克風陣列噪音遮蔽法，主要的特點為：（1）使用語音模型與填充模型所計算出之信心值作為評估語音與噪音分離程度之指標，以及（2）根據上述的條件，以期望值最大化演算法自動找出對應於最大信心值的最佳雙麥克風之麥克風陣列噪音遮蔽參數。我們使用一語音命令實驗來進行測試，實驗結果顯示我們所提出之方法在低訊噪比以及近距離之噪音源干擾下，辨識率都有非常明顯的提升。

關鍵字

強健性語音辨識；麥克風陣列；雙耳時間差

並列摘要

A new one-stage maximum confidence measure (MCM) based interaural phase difference estimation framework for noise masking is proposed to closely integrate the underline speech models into dual-microphone array noise filtering for robust speech recognition. The main ideas are: (1) utilizing both the speech and filler models of the recognizer to feedback confidence measures (CMs) that indicate the degree of separation between filtered speech and interference noises, and (2) automatically optimizing the parameters of the microphone array with an expectation maximization (EM) algorithm based on the proposed MCM criterion. Experimental results on a Mandarin voice command task show that the proposed approach significantly improves the final speech recognition rates. Moreover the observed performance degradation is usually graceful under low signal-to-noise ratios (SNRs) and close interference noises conditions.

並列關鍵字

Robust Speech Recognition ； Microphone Array ； Interaural Phase Difference ； ITD

國際替代計量

以雙耳時間差為基礎之強健性語音辨識技術

全文下載

主題瀏覽