利用構音肌電訊號之語音辨識系統

本論文提出一種使用構音肌電訊號的語音辨識系統。此系統採用音素式語音辨識器，並且結合了專為構音肌電訊號設計的特徵萃取模組。實驗用語料包含平行雙模態錄製的麥克風語音以及構音肌電訊號。實驗結果顯示語音聲波發生於構音肌電訊號產出後0.05-0.06秒。實驗中並採用了構音特徵辨別器以產生不同於傳統語音特徵的額外資訊。同時，前述專為構音肌電訊號設計的特徵萃取方式也可提升構音特徵辨別率。此語音辨識系統採用多串流架構解碼器以結合傳統肌電特徵及構音肌電特徵。在詞彙量100詞的英語連續語音辨識實驗中，本系統達到29.9%的辨識錯誤率。

關鍵字

語音辨識；肌電訊號；構音特徵

並列摘要

This paper presents an automatic speech recognition system based on electromyographic biosignals captured from the articulatory muscles in the face using surface electrodes. We develop a phone-based speech recognizer and describe how the performance of this recognizer improves by carefully designing and tailoring the extraction of relevant speech feature toward electromyographic signals. Our experimental design includes the collection of audibly spoken speech simultaneously recorded as acoustic data using a close-speaking microphone and as electromyographic signals using electrodes. Our experiments indicate that electromyographic signals precede the acoustic signal by about 0.05-0.06 seconds. Furthermore, we introduce articulatory feature classifiers, which had recently shown to improve classical speech recognition significantly. We describe that the classification accuracy of articulatory features clearly benefits from the tailored feature extraction. Finally, these classifiers are integrated into the overall decoding framework applying a stream architecture. Our final system achieves a word error rate of 29.9% on a 100-word recognition task.

並列關鍵字

Speech Recognition ； Electromyographic Signal ； Articulatory Feature

國際替代計量

利用構音肌電訊號之語音辨識系統

全文下載

主題瀏覽