透過您的圖書館登入
IP:3.144.233.150
  • 學位論文

探究有效偵測及修正語音辨識錯誤技術之研究

A Study on Effective Detection and Correction Techniques for Speech Recognition Errors

指導教授 : 陳柏琳
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文著重在研究語音辨識錯誤相關的幾個重要面向,尤其是當一般的語音辨識系統應用於特殊領域下所產生的未知詞問題。為此目的,我們提出一個兩階段的方法,包括了語音錯誤偵測和錯誤內容修補。在錯誤偵測階段,我們嘗試比較多種序列標記方法去偵測不同型態的錯誤。更進一步,在錯誤修正階段,藉由上一階段所偵測的結果作為依據,利用音素比對方法以特殊領域的關鍵詞表來修正錯誤。在四種應用領域,包括教育議題、工業技術相關訪談、語音記事及會議錄音,所進行的一系列實驗。由實驗結果顯示,我們提出的方法可以使得一般語音辨識系統在上述應用領域中有某種程度上的提升。

並列摘要


This paper sets out to study several important aspects pertaining to speech recognition errors, especially the out-of-vocabulary (OOV) word problem that is caused by using generic speech recognition systems for a specific application domain. To this end, a two-stage processing method, involving error detection and error correction, is proposed. For error detection, we explore and compare disparate sequence labeling methods to detect possible errors of different types. Further, in the error correction stage, an effective phone-level matching mechanism along with a domain-specific keyword list is exploited to correct errors of different types detected by the previous stage. Extensive experiments conducted on four application domains, including educational issues, industrial technology-related interviews and speech memos and meeting recordings, show that our proposed methods can boot the performance of a given general speech recognition system on the aforementioned application domains to some extent.

參考文獻


[1] Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal processing magazine 29.6 (2012): 82-97.
[2] Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
[3] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521.7553 (2015): 436-444.
[4] Schmidhuber, Jürgen. "Deep learning in neural networks: An overview." Neural networks 61 (2015): 85-117.
[5] Ogawa, Atsunori, and Takaaki Hori. "Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks." Speech Communication 89 (2017): 70-83.

延伸閱讀