使用深層類神經網路之中文語音辨認

目前深層類神經網路已成為語音辨識領域中的熱門研究，本論文中以Kaldi speech recognition toolkit建立即時中文大詞彙語音辨識系統，並使用深層類神經網路取代傳統聲學模型中高斯混合模型，以加權有限狀態機實現辨識系統，分析影響辨識率與辨識時間的因素。其中聲學模型部分使用高斯混合模型或深層類神經網路模型的差異，以及語言模型大小的差異都影響了整個辨識系統的大小，解碼過程中的許多參數也影響了辨識系統的效能，因此藉由調整這些參數，我們可以找到最佳的操作點，以得到一個即時又正確的辨識系統。在實驗中使用TCC300語料庫分別作為訓練與測試語料，並建立八萬詞的發音詞典。

關鍵字

深層類神經網路；中文語音辨識；即時

並列摘要

Deep neural network has been a popular research area in automatic speech recognition. In this dissertation, we focus on implementing a real-time large vocabulary mandarin-speech recognition system using Kaldi speech recognition toolkit. In the proposed system we develop the deep neural network acoustic model which is compared to the conventional Gaussian mixture model (GMM). We also use weighted finite state transducer (WFST) to realize the decoder. The main goal is to receive the best operation point in the proposed speech recognition system by tuning the parameter which effect the recognition speed amd recognition rate. The experimental results use the speech corpora of TCC300 speech corpora and 80k lexicon.

並列關鍵字

Deep neural network ； mandarin-speech recognition ； real-time

參考文獻

[1] L. R. Bahl, F. Jelinek and R. L. Mercer "A maximum likelihood approach to continuous speech recognition", IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-5, pp.179 -190, 1983.

[3] S. B. Davis, and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences." IEEE Transactions on Acoustics, Speech, and Signal Processing 28(4); pp.357-366, 1980.

[4] S. Tiberewala and H. Hermansky, “Multiband and adaptation approaches to robust speech recognition,” in Proceedings of European Conference on Speech Communication and Technology, 25(1-3), pp. 2619-2622, 1997.

[6] M. Gales, "Semi-tied covariance matrices for hidden Markov models", IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp.272 -281, 1999.

[7] X. Zhang, J. Trmal, D. Povey, and S. Khudanpur, “Improving deep neural network acoustic models using generalized maxout networks,” in Proceedings of IEEE ICASSP, 2014.

國際替代計量

使用深層類神經網路之中文語音辨認

全文下載

主題瀏覽