情緒化自發性語音分解之立方雲線數位電路設計

目前的語音辨識系統在理想的環境中的辨識率已高達九成，應用於較為複雜的環境中時會因為人類的情緒化語音以及背景音而讓辨識率低落，因為這些原因目前無法很普及的應用，所以實驗室利用經驗模態分解分解語音訊號，提取必要成分使得辨識系統不會被人類的情緒化語音以及背景音影響，導致辨識率低落，但是經驗模態分解運算的速度較為緩慢，無法應用於需要及時反應之環境中，故實驗室提出利用現場可程式化閘陣列可平行處理以及處理速度快地的優點來加速經驗模態分解運算，使不具即時性之經驗模態分解技術能運用於即時線上要求之語音辨識系統中。情緒化自發語音者辨識技術架構包括:語音訊號處理、經驗模態分解、特徵萃取、雙模辨識等四大部份。本系統之特色在經驗模態分解分解過程中，利用立方雲線內插法，配合數位平行電路加速經驗模態分解過程，並根據此數據建立與者聲學模型與語者詞彙模型，此設計對於要求高度個人化與智慧化的數位家庭生活科技，時為重要突破。

關鍵字

立方雲線；情緒化自發語音處理；經驗模態分解；數位平行運算電路設計；現場可程式邏輯閘陣列語音辨識

並列摘要

The current speech recognition systems in an ideal environment recognition rate has reached 90% because of the human emotional tone of voice and the background and let the recognition rate is low when applied to more complex environments because of these reasons is currently not very popular applications, so the laboratory use of Empirical Mode Decomposition voice signals, making the identification system to extract the necessary ingredients are not human emotional speech and background sound effects, leading to low recognition rate, but more slowly Empirical Mode Decomposition operation, it is proposed the use of Field programmable gate array processing laboratory and the advantages of fast processing speed can be accelerated parallel computing Empirical Mode Decomposition, so no experience with real-time mode decomposition technique can be applied to real-time requirements of online speech recognition system. The technique of speech recognition in an emotionalized spontaneous speech includes four major parts: speech signal process .Empirical Mode Decomposition. Feature extraction and dual model identification systems. Characteristics of this system is to use the cubic splines parallel to the circuit speed digital Empirical Mode Decomposition. Process and according to establishment the speaker acoustic model and language model vocabulary. This design makes the voice commands identification more accurate and the stored vocabulary voice model can also own the personal characteristic of some specified speaker simultaneously.

並列關鍵字

Cubic spline ； An emotionalized spontaneous speech ； Empirical Mode Decomposition ； Field programmable gate array

參考文獻

[1] 李政益，「特定語者特定中文語音指令雙模辨識技術」，清雲科技大學，碩士論文，民國九十四年。

Google Scholar

[2] 徐世霖，「雲線式訊號分解之數位電路設計與模擬」，清雲科技大學，碩士論文，民國一百年。

Google Scholar

[3] 劉于碩，「應用經驗模態分解技術於情緒化自發語音之辨識」，清雲科技大學，碩士論文，民國九十六年。

Google Scholar

[4] Cohen, Time-frequency analysis, Prentice-Hall,Inc.,1995.

Google Scholar

[5] D. A. Reynolds, R. C. Rose, ”Robust text-independent speaker identification Using Gaussian mixture speaker models,” Speech and Audio Processing, IEEE Transactions on, vol.3, no.1, pp.72-83,1995.

Google Scholar

國際替代計量

情緒化自發性語音分解之立方雲線數位電路設計

未授權

主題瀏覽