小波轉換為近年來相當熱門的研究題目之一,小波定理提供了統一的架構給許多不同的訊號處理應用領域.目前小波轉換已廣泛地應用在通信系統﹑信號處理﹑影像和音訊處理等不同的研究領域。由於小波轉換具有極佳的時域﹑頻域分析功能以及多重解析的特性,因此很適合運用在具有高時變性的語音信號上。本論文研究主題為小波轉換在語音信號處理上之應用,共分為三個部分,包含: 語音信號偵測﹑子母音分割,及音高週期求取等三部分。 首先,在語音活動檢測上, 以小波轉換為基礎發展出短時段的語音活動檢測判斷法,實驗結果顯示本論文所提的方法在高雜訊 (SNR=0dB) 環境下仍有極高的語音信號偵測率﹐且勝過GSM Enhanced Full Rate通訊系統所提出的語音活動檢測判斷結果。其次,在子母音分割方面,參考Chen和Wang 所提出的以小波轉換搭配積函數的演算法為基礎,再加上新的分割點判斷演算法﹔與傳統方法比較之下﹐本論文所提的判斷分割點的演算法可提昇子母音分割的準確度,且在雜訊環境下也可求得精確的子母音分割點。最後在音高週期求取部份, 為加強傳統演算法抗雜訊的能力﹐本論文以小波轉換為基礎﹐加上circular average magnitude difference function (CAMDF) 方法﹐來求取音高週期﹐實驗結果顯示本論文所提的演算法無論在乾淨或雜訊環境中均具有不錯的成效。
The wavelet transform is one of the most exciting developments of the last decade. Wavelet theory provides a unified framework for a number of techniques which had been developed independently for various signal processing applications. Due to the wavelet representation has characteristics of the efficient time-frequency localization and the multi-resolution analysis; the wavelet transforms are suitable for processing the non-stationary signals such as speech. Based on the Wavelet framework, this thesis develops three wavelet-based speech signal processing algorithms including voice active detection (VAD), consonant/vowel (C/V) segmentation, and pitch detection. The first part is the wavelet-based voice active detection algorithm on a frame by frame basis. Experimental results show that the proposed VAD algorithm is capable of outperforming to the VAD of Enhanced Full Rate GSM-based system and can operate reliably in noisy environments (SNR=0dB). Then, this thesis makes use of wavelet transform and energy profile to indicate the C/V segmentation point and is no need to set any predetermined threshold. It is shown that the C/V the segmentation point can be accurately pointed out with a low computation complexity. Final, In the light of the properties of wavelet transform and circular average magnitude difference function, a new pitch detection algorithm is proposed. The simulation results show that new method can detect the pitch period accurately when other methods can‘t when SNR is in 0dB.