透過您的圖書館登入
IP:18.116.21.229
  • 期刊
  • OpenAccess

時域上基頻軌跡演算法的改良與探討

On the Modified Algorithm of Pitch Contour Detection in Time Domain

摘要


中文是一種聲調語言,不同聲調之間的差異,可以由基頻軌跡來決定。在擷取基頻軌跡的過程中,常常會擷取到半頻或倍頻的情況,造成基頻軌跡不連續,以致在聲調辨識上的錯誤。本論文將以傳統方法(Auto-Correlation Function (ACF)、Average Magnitude Difference Function (AMDF)與Correlation Function (CF))作為基礎,提出一種時域上基頻軌跡的演算法來改良中文之聲調辨識。其方法主要是利用解強調與一階差分兩種濾波器,使語音訊號之波形能夠更具有週期性,來降低語音受到雜訊的影響,並且利用半頻、基頻與倍頻之間的頻率特性,來擷取出音框最有可能的基頻值,再利用分群的方式以及線性迴歸的方法,對音框做修正與平滑的動作。最後以本實驗室所錄製語料做測試,語料內容為中文聲調一到四聲,不考慮輕聲,共1331個中文單字。由實驗結果發現,本論文之改良方法在中文之聲調辨識上,最高可達95.54%的辨識率,比傳統方法之辨識率提高約兩成,也比UPDUDP方法之最高辨識率95.03%還高,雖然兩者辨識率差異不大,但是在半頻與倍頻的錯誤率,UPDUDP為3.04%比本論文之改良方法(錯誤率為0.19%)高出約3%的錯誤率,尤其是中文聲調為一聲時,UPDUDP之倍頻錯誤率高達8.28%,而本論文則為0.1%,因此本論文之改良方法能夠有效的改善中文聲調之辨識。

關鍵字

聲調語言 基頻軌跡 音框

並列摘要


Chinese is a tonal language. The difference between tones can be determined by pitch contour. During the process of extracting pitch contour, the situation that a pitch's occurrence at half-frequency or at double-frequency constantly happens, which lead to a pitch contour's discontinuousness and mistakes in its tone recognition. Based on the traditional methods (Auto-Correlation Function (ACF), Average Magnitude Difference Function (AMDF) and Correlation Function (CF)), this study aims to improve the deficiency of pitch contour detection. We propose a modified method to reduce the impact of noise in speech and possibly find the precise fundamental frequency for each extracted signal. We use the methods of clustering and linear regression model to achieve correction and smoothness for the pitch at half frequency and double frequency. The test corpus consists of a total of 1331 Chinese words from tone 1 to tone 4, excluding tone 5. From the experimental results, compared to traditional methods, the modified method contributes to the higher recognition rate by about 20% (the highest achieved recognition rate is 95.54%). Meanwhile, the recognition rate of this study is higher than the maximum recognition rate up to 95.03%, which adopts the Unbroken Pitch Determination Using Dynamic Programming (UPDUDP); though the difference between these two rates is not remarkable. The modified method decreases the error rate of the pitch at half frequency or double frequency about 3%, compared to the one adopting the UPDUDP. In terms of tone 1, our modified method only has 0.1% error rate, which is far lower than the error rate of pitch at double frequency using UPDUDP is 8.28%. Thus, our method is proved to effectively improve the detection of pitch contour.

並列關鍵字

Tone language Pitch contour Frame ACF AMDF CF UPDUDP

參考文獻


王小川(2009)。語音訊號處理 修訂二版。全華。
杜承恩(2011)。基於決策樹與隱藏式馬可夫模型之華語聲調辨識。國立清華大學資訊工程學系。
周家正(2005)。一種基於變動框架長度之基頻週期軌跡演算法的研究。中原大學電機工程學系。
黃重光(2001)。以自組織特徵映射建立國語聲調電腦評量模式之研究。國立台中師範學院教育測驗統計研究所。
陳進旺(2007)。應用AMDF 演算法之音高點擷取方法。國立台灣科技大學資訊管理系。

延伸閱讀