本論文為偵測華語捲舌音以及非捲舌音之研究。研究目標是希望能準確判斷某段經過切音的華語子音聲音區段是否具有捲舌音的特性。 本論文所使用的偵測方式近似於說話人辨認,首先我們使用高斯混合模型(GMM)來訓練捲舌音模型以及非捲舌音模型,然後藉著調整相似度比值(likelihood ratio)的門檻值來產生相等錯誤率(EER)。除了梅爾倒頻係數(MFCC)之外,本論文也採用了頻譜動差參數以及共振峰的語音特徵。 實驗結果顯示,頻譜動差參數以及共振峰皆有助於捲舌音之偵測,研究結果的最佳相等錯誤率為17.69%。
This thesis presents the detection of retroflex and non-retroflex for Mandarin Chinese. The objective of our research is to determine whether an initial within a syllable obtained from forced alignment has the characteristics of a retroflex or not. The decision rule used in this paper is similar to speaker verification. Firstly, GMM-based retroflex and non-retroflex models are trained. Secondly, we adjust the threshold of the likelihood ratio to achieve the equal error rate (EER). In addition to MFCC, spectrum moments and formants are also used as our speech features. The experimental results indicate that spectrum moments and formants are able to improve the performance of the retroflex and non-retroflex detection rate. The best equal error rate obtained from our experiments is 17.69%.