透過您的圖書館登入
IP:3.145.63.136
  • 期刊

以三連音素為單位之中文語音辨識

Tri-Phone-Based Mandarin Speech Recognition

摘要


本文使用三連音素為單位的聲學模型(acoustic model)來取代傳統的411個中文音節所構成的聲學模型,以提高在大詞彙下的語音辨識率。而使用跨音節的三連音素模型,由於模型數目過於龐大,使得訓練語料的不足,會有所謂的未曾出現三連音素模型(unseen models)的問題。為了解決這個問題我們用大陸的中文語料來增加訓練的語料庫,以及使用決策樹(decision tree)中層級分享參數(state-tying)的方法,使得三連音素模型有更好的辨識成果。

並列摘要


In this paper, a mandarin speech recognition system based on tri-phone model was constructed. However, there are several practical problems when tri-phone models are applied in the speech recognition system. First, in the speech recognition system, many tri-phone models have only few occurrences in the training data, hence there is no sufficient data for robust parameter estimation of these rarely seen tri-phone models. Second, there are a large number of tri-phone models missing in the training corpus. Unseen tri-phone models are unavoidable when building cross-word tri-phone systems. We use decision tree and more training data to solve these problems.

延伸閱讀