Design and Implementation of a 1.4Kbps Glottal Excitation Linear Prediction (GELP) Vocoder

本文提出一個搭配LPC語音編碼器之聲門激源模型以使語音信號能在1400bps 的情況下有效編碼。其中頻譜參數的編碼工作是以轉轍式線性預測類神經網路伴隨多階段向量量化的方式來處理，而激源區分為兩類，屬於無聲的激源是搜尋自隨機代碼簿，至於有聲激源則是從聲門代碼簿加以挑選，所有涉及到激源信號的分析與合成以及代碼簿的構建程序皆有詳細交代。針對此一1.4kbps語音編解碼器所做之音質評鑑，其平均值為2.993，而2.4 Kbps之LPC編碼器與4.8 Kbps之CELP編碼器的相對分數則分別是2.272與3.314。此外，我們亦嘗試推出簡化版使其能在ADSP-2181上執行，但由於演算法刪減與記憶體受限之故，致使最後的音質跟著下降，這似乎意謂著整套編解碼功能的完整實現都還是得借重擁有大記憶容量的浮點數DSP晶片。

關鍵字

聲門激源； LPC語音編解碼器；轉轍式預測類神經網路

並列摘要

This paper presents a glottal excitation model to cope with the LPC vocoder for speech signals coded at 1400 bps. We encode the spectral parameters by using a switched-predictive neural network along with multi-stage vector quantization. While the unvoiced excitation is retrieved from a stochastic codebook, we use a glottal codebook to characterize the voiced excitation. Procedures are described for analysis and synthesis of the excitation signals in addition to codebook construction. The MOS test regarding the 1.4 Kbps GELP coder is 2.993, as compared with 2.272, 3.314 for the 2.4 Kbps LPC and 4.8 Kbps CELP coders. A simplified version is developed to work on the ADSP-2181 processor, but it suffers quality degradation due to the algorithm truncation and memory restriction. This suggests that fully implementation of the proposed coder may rely on a floating-point DSP chip integrated with large memory.

並列關鍵字

glottal excitation ； LPC vocoder ； switched-predictive neural network

國際替代計量

Design and Implementation of a 1.4Kbps Glottal Excitation Linear Prediction (GELP) Vocoder

全文下載

主題瀏覽