使用動態明顯樹之音訊編碼

本論文主要使用動態型顯樹量化進階音訊編碼器(Advanced Audio Coding, AAC)的修正型離散餘弦轉換係數(modified discrete cosine transform coefficients, MDCT coefficients)，並且在過程中進行位元分配。我們使用動態明顯樹量化法(Dynamic significance tree quantization, DSTQ)，此外也使用層級樹集合分派法(Set partitioning in hierarchical trees, SPIHT)，和結合明顯樹量化法(Combined significance tree quantization, CSTQ)，MPEG-1 layer 3 (MP3)和開放的音源編碼軟體Ogg Vorbis來對經過心理聲學處理和未經過心理聲學處理的修正型離散餘弦轉換係數做編碼。實驗資料庫包含有12種不同類別音訊，即男聲、女聲、吉他、琵琶、揚琴、嗩吶、鋼琴、交響樂、二胡、爵士樂、薩克斯風、以及笛子。實驗的位元率有32 kbps、48 kbps、64 kbps、及96 kbps。實驗結果發現在這些位元率下，所有使用明顯樹模型量化離散餘弦轉換係數之編碼後音訊依然可以保有相當接近原音的音訊品質。在各種樹模型中，又以加入心裡聲學的DSTQ模型(psy-DSTQ-512)在聲音感知量測的模擬中表現最佳。

關鍵字

進階音訊編碼；低位元率量化；明顯樹

並列摘要

In this thesis, we employ dynamic significance tree quantization (DSTQ) to quantize the modified discrete cosine transform coefficients (MDCT) of audio signals and at the same time to carry on the bit allocation during the encoding process. We compare DSTQ with other famous encoding methods including set partitioning in hierarchical trees (SPIHT), combined significance tree quantization (CSTQ), MPEG-1 layer 3 (MP3), and open source audio encoding software Ogg Vorbis. It is also investigated for each significance tree model method that the psychoacoustic model is either included or not included within the encoder. The experimental database contains 12 categories of audio signals, i.e., male voices, female voices, guitar, lute, dulcimer, suona, piano, symphony, erhu, jazz, saxophone, and flute. Encoded bit-rates are set to be 32 kbps, 48 kbps, 64 kbps, and 96 kbps, respectively. The experimental results show that all of the significance tree models can maintain nearly transparent audio quality at these bit rates mentioned above. Among them, DSTQ with psychoacoustic model (psy-DSTQ-512) performs the best in our simulation based on the perceptual evaluation of audio quality (PEAQ) measure.

並列關鍵字

Advanced audio coding ； low bit-rate quantization ； Significance tree models.

參考文獻

[4] Z. Lu and W. A. Pearlman, “An efficient, low complexity audio coder delivering multiple levels of quality for interactive applications,” Multimedia Signal Processing, Redondo Beach, Dec. 1998, pp. 529–534.

[5] J. M. Shaprio, “Embedded image coding using zerotrees of wavelet coefficients,” IEEE Trans. on Signal Processing, vol. 41, no. 12, Dec. 1993, pp. 3445-3463.

[6] K. Brandenburg, “MP3 and AAC explained,” presented at the 17” AkS Conference on Hith Quality Audio Coding, Florence, 1999.

[7] M. Raad and A. Mertins, “From lossy to lossless audio coding using set partitioning in hierarchical trees,” Digital Audio Effects, Hamburg, Germany, Sep. 2002, pp. 245–250.

[8] Te Li, S. Rahardja, Soo Ngee Koh, ”Frequency region-based prioritized bit-plane coding for scalable audio,” IEEE Trans. on Audio, Speech and Language Processing, vol. 16, no. 12, Jan 2008, pp. 94-105.

國際替代計量

使用動態明顯樹之音訊編碼

未授權

主題瀏覽