從多音源中擷取特定位置訊號之研究

近年來，在視訊會議與遠端教學中，藉由麥克風陣列的應用，便能在不變動硬體設施的情況下，改變位於特定位置而來的聲音訊號並加強之；在此篇論文，我們應用麥克風陣列與電腦斷層掃描攝影術中之反投影重建法，從多音源環境中擷取特定位置音訊。通常要從多音源的環境中擷取特定位置的聲音時，使用的是延遲總和法(Delay and Sum Method)將麥克風陣列所收到的經過適當延遲與加總而取得增強的聲音信號；電腦斷層掃描攝影術(Computerized Tomography)中常用反投影重建法來還原影像信號，本篇論文應用此反投影重建法，將麥克風接收聲音的過程，視為一種非直線的投影，如此便可以使用反投影重建法來還原聲音信號。本篇論文使用matlab模擬出一個無反射的會議室，麥克風陣列則等距離地排列在一側的牆上，首先使用單純的正弦波來測試訊號還原的效果，接著測試不同數目的麥克風陣列大小，以及兩個音源相對位置對還原效果的影響，模擬的結果顯示利用反投影重建術能夠從兩個適當位置的音源之中，擷取出所要的聲音訊號。

關鍵字

電腦斷層掃描；麥克風陣列；聲音擷取

並列摘要

Recently, microphone array technique has already been used in teleconference and distance education. Without modifying the hardware arrangement, the sound signal from the position we selected could be enhanced by microphone array. This paper presents a new method of extract sound signal, base on the Back-Projection method usually been used in Computerized Tomography (CT) technique. The typical method to enhance signal, which comes from the focus location in multiple sources environment, is Delay-and-Sum technique which is base on the delay and shift operations. In the Computerized Tomography (CT) technique, the back-projection method usually been used to reconstruct the image signal. Taking the process of signal received by microphone is a kind of non-straight projection. Therefore, the back-projection method could be used to reconstruct the sound signal. The system is simulated by matlab and a non-reverberant conference room is performed. Microphone array are arranged with equal distance alone one wall. First, simple sin wave is used to verify the extraction efficiency. Then, we change the position of two sources and the microphone array size. The result shows the back-projection method could extract the sound, which comes from the source we specified

並列關鍵字

Computerized Tomography ； Microphone Array ； Sound Extraction

參考文獻

[2] Jen-Tzung Chien; Jain-Ray Lai; Po-Yin Lai, “Microphone array signal processing for far-talking speech recognition”, Wireless Communications, 2001. (SPAWC '01). 2001 IEEE Third Workshop on Signal Processing Advances in20-23 March 2001 Page(s):322 - 325

[3] Denda Y., Nishiura T., Kawahara H., Irino, T, “A design of audio-visual talker tracking system based on CSP analysis and frame difference in real noisy environments”, Multimedia Signal Processing, 2004 IEEE 6th Workshop on 29 Sept.-1 Oct. 2004 Page(s):63 – 66

[5] D. Giuliani, M. Matassoni, M. Omologo, “HANDS FREE CONTINUOUS SPEECH RECOGNITION IN NOISY ENVIRONMENT USING A FOUR MICROPHONE ARRAY”, Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on Volume 1, 9-12 May 1995 Page(s):860 - 863 vol.1

[6] Lucas C. Parra and Christopher V. Alvino, “Geometric Source Separation: Merging Convolutive Source Separation With Geometric Beamforming”, IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 Page(s):352 – 362

[1] Harvey F. Silverman, William R. Patterson, James L. Flanagan, “The Huge Microphone Array”, Concurrency, IEEE Volume 6, Issue 4, Oct.-Dec. 1998 Page(s):36 – 46

Google Scholar

國際替代計量

從多音源中擷取特定位置訊號之研究

未授權

主題瀏覽