機率型麥克風陣列校正與聲源定位系統

在機器人學裡，聲音感知是一個非常重要的功能。麥克風陣列在聲音感知的應用中被廣泛的使用，而在這些應用裡，麥克風在空間的座標通常是已知的。聲音估計結構(Structure from Sound)演算法提供了一個富有彈性之方法來校正不同結構的麥克風陣列，它能同時地定位多個麥克風與定位多個聲源。然而，在現存的演算法裡並沒有將量測的不確定性納入考量，也沒有提供音聲估計結構演算法之結果的不確定性估測。在這篇論文裡，我們提出了一個機率型聲音估計結構演算法(Probabilistic Structure from Sound)。此外，我們提出了一個機率型聲源定位演算法(Probabilistic Sound Source Localization)，此演算法是使用機率型聲音估計結構演算法的結果來改進聲源定位的準確性。我們使用低成本的麥克風。大量的模擬與實驗結果成功的展示了機率型聲音估計結構演算法與機率型聲源定位演算法之成果。

關鍵字

麥克風陣列；聲音估計結構；聲源定位

並列摘要

Auditory perception is one of the most important functions for robotics applications. Microphone arrays are widely used for auditory perception in which the spatial structure of microphones is usually known. The thesis first describes the affine Structure from Sound (SFS) algorithm. The structure from sound is a problem to simultaneously localize microphones and sound sources. However, the existing method does not take measurement uncertainty into account and does not provide uncertainty estimates of the SFS results. In this thesis, we propose a probabilistic structure from sound (PSFS) approach using the unscented transform. The PSFS algorithm not only localizes microphones and sound sources but also estimates the uncertainties of the SFS results. In addition, a probabilistic sound source localization (PSSL) approach using the PSFS results is provided to improve sound source localization accuracy. The ample results of simulation and experiments using low cost, off-the-shell microphones demonstrate the feasibility and performance of the proposed PSFS and PSSL approaches.

並列關鍵字

microphone arrays ； structure from sound ； sound source localization

參考文獻

Hu, J.-S., Cheng, C.-C., & Liu,W.-H. (2006). Robust speaker’s location detection in a vehicle environment using gmmmodels. IEEE Transactions on Systems,Man and Cybernetics - PartB: Cybernetics, 36(2), 403–412.

Knapp, C. H. & Carter, G. C. (1976). The generalized correlation method for estimation of time delay. IEEE Trans. Acoust., Speech, Signal Processing, 24, 320–327.

Nakashima, H. & Mukai, T. (2005). 3d sound source localization system based on learning of binaural hearing. In IEEE International Conference on Systems, Man and Cybernetics.

Tomasi, C. & Kanade, T. (1992). Shape andmotion fromimage streams under orthography: a factorization method. International Journal of Computer Vision, 9(2), 137–154.

Birchfield, S. T. & Gangishetty, R. (2005). Acoustic localization by interaural level difference. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

Google Scholar

國際替代計量

機率型麥克風陣列校正與聲源定位系統

全文下載

主題瀏覽