透過您的圖書館登入
IP:18.221.165.246
  • 學位論文

利用麥克風陣列做戶外異常聲源之定位

Using a Microphone Array to Detect the Locations of Unusual Sound Sources in an Outdoor Area

指導教授 : 石勝文

摘要


本論文主要探討戶外場所的異常聲源定位研究。戶外聲源定位技術會因風向、環境雜訊以及溫度等因素影響其準確性。為了排除雜訊干擾以提昇定位準確率,且能拓展聲源定位技術用以偵測頻譜不完全重疊的多重聲源位置,本研究將利用兩訊號所含雜訊對其相位差之擾動,建立時間延遲量的機率模型,並根據此模型找出可能的聲源位置分佈。最後利用 mean-shift 演算法分群,過濾低權值的群後,得到最可能的聲源位置。在實驗上則利用四顆線性均勻分布的麥克風,來收集戶外聲源訊號。其單聲源訊號偵測的準確率在一般的情況下和目前最常使用且有高準確率的聲源定位方法Phase Transform (PHAT) 不相上下,但若針對戶外不同頻率的混音環境雜訊(如蟲鳴鳥叫),此方法擁有比PHAT 較高的穩定度。另外在頻率差異性大的雙聲源訊號偵測上,也能有一定的準確率。

並列摘要


In this thesis, we study the unusual sound source localization problem in an outdoor area. In an outdoor environment, the accuracy of sound source localization will be influenced by the wind velocity, the air temperature, and background noises. This work aims to develop a sound source localization method which is accuracy and robust against noise and is able to localize multiple sound sources having non-overlapping spectrums in the frequency domain. The probability density function (PDF) of the time delay of arrival (TDOA) between two signals is derived based on the PDF of the phase angle between two signals. The PDF of the phase angle is derived from a noise model of the input signals. According to the derived probabilistic model, possible locations of the sound sources can be computed. The mean-shift algorithm is used to find clusters of possible locations. Too small clusters are discarded and the centers of the remaining clusters represent the estimated locations of the sound sources. To test the proposed method, a uniform linear microphone array consisting four microphones is constructed to collect sound signals in an outdoor area. The experimental results show that, in a strictly single sound source scenario, the accuracy of the proposed method is comparable to a very popular method known as the phase transform (PHAT) technique. Furthermore, when the background noises, such as the sounds from insects and/or birds, are not negligible, the proposed method outperforms the PHAT method. Additionally, the experimental results of estimating two simultaneous sound sources show that the proposed method also can achieve considerable stability for detecting/localizing multiple sound sources provided that the sound sources have non-overlapping spectrums.

參考文獻


[1] 曾政傑, “基於多重訊號分類之聲源方位偵測,” Master’s thesis, 國立台灣科技大學資訊工程系, 2008.
[2] B. Kapralos, M. R. M. Jenkin, and E. Milios, “Audio-visual localization of multiple speakers in a video teleconferencing setting,” International Journal of Imaging Systems and Technology, vol. 13, pp. 95–105, 2002.
[3] M. Cristani, M. Bicego, and V. Murino, “Audio-visual event recognition in surveillance video sequences,” IEEE Transactions on Multimedia, vol. 9, no. 2, pp. 257–267, 2007.
[4] W. Zajdel, J. D. Krijnders, T. Andringa, and D. M. Gavrila, “CASSANDRA: audiovideo sensor fusion for aggression detection,” in Proceeding of IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 200–205, 2007.
[5] J. Kuklyte, P. Kelly, C. O’Conaire, N. E. O’connor, and L. Q. Xu, “Anti-social behavior detection in audio-visual surveillance systems,” in In: PRAI*HBA - The Workshop on Pattern Recognition and Artificial Intelligence for Human Behaviour Analysis, 2009.

延伸閱讀