利用貝氏方法透過聲音指紋辨識場所

近年來,在手機或其他行動設備上,利用聲音的背景頻譜來辨識使用者的所在位置已經越來越盛行。利用聲音來做辨識的原因是因為聲音具有完備的性質、簡單計算、不易受到短暫臨時的聲音影響和聲音的獨特性。一般來說,聲音頻譜的分析是透過Mel-frequency cepstral coeffcients (MFCCs) 將一維度的時域聲音資料轉換到高維度的頻域資料,再用混合高斯分佈(Gaussian mixture model GMM) 作估計。估計混合高斯分佈的參數,常見作法為最大期望演算法(EM-algorithm)。在這篇論文裡,改用貝氏方法來估計混合高斯分配,並且把結果和最大期望演算法做比較。在模擬和分析實際數據中,貝氏方法均表現得比最大期望演算法準確。

關鍵字

聲音背景頻譜； MFCC ；混合高斯分佈；最大期望演算法；貝氏方法

並列摘要

Location Recognition has attracted a lot of researchers' attention in the past few years. Recently, using acoustic background spectrum for indoor localization through mobile device, such as smartphone, had been discussed in the literature. The advantage of location recognition through audio is because it is compact, easy computed, robust to transient sounds, and distinctive. In general, acoustic spectrum analysis is based on Mel-frequency cepstral coefficients (MFCCs) to transform audio data to high dimensional data, and these high dimensional data can be fitted by Gaussian mixture models. Based on the Gaussian mixture models fitted by the training data, we can classify the audio fingerprint of a location to a Gaussian mixture model with the highest likelihood value. To estimate the parameters in the Gaussian mixture models, it is common to use the expectation-maximization algorithm (EM-algorithm) to estimate the parameter. In this thesis, we apply the generalized Bayes method to estimate the parameter in GMMs, and compare it with the EM-algorithm. In a simulation study and a real data example, the generalized Bayes method is shown to have better performance than the EM algorithm.

並列關鍵字

Acoustic background spectrum ； MFCC ； GMM ； EM-algorithm ； Generalized Bayes method

參考文獻

[7] Efron, B. (2012). Bayesian inference and the parametric bootstrap. Annals of Applied Statistics, 6, 1971-1997.

[8] Kanungo,T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Angela Y. Wu, A. Y. (2002). An Efficient k-Means Clustering Algorithm: Analysis and Implementation., IEEE Transactions on pattern analysis and machine intelligence. 24, 881-892.

[11] Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. Speech and Audio Processing, IEEE Transactions on, 3(1), 72-83.

[12] Robert, C. P., Celeux, G., & Diebolt, J. (1993). Bayesian estimation of hidden Markov chains: A stochastic implementation. Statistics & Probability Letters, 16(1), 77-83.

[14] Sivaprakasam, T., & Dhanalakshmi, P. (2013). A Robust Environmental Sound Recognition System using Frequency Domain Features. International Journal of Computer Applications, 80, 5-10.

國際替代計量

利用貝氏方法透過聲音指紋辨識場所

全文下載

主題瀏覽