改進以地標為基礎的音訊指紋辨識

摘要音樂聲紋辨識是一種快速的音樂檢索方式，透過麥克風收音，將錄製的歌曲傳送到辨識系統進行運算，最後將最符合的結果回傳給使用者。在本論文中介紹了幾種不同的音樂聲紋辨識方法，並且提出關於前處理的改良方法，以及使用了新的特徵和機器學習改良原來的重新排序方法。我們在前處理的時候將power spectrum能量小於0的數值設成0，由於這些部分無法抗噪，並且加上了一個頻率方向的高通濾波器。我們在改良重新排序的部分，使用Haistma [8]的方法抽取新的特徵並比較相似度，最後使用機器學習方法找出新特徵和原有的兩個特徵相似度的加權總和來改進原有的重新排序的辨識率。使用的三種機器學習方法是Pranking，Ranking SVM，基因演算法，其中表現最好的是基因演算法。最後的結果可以讓辨識率從81.21%提高到86.04%。

關鍵字

音樂檢索；聲紋辨識；重新排序； Pranking ； Ranking SVM ；基因演算法

並列摘要

Abstract Audio Fingerprint (AFP) is a fast way of music retrieve. It first records a segment of a music through the microphone on a cellphone or tablet device, and sends the recorded segment to the server for AFP computation. This paper describes several audio fingerprint methods, and put forward improved methods for preprocessing, and using new feature with machine learning to improve the original re-ranking method used. In the preprocessing phase, we set those value in power spectrum with energy smaller than 0 to 0, because these values are not robust to noise, and a high pass filter in the frequency direction added. To improve the original re-ranking method, we extract the new feature with the method proposed by Haistma [8] and compare the similarity of the new feature. Then, using machine learning methods to find a weighted sum of the simlilarities of the new feature and two original features. The machine learning methods used are Pranking, Ranking SVM and Genetic Algorithm. Genetic Algorithm have the best performance among the three methods. The final result is recognition rate raised from 81.21% to 86.04%.

並列關鍵字

music retrieval ； audio fingerprinting ； re-ranking ； Pranking ； Ranking SVM ； Genetic Algorithm

參考文獻

[5] CC. Wang, MH Lin, JSR Jang, W Liou, An Effective Re-ranking Method Based on Learning to Rank for Improving Audio Fingerprinting, APSIPA, 2014.

[6] Christopher J. C. Burges, John C. Platt, and Soumya Jana, Distortion Discriminant Analysis for Audio Fingerprinting, IEEE Transactions on Speech and Audio Processing, 2003.

[7] Xavier Anguera, Antonio Garzon and Tomasz Adamek, MASK: Robust Local Features for Audio Fingerprinting, IEEE International Conference on Multimedia and Expo, 2012.

[8] Jaap Haitsma, Ton Kalker, A Highly Robust Audio Fingerprinting System, ISMIR, 2002.

[10] Henrique Malvar, A Modulated Complex Lapped Transform and its Applications to Audio Processing, IEEE International Conference on Acoustics, Speech, and Signal Processing, 1999

國際替代計量

改進以地標為基礎的音訊指紋辨識

全文下載

主題瀏覽