建構在MPEG-7 Audio上的音樂檢索與推薦系統

本論文提出了建構在MPEG-7 Audio上的音樂檢索系統與音樂自動推薦系統。一開始作者介紹MPEG-7 Audio中，有關於音樂的描述部分（旋律Melody跟聲紋Audio Signature）。在音樂檢索系統的部分，其目標是從輸入的MIDI檔，來擷取出Melody Description Scheme所定義的MPEG-7格式，並從查詢的音樂片段中尋找相似的部分；在決定旋律線的過程中，利用最高音高法(The most highest pitch method)來克服存在旋律中的和絃(Chord)問題；並使用Local Alignment這個動態規劃(Dynamic Programming)的演算法，來進行相似度比較。在剖析MPEG-7的檔案上則是使用Xerces的XML剖析器（Parser）。最後系統會建造出一個評分表供使用者來檢驗系統效能。在音樂自動推薦系統上則是採取推薦演算法當中的內容過濾(Content-based Filtering)方法。對於使用者所偏好的音樂，抓出MPEG-7 Audio的聲紋，利用向量量化（Vector Quantization）來分群。一個新的歌進來，可以歸類在不同的群組，給定是否推薦的分數。在音樂檢索系統的部分，在1242首的MIDI歌曲中，頭一個找到相類似的歌曲，在旋律線長度為十的時候，辨識率可達六成五，之後辨識率會隨著旋律線長度增加而提升；推薦系統方面，則讓兩個人來對這571首wav跟MP3的歌曲打喜好的分數，再跟系統所提供的分數來做比較。最後發現平均推薦正確率可達七成五。

關鍵字

內涵式檢索；音樂資訊檢索；推薦系統； MPEG-7 Audio ；旋律；聲紋

並列摘要

In this thesis, the author proposes a music informational retrieval system and an audio recommendation system using MPEG-7 Audio. In the beginning, Melody and Audio Signature Description Scheme which are relevant to music are introduced. For music informational retrieval system, the goal is to retrieve the MPEG-7 Melody Description Scheme from the MIDI input and use them to find the music clips most similar to the queries from the data base. During the melodic contour extraction, the system applies the highest pitch method to overcome the chord problem existed in melody; local alignment, an algorithm in dynamic programming, is utilized to calculated the similarity. The Xerces C++ XML parser is adopted in the system in order to parse the MPEG-7 files. Finally, a rating result is constructed for system performance evaluation. On other hand, the audio recommendation system introduces the content-based filtering approach. Depending on the music data from the user’s preference, the system extracts the corresponding MPEG-7 audio signature and employs LBG Vector Quantization for classification. A new music input may be classified, and be given a score to decide whether the song is recommended. In the part of music information retrieval system, the recognition rate could be arrived at 65% when the first similar candidate at contour length 10 is found in 1242 MIDI files. Then the recognition rate will rise with the increase of contour lengths. In the audio recommendation system’s part, we let two people to give the ratings for 571 wav and MP3 songs according to their preference and some songs are all rated by the recommendation system. After comparing the rating by human and the system, we discovered that the average correct recommendation ratio could reach 75%.

並列關鍵字

content-based retrieval ； music information retriev

參考文獻

[Chen and Hsieh, 1991] S. H. Chen and W. M. Hsieh, “Fast algorithm for

[Tzanetakis and Cook,2002] G. Tzanetakis and P.Cook, “Musical Genre Classification of Audio Signals. ” IEEE Trans. Speech and Audio Signal Processing, vol.10, no.5, 2002

[Herre et al., 2002] J. Herre, O. Hellmuth and M. Cremer, “Scalable Robust Audio Fingerprinting Using MPEG-7 Content Descirption,” IEEE Proc. Workshop on Multimedia Signal Processing, pp.165-168, 2002.

[Huang and Jeng, 2002] Y. C. Huang and S. K. Jeng, “A Music Query System Based on MPEG-7 Audio,” 全國電信研討會 in NCCU, 2002

[Huang and Jeng, 2004] Y. C. Huang and S. K. Jeng, “An Audio Recommendation System Based on Audio Signature Description Scheme in MPEG-7 Audio,” IEEE Proc. ICME’04

國際替代計量

建構在MPEG-7 Audio上的音樂檢索與推薦系統

全文下載

主題瀏覽