  • 學位論文


Querying By Humming System: Improved Onset Detection and Modified Melody Matching

指導教授 : 丁建均


近幾年, 音樂檢索的技術開始被深入的研究。傳統上,我們使用歌名或歌手等以文字為基礎的方式來進行音樂檢索。然而,假如我們忘記歌名跟歌手等文字線索那要怎麼辦呢?因此,接下來我們將介紹一個新穎的觀念,利用歌曲中的旋律內容來進行檢索。哼唱詢問(QBH)是人們跟電腦透過網際網路進行音樂搜尋的互動式觀念,而在一個龐大的音樂資料庫底下,此搜尋結果需要快速的搜尋時間。   大致上來說,我們嘗試著取出一段哼唱旋律的音頻並且比較其音頻在音樂資料庫的最大相似度。在資料庫中,高相似度的曲目將會被列出並且依照相似度由高往低排列。這篇論文將會介紹發端識別,音頻偵測,以及旋律比對的演算法。然而,由人們產生的歌聲總是不完美的,因此,仍然有許多困難之處留待我們去解決和改善。每一個人有不同的演唱方式且會導致各種不同型態的聲音波形,像是音調的高低,音準的準確性,尾音哼唱方式,演唱方式等等..,而這些因素也造成發端識別上的困難。   因此,在這篇論文中,我們主要提出了三種改善哼唱詢問(QBH)的方式,一種是針對發端識別做改善,而另外兩種則是針對旋律比對做改善,此外,在音頻偵測的方面,我們也採用了自己的方式。   最後,未來的研究應該更致力於改善哼唱詢問的效能,使它針對不同演唱風格的情況下能更具可適性以及信賴性。另外,在進行旋律比對方面上,我們要求更快的搜尋時間以及較低的複雜度,特別是在一個龐大的音樂資料庫底下。


Music retrieval techniques have been investigated in recent years. Typically, we use the names of singers or songs as retrievals. However, how do we search songs when we forget the names of singers and songs? Hereunder, we will introduce a novel concept for music retrieval searching by using any melodic passage of a song. ‘Querying by humming’ (QBH) is a interaction concept for people to interact with computer through internet, and the searching results are revealed fast and orderly by comparing the sung input with a large database of known songs. Generally speaking, we try to extract a series of the pitches form the humming input by a single individual, and compare these pitches with pitch interval of the known musical database. Melodies (theme) of the database are similar to the sung input are retrieved and listed orderly depending on its similarity score. This paper will present the algorithm for note onset detection (event detection), pitch detection, pitches quantization, melody encoding and melody matching (similarity matching or pattern matching). However, Human reproduction of melodies is always imperfect, therefore, there are many difficulties for us to overcome and improve. Every individual has different singing style that results as a variety of patterns of sung inputs that make note onset detection become more difficult. Besides, the accuracy rate of onset detection also influences the performance of melody matching. Therefore, in this thesis, we mainly proposed three methods to improve the QBH system, one is for improving onset detection and other two are for improving melody matching. Besides, we use our own method to extract the fundamental frequency. For that, future research should make an effort to improve event detection, making it more adaptive and reliable to deal with various situations. Furthermore, research on measuring similarity between melodies in database and sung theme need to be further pursued to reduce the computation time and complexity while the amounts of musical database have been exploding nowadays.


B. Onset Detection
[4] A. Klapuri, “Sound Onset Detection by Applying Psychoacoustic Knowledge,” in Proc. of IEEE International Conference on Acoustics, Speech and Signal, 1999.
[7] J. Foote, “Automatic audio segmentation using a measure of audio novelty,” in Proc. of IEEE International Conference on Multimedia and Expo, issue 1, pp. 452–455, 1999
[8] P. Masri, and A. Bateman, ” Improved modelling of attack transients in music analysis-resynthesis,” in Proc. of International Computer Music Conference (ICMC 96), Hong-Kong, Aug 1996,
[10] A. de Cheveigne and H. Kawahara, “Yin, a fundamental frequency estimator for speech and music,” in Proc. of Acoust. Soc. Am., vol, 111, Issue, 4 pp. 1917-1930, April 2002.
