於流行音樂中尋找相似唱腔歌手之研究

流行音樂是普羅大眾接受度最高的音樂，也是數量最多的音樂型態。為了滿足人們對於檢索音樂的需求，已提供許多便利的查詢方式，當查詢者只記得與查詢對象具相似歌聲的歌手名稱，而忘記查詢對象的人名或歌曲名稱時，往往還是找不到想聽的音樂。在目前相關的研究主題中，尚未有提供相似歌聲進行音樂查詢的方式。有鑑於此，本論文提出尋找相似唱腔歌手的研究，藉由歌聲的音色探勘出相似唱腔的歌手，我們利用符合人耳聽覺敏銳頻率的特徵：經由離散傅立葉轉換後能觀察唱腔變化和共振峰之頻譜統計值，用來象徵歌手唱腔的特色，並利用歐基里德距離計算歌手唱腔的相似性，藉由相似度的比較結果，將能提供利用相似歌聲特色之其他歌手人名進行音樂之查詢功能，或進行以相似歌聲為基礎之音樂分類。經由本論文實驗證明，所採用的特徵擷取方法，確實有效辨識出歌手唱腔相似的歌曲。

關鍵字

離散傅立葉轉換；共振峰；歐基里德距離；歌聲相似度

並列摘要

Pop music is the most favor style to people, also the largest number of music. Although the existed music retrieval systems provide varies and conveniences query methods for users, there is no system that can retrieve music when the user only remembers the vocal timbre. However, if the user knows the other singer’s name that has the similar timbre, the music retrieval system should provide the query way by the similar singer’s name. Therefore, this research presents a way to find similar singers based on vocal timbre i.e. by sound of singing to find the other forgot singer with the similar sound. In this research, we adopt two kinds of features to evaluate the difference of the singer, which meet the sensitivity of human hearing. The extracted features include statistics of the spectrums which come from discrete Fourier transform and formants. Moreover, Euclidean distance is used to evaluate the difference of feature of vocal sound. According to the result of experiment, our method can be the basis of the novel retrieval method and music classification by vocal sound.

並列關鍵字

Discrete Fourier Transform ； Formant ； Euclidean Distance ； Vocal Similarity

參考文獻

[1]Bele, I. V., ”The Speaker’s Formant,” Elsevier, Journal of Voice, Vol. 20, No. 4, pp. 555-578, 2006

[2]Brown, W. S. Jr., Rothman, H. B. and Sapienza, C. M., ”Perceptual and Acoustic Study of Professionally Trained Versus Untrained Voices,” Elsevier, Journal of Voice, Vol. 14, No. 3, pp. 301-309, 2000

[3]Cheng, C. C. and Hsu, C. T., “Content-Based Audio Classification with Generalized Ellipsoid Distance,” Proceedings of the Third IEEE Pacific Rim Conference on Multimedia(PCM), pp. 328-335, 2002

[4]Chow, D. and Abdulla, W. H., ”Robust Speaker Identification Based on Perceptual Log Area Ratio and Gaussian Mixture Models,” Proceedings of 8th International Conference on Spoken Language Processing(INTERSPEECH), pp. 1761-1764, 2004.

[6]Dhanalakshmi, P., Palanivel, S. and Ramalingam, V., ”Classification of Audio Signals using AANN and GMM,” Elsevier, Applied Soft Computing, Vol. 11, No. 1, pp. 716-723, 2011

國際替代計量

於流行音樂中尋找相似唱腔歌手之研究

未授權

主題瀏覽