自動歌唱評分方法之研究

本研究的目標在於賦予電腦評斷歌唱好壞的能力。雖然許多卡拉OK系統宣稱具有自動歌唱評分功能，但其評分效果往往與事實相去甚遠。主要原因是系統大多僅以粗淺的歌唱音量作為評分依據，以致效果不彰。有鑑於此，本研究嘗試發展較佳的歌唱評分技術。考慮大多數卡拉OK系統是採用雙軌伴唱歌曲，其中每一首歌包括一個純伴奏音軌與另一含有伴奏之歌聲音軌，我們試著從含有伴奏之歌聲中獲取清唱歌聲，以作為受測歌聲的評分參考依據。為了抽取清唱歌聲，我們探討了四種已知的訊號處理方法。第一種方法假設兩音軌的伴奏訊號差別僅在於音量大小與些微的快慢變化，因此將某一音軌的訊號音量與快慢進行最佳化調整後，再與另一音軌的訊號相減以獲得近似的清唱歌聲。第二種方法利用頻譜刪減概念取得清唱歌聲近似頻譜。第三種方法利用維納濾波器來濾除背景音樂，第四種方法考慮由適應性訊號處理技術中的LMS與RLS來估算清唱歌聲。當建立參考歌聲樣本後，系統從動態音量、音高、與音長等三項指標比較受測歌聲與參考歌聲的相似程度，並量化相似程度為分數。而為了評估系統的可靠度，我們邀集了多位歌唱能力不同的人來錄製測試歌聲，並僱請歌唱能力媲美專業歌手的人士進行測試歌聲的人工評分，以作為系統評分正確性與否的參考答案。實驗顯示本論文之歌唱評分系統與人工評分十分接近。

關鍵字

擷取清唱歌聲；音高軌跡；歌唱效能評估

並列摘要

This work aims to develop an automated system capable of evaluating the singing skill of a Karaoke user. Although most Karaoke apparatuses claim to have a function of automatic scoring, their capabilities are far from satisfactory. The major failing arises from the fact that only singing energy is used as a cue for evaluating singing performances. To boost the value of Karaoke apparatuses, this work manages to provide a better solution. First, we study how to extract vocal signals from Karaoke music, so that the evaluation of singing performance can be done by comparing users’ singing signals with the extracted vocal signals. Here, the Karaoke music typically comprises two distinct channels: one is a mixture of the lead vocals and background accompaniment, and the other consists of accompaniment only that sounds similar to the background accompaniment in another channel. We investigate four approaches for vocal extraction. The first approach tries to remove vocal’s background accompaniment by subtracting one channel’s signal from another one’s in the time domain, based on the optimized amplitude scaling and time shifting. The second approach applies spectral subtraction to remove vocal’s background. The third approach performs Wiener filtering to estimate the vocal signals from accompanied vocals. The last approach is concerned with the adaptive filtering techniques, including LMS and RLS. After vocal information is obtained, its energy variations, pitch contour, and tempo are explored as reference patterns to contrast users’ singing. In order to examine our proposed system, we invite several amateur singers with a variety of singing capabilities to contribute test singing samples, and employ persons with professional singing capabilities to score the collected test samples. A preliminary experiment results show that our proposed system match most of the human scoring.

並列關鍵字

vocal extraction ； pitch contour ； singing performance evaluation

參考文獻

[3] W. H. Tsai and H.M. Wang, “Using the similarity of main melodies to identify cover versions of popular songs for music document retrieval,” Journal of Information Science and Engineering, vol. 24, pp. 1669-1687, 2008.

[9] 許泰寧，數位音頻分析系統研究-以應用於KTV評分，碩士論文，台灣大學， 2006。

[1] J. Downie, K. West, A. Ehmann, and E. Vincent, “The 2005 music information retrieval evaluation exchange (MIREX 2005): Preliminary overview,” in Proc. International Conference on Music Information Retrieval, London, UK, 2005, pp. 320–323.

[4] W. H. Tsai and H.M. Wang, “Automatic identification of the sung language in popular music recordings,” Journal of New Music Research, vol. 36, no. 2, pp. 105 - 114, 2007.

[10] H. M. Yu, W. H. Tsai, and H. M. Wang, “A Query-by-Singing System for Retrieving Karaoke Music,” IEEE Transaction on Multimedia, vol.10, No.8, pp. 1626-1637, Dec. 2008.

被引用紀錄

李育瑋（2014）。基於MPEG-7的歌者辨識與歌唱評分系統〔碩士論文，國立交通大學〕。華藝線上圖書館。https://doi.org/10.6842/NCTU.2014.00284

張馨文（2011）。以音域為基礎的自動歌曲推薦系統〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2011.01199

許銘凱（2013）。自動判斷演唱歌詞正確與否之方法研究〔碩士論文，國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-0402201321402100

國際替代計量

自動歌唱評分方法之研究

全文下載

主題瀏覽