透過您的圖書館登入
IP:52.14.240.178
  • 學位論文

自動樂譜辨識

Automatic Music Score Recognition

指導教授 : 劉奕汶

摘要


在音樂應用上,人們發明樂譜原先是為了方便以圖像的方式紀錄一段音樂的資訊,而光學音樂辨識旨在設計一套演算流程,讓電腦也能夠自動辨識原先是設計給人閱讀的樂譜。一般而言,樂譜被存為電子檔的格式大多為圖片檔,因此光學音樂辨識的目的在於從一張圖片上取得其音樂資訊。本論文主要探討兩個面向:樂譜的前處理以及對於單一組五線譜的辨識演算。一個樂譜會先經過前處理將其分割成更小的單位來獨立運算以及處理一些印刷上所造成的雜訊或瑕疵,讓後續的辨識能夠得到最好的輸入圖片。辨識則是本論文的核心,本論文以樣本匹配法及支持向量機實作辨識演算法,在實際的樂譜圖片上都有不錯的結果。除此之外,在演算法的設計上也與以往有所不同。第一點,在前處理中使用隨機抽樣一致法,使其結果多了隨機性,每一次的結果在同一張圖上都會不一樣,因此讓重複執行變得有意義。其不同次執行的結果,可以歸納出一個更好的結果,使一些原先穩定演算法無法辨識到的符號因為其隨機性而有機會被辨識。第二點則是其演算法基於分治法的概念,意即其分割出來的子問題幾乎是完全獨立的,也因此讓此實作更適合平行處理來加快運算速度。

並列摘要


The purpose of optical music recognition is to develop a computer program that is able to understand the musical score, which is invented for human beings to annotate melody. A score is usually stored as an image. Therefore, a recognition system must retrieve musical information from a set of pixels. This dissertation deals with two major issues: preprocessing and recognition. Preprocessing aims at dividing the input image into several slices that can be processed independently and handling the defects in the printing step. The goal of preprocessing is to simplify the subsequent recognition stage. Afterward, recognition on a staff image is the core of this dissertation. The implementation is based on template matching and the support vector machine. For real score images, the present algorithm works well. The design of the present algorithm brings a different perspective to optical music recognition. First, the preprocessing uses random sample consensus (RANSAC) as a part of staff detection. Such randomness makes it meaningful to repeat the same operation; by comparing the results between different iterations, consensus-based correction provides possibility of finding symbols that other existing stable algorithms cannot find. Secondly, the algorithm is based on the divide and conquer concept, which means the subtasks have little correlation, and hence the algorithm can be readily parallelized.

參考文獻


[2]  O. Nobuyuki, “A threshold selection method from gray-level histograms,” IEEE Trans. Systems, Man and Cybernetics, vol. 9, pp. 62–66, 1979.
[3]  Q. Chen, Q.-s. Sun, P. A. Heng, and D.-s. Xia,“A double-threshold image binarization method based on edge detector,” Pattern Recognition, vol. 41, pp. 1254–1267, 2008.
[4]  L.-K. Huang and M.-J. J. Wang, “Image thresholding by minimizing the measures of fuzziness,” Pattern Recognition, vol. 28, pp. 41–51, 1995.
[5]  D.-M. Tsai,“A fast thresholding selection procedure for multimodal and unimodal histograms,” Pattern Recognition Letters, vol. 16, pp. 653–666, 1995.
[8]  K. T. Reed and J. Parker, “Automatic computer recognition of printed music,” in Proc. the 13th Int. Conf. on Pattern Recognition, vol. 3, p. 803–807, 1996.

延伸閱讀


國際替代計量