利用多特徵訓練對吉他演奏進行自動採譜

自動音樂採譜Automatic Music Transcription (AMT) 定義為將原聲音樂訊號轉換成音樂記號。在過去的研究裡，較多研究是針對鋼琴獨奏或者多樂器演奏去進行自動採譜，而較少AMT系統針對吉他所彈奏出的音樂去做類似工作。因為吉他歌曲通常是在六根弦上，以不同的指法、刷弦、和弦進行等技巧去彈奏，其中還可能會有單音、和弦的彈奏方式。模型必須在一首吉他演奏曲中，從六根不同的弦所彈奏而成的豐富的諧波，辨識出所彈奏的音符。在一首歌曲中，單音的部分極大機率為和弦音，且大部分的音傾向於出現在拍點、或在拍點相關位置（後半拍）。因此，在這項研究中，我們將針對以吉他彈奏出的歌曲做自動採譜，除了使用音符（Note）做為輸出標籤，也將側面資訊：和弦（Chord）、拍點（Beat）一併考慮。過去在AMT的子任務裡，音符層級的採譜任務 (Note-level Transcription) 通常只會使用音符做為輸出標籤。我們做了數個多任務學習(Multitask learning)的實驗，同時輸出音符、和弦以及拍點標籤，希望能藉此提高音符在吉他曲中轉錄的效能，同時也記錄了和弦辨識、拍點追蹤在這個系統裡的功效。

關鍵字

自動音樂採譜；吉他轉譜；深度學習；多音預測；多任務學習

並列摘要

Automation Music Transcription (AMT) is the conversion of acoustic music signals into musical notations. Most previous AMT studies focused on solo-piano or multi-instrument transcription, while a few AMT systems did similar work on guitar audio. A guitar piece is composed by playing on six strings, with varied fingerings, brush strings, chord progression, and other techniques. Furthermore, the guitar piece would sound different depending on whether it was played solo or played with chords. An AMT model must identify the notes in a guitar piece with rich harmonic information played from six distinct strings. Furthermore, the note has a significant probability of being a chord, and most notes tend to appear at the beat position or the relevant position (i.e., the second half of the beat). To deal with the complex musicality of guitar pieces, in this research, we will not only use note information as the output label for the AMT system on guitar transcription but also take the chord and beat information into consideration. In most AMT subtasks, note-level transcription tasks used only note information as out-put labels. We have performed several experiments with multitask learning and output notes, chords, and beat labels simultaneously, trying to improve the performance of notes transcription in guitar audio. We have also recorded the predicted results of chord recognition and beat tracking in this system.

並列關鍵字

Automatic Music Transcription ； Guitar Transcription ； Deep Learning ； Multi-Pitch Estimation ； Multitask Learning

參考文獻

[1] Klapuri, Anssi and Manuel Davy, “Signal Processing Methods for Music Transcription,” 2007.

Google Scholar

[2] Chaitanya, M. Sai and Soubhik Chakraborty, “Musical information retrieval. Signal Analysis and Feature Extraction using Python,” GRIN Verlag, 2021.

Google Scholar

[3] Benetos, Emmanouil, Simon Dixon, Dimitrios Giannoulis, Holger Kirchhoff and Anssi Klapuri, “Automatic music transcription: challenges and future directions,” Journal of Intelligent Information Systems, 2013, 41.3: 407-434.

Google Scholar

[4] Benetos, Emmanouil, Simon Dixon, Zhiyao Duan and Sebastian Ewert, “Automatic Music Transcription: An Overview,” IEEE Signal Processing Magazine, 2018, 36.1: 20-30.

Google Scholar

[5] Wu, Yu-Te, Berlin Chen and Li Su, “Polyphonic Music Transcription with Semantic Segmentation,” ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 166-170, doi: 10.1109/ICASSP.2019.8682605.

Google Scholar

主題瀏覽