用流派融合賽局來學習混合音樂流派的作曲

音樂創作是一種創造性的進化藝術，可以產生與以前不同的新音樂作品。在這項工作中，我們將音樂創作過程視為一種流派融合遊戲，一種新流派從舊流派演變而來，通過使用風格轉移以微妙的方式將它們簡單地融合在一起，從而產生既具有音樂性又融合了先前流派的音樂。流派。換句話說，我們將作曲視為兩種遊戲:音樂性遊戲和流派融合遊戲。我們提出了一種 ALAE(Adversarial Latent Autoencoder)，一種類似 GAN 的自動編碼器深度學習架構，來實現這種新的音樂創作理念。因此，我們提出了我們系統的兩種變體:1) 僅音樂性系統和 2) 音樂性流派融合系統。在後者中，我們採用三元組損失函數來訓練流派融合遊戲過程。我們的完整音樂流派融合系統與兩個基線進行了比較:隨機採樣融合模型和僅音樂性模型。僅音樂性模型使用相同的 ALAE 架構，但只關注匹配真實的音樂分佈。我們進行了客觀和主觀評估實驗。在客觀實驗中，我們使用了各種指標，例如重建精度、空條比率、複音、音調距離和類型分類器，再次評估我們的系統基線。我們發現我們提出的音樂性流派融合系統能夠學習以融合兩種流派為特徵的風格生成音樂。複音率和空條率接近真實數據分佈的值。雖然增加的音調距離表明我們的系統確實探索了真實音樂數據分佈以外的其他領域，以滿足流派融合遊戲施加的額外限制。在主觀實驗中，我們發現如果流行和搖滾流派樣本來自各自的數據集，他們能夠快速識別它們。然而，當被要求從我們系統中使用的音樂性流派融合系統中表徵樣本時，他們努力選擇特定的流派並更喜歡“流行和搖滾的混合"。我們還發現，我們的完整音樂風格融合系統和音樂風格遊戲都能夠勝過我們工作中使用的基線。音樂創作是一種創造性的進化藝術，可以產生與以前不同的新音樂作品。在這項工作中，我們將音樂創作過程視為一種流派融合遊戲，一種新流派從舊流派演變而來，通過使用風格轉移以微妙的方式將它們簡單地融合在一起，從而產生既具有音樂性又融合了先前流派的音樂、流派。換句話說，我們將音樂創作視為兩種遊戲:流派融合遊戲和音樂性遊戲。我們提出了一種 ALAE(對抗性潛在自動編碼器)，一種類似 GAN 的自動編碼器深度學習架構，以實現這種新穎的音樂創作理念。我們還採用三元組損失函數來訓練流派融合遊戲過程。我們比較的三個基線模型是隨機採樣融合模型，以及刪除了流派融合組件的相同 ALAE 架構。我們進行了客觀和主觀評估實驗。在客觀實驗中，我們使用了各種指標，例如重建準確度、空小節比率、複音、音調距離和流派分類器，我們發現我們提出的模型能夠學習以融合為特徵的風格的音樂。兩種流派。複音率和空條率接近真實數據分佈的值。雖然增加的音調距離表明我們的模型確實探索了數據分佈以外的其他領域，以滿足流派融合遊戲施加的額外約束。在主觀實驗中，我們發現如果流行和搖滾流派樣本來自各自的數據集，他們能夠快速識別;然而，他們在處理來自我們系統中使用的各種模型的樣本時遇到了困難。我們還發現，帶有流派融合遊戲和音樂性遊戲的完整模型能夠勝過我們工作中使用的基線。

關鍵字

音樂生成；風格轉移；流派融合；對抗性潛在自動編碼器

並列摘要

Music composition is an evolutionary art of creativity that can produce new pieces of music in contrast to the previous ones. In this work, we view the music composition process as a genre fusion game where a piece of new genre evolves from old ones by simply fusing them together in a subtle way using style transfer that results in music that is both musical and a fusion of previous genres. In other words, we view music composition as two games: a musicality game and a genre fusion game. We propose an ALAE (Adversarial Latent Autoencoder), a GAN-like autoencoder deep learning architecture, to implement this novel idea of music composition. Thus we present two variants of our system: 1) a musicality-only system and a 2) musicalitygenre-fusion system. In the latter, we adopt a triplet loss function to train the genre fusion game process. Our full musicality-genre-fusion system is compared with two baselines: a random sampling fusion model and the musicality-only model. The musicality-only model uses the same ALAE architecture but focuses only on matching the real music distribution. We conducted both objective and subjective evaluation experiments. In the objective experiments, we used various metrics such as reconstruction accuracy, empty bar ratio, polyphony, tonal distance, and a genre classifier to evaluate our system again the baselines. We found out that our proposed musicality-genre-fusion system was able to learning to generate music in a style characterized by a fusion of two genres. The polyphony rate and empty bar ratio were close to the real data distribution’s values. While the increased tonal distances showed that our system did indeed explore other areas other than the real music data distribution in order to satisfy the additional constraints imposed by the genre fusion games. In subjective experiment, we found out that listeners were able to quickly identify pop and rock genres samples if they come from their respective datasets. How ever, they struggled to choose a specify genre and preferred ‘a mixture of pop and rock’ when asked to characterize the samples from the musicality genre fusion system used in our system. We also found that both our full musicality genre fusion system and the musicality game were able to outperform the baselines used in our work.

並列關鍵字

Music Generation ； Style Transfer ； Genre Fusion ； Adversarial Latent Autoencoders

參考文獻

[1] K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” 2015.

Google Scholar

[2] S. Pidhorskyi, D. A. Adjeroh, and G. Doretto, “Adversarial latent autoencoders,” CoRR, vol. abs/2004.04467, 2020.

Google Scholar

[3] S. Dai, Z. Zhang, and G. G. Xia, “Music style transfer: A position paper,” 2018.

Google Scholar

[4] G. Brunner, Y. Wang, R. Wattenhofer, and S. Zhao, “Symbolic music genre transfer with cyclegan,” 2018.

Google Scholar

[5] W.Marshall,“Dembow,dembow,dembo:Translation and transnation in reggaeton,”Lied und populäre Kultur / Song and Popular Culture, vol. 53, pp. 131–151, 2008.

Google Scholar

國際替代計量

用流派融合賽局來學習混合音樂流派的作曲

全文下載

主題瀏覽