透過您的圖書館登入
IP:3.134.77.195
  • 學位論文

變分自動編碼器用於複音音樂插值

Variational Autoencoders for Polyphonic Music Interpolation

指導教授 : 蘇豐文

摘要


本論文旨在使用機器學習技術來解決插值音樂作曲的新型問題。我們提出兩個基於變分自動編碼器的模型來給予兩首歌曲之間生成適當的多音軌旋律,以便流暢地改變音高與動態去橋接。第一個模型產生的插值音樂表現超越隨機產生的資料基底與雙向LSTM的方法,其表現可與當前最新技術相媲美。而第二個新穎架構的模型用超越目前技術水準的插值方法去重建誤差,它利用額外的類神經網路去直接估算插值編碼的向量。此外,我們製造的新竹插值MIDI資料集使得訓練文獻中的方法與論文中的模型在計算與時間要求上更有效率。最後我們完成量化的使用者調查去確保結果的效力。

並列摘要


This thesis aims to use Machine Learning techniques to solve the novel problem of music interpolation composition. Two models based on Variational Autoencoders (VAEs) are proposed to generate a suitable polyphonic harmonic bridge between two given songs, smoothly changing the pitches and dynamics of the interpolation. The interpolations generated by the first model surpass a Random data baseline and a bidirectional LSTM approach and its performance is comparable to the current state-of-the-art. The novel architecture of the second model outperforms the state-of-the-art interpolation approaches in terms of reconstruction loss by using an additional neural network for direct estimation of the interpolation encoded vector. Furthermore, the Hsinchu Interpolation MIDI Dataset was created, making both models proposed in this thesis more efficient than previous approaches in the literature in terms of computational and time requirements during training. Finally, a quantitative user study was done in order to ensure the validity of the results.

參考文獻


[1] L. Weng, “From autoencoder to beta­vae.” http://lilianweng.github.io/lil-log/ 2018/08/12/from-autoencoder-to-beta-vae.html,2018.
[2] C.Doersch,“Tutorialonvariationalautoencoders,”2016.
[3] D. P. Kingma and M. Welling, “An introduction to variational autoencoders,” FoundationsandTrendsinMachineLearning,pp.1–18,2019. [4] A.TöscherandM.Jahrer,“Thebigchaossolutiontothenetflixgrandprize,”2009.
[5] N. Jiang, S. Jin, Z. Duan, and C. Zhang, “Rl­duet: Online music accompaniment generationusingdeepreinforcementlearning,”2020.
[6] S. I. Mimilakis, E. Cano, J. Abeßer, and G. Schuller, “New sonorities for jazz recordings: Separationandmixingusingdeepneuralnetworks,”2016.

延伸閱讀