透過您的圖書館登入
IP:3.15.4.244
  • 學位論文

使用向量量化技術來達成語音轉換

Using Vector Quantization To Achieve Voice Conversion

指導教授 : 呂育道
共同指導教授 : 李宏毅(Hung-yi Lee)

摘要


本論文提出了一套基於向量量化的深度語者轉換模型。 與此同時,本文對此方法與其他現存的方法做了多種客觀、主觀的評估。結果顯示,本文所提出的方法在流利度以及語者相似度上都比現存的方法優秀。

並列摘要


This thesis proposes a vector quantization-based voice conversion approach. The objective and the subjective evaluations show that the proposed method performs better than other existing approaches in both audio naturalness and speaker similarity

參考文獻


[1] J. Song, P. Kalluri, A. Grover, S. Zhao, and S. Ermon, “Learning controllable fair representations,” in Proceedings of 22nd International Conference on Artificial Intelligence and Statistics, April 16–18, 2019, Naha, Japan, pp. 2164–2173.
[2] F. Villavicencio and J. Bonada, “Applying voice conversion to concatenative singing-voice synthesis,” in 11th Annual Conference of the International Speech Communication Association, September 26–30, 2010, Makuhari, Japan, pp. 2162– 2165.
[3] E. Nachmani and L. Wolf, “Unsupervised singing voice conversion,” in 20th Annual Conference of the International Speech Communication Association, September 15– 19, 2019, Graz, Austria, pp. 2583–2587.
[4] S. H. Mohammadi and A. Kain, “Voice conversion using deep neural networks with speaker-independent pre-training,” in Spoken Language Technology Workshop, December 7–10, 2014, Lake Tahoe, pp. 19–23.
[5] M. Sahidullah and G. Saha, “Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition,” Speech Commun., vol. 54, no. 4, pp. 543–565, 2012.

延伸閱讀