透過您的圖書館登入
IP:3.22.171.136
  • 學位論文

以變分自動編碼器與生成對抗式網路對臉部圖片語意分解

Face image semantically decomposing with variational autoencoder and generative adversarial network

指導教授 : 王勝德

摘要


語意分解生成對抗式網路是從生成對抗式網路衍生出的一種新型架構,它將臉部圖片在語意上解構為代表一個人獨特五官的「身分」以及代表剩餘所有其他面部特徵如:角度、光照、髮型……等的「觀測」。我們基於其概念將生成對抗式網路進一步與變分自動編碼器結合,提出一個新的網路模型。在此模型中,變分自動編碼器能在編碼階段將一臉部圖片編碼至一高斯分布的向量空間、同時分解為身份與觀測兩部分,再從這兩部分重建回原圖;而生成對抗網路則能夠利用其對抗性使得解碼器產生的圖片能夠有接近真實照片的品質。這樣的架構使這個模型能夠更直觀更有效率地從隨機分布或者是現有圖片的身分和觀測來產生新的臉部圖片。在模型的驗證上,我們展示了身分與觀測對於生成圖片的影響,並證明了在語意分解上的成功。

並列摘要


As one of state-of-the-art generative adversarial models, SDGANs can generate face images from two part of semantic meanings, “identity” and “observation”. In its statement, identity stands for a person’s unique facial features, and observation stands for all the other features including pose, lighting, color of hair, etc. We extend and combine SDGANs with variational autoencoder, introducing a new network architecture. In this model, variational autoencoder part can encode a face image to vector space of a Gaussian distribution, decompose to identity and observation part in encoding phase, and reconstruct it in decoding phase. The generative adversarial network part can enhance the quality of images generated by decoder, making it photo-realistic. This architecture enable the model to generate new image from either a random distribution or from identity or observation component of an existing image more intuitively and more efficiently. To verify our model, we demonstrate how identity and observation affect generated images and prove the success in semantic decomposition.

參考文獻


[2] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems 2015. p. 91-99.
[5] Xiang Wu, Ran He, Zhenan Sun, and Tieniu Tan. A Light CNN for Deep Face Representation with Noisy Labels. arXiv preprint arXiv:1511.02683, 2015.
[8] Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802.
[11] Aaron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759, 2016.
[12] Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, and Julian McAuley. Semantically Decomposing the Latent Spaces of Generative Adversarial Networks. arXiv preprint arXiv:1705.07904, 2017.

延伸閱讀