語意分解生成對抗式網路是從生成對抗式網路衍生出的一種新型架構,它將臉部圖片在語意上解構為代表一個人獨特五官的「身分」以及代表剩餘所有其他面部特徵如:角度、光照、髮型……等的「觀測」。我們基於其概念將生成對抗式網路進一步與變分自動編碼器結合,提出一個新的網路模型。在此模型中,變分自動編碼器能在編碼階段將一臉部圖片編碼至一高斯分布的向量空間、同時分解為身份與觀測兩部分,再從這兩部分重建回原圖;而生成對抗網路則能夠利用其對抗性使得解碼器產生的圖片能夠有接近真實照片的品質。這樣的架構使這個模型能夠更直觀更有效率地從隨機分布或者是現有圖片的身分和觀測來產生新的臉部圖片。在模型的驗證上,我們展示了身分與觀測對於生成圖片的影響,並證明了在語意分解上的成功。
As one of state-of-the-art generative adversarial models, SDGANs can generate face images from two part of semantic meanings, “identity” and “observation”. In its statement, identity stands for a person’s unique facial features, and observation stands for all the other features including pose, lighting, color of hair, etc. We extend and combine SDGANs with variational autoencoder, introducing a new network architecture. In this model, variational autoencoder part can encode a face image to vector space of a Gaussian distribution, decompose to identity and observation part in encoding phase, and reconstruct it in decoding phase. The generative adversarial network part can enhance the quality of images generated by decoder, making it photo-realistic. This architecture enable the model to generate new image from either a random distribution or from identity or observation component of an existing image more intuitively and more efficiently. To verify our model, we demonstrate how identity and observation affect generated images and prove the success in semantic decomposition.