透過您的圖書館登入
IP:3.149.26.176
  • 學位論文

結合頻率域損失之生成對抗網路影像合成機制

Image Synthesis Using Generative Adversarial Network with Frequency Domain Constraints

指導教授 : 廖文宏

摘要


生成對抗網路的技術不斷精進,所產生的圖像人眼往往無法辨別是真實或合成,然而由於生成對抗網路在學習過程較難重建高頻資訊,導致在頻率域上可觀察到偽影,因此能被檢測模型輕易的辨識出來。同時也有研究指出頻率上的高頻分量,不利於生成對抗網路進行學習,因此如何在生成圖像時兼顧頻率域的學習效果,成為一大挑戰。 本論文從頻率域的角度著手,除了驗證去除掉部分高頻上的雜訊,的確能夠更有效幫助生成對抗網路之學習,也提出了利用添加頻率損失的方式來改善訓練效果。經實驗發現利用離散傅立葉轉換或是離散小波轉換的損失,都能有效幫助生成對抗網路產生品質更好的圖像,在CelebA人臉資料集上,添加離散小波損失的生成圖FID最佳能達到6.53,比起SNGAN的FID為16.53進步許多,添加頻率損失的模型在訓練上也更加的穩定。另外本論文也使用通用的真偽分類模型進行測試,其改善後的模型所產生的圖片能讓辨識準確率有效降低,代表了經過改進後的模型生成的圖像更加逼真,證實了提供頻率的資訊給生成對抗網路的確有助於訓練流程,也提供後續對於生成對抗網路的研究有更多的參考方向。

並列摘要


Generative adversarial networks (GAN) have evolved rapidly since its introduction in 2014. The quality of synthesized images has improved significantly, making it difficult for human observer to tell the real and GAN-created ones apart. Due to GAN’s inability to faithfully reconstruct high frequency components of a signal, however, artifact can be observed using frequency domain representation, which can be easily detected using simple classification models. Researchers have also studied the adverse effects of high frequency components in the training process. It is a thus challenging task to synthesize visually realistic images while maintaining fidelity in the frequency domain. This thesis attempts to enhance the quality of images generated using generative adversarial networks by incorporating frequency domain constraints. To begin with, we observe that the overall training process has become more stable by filtering out high-frequency noises. We then propose to include frequency domain losses in the generator and discriminator networks to investigate their effects on the generated images. Experimental results indicate that both discrete Fourier transform (DFT) and discrete wavelet transform (DWT) losses are effective in improving the quality of the generated images, and the training processes turn out to be more stable. We verify our results using a classification model designed to detect fake images. The accuracy is significantly reduced using images generated by our modified GAN mode, demonstrating the advantages of incorporating frequency domain constraints in generative adversarial networks.

參考文獻


[1] Y.LeCun, K.Kavukcuoglu, andC.Farabet, “Convolutional networks and applications in vision,” in ISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, 2010, pp. 253–256, doi: 10.1109/ISCAS.2010.5537907.
[2] Y.Lecun, Y.Bengio, andG.Hinton, “Deep learning,” Nature, vol. 521, no. 7553. Nature Publishing Group, pp. 436–444, May27, 2015, doi: 10.1038/nature14539.
[3] I. J.Goodfellow et al., “Generative Adversarial Nets.” [Online]. Available: http://www.github.com/goodfeli/adversarial.
[4] T.Karras, S.Laine, andT.Aila, “A Style-Based Generator Architecture for Generative Adversarial Networks,” Dec.2018, Accessed: Dec.13, 2020. [Online]. Available: https://arxiv.org/abs/1812.04948.
[5] S.Lyu, “DEEPFAKE DETECTION: CURRENT CHALLENGES AND NEXT STEPS.” Accessed: Apr.23, 2021. [Online]. Available: https://deepfakedetectionchallenge.ai.

延伸閱讀