透過您的圖書館登入
IP:216.73.216.156
  • 學位論文

對齊和非對齊之影像轉換學習

Learning Aligned and Misaligned Image-to-Image Translation

指導教授 : 陳煥宗

摘要


本論文討論對齊以及非對齊圖片組的圖像轉換。在非對齊圖片組的圖像轉換 上,我們提出了相對應的中文手寫字合成問題。方法上 ,我們使用了串連式 的對抗生成網路來解決非對齊圖像與 U 型生成器的衝突,並證明此網路同 時有效地解決了非對齊圖片組造成一般對抗生成網路的模式崩塌問題。在對 齊圖像的轉換上,我們探討了深度學習模型在只看過一張訓練圖片的情況下 應該如何因應,在此,我們提出了兩步驟的模型訓練方式,補足了訓練資料 短缺造成的圖像模糊問題,在特徵提取上,也進一步獲得圖像內部群組的關 係,並證明了在我們訓練方式下所合成的圖片最受到人們的喜愛。

並列摘要


This research aims to address the aligned and misaligned image-to-image translation problems. For misaligned image-to-image translation, we study the corresponding Chinese handwriting synthesize problem. We introduce the Cascaded-GAN to handle the incompatibility between U-Net and the mis- aligned training image pairs. Cascaded-GAN efficiently solves the mode col- lapsing problem. For aligned image-to-image translation, we discuss how a deep learning model may tackle the one-shot learning scenario on image trans- lation. We propose a two-step training strategy to solve the blurry image result due to the lack of training data. Furthermore, we successfully get the group- ing information when extracting features. Finally, we show that most people prefer the synthesized images from our model.

參考文獻


[1] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
[2] S.BenaimandL.Wolf.One-shotunsupervisedcrossdomaintranslation.InAdvances in Neural Information Processing Systems, pages 2104–2114, 2018.
[3] Y.Choi,M.Choi,M.Kim,J.-W.Ha,S.Kim,andJ.Choo.Stargan:Unifiedgenerative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8789–8797, 2018.
[4] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understand- ing. In Proceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 3213–3223, 2016.
[5] C. Dong, C. C. Loy, K. He, and X. Tang. Image super-resolution using deep convo- lutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2015.

延伸閱讀