智慧美顏系統之研究

人臉美化在影像處理中是相當常見的功能，但大多數的方法都需要使用者自行調整演算法的參數，才能達到最佳人臉美化的效果，因此智慧美顏系統的開發是一個重要研究的方向。在2017年Zhu等人提出週期一致生成對抗網路(cycle-consistent generative adversarial networks: CycleGAN)[4]，來完成非配對式影像對影像轉譯(unpaired image-to-image translation)，並應用在智慧美顏系統上，確實可以達到人臉美化，然而也遭遇到輪廓或邊緣遺失的問題。為了解決上述問題，在2018年Chang等人提出配對式CycleGAN(PairedCycleGAN)[10]，他們將影像分割成不同區域，並利用影像處理製作更好的正樣本(positive samples)，再以配對方式分別進行區域網路學習，從實驗結果顯示，PairedCycleGAN的確能達到更好的人臉美化效果，但是在製作正樣本的過程，則會增加系統的複雜度，且不同區域網路的訓練，也會大幅提升記憶體的使用量。因此，本論文結合CycleGAN和PairedCycleGAN的系統架構，提出一個改良式CycleGAN(modified CycleGAN: MCycleGAN)，我們所提MCycleGAN的系統架構，除保持非配對訓練外，也分別以Sobel和Gaussian濾波器用於西方人臉資料集、Sobel和雙邊(bilateral)濾波器用於亞洲人臉資料集，來提取影像特徵製作正樣本，完成更佳的人臉美化。論文所提的MCycleGAN美顏系統，除了不會增加系統複雜度和記憶體使用量，同時也解決邊緣細節遺失的情形。從實驗結果可以發現，所提MCycleGAN在語義分割的IOU(intersection over union)量測結果顯示，當採用Sobel和Gaussian濾波器製作西方人臉正樣本下，會比CycleGAN提升0.07，另採用Sobel和bilateral濾波器製作亞洲人臉正樣本下，會比CycleGAN提升0.065。另外一方面，我們也藉由從事影像工作的專業人員就MCycleGAN和CycleGAN來進行主觀量測，當有原始影像可供參考比較時，他們對MCycleGAN喜愛度的百分比，平均高出CycleGAN 16.5%，若只有美化影像互相比較的情形下，MCycleGAN喜愛度的百分比，平均高出CycleGAN 16%。

關鍵字

none

並列摘要

It is a very general function to beautify a face in image processing domain. But, most methods require to perform the function by mean of adjusting the parameters to achieve the best beautified face effect manually. Therefore, how to develop an intelligent beautified face system is an important studying topic. In 2017, Zhu et al. proposed a cycle-consistent generative adversarial networks (CycleGAN) to complete unpaired image-to-image translation [4]. The CycleGAN can be successfully applied to beautify a face, but it suffers a problem of blurring contours or edges. Therefore, Chang et al. further proposed a paired CycleGAN (PairedCycleGAN) system to improve the problems of CycleGAN in 2018 [10]. They firstly divided the image into different regions and used image filtering methods to make better positive samples, and then performed a regional network learning using the paired CycleGAN system. The PairedCycleGAN can indeed achieve better beautified face effects from the demonstration of their experiment results. However, the process of making positive samples will increase the complexity of the system, and the training of different local networks also occupy very large memory. Therefore, in this thesis, we proposed a modified CycleGAN (MCycleGAN) to obtain best beautified effects by making some suitable positive samples to CycleGAN system. The proposed MCycleGAN system is not only to keep unpaired training architecture, but also to employ the Sobel and Gaussian filters to extract image features as positive samples to complete better beautified effects. Therefore, our method can overcome the problems of contours and edges due to combination of CycleGAN and PairedCycleGAN. Simulation results show that the proposed MCycleGAN system can obtain higher intersection over union (IOU) values than those of the MCycleGAN. We can find that it can increase the value of IOU about 0.07 and 0.065 on an average when compared with CycleGAN by using Western face and Asian face sets, respectively. On the other hand, a subjective measurement between MCycleGAN and CycleGAN is also assessed from image processing experts. We also can find that the proposed MCycleGAN can arrive a higher chosen rate about 16.5% and 16% as compared to CycleGAN when there is an original face image as reference or not, respectively.

並列關鍵字

none

參考文獻

[1] S. J. Russel and P. Norvig, “Artificial intelligence: a modern approach,” in Malaysia, Pearson Education Limited, 2016.

Google Scholar

[2] C. Tomasi and R. Manduchi,“Bilateral filtering for gray and color images,” in Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)., 10.1109/ICCV.1998.710815

Google Scholar

[3] I. J. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozairy, A.Courville and Y. Bengio, “Generative Adversarial Nets,” in Advances in Neural Information Processing Systems 27 (NIPS 2014).

Google Scholar

[4] J. Zhu, T. Park, P. Isola and A. A. Efros, "Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2242-2251, doi: 10.1109/ICCV.2017.244.

Google Scholar

[5] J. Johnson, A. Alahi, F. F. Li, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution,” in Computer Vision – ECCV 2016, pp 694-711.

Google Scholar

國際替代計量

智慧美顏系統之研究

全文下載

主題瀏覽