在這篇論文中,我提出新的方法將對抗式攻擊(Adversarial Attack)應用於攻擊圖像轉換式生成對抗網絡(Image Translation Generative Adversarial Network),包括循環式生成對抗網絡(CycleGAN)、條件式對抗網絡(pix2pix)、以及高解析度條件式對抗網絡(pix2pixHD),皆獲得良好的成果。圖像轉換式生成對抗網絡是一系列能將圖像在不同圖像集合之間轉換的有用模型。但若在不良的意圖下,這些技術也可被應用於深度偽造演算法中,創造出諸如「將衣物由人物相片中移除」的軟體。有鑒於此潛在的威脅,本論文提出以對抗式攻擊法對付圖像轉換式生成對抗網絡,好讓經由對抗式方法微調的圖片不至於輕易的被圖像轉換式生成對抗網絡模型所變造。 在所有攻擊步驟裡可能採用的對抗損失函數中,本論文發現直觀的運用生成對抗網絡模型中的評比者網絡(Discriminator)無法企及良好的成效,但運用距離函數則相當有效。希望這能對後續為保護圖片不受惡意圖像轉換式生成對抗網絡所變造的研究提供指引。
In this thesis, I propose a novel method to apply Adversarial Attack on image translation Generative Adversarial Networks, including CycleGAN, pix2pix, and pix2pixHD, and achieve satisfying results. Image translation GANs are powerful models that can achieve imagehyp{}tohyp{}image translation between different image domains. If used with malicious intent, these techniques can be used as deepfake algorithms that could, for example, remove clothes on a body in a photograph. Given this potential threat, this thesis proposes to apply Adversarial Attack against image translation GANs so that images perturbed by adversarial methods would not be easily counterfeited by an image translation GAN model. That is, feeding image translation GANs with images slightly perturbed by the proposed method would not result in the designed outcome. From all the alternatives of adversarial loss functions to be used in the attack procedure, this work finds that naively using the Discriminator part of the GAN models would not lead to a successful result, while using distance functions proves to be very useful. This work hopes to provide a guideline to those who wish to defend personal images from malicious use of image translation GANs in the future.