增強基於深度學習之指紋識別的強韌性以改善深度偽造來源追蹤

人工指紋識別(即數位水印)是一種透過多媒體身份驗證以追蹤深度偽造 (Deepfake)來源之技術。然而，人工指紋識別並未優先考慮其在受到不同失真(distortion)時之強韌性，導致被嵌入圖片之數位水印容易被常見之影像處理技術破壞，強韌性不足將使得數位水印之實用性降低。在此論文中，我們將現有的跨失真水印技術 (distortion agnostic watermark)與人工指紋識別進行整合，以增強人工指紋識別之強韌性。跨失真水印技術由攻擊者、編碼器以及解碼器組成。攻擊者為一卷積神經網路(convolutional neural network)，此卷積神經網路負責使影像失真以提高將水印從中解碼之難度。編碼器與解碼器將與攻擊者進行對抗。編碼器將水印嵌入至影像中;解碼器負責盡可能從因被攻擊而導致失真之影像中還原編碼器嵌入之水印。透過觀察，我們發現若圖片受到某些特定失真時，如高斯模糊及水平翻轉，僅使用卷積神經網路對圖片進行攻擊無法改善編碼器及解碼器處理水印時之強韌性。此外，當圖片套用了高斯模糊時，攻擊者的存在甚至會導致解碼後之水印品質變差。為了解決此問題，我們設計了攻擊加強器，此攻擊加強器會對圖片套用特定的失真，促使編碼器以及解碼器在訓練時除了與攻擊者進行對抗，也對攻擊加強器所套用之特定失真進行強化，以提高整體人工指紋識別系統的強韌性。根據實驗結果，本篇論文可成功優化人工指紋之解碼品質。即便圖片被套用了原先單獨使用攻擊者進行訓練時無法抵抗的高斯模糊及水平翻轉，本篇論文所提方法依舊可提高人工指紋解碼之位元準確率。相較於舊有的人工指紋識別系統，本篇論文將人工指紋解碼之位元準確率提高 36%，將深度偽造追蹤技術又向前推進了一步。

關鍵字

深度偽造；人工指紋識別；數位水印；隱寫術；生成對抗網路；高斯模糊； JPEG 壓縮

並列摘要

Artificial Fingerprinting (AF or the so-called digital watermarking) is a technique that can be used to conduct Deepfake attribution by ensuring media authentication. However, AF does not prioritize its robustness to certain kinds of distortions, making the embedded watermarks vulnerable to some commonly applied image processing. Insufficient robustness reduces the practicality of digital watermarking techniques. In this work, we extend an existing distortion agnostic watermarking method to enhance the robustness of AF; our method comprises an attacker and an encoder-decoder pipeline. The attacker is a convolutional neural network (CNN) that distorts the images. In contrast, the encoder-decoder pipeline embeds watermarks into images and decodes the watermarks from the attacked ones. We observed that simply using a CNN as the distortion agnostic attacker does not improve the bitwise accuracy of the retrieved watermark when some common image processing operations, such as Center Cropping and Horizontal Flipping are used. Furthermore, the CNN attacker is less effective to improve the robustness of watermarking when applying Gaussian Blur distortion. To fix this shortage, we designed an attack booster applies a set of differentiable image distortions before the CNN attacker to enhance the robustness of the whole fingerprinting system. Experimental results show that the proposed approach improves the quality of the extracted fingerprints even if the aforementioned image processing operations are applied. Quantitatively, our method can improve the bitwise accuracy by up to 36%, which takes another step forward on the road of Deepfake attribution.

並列關鍵字

Deepfakes ； Artificial Fingerprinting ； Watermarking ； Steganography ； Generative Adversarial Networks ； Gaussian Blur ； JPEG-Compression

參考文獻

[1] D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen, “MesoNet: A compact facial video forgery detection network,” in International Workshop on Information Forensics and Security (WIFS), pp. 1–7, 2018.  

Google Scholar

[2] N. Bi, Q. Sun, D. Huang, Z. Yang, and J. Huang, “Robust image watermarking- ing based on multiband wavelets and empirical mode decomposition,” IEEE Transactions on Image Processing, vol. 16, pp. 1956–1966, 2007.  

Google Scholar

[3] L. Chen, R. K. Maddox, Z. Duan, and C. Xu, “Hierarchical cross-modal talking face generation with dynamic pixel-wise loss,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7832-7841, 2019.  

Google Scholar

[4] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, “Secure spread spectrum watermarking for multimedia,” IEEE Transactions on Image Processing, vol. 6, no. 12, pp. 1673–1687, 1997.  

Google Scholar

[5] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio, “Generative adversarial nets,” in Conference on Neural Information Processing Systems, pp. 2672–2680, 2014.

Google Scholar

主題瀏覽