透過您的圖書館登入
IP:18.220.137.164
  • 學位論文

xCos:適用於人臉驗證任務的可解釋餘弦相似度

xCos: An Explainable Cosine Metric for Face Verification Task

指導教授 : 徐宏民

摘要


我們在人臉識別任務上,特別是人臉驗證任務,研究了XAI(可解釋AI)。臉部驗證是近年來一項至關重要的任務,並且已被部署到許多應用程序中,例如訪問控制,監視和針對移動設備的自動個人登錄。隨著數據量的增加,深度卷積神經網絡可以針對人臉驗證任務實現非常高的準確性。但除了出色的性能外,深層驗證模型還需要更多的可解釋性,以讓我們信任它產生的結果。在本文中,我們提出一種新穎的相似性度量,稱為可解釋餘弦(xCos),它帶有一個可學習的模組,並可將其插入大多數驗證模型中,以提供有意義的解釋。在xCos的幫助下,我們可見兩張輸入臉部圖片的哪些部分是相似的、模型注意的部位,以及模型如何加權局部相似度以輸出xCos分數。我們證明了我們提出的方法在LFW和各種人臉驗證資料集上的有效性:所提出的模型不僅為人臉驗證提供了新穎而理想的模型可解釋性,並且確保了插入現有人臉識別模型時的準確性。

並列摘要


We study the XAI (explainable AI) on the face recognition task, particularly the face verification here. Face verification is a crucial task in recent days and it has been deployed to plenty of applications, such as access control, surveillance, and automatic personal log-on for mobile devices. With the increasing amount of data, deep convolutional neural networks can achieve very high accuracy for the face verification task. Beyond exceptional performances, deep face verification models need more interpretability so that we can trust the results they generate. In this paper, we propose a novel similarity metric, called explainable cosine (xCos), that comes with a learnable module that can be plugged into most of the verification models to provide meaningful explanations. With the help of xCos, we can see which parts of the 2 input faces are similar, where the model pays its attention to, and how the local similarities are weighted to form the output xCos score. We demonstrate the effectiveness of our proposed method on LFW and various competitive benchmarks, resulting in not only providing novel and desiring model interpretability for face verification but also ensuring the accuracy as plugging into existing face recognition models.

參考文獻


[1] P. L. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors. Advances in Neural Information Processing Systems 25: 26th Annual Con- ference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States, 2012.
[2] W. Brendel and M. Bethge. Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. International Conference on Learning Repre- sentations, 2019.
[3] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman. Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pages 67–74. IEEE, 2018.
[4] G. Castañón and J. Byrne. Visualizing and quantifying discriminative features for face recognition. 2018 13th IEEE International Conference on Automatic Face
[5] Y.-L. Chang, Z. Y. Liu, K.-Y. Lee, and W. Hsu. Free-form video inpainting with 3d gated convolution and temporal patchgan. In Proceedings of the International Conference on Computer Vision (ICCV), 2019.

延伸閱讀