透過您的圖書館登入
IP:3.135.216.174
  • 學位論文

透過重新排序與近似最鄰近搜尋以改進基於殘差網路之人臉特徵擷取器

Improving ResNet-based Feature Extractor for Face Recognition via Re-ranking and Approximate Nearest Neighbor Search

指導教授 : 張智星
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


深度殘差網絡 (ResNet) 是圖像分類和物件偵測中最先進的架構之一,當網路架構中最後一層被移除時,它可以當作一個良好的特徵向量抽取器。將人臉轉換成特徵向量後,一些任務例如人臉識別、人臉驗證,就可以使用一些距離測量方法來實現。我們提出了一個基於ResNet特徵提取器的人臉識別框架,並加上其他改善性能的步驟,包括人臉偵測,人臉對齊,人臉驗證/識別以及透過近似最近鄰搜索 (ANNS) 來重新排序。首先,我們在三個常見的人臉檢測資料集上評估兩種人臉偵測演算法,MTCNN和FaceBoxes,接著總結這兩種方法的最佳使用場景。其次,經過特定的預處理和後處理,我們的系統選擇基於ResNet的特徵提取器,並在LFW資料集中達到99.33%的驗證準確度。第三,我們使用懲罰曲線來確定最佳配置並獲得良好的臉部驗證結果。最後,基於這篇論文提出的重新排序策略,我們的方法在大型類別間變異數據集 (在CASIA-faceV5資料集上提升1.47%,在CASIA-WebFace資料集上提高2.28%) 與大型類別內變異數據集 (在FG-NET資料集上提高1.3%,在CACD資料集上提高2.43%) 上都能使辨識率上升。

並列摘要


Deep residual network (ResNet) is one of the state-of-the-art architectures in image classification and object detection, which can serve as a robust feature extractor when the last layer is removed. Once the faces are embedded as feature vectors, tasks such as face recognition, verification and identification can be easily implemented using some distance measurements. This paper proposes a framework for face recognition based on feature extractor from ResNet, together with other steps for improving its performance, including face detection, face alignment, face verification/identification, and re-ranking via Approximate Nearest Neighbor Search (ANNS). First, we evaluate two face detection algorithms, MTCNN, and FaceBoxes on three common face detection benchmarks, and then summarize the best usage scenario for each approach. Second, with certain preprocessing and postprocessing, our system selects the ResNet-based feature extractor, which achieves 99.33% verification accuracy on LFW benchmark. Third, we use the penalty curve to determine the best configuration and obtain improved results of face verification. Lastly, based on the proposed re-ranking policy, our method not only boosts the accuracy in large inter-class variation datasets (1.47% and 2.28% improvement in rank-1 accuracy for CASIA-faceV5 and CASIA-WebFace respectively) but also in large intra-class variation datasets (1.3% and 2.43% improvement in rank-1 accuracy for FG-NET and CACD respectively).

參考文獻


[1]A. Andoni and P. Indyk, "Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions," Communications of the ACM, vol. 51, no. 1, p. 117, 2008.
[2]A. Andoni and I. Razenshteyn, "Optimal data-dependent hashing for approximate near neighbors," in Proceedings of the forty-seventh annual ACM symposium on Theory of computing, 2015: ACM, pp. 793-801.
[3]A. Asthana, S. Zafeiriou, S. Cheng, and M. Pantic, "Robust discriminative response map fitting with constrained local models," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 3444-3451.
[4]M. Aumüller, E. Bernhardsson, and A. Faithfull, "ANN-benchmarks: A benchmarking tool for approximate nearest neighbor algorithms," Information Systems, 2019.
[5]S. Bai and X. Bai, "Sparse contextual activation for efficient visual re-ranking," IEEE Transactions on Image Processing, vol. 25, no. 3, pp. 1056-1069, 2016.

延伸閱讀