透過人臉區域校正與階梯式深度網路的人臉特徵點偵測

人臉特徵點偵測在近幾年被廣泛的研究且在控制環境下有很好的結果，然而在實際情況下拍攝的影像，容易受到外在因素的影響(如人臉影像受光影變化、部分遮蔽、解析度不足、表情變化或是角度偏差)，而偵測的準確率便會急遽的下降。相關的方法常需要在特徵點偵測前，預測人臉區域的位置，也容易受到人臉偵測的準確率影響。這篇論文的目標是想要在環境因素的影響下偵測特徵點，並且不受人臉偵測準確率的影響。所以我們提出一個兩階段式深度網路，用於實作漸進式的人臉特徵點偵測，第一步先得到特徵點粗略的位置，第二步則基於局部區塊作調整，我們也將網路中的每一層加入多任務學習，以接納更多人臉資訊幫助特徵點偵測的準確率。為處理不準確的人臉偵測，我們更提出一個卷積類神經網路用於校正人臉區域。實驗證明我們的方法在AFLW和AFW兩個資料庫上，利用了較少的模型數量得到更佳的結果。

關鍵字

人臉特徵點偵測；深度學習；卷積類神經網路；階梯式

並列摘要

Facial landmark detection has been studied in recent years and has achieved good performance in controlled environment. However, the performance decreases significantly when face images are taken under wild conditions (e.g., different illuminations, occlusions, resolution and with different expressions and pose variations). Moreover, many methods need to determine face region before landmark detection. Therefore, the performance is affected by the accuracy of face detectors. The purpose of this work is to tackle the influence of environmental variations and ensure the detection accuracy even with instable face detectors. Therefore, we propose a two-level deep network to implement coarse-to-fine estimation. The first level predicts rough locations and the second level locally refines the results. We also adopt the multi-task learning into each level to include more information from face. Furthermore, we propose a CNN model to rectify inaccurate face region. Experimental results show that our approach uses fewer models to get more accurate results on AFLW and AFW datasets.

並列關鍵字

facial landmark detection ； deep learning ； convolutional neural network ； cascade

參考文獻

[1]P. N. Belhumeur, D. W. Jacobs, D. J. Kriegman, and N. Kumar, “Localizing parts of faces using a consensus of exemplars,” In Proc. CVPR, 2011.

[2]L. Gu and T. Kanade, “A generative shape regularization model for robust face alignment,” In Proc. ECCV, 2008.

[3]X. Zhu and D. Ramanan, “Face detection, pose estimation, and landmark localization in the wild,” In Proc. CVPR, 2012. pp. 2879-2886.

[4]L. Liang, R. Xiao, F. Wen, and J. Sun, “Face alignment via component-based discriminative search,” In Proc. ECCV, 2008.

[5]T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, “Active shape models their training and application,” Computer Vision and Image Understanding, vol. 61, no.1, pp.38-59, Jan. 1995.

國際替代計量

透過人臉區域校正與階梯式深度網路的人臉特徵點偵測

主題瀏覽