透過您的圖書館登入
IP:3.148.192.125
  • 學位論文

現代卷積神經網路人臉年齡估測

Age Estimation via Modern Convolutional Neural Networks

指導教授 : 劉冠顯

摘要


從人臉估測年齡有許多應用,例如在百貨公司或大賣場的廣告牆投放符合當下顧客年齡的廣告以吸引消費,或是在公共場所如公園,如果有落單的小朋友或需要扶助的年長者,經由年齡估測系統發現後可適時提供協助。本研究利用卷積神經網路從相片中的人臉估測年齡。首先偵測相片中的人臉及兩眼位置,剪裁人臉所在區域,並依據兩眼位置進行精確的人臉對正。然後將裁切後的相片輸入卷積神經網路萃取年齡特徵,最後經由網路輸出值估測相片中人臉的年齡。本研究使用Xception的深度可分離卷積 (depthwise separable convolution) 降低運算複雜度與多路徑的卷積運算強化特徵萃取能力、使用標籤分布年齡編碼(lable distribution age encoding) 配合KLD (Kullback-Leibler Divergence) 損失函數強化神經網路學習能力、使用丟棄 (dropout) 避免網路過擬合 (overfitting),使用資料擴增 (data augmentation) 增加訓練資料量及訓練資料的多元性,使用最高機率、期望值、KLD相似性3種方式計算年齡估測值。本論文使用IMDB-WIKI資料集為訓練資料,使用APPA-REAL、FG-NET、MORPH-II及 LAP-2016 為目標資料集。本研究結果在APPA-REAL資料集測試組、FG-NET資料集、MORPH-II資料集的平均絕對誤差(mean absolute error, MAE) 為3.385、2.78、2.88,在LAP-2016 資料集測試組的 ϵ-error 為0.2589。本研究在APPA-REAL資料集測試組達到最佳的結果。

並列摘要


Age estimation has versatile applications. For example, department stores could stimulate buying for different age customers passed by with age-oriented advertisements or the surveillance systems at public area could filter the unattended children or elders and provide assistance. This study uses convolutional neural networks for age estimation. Firstly, the system detects faces in the image, and gets the positions of eyes, then crops the face area and aligns the face. Finally, the cropped image is input into convolutional neural network for extracting features and estimating the face age. The study uses Xception’s Depthwise Separable Convolution for reducing computation complexities and multi-paths convolution for enforcing features extracting abilities, uses label distribution age encoding and KLD loss function for enhancing the neural network learning abilities, uses highest probability, expected value and KLD similarity for final age estimation. The IMDB-WIKI dataset is used as pre-training dataset. The APPA-REAL (including training set and validation set) is used as finetune dataset. The APPA-REAL, FG-NET and MORPH-II are the target datasets which have become new benchmarks on age estimation. We achieve the state-of-the-art results of mean absolute error (MAE) 3.385, 2.78, and 2.88 years on APPA-REAL, FG-NET, and MORPH-II test sets, and ϵ-error 0.2589 on LAP-2016 test set respectively.

參考文獻


參考文獻
[1] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, and M. Bernstein, "Imagenet large scale visual recognition challenge," International journal of computer vision, vol. 115, no. 3, pp. 211-252, 2015.
[2] R. Rothe, R. Timofte, and L. Van Gool, "Deep expectation of real and apparent age from a single image without facial landmarks," International Journal of Computer Vision, vol. 126, no. 2-4, pp. 144-157, 2018.
[3] E. Agustsson, R. Timofte, S. Escalera, X. Baro, I. Guyon, and R. Rothe, "Apparent and real age estimation in still images with deep residual regressors on APPA-REAL database," in Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on, 2017, pp. 87-94.
[4] G. Panis, A. Lanitis, N. Tsapatsoulis, and T. F. Cootes, "Overview of research on facial ageing using the FG-NET ageing database," Iet Biometrics, vol. 5, no. 2, pp. 37-46, 2016.

延伸閱讀