  • 學位論文


Accurate and robust face recognition from RGB-D images with a deep learning approach

指導教授 : 賴尚宏


在臉部辨識的問題中,深度影像和彩色影像是兩種互補的視覺資料,可呈現兩種不同面向的資訊,而同時使用它們可以得到更準確的辨識結果。在這篇論文中,我們提出一套基於深度學習的臉部辨識系統,用於在消費性RGB-D相機所捕捉的彩色和深度影像上,完成臉部識別和辨認。  為了同時從彩色和深度影像上獲取資訊以作為辨識之用,我們的系統包含3個主要部分,分別為:深度影像修復、深度學習,用於特徵抽取;以及統合分類,用以整合色彩和深度資訊。  為了使深度影像獲得與彩色影像相近的辨識效能,我們提出一連串基於影像處理和電腦圖學的方法,拍攝連續多張深度畫格,用以修復和增強深度影像,進一步重建出高品質的臉部模型。利用多視角重新取樣,我們可以模擬單一臉部模型從各種角度拍攝所得的深度影像。 為了消弭RGB-D資料有限對於深度學習的隱患,我們引入了學習遷移的概念。我們的深度網路包含了近年流行的組件與架構,首先在彩色和灰階臉部影像上學習,接著在深度臉部影像上進行微調優化。深度網路主要用來為彩色和深度臉部影像提取具分辨力的深度特徵。我們不僅參考這些特徵,更對每張影像與資料庫中其他的影像的關係進行統計,來設計最後的分類器,以達到更高的準確率和更好的強健性。  在實驗中,我們已經證明了如此的方法在公開的資料集上能達到非常精確的辨識率,並且對於頭部旋轉和光源變化有很強的容忍力。


Face recognition from RGB-D images utilizes two complementary types of image data, i.e. color and depth images, to achieve more accurate recognition. In this thesis, we propose a face recognition system based on deep learning, which can be used to verify and identify a subject from the color and depth face images captured with a consumer-level RGB-D camera. (e.g., Microsoft Kinect). To recognize faces with color and depth information, our system contains 3 parts: depth image recovery, deep learning for feature extraction, and joint classification. To gain recognition performance of a depth face image, we propose a series of image processing techniques to recover and enhance a depth image from its neighboring depth frames, thus reconstructing a precise 3D facial model. With multi-view resampling, we can compute the depth images corresponding to various viewing angles of a single 3D face model. To alleviate the problem of the limited size of available RGB-D data for deep learning, transfer learning is applied. Our deep network architecture contains recently popular components. We first train the deep network on color face dataset, and next fine-tune with depth images for transfer learning. The deep networks are used to extract discriminative feature (deep representation) from color and depth images. Not only these deep representations are taken into consideration, we analyze the relation between each image and the other images in the database, to design our classifier, to reach higher recognition accuracy and better robustness. Our experiments show that the proposed face recognition system provides very accurate face recognition results on public datasets, and it is robust against variations in head pose and illumination.


[1] Y. Sun, X. Wang, and X. Tang, “Deep Learning Face Representation from Predicting 10,000 Classes” Computer Vision and Pattern Recognition, 2014.
[3] Y. Sun, X. Wang, and X. Tang, “Deeply learned face representations are sparse, selective, and robust” Computer Vision and Pattern Recognition, 2015.
[5] Y. Sun, D. Liang, X. Wang, and X. Tang, “DeepID3: Face Recognition with Very Deep Neural Networks” arXiv preprint arXiv:1502.00873, 2015.
[6] J. Liu, Y. Deng, T. Bai, Z. Wei, and C. Huang, “Targeting Ultimate Accuracy: Face Recognition via Deep Embedding” arXiv preprint arXiv:1506.07310.
[7] S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification” Computer Vision and Pattern Recognition, 2005.
