使用多任務網路串聯於個人化臉部動作單元偵測之研究

Facial action unit detection, which aims to detect facial muscle activities from face images, is an important task to enable the emotion recognition from facial movements. By coding facial muscle activities into a system of facial Action Units (AUs), facial expressions can be clearly described. However, it is still challenging to predict the AUs from fine-grained facial appearances, and one of the challenges lies in handling various appearances from different subjects. In this thesis, we address the problem by introducing an auxiliary neutral face image to produce person-specific transformations for various subjects. With the help of neutral faces, our method extracts effective features of facial muscle activities despite the divergent individual appearances. We propose to combine an additional face clustering task on top of the AU detection task to form a multi-task network cascades and train the cascades jointly. First, to train the face clustering networks for producing person-specific transformations, we utilize identity-annotated datasets which contain numerous subjects to alleviate a common problem that existing AU-annotated datasets contain only a few subjects. Second, we transform the facial features using the person-specific transformations to reduce individual differences for predicting AU labels. As a result, the proposed network cascades exploit not only the visual but also the identity information and thus more effectively detect AUs based on the personalized appearance normalization. Our experimental results on the BP4D dataset show that our method outperforms state-of-the-art ones. Experiments under cross-dataset and cross-group scenarios also show the advantage of our method in terms of robustness.

並列關鍵字

Machine Learning

參考文獻

[1] N. Ambady and R. Rosenthal. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological bulletin, 111(2):256, 1992.

[2] T. BaltruÇsaitis, M. Mahmoud, and P. Robinson. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, volume 6, pages 1–6. IEEE, 2015.

[3] S. A. Bargal, E. Barsoum, C. C. Ferrer, and C. Zhang. Emotion recognition in the wild from videos using images. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 433–436. ACM, 2016.

[5] J. Chen, X. Liu, P. Tu, and A. Aragones. Learning person-specific models for facial expression and action unit recognition. Pattern Recognition Letters, 34(15):1964–1970, 2013.

[7] J. Dai, K. He, and J. Sun. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3150–3158, 2016.

國際替代計量

使用多任務網路串聯於個人化臉部動作單元偵測之研究

全文下載

主題瀏覽