透過您的圖書館登入
IP:216.73.216.231
  • 學位論文

應用機器學習的人臉情緒辨識

Facial Emotion Recognition Based on Machine Learning

指導教授 : 丁肇隆
本文將於2030/01/14開放下載。若您希望在開放下載時收到通知,可將文章加入收藏

摘要


本研究聚焦於面部表情識別(Facial Expression Recognition, FER)技術的發展與應用,旨在解決現有技術中面臨的過擬合、資料品質不足及泛化能力有限等挑戰。FER技術在人工智慧與人機互動領域具有重要價值,但由於資料標註不一致、表情類別不均衡及複雜場景中的表現不足,其準確性與穩健性仍有提升空間。本研究以RAF-DB資料集為基礎,針對資料標註錯誤及品質不佳問題進行多次篩選與處理,使用了多尺寸輸入、特徵融合架構及分段訓練策略等方法,並以AlexNet為核心模型進行優化,結合YOLO檢測技術提升臉部區域的定位精度,從而改善模型的表情辨識能力。在實驗中,經篩選後的資料集顯著提升了模型對細微表情變化的識別能力,驗證了資料品質提升對於模型性能的關鍵作用。多尺寸輸入策略表明影像解析度對學習效果的影響不可忽視;分段訓練策略有效優化了卷積層與全連接層的學習過程;特徵融合架構則透過整合局部與全局特徵,進一步提升了辨識準確率與強健性。本研究為解決FER技術中的資料品質問題提供了切實可行的方案,並透過創新模型設計與學習策略,推動了表情辨識技術的進一步發展。研究成果對教育、醫療、智慧客服等實際應用場景具有重要意義,也為未來FER系統的設計與應用提供了新的參考方向,為該領域的技術創新奠定了堅實基礎。

並列摘要


This study focuses on the development and application of Facial Expression Recognition (FER) technology, aiming to address challenges such as overfitting, insufficient data quality, and limited generalization capabilities in current techniques. FER technology holds significant value in artificial intelligence and human-computer interaction; however, its accuracy and robustness remain constrained due to issues such as inconsistent data annotation, imbalanced expression categories, and suboptimal performance in complex scenarios. Based on the RAF-DB dataset, this research undertakes extensive data filtering and preprocessing to correct annotation errors and enhance data quality. It employs methods such as multi-scale input, feature fusion architectures, and stage-wise training strategies, with AlexNet serving as the core model. Additionally, YOLO detection is integrated to improve the precision of facial region localization, thereby enhancing expression recognition performance.The experimental results demonstrate that the refined dataset significantly improves the model's ability to identify subtle expression variations, affirming the critical role of data quality in enhancing model performance. The multi-scale input strategy highlights the impact of image resolution on learning outcomes; the stage-wise training strategy effectively optimizes the learning process for convolutional and fully connected layers; and the feature fusion architecture combines local and global features, further improving recognition accuracy and robustness.This study provides a practical solution to data quality issues in FER technology and advances expression recognition through innovative model designs and learning strategies. The research findings are highly relevant to applications in education, healthcare, and intelligent customer service, offering new directions for the design and application of FER systems and establishing a solid foundation for technological innovation in this field.

參考文獻


[1] L. Sirovich and M. Kirby, "Low-dimensional procedure for the characterization of human faces," Journal of the Optical Society of America A, vol. 4, no. 3, pp. 519–524, 1987.
[2] M. Turk and A. Pentland, "Eigenfaces for recognition," Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71–86, 1991.
[3] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, "Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997.
[4] E. Osuna, R. Freund, and F. Girosi, "Training support vector machines: An application to face detection," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 1997.
[5] L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.

延伸閱讀