  • 學位論文


Instance Appraisable Deep Learning Model for Sequence-level Pain Intensity Estimation via Facial Videos

指導教授 : 傅立成


在經常人滿為患的醫院急診中,時間是最重要的資源之一,而為了有效運用黃金時間,判別病患緊急程度的檢傷流程則成為關鍵的一環。要如何在加速檢傷流程的同時維持判斷的準確、客觀一直是困難的課題,因此在本研究中,我們專注在與電子化檢傷相關的任務,期望一個完善的自動化系統最終能提高醫療資源使用的效率。 在台灣急診檢傷流程中,疼痛指數為其中一項重要指標,而人們在大多數情況下會將疼痛反映在臉部表情上,故我們著眼於電腦視覺方法,建立基於臉部影像之深度學習模型。在現行醫療體系下,疼痛指數為病患根據視覺類比表自行評估,然而其數值具有較強的主觀性,且該疼痛指數為序列級別的標籤。在影像分析任務上,相較於逐幀標註的指標,存在資料標註不精確的問題。更具體地說,患者檢傷過程中表情並非維持不變,例如疼痛類型為間歇性疼痛之患者,如何從檢傷影片中找出含有疼痛表現的片段十分關鍵。在我們的實驗中,我們提出了具有事例評價機制的深度學習模型。我們將較短的影片片段稱為事例,並輸入模型,藉由多事例學習訓練模型產生評價分數,以此尋找可能的關鍵畫面,最終改善模型表現與可解釋性。 總結來說,我們將建構一個可用於實際場域的AI輔助系統作為目標,設計了一個支援線上預測的疼痛指數預測系統。


In the emergency department (ED) of hospitals that are often overcrowded, time is one of the most valuable resources. To effectively utilize the golden time, the triage process for estimating the urgency of different patients becomes extremely crucial. Accelerating the triage process while maintaining the accuracy and objectivity of judgment has always been a dilemma. Therefore, in this research, we focus on the tasks related to the automatic triage system and hope that the system can improve the efficiency of utilizing medical resources. Considering pain level is one of the major indicators in the Taiwan triage process, and people usually reflect their pain on their facial expressions, we implement a deep learning model based on facial videos via computer vision methods. In the current medical system, the commonly used pain metric is Visual Analog Score (VAS), which is typically provided through patient self-report. However, VAS is a sequence-level subjective metric. In comparison with frame-level labels, sequence or video-level annotations are more inexact. More specifically, patients' facial expressions may change dramatically during the triage process. As a result, for patients suffering from intermittent pain, recognizing the durations and timings of painful expressions are essential for pain intensity estimation. In this thesis, short video clips are considered as instances and are input to the model. Via our proposed multiple-instance learning approaches, our model learns to appraise the value of instances. Based on the generated instance scores, we improve the performance and interpretability of our pain level assessment model. To sum up, in this thesis we pursue the goal of implementing our system in real clinical situations, so that an online inference framework for pain level estimation is provided.


[1] 衛生福利部中央健康保險署. 110 年分級醫療整體成效進度追蹤. https://www.nhi.gov.tw. Updated: 2022-02-27.
[2] Patrick Lucey, Jeffrey F. Cohn, Kenneth M. Prkachin, Patricia E. Solomon, and Iain Matthews. Painful data: The unbc-mcmaster shoulder pain expression archive database. In 2011 IEEE International Conference on Automatic Face Gesture Recognition (FG), pages 57–64, 2011.
[3] Junxi Feng, Xiaohai He, Qizhi Teng, Chao Ren, Honggang Chen, and Yang Li. Reconstruction of porous media from extremely limited information using conditional generative adversarial networks. Physical Review E, 100, 09 2019.
[4] Jane Bromley, James Bentz, Leon Bottou, Isabelle Guyon, Yann Lecun, Cliff Moore, Eduard Sackinger, and Rookpak Shah. Signature verification using a ”siamese” time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence, 7:25, 08 1993.
[5] Fernando De la Torre, Wen-Sheng Chu, Xuehan Xiong, Xiaoyu Ding, and Jeffrey Cohn. Intraface. In IEEE International Conference on Automatic Face Gesture Recognition and Workshops, 05 2015.
