情感分析在自然語言(NLP)領域中是熱門的研究主題,主要的研究都是圍繞情緒做分析,然而造成情緒的原因也值得深入分析。情緒與原因的組合提取,旨在於文章中提取描述情緒的子句和造成該情緒的原因子句,並配對兩者成為情緒原因組合。這樣的組合可以幫助人在分析情緒時更為全面,例如在使用者經驗和心理諮商等都有其應用價值。 本研究中,我們探討使用文字理解模型來解決此任務,在過去的論文中,大部分都將此情感原因的組合提取視為子句分類任務。我們主要的想法是將原先的子句分類任務轉換成問答任務來更深入理解文章,並且採用了兩階段的方法,分別是情緒與原因子句的提取以及篩選組合。此外,我們也提出資料增強策略來輔助原先訓練資料不足所帶來的準確率不佳問題。於標準資料集驗證實驗中,實驗結果展現我們所提出的方法在F1指標中為目前相關研究最佳的結果。
Sentiment analysis in Natural Language Processing (NLP) has been a popular topic and most studies pay more attention to the emotions analysis. However, in addition to analyzing emotion itself, the reasons that cause the emotion also deserve attention. Emotion-cause pair extraction (ECPE) aims to extract the emotion clauses and pair with corresponding causes in an excerpt to assist people understand human sentiment. The extracted causes are also valuable for many applications, such as user experiences, counseling psychology, etc. In this thesis, we study the ECPE problem by employing machine reading comprehension model to extract emotions and causes in an excerpt. In the past, the ECPE was modeled as a clause-level classification task. In comparison, our idea is to transform the original clause-level classification task into the question answering task in the machine reading comprehension model. Based on such an idea, we propose a two-stage approach with specific data augmentation. The first stage is to leverage machine reading comprehension model to extract emotions and causes from a given excerpt and the second stage is to pair extracted emotion clauses and cause clauses for discovering final emotion-cause pair result. The evaluation based on benchmarking dataset demonstrate the effectiveness of the proposed approach; we push the state-of-the-art results from 61% to 65% in terms of F1 scores.