透過您的圖書館登入
IP:3.16.203.122
  • 期刊
  • OpenAccess

領域相關詞彙極性分析及文件情緒分類之研究

Domain Dependent Word Polarity Analysis for Sentiment Classification

摘要


情緒分析乃近年來發展迅速之一熱門研究領域,旨在透過文本分析技術探討作者之意見傾向與情緒狀態。其中,以情緒詞與情緒詞典為基礎之各種方法尤為知名。然而,情緒詞之情感傾向及其行為於不同領域文本中之行為並不盡然相同。本研究聚焦於情緒詞彙於不同領域文本中之行為,對房地產、旅館、和餐廳等三種不同領域之文本進行分析,並發現部分情緒詞彙於不同領域文本中的情緒傾向非但有差異,甚至彼此衝突。此外,部分未收錄於情緒詞典中之「非情緒詞」,在特定領域中亦可能成為「領域相依」之詞彙,影響情緒分類。本研究繼而提出不同詞彙權重計算方式,將此資訊加入舊有情緒分類系統中。在使用LIBSVM的線性核函數方式,對房地產、旅館、和餐廳等三種語料使用5次交叉驗證方式進行分類。實驗結果顯示所提出之TF-S-S-IDF分類方法,結合TF-IDF、臺灣大學情感詞典,及計算語料之領域極性情感傾向程度(SO),強化領域相關及領域不相關之情緒詞之權重,通過t檢定有效提升各領域中文件分類之效能。

並列摘要


The researches of sentiment analysis aim at exploring the emotional state of writers. The analysis highly depends on the application domains. Analyzing sentiments of the articles in different domains may have different results. In this study, we focus on corpora from three different domains in Traditional and Simplified Chinese including real estate, hotel and restaurant, then examine the polarity degrees of vocabularies in these three domains, and propose methods to capture sentiment differences. Finally, we apply the results to sentiment classification with LIBSVM (linear kernel). The experiments show that the proposed method TF-S-S-IDF which integrates TF-IDF, NTU Sentiment Dictionary, and word sentiment orientation degree in each specific domain can effectively improve the sentiment classification performance.

參考文獻


Chaovalit, P.,Zhou, L.(2005).Movie review mining: a comparison between supervised and unsupervised.Proceedings of the 38th Hawaii International Conference on System Sciences.(Proceedings of the 38th Hawaii International Conference on System Sciences).
Chin, Y.-L.(2010).A review and discussion of real estate cycle indicators analysis and publication method.,::Architecture and Building Research Institute, Ministry of the Interior.
Ku, L.-W.,Chen, H.-H.(2007).Mining opinions from the web: beyond relevance retrieval.Journal of American Society for Information Science and Technology.58(12),1838-850.
Ku, L.-W.,Huang, T.-H.,Chen, H.-H.(2009).Using Morphological and Syntactic Structures for Chinese Opinion Analysis.Proceedings of Conference on Empirical Methods in Natural Language Processing.(Proceedings of Conference on Empirical Methods in Natural Language Processing).
Ku, L.-W.,Huang, T.-H.,Chen, H.-H.(2010).Construction of Chinese Opinion Treebank.Proceedings of the Seventh International Conference on Language Resources and Evaluation.(Proceedings of the Seventh International Conference on Language Resources and Evaluation).

被引用紀錄


王雅詩(2017)。基於詞性組合的意見字典擴增方法之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2017.00608
樓逸軒(2016)。運用詞彙重組方法改善中文斷詞〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201600800
林宜萱(2013)。財經領域情緒辭典之建置與其有效性之驗證-以財經新聞為元件〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2013.00843
陳雅婷(2013)。基於少數關鍵字之半監督式學習法進行評論文件分類〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2013.00250
曹開明、劉大華(2019)。網路論述的語藝生態學初探:以網路社群討論軍中同志議題為例傳播研究與實踐9(1),65-103。https://doi.org/10.6123/JCRP.2019.01_9(1).0003

延伸閱讀