  • 學位論文


Emotion Detection for Unbalanced Indonesian Tweets

指導教授 : 陳宜欣


印尼文情緒偵測於不均衡微網誌資料之研究 近年來,推特資料勘探已成爲研究熱點。而在微網誌上的情緒分析,是眾多研究中的其中一種。最近,相關學者提出了一種基於圖學的情緒模式擷取技術,該技術在多種語言的應用中皆取得良好效果。本研究旨在提升印尼文推特情緒分析的精確度,分析的情緒包括以下八類:開心(senang)、憂傷(sedih)、害怕(takut)、驚訝(terkejut)、噁心(jijik)、希望(antisipasi)、信任(percaya)、生氣(marah)。之前的研究中,印尼文的情緒分析精確度不甚理想,主要原因爲印尼文推特中情緒分佈不均衡。因此本研究提出一種調整情緒模式權重的方法以解決情緒分佈不均衡的問題。實驗結果證明,該方法可(顯著)提高印尼文推特中情緒分析的精確度。


Emotion Detection for Unbalanced Indonesian Tweets ABSTRACT Research concerning Twitter mining becomes an interesting research topic in recent years. Emotion detection is one of research area which uses microblog, such as Twitter, to discover emotions from textual data. Recently, a novel technique based on graph-based was proposed to extract patterns that bear emotion. The system has been achieved a good performance in different languages. By adopting the system, we are motivated to enhance the accuracy of emotion detection for Indonesian language which consists of eight emotions, i.e. joy (senang), sad (sedih), fear (takut), surprise (terkejut), disgust (jijik), anticipation (antisipasi), trust (percaya), dan anger (marah). The data distribution among the emotions is really unbalanced which make the low precision of system for Indonesian language. In this study, we proposed an adjusting pattern weight to address unbalanced data problem for Indonesian language. The experiment results show that the proposed approach can improve the precision for unbalanced Indonesian data.


driguez Perez, and J. M. Jose, An interactive interface for visualizing events
Research & Development in Information Retrieval, SIGIR '14, (New York,
Detection and Earthquake Reporting System Development," IEEE Transactions
elections with twitter: What 140 characters reveal about political sentiment.,"
[4] M. Hu and B. Liu, Mining and summarizing customer reviews," in Proceedings
