為了找到電影領域的評論文章的評價資訊,本研究實作一意見探勘系統。首先從台大批踢踢BBS電影版擷取到評論文章,藉由斷句、斷詞、詞庫比對的方式得到意見單元。利用種子詞彙延伸的方法去延伸正、負向情緒字詞的詞庫;除了種子詞彙延伸所得到的意見傾向,另外加上成語傾向判斷以及句子語法規則傾向判斷,得到最後具有意見傾向的意見單元。 本系統在正、負傾向的判斷功能在F1 measure下可達71.74%與66.42%。研究中使用的種子詞彙延伸方法產生的情緒詞詞庫與台大情緒詞庫(NTUSD),進行意見單元傾向判斷的實驗比較,發現本研究的情緒詞詞庫以較少的詞彙,得到較高的精準度與召回率。
In order to find the good or bad opinions from movie comments, we implement a opinion mining system in this study. First of all, we crawl the movie reviews from PTT BBS forum, by doing the segmenting sentences, segmenting terms, and term matching way, then we get the opinion unit. Secondly, we use seed words exploring method to extend synonym and antonym of the seed words, then we can get a sentiment dictionary to decide the orientation of the opinion units. Besides, we have idiom orientation determination module and sentence grammar orientation determination module to decide the final opinion orientation. The efficiency of the orientation determination in this system can reach F1 71.74% and 66.42% for positive and negative comments, respectively. Furthermore, we compare the sentiment dictionary we generated in this study with NTUSD dictionary, the results shows that the sentiment dictionary in this study can use less sentiment words to get higher precision than the NTUSD dictionary.