提升文本資料語意分類績效之研究

隨著文字為主的溝通平台，例如：部落格、微博、推特等的快速發展，網路使用者對購買產品或服務的意見和評論可急速在網路空間擴散，並直接影響其他顧客的購買意願及品牌印象。面對在網路上迅速增加的評論，人們如何有效地偵測出使用者的語意，尤其是負面的語意，已成為新興的研究議題。為了有效處理語意分類的問題，改善語意分類的績效，本研究提出了兩個特徵選取方法，分別稱為「修改類別比例差異(Modified categorical proportional difference, MCPD)」方法與「特徵類別導向(Feature category orientated, FCO)」指標方法，並使用網路產品評論做為研究案例，以及支撐向量機(Support vector machines, SVM)做為分類器驗證所提方法的有效性。實驗結果顯示，本研究所提出的MCPD方法在使用較少的特徵維度進行分類時有較佳的分類效能，而FCO指標方法可以提升語意分類的績效。

關鍵字

支撐向量機；特徵選取；語意分類

並列摘要

With rapid development of the text based communication platforms, such as Blogs, Microblog, Twiter, and so on, the opinions and comments of products/services expressed by users can spread quickly in the cyber space, and affect other consumer’s purchase intentions or brand impressions. Facing with promptly increasing reviews on the Web, how to effectively detect user’s sentiment, especially negative sentiment, has become one of emerging research issues. In order to tackle this task and improve sentiment classification performance, this study aims to propose two feature selection methods called “Modified categorical proportional difference, MCPD” and “Feature category orientated, FCO” approach for dimension reduction. In addition, several actual cases of online users’ comments will be used to illustrate the effectiveness of our proposed methods. And support vector machines (SVM) have been employed to construct classifiers. Experimental results indicated that our proposed MCPD method has the better classification performance while using fewer features, and FCO approach can improve sentiment classification performance of textual data.

並列關鍵字

Support vector machines ； Sentiment classification ； Feature selection

參考文獻

[2]Aizawa, A. (2003), “An Information-theoretic Perspective of TF-IDF Measures,” Information Processing and Management, Vol. 39, No. 1, pp. 45-65.

[3]Bai, X. (2011), “Predicting Consumer Sentiments from Online Text,” Decision Support Systems, Vol. 50, No. 4, pp. 732-742.

[4]Chang, C.-C. and Lin, C.-J. (2001), “LIBSVM: A Library for Support Vector Machines,”Software, available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm.

[6]Chen, J., Huang, H., Tian, S., and Qua, Y. (2009), “Feature Selection for Text Classification with Naïve Bayes,” Expert Systems with Applications, Vol. 36, No. 3, pp. 5432-5435.

[8]Chen, L.-S., Liu, C.-H., and Chiu, H.-J. (2011), “A Neural Network Based Approach for Sentiment Classification in the Blogosphere,” Journal of Informetrics, Vol. 5, No. 2, pp. 313-322.

被引用紀錄

林峻緯（2012）。資訊系統導入過程中使用者抗拒與組織對策之探討-以國立虎尾科技大學雲端點名系統為例〔碩士論文，國立虎尾科技大學〕。華藝線上圖書館。https://doi.org/10.6827/NFU.2012.00180

國際替代計量

提升文本資料語意分類績效之研究

主題瀏覽