透過您的圖書館登入
IP:216.73.216.134
  • 期刊

Method of Short Text Classification based on TF‐IWF Feature Selection

摘要


[Objective] TF‐IDF algorithm solves the problem of external corpus dependence in short text classification, but it has the problem of weight concentration and low text discrimination when calculating text features. Therefore, a short text classification method based on Chi square statistics and tf‐iwf algorithm is proposed. [method] the feature words are extracted from the training data set by chi square statistics. The feature words are weighted by tf‐iwf algorithm, and then classified by SVM classifier. [results] the experimental results show that the accuracy of text classification is improved by 3.1%, the recall is improved by 5.2%, and the F1 value is improved by 3.7% by combining chi square statistics and tf‐iwf. [Conclusion] the method expands the range of the weight value of feature words, increases the variance of the weight value of the text set, and solves the problem of sparsity of short text content to a certain extent, so as to improve the performance of short text classification.

參考文獻


HU X, SUN N, ZHANG C, et al. Exploiting internal and external semantics for the clustering of short texts using world knowledge [C] / /Proceedings of the 18th ACM conference on Information and knowledge management. Hong Kong: ACM, 2009: 919-928.
WANG Sheng, FAN Xinghua, CHEN Xianlin. Chinese short text classification based on hyponymy relation[J]. Journal of Computer Applications, 2010, 30(03): 603-606+611.
Wang Yang, Xu Shanshan, Li Chang, Ai Shicheng, Zhang Weidong, Zhen Lei, Meng Dan. Classification model based on support vector machine for Chinese extremely short text[J/OL]. Application Research of Computers:1-5.https://doi.org/10.19734/j.issn.1001-3695.2018.06.0514.
SHENG Cheng Cheng,ZHU Yong, LIU Tao. Public opinion analysis based on Weibo social network[J]. Intelligent Computer and Applications, 2019, 9(01): 57-59+64.
Li Ding-yu,Hu Xue-gang. Cross-domain Sentiment Classification Algorithm for Short Text[J]. Journal ofChineseMini-MicroComputerSystems, 2018, 39(05): 1005-1009.

延伸閱讀