Research on Text Classification of Cyber Violence Speech Based on LSTM

Nowadays, with the development of technology, network communication has gradually become one of the most important ways for people to obtain and exchange information. However, due to the anonymity of the network speech, cyber violence is not a novelty for people. Especially, these years teenagers have become a larger part of Internet users. Therefore, the supervision of comments of social media is very important and necessary. This research analyzes the comments of several controversial topics of Weibo, which is one of the most popular social media in China. It also distinguishes normal comments from some hash comments which attacks the others. The research also trains the dataset after data crawling, cleaning, and labeling. It not only identifies some comments with obvious rude remarks, but also distinguishes other speech with obvious personal attack although it does not contain any dirty words. After training based on the LSTM mode, this study finally evaluates the model which could classify different type of speech. The result of the research shows that it is feasible to supervise violent speech through training the LSTM model. Compared with the previous method of speech supervision, which simply blocks the impolite words, this method also has high accuracy and sensitivity to some novel network terms and abbreviations.

關鍵字

LSTM ； Social media ； Cyber violence ； Speech supervision

參考文獻

Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.

Regulations on Ecological Governance of network information content. (2021) Ecological Governance of network information. http://www.cac.gov.cn/2020-01/20/c_1581058057316205.htm

Google Scholar

Weibo User Report in 2020. (2020) 2020 Weibo User Report. https://data.weibo.com/report/

Google Scholar

HIT Stopwords. (2020) Stopwords Lists. https://github.com/goto456/stopwords

Google Scholar

Zhang X, Zhao J, Lecun Y. Character-level Convolutional Networks for Text Classification * [C] // Neural Information Processing Systems. MIT Press, 2015, 27(6):649-657.

Google Scholar

國際替代計量

Research on Text Classification of Cyber Violence Speech Based on LSTM

全文下載

主題瀏覽