應用深度學習技術於電影評論情感分析

現代資訊科技發達且網路資訊量飛躍成長狀況下，本研究提出的詞向量搭配深度學習模型，可讓電影資訊服務提供平臺，更加迅速協助影迷及閱聽大眾作出消費行為決策。以word2vec詞向量模型方式銓釋文字，有降低維度優點，並可呈現上、下文相近詞彙間語義關係。本研究使用IMDb英文影評為語料庫來源，藉由自行訓練之CBOW及Skip-gram詞向量模型及其權重，並搭配深度學習模型(RNN、LSTM及GRU)，以及模型參數調整(神經元個數、Dropout)之不同組合，併加上以Keras生成之詞向量權重矩陣作為對照組進行實驗，以探究對預測準確率之影響。經實驗結果，預測準確率最高達86.92%，為CBOW與GRU(神經元個數16、Dropout為0.2)之組合；其次，Skip-gram在預測準確率之平均效果上優於CBOW及Keras生成之詞向量；另外，深度學習模型在預測效果上，GRU與LSTM兩者效果相近且較佳，RNN預測效果則居末。

關鍵字

情感分析；詞向量模型；深度學習； IMDb

並列摘要

With the development of modern information technology and the rapid growth of Internet information, the word vector and deep learning model proposed in this study can provide a platform for film information services to more quickly assist movie fans and the audience to make consumer behavior decisions. The word2vec word vector model is used to interpret texts, which has the advantage of dimensionality reduction, and can present the semantic relationship between similar words in the context. In this study, IMDb English film was used as the source of the corpus, with self-trained CBOW and Skip-gram word vector models and their weights, combined with deep learning models (RNN, LSTM, and GRU), and model parameter adjustments (number of neurons, Dropout) and added the word vector weight matrix generated by Keras as a control group to conduct experiments to explore the impact on the prediction accuracy. The experimental results show that the prediction accuracy rate is as high as 86.92%, which is a combination of CBOW and GRU (the number of neurons is 16, and the dropout is 0.2). Secondly, Skip-gram is better than word vectors generated by CBOW and Keras in the average effect of prediction accuracy. In addition, in terms of the prediction effect of the deep learning model, GRU and LSTM have similar and better effects , while the RNN has worse prediction effect.

並列關鍵字

Sentiment Analysis ； Word Vector Model ； Deep Learning ； IMDb

參考文獻

Google Scholar

[1] Santosh Kumar, P., Yadav, R. B., Dhavale, S. V. (2021). A Comparison of Pre-trained Word Embeddings for Sentiment Analysis Using Deep Learning. D. Gupta, A. Khanna, S. Bhattacharyya, A. E. Hassanien, S. Anand, A. Jaiswal (2021), International Conference on Innovative Computing and Communications (525–537). Springer Singapore.

Google Scholar

[2] 網路電影資料庫—維基百科，自由的百科全書. (2021). 讀取於 2021年6月8日, 從 https://zh.wikipedia.org/wiki/互联网电影资料库

Google Scholar

[3] 陳威達. (2020). 應用機器學習演算法進行文本情感分析之研究 [德明財經科技大學]. 收入資訊管理系: 卷碩士. https://hdl.handle.net/11296/mc64ce

Google Scholar

[4] Yenter, A. Verma, A. (2017). Deep CNN-LSTM with Combined Kernels from Multiple Branches for IMDb Review Sentiment Analysis. 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON), 540–546. https://doi.org/10.1109/UEMCON.2017.8249013

Google Scholar

國際替代計量

應用深度學習技術於電影評論情感分析

未授權

主題瀏覽