近年來因社群網路的興起,人們互相分享資訊造就社群網路成為大數據提供者,過載的資訊量也增加網路使用者分析與消化資訊的困難度。本研究實作一產品評論的正負評價分類系統,目標在於分析社群網路上海量的商品評論訊息,找出實際有用的資訊並以正負評價做為分類。此系統以社群網站上針對商品的相關評論為資料來源,再透過自然語言處理技術將文本資料轉換為電腦可以理解的詞向量,接下來用深度學習與情緒分析技術將評論內容做正負評價分類。 本研究使用Jieba做為中文斷詞工具,並使用word2vec訓練詞向量模型及文本的特徴提取,再使用PCA降低資料維度,最後使用SVM、MLP、LSTM三種不同的深度學習方法,進行評論的正負評價分類,在實驗過程中嘗試調整各項參數值進行多次訓練,再交叉比對實驗結果以找出最佳的深度學習方法及參數設定,以訓練出更有效的深度學習模型。由實驗結果得出LSTM在文本分類的精確度優於另外兩個方法,且不易受參數調整影響,是相對穩定的分類模型。
In recent years, with the increasing popularity of social networking, the way that people share information with each other lets social networks become a big data provider. Overloaded information also increases the difficulty for users to analyze information. This study is aimed to analyze a large number of product reviews on the social network and find useful information on evaluation classification by means of building positive and negative comments for product reviews. This dataset is a product review on Taobao.com, and then we convert text data into word vectors using natural language processing techniques. At last, we use machine or deep learning technique to classify the comments as positive and negative evaluations. This study employs Jieba Chinese word segmentation tool and extracts the features of word2vec training word vector model and text. Finally, we classify the positive and negative comments by using three different machine or deep learning methods, SVM, MLP and LSTM. In the experiments, we try to adjust the parameters and conduct numerous cross-matching to the experimental results in order to find the best machine learning method and parameters assignment as well as to train this system with more effective machine or deep learning model.