Sentiment Analysis of Internet Public Opinions After Introducing Distance-based Electronic Toll Collection on Taiwan's Freeway

指導教授 : 陶治中 蕭瑞祥


我國民眾每日對於交通運輸系統倚賴程度甚深,近年隨著網際網路技術的成熟與普及,民眾已習慣於網路平台發表與分享使用交通運輸系統之經驗,若能針對使用者使用經驗回饋資訊進行有效的匯整分析,則有利於管理者即時掌握重要議題而擬訂對應的策略。   本研究將資料探勘技術應用於交通資訊分析,藉由文獻彙整與專家討論歸納一包含文本蒐集、文本處理、文本分類、情感分析步驟之網路輿情分析模式,並將其應用於我國重大交通議題上,期望透過該模式對我國實施計程收費後之網路評論進行分析,並以此瞭解民眾對於網路評論下之話題情感趨勢。本研究透過爬蟲系統、程式對網路評論文本進行蒐集與處理,利用CKIP斷詞系統以及Microsoft Excel軟體進行文本斷詞與詞頻統計,然後運用Weka軟體進行文本分類及後續應用。為使本研究分析結果具可靠性,網路評論文本係經人工閱讀並歸納於對應分類,特徵項詞庫與情感詞庫的建置亦與專家討論,並以此作為文本分類與情感分析之重要基礎。   經由實證分析結果可知,本研究所提出之適用於交通重大議題下網路評論文本分析模式具有良好的分析能力,顯示出特徵項詞庫的建構與分類器的選擇良好。在我國實施計程收費後之網路輿情分析上,民眾關注話題以收費員抗爭話題、高速公路壅塞話題、差別費率討論、計程收費與通行費討論為最多,顯見民眾較關心政策相關、產品使用經驗與實際上路遭遇之問題。在情感分析上,以累加每日情感值方式繪製時間-情感曲線圖,個別情感趨勢較呈現負面,可知熱門討論話題具有較多的抱怨內容,但整體而言可看出情感趨勢的走向逐漸轉正,顯示民眾已逐漸接受計程收費政策。   綜整研究結果可發現,在本研究之網路輿情評論分析模式中,特徵項詞庫與情感詞庫在文本與情感分類上具有舉足輕重的影響,而情感曲線圖的繪製則需要長期的歷史資訊方可歸納出整體話題走向。本研究建構之網路輿情分析模式是以分析歷史資訊並提供分析流程,所分析出之結果除可供相關產業研究人員與決策者進行參考外,分析模式亦可作為未來自動化網路輿情評論分析系統之基礎。此外,本研究所提出之網路評論分析模式,亦可藉由修改特徵項詞庫、情感詞庫以及分類器的再訓練,供其它領域應用於分析評論時之參考。


The degree of relying on daily transportation systems is more and more significant for Taiwan’s people. Owing to ubiquitous Inernet technologies, they get used to share and express their viewpoints on experiencing transportation systems via network platforms. If managers can obtain these real-time public opinions, corresponding strategies can be provided effectively. The approach applying data mining technology to traffic information analysis in this study consists of text gathering, processing, classifying and sentiment analyzing. An empirical study on public opinions and sentiment analysis of distance-based Electronic Toll Collection on Taiwan’s Freeway is conducted. With the help of web crawler systems, ETC related text data are gathered and processed. Then text divisions and frequency statistics of keywords are completed by using CKIP system and Microsoft Excel. Eventually Weka software is used for text classifications and further applications .To assure results of this study more reliable, text classifications are made with manual reading. In addition, features and sentiment words database are also constructed by discussing with experts which will be fundamentals for text classification and sentiment analysis. The results verify the proposed model is valid for analyzing public opinions on transportation issues and there is no need to adjust text classifier and feature items. It is shown that people pay more attention to topics such as toll collector, freeway congestion, different rates, distance-based toll collection. As to sentiment analysis, daily accumulated sentiment values are used to draw curve graphs according to public opinions’ appearance trends. The results show the public opinions on distance-based Electronic Toll Collection on Taiwan’s Freeway have been accepted gradually. In conclusion, feature items from the internet public opinion analysis have great influence on text classification, and curve graphs need long-term historical data to identify evolving trends. The approach in this study can also applied to other disciplines by modifying text feature items and classifier.




