運用語意分析技術建構騎士人身安全部品推薦系統

在網路內容與資訊爆炸時代，巨量資料已成為資料分析領域中十分熱門的議題，且資料來源廣泛，如：政府開放資料、社群網站以及電子佈告欄系統等，眾多使用者的交流平臺，而使用者經由這些平台來發送訊息，成為一個巨量的社群資料庫，且能透過這些資料進行多元主題的分析。然而，以往要蒐集使用者的意見進行資料分析，常需依靠人力做市場調查，不僅花時間，且資料樣本數也不足，導致無法達到預期的成果；並且，若資料蒐集速度緩慢，又無法精確的符合客戶需求，將會錯過市場重要的商機。如何從這龐大的社群資料庫中，呈現出重要的資訊，便是一個目前十分重要的議題。本研究以小老婆汽機車討論網的資料為例，建構一個騎士人身部品的推薦系統，經由擷取使用者的討論文章，並將非結構化資料重新定義，成為結構化資料。經資料前置處理後，利用文字探勘方法，萃取資訊。而在文字探勘流程中，首先，針對每篇文章、每個回覆去做資料擷取及字詞拆解，並分析關鍵字，建立字詞義意正負向之詞庫，作為資料分析之基礎。接著，運用語意分析技術，運用建立好的詞庫來探討使用者所發表文章或回覆意見的情緒正負向。並讓使用者可以自訂詞彙，來提升其對於情緒分析的準確性。本系統除了能提供使用者文章內容的傾向，也得以提供分析資訊作為決策參考的依據。

關鍵字

巨量資料；語意分析；推薦系統

並列摘要

In this modern age where information explosion takes action in every minute, Big Data has raised a lot of eyebrows, becoming one of the hottest issue recently. This has to be credited to its wide coming sources, for instance, governmental open source, social media websites, Bulletin Board System (BBS), etc. These platforms share one common character, which involves various users delivering messages through these platforms, establishing a gigantic social community database. Furthermore, these data are able to be used in diversified theme analysis. However, since the olden days if one attempted to practice data analysis via users’ opinion collection, it relies on market research manually. The upshot is that not only time consuming, but the number of samples are usually insufficient, leading to a consequent that below expectation. In addition, if the collection takes too much time, but unable to match the requirement form the client, there is a risk that missing the opportunity of the market. Therefore, the question of how to present crucial information from a massive social community database, has currently turn into a significant subject. In this research, the researcher took Jorsindo Motor Club as an example, by created a recommendation system of bike safety equipment, with the assist of its database, and then captured articles form the forum, instituting structured data. After pre-processed these data, the researcher had extracted the information by text mining. In this procedure, data captured and words dismantled had been practiced in every subjects and replies. Besides, key-words had been analyzed, and a library of word meaning had been organized as the base of data analysis, including positive and negative side. The next step was the operation of semantic analysis, this was used for discussing the positive or negative emotion of users form the forum’s subjects and replies, with the help of word meaning library. Besides, users were able to personalize vocabularies, improving the accuracy of emotional analysis. This system is not only providing the tendency of user’s contents, but beneficial for analyzing information for reference in decision.

並列關鍵字

Big Data ； Sematic Analysis ； Recommendation System

參考文獻

陳稼興、謝佳倫、許芳誠(2000)。以遺傳演算法為基礎的中文斷詞研究。資訊管理研究, 2(2), 27-44。

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. Knowledge and Data Engineering, IEEE Transactions on, 17(6), 734-749.

Agarwal, V., & Bharadwaj, K. K. (2012). A collaborative filtering framework for friends recommendation in social networks based on interaction intensity and adaptive user similarity. Social Network Analysis and Mining, 3(3), 359-379.

Chris Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, 1999.

Gobble, M.M. (2013). Big data: The next big thing in innovation. Research Technology Management, 56(1), 64-66.

國際替代計量

運用語意分析技術建構騎士人身安全部品推薦系統

未授權

主題瀏覽