現今巨量資料在資訊科學裡是非常熱門的議題之一,也有人將巨量資料比喻為推動人類社會發展的「新石油」,巨量資料時代來臨後,越來越多的人投入資料的探勘和採擷,Data Mining的工具是利用資料來建立一些模擬真實世界的模式(model),利用這些模式來描述資料中的特徵(patterns)以及關係(relations)。這些模式有兩種用處,第一;瞭解資料的特徵與關係可以提供你做決策所需要的資訊,譬如關聯模型(association model)可以幫助超級市場或百貨公司規畫如何擺設貨品。第二;資料的特徵可以幫助你做銷售預測,例如你可以根據顧客資料輪廓,預測出哪些客戶最可能對你的推銷做回應,所以你可以只對特定的對象做郵購推銷,不必浪費許多印刷費郵寄費而只得到很少的回應,這也促進了電子商務的蓬勃發展。本文將以中心點分析(centrality analysis)演算法來做參考再加以改良,找出巨量資料在特定範圍內的重要節點(中心點),這些節點能幫助我們了解在關聯模型之下與其他節點所代表的意義。最後再以R語言來實作資料的視覺化呈現。
Big data has become one of the most popular topics in the field of information science. People even regard big data as 〞new oil〞, which is beneficial to the development of human society. In this vein, more and more people begin to develop Data Miming's tool and build/simulate real-world models to describe the patterns or relationships in the data. Applying these models might bring some advantages for business. Firstly, exploring useful characteristics and relationships of information will help enterprise to make decisions, e.g., managers can apply the association models of goods to display goods or make promotion. Secondly, the characteristics of the information can help enterprise to make sales forecasts. For example, managers can predict what type of customers is most likely to respond to your marketing strategy based on customer data portfolios, and further conduct customization marketing. To enhance the ability of social computing in big data, this paper aims to develop centrality analysis algorithm and incorporate it into an information system which can be applied for uncovering interesting patterns in social network. Furthermore, this paper also spends much efforts on data visualization using R language, helping users to comprehend the meanings of system outcomes.