透過您的圖書館登入
IP:3.138.113.188
  • 學位論文

文獻參考網路分析之單一主題參考文獻分析

The complex network of paper citation:Single Topic citation network analysis

指導教授 : 翁昭旼

摘要


以網路上抓下NOD2 這個基因為主題的相關論文引用,將這些的文獻的引用(Reference)進行格式化並存入資料庫中以便進行快速的資料搜尋以及網路重建。 在資料庫建製完成,我們再進行網路的建構,並進行網路的分析。 在這些分析中我們使用到帕松(Poisson) 去計算網路節點相連分支度的機率分佈,並計算多項的網路數據分析,包括Cluster Coefficient ( C )以及Average Shortest Path Length( L )等。數據顯示此單一主題參考文獻網路中的C值較驗証用隨機網路(Random Network )來的大,這表示此網路有群聚的現象,L很短表示網路有中心匯聚節點(Hub Node)的存在,因此我們知道此的網路是屬於無尺度網路(Scale Free Network) 。因此我們提出一個找尋匯聚節點的演算法希望能找出這個網路中的匯聚節點。此外我們希望能夠預測這個網路之中的單一節點所代表之文獻被引用之成長與衰退,我們利用兩種演算法可以來區分此網路的節點,分別為門檻網路(Threshold Network)以及排序普羅比模式(Order Probit Model)區分,門檻網路的主要目的是將混合在一起的成長與衰退的節點分散開來,排序普羅比模式的目的是希望在已經分出來的區域中分成成長度較高與次高的區域,為了慎重起見,我們將這樣的方法使用在KDD 2003 的參照資料集中用來驗證我們這樣的方法是可行的,結果的確可以使預測的誤差下降,因此我們也將這樣的方式使用在我們的文獻參考網路中。

並列摘要


To analyze single topic literature citation network we focus on the topic of NOD2 gene. Our research focuses on the cluster of the network and paper citation raise and fall trends .After find out the important Hub Node in this Network then we try to predict the citation count. The topic is about NOD2 gene. First we search topic NOD2 in the ISI Web of Knowledge. Second, we use each paper’s reference to build citation network. After network construction, we can analysis network’s characteristics. At first we use Poisson distribution and probability to generate node degree distribution and use this to build random network. Then we use this random network to compare with our network. According to Cluster Coefficient value, Average Shortest Path Length value and Network citation distribution of our network, we know our network is Scale Free Network. Bigger C value compare to random network means our network is a lot more aggregate than random network. Small L value compare to random network means we have Hub Nodes in our network. Thus we provide a hub node search algorithm to search Hub nodes. Besides, we also want to predict the raise and fall trend of paper citation. We provide algorithm to distinguish those nodes raise and fall. That included Threshold Network and Order Probit Model. The goal of Threshold Network is to distinguish the trends of raise and fall. Order Probit Model is used to distinguish the significance of growth. Which we had verified those methods in KDD 2003 citation network. The result in KDD tells us this method can reduce the prediction error rate. Thus we use this on our network and retrieve network’s characteristics.

參考文獻


【1】 D. J. Watts, Steven H. Strogatz, Collective dynamics of ‘small-world’ networks.Nature, 1998.
【2】 M. E. J. Newman, The structure and function of complex networks, arXiv:cond-mat/0303516 v1 25 Mar 2003.
【4】 XF. Wang, G Chen ,Complex networks: small-world, scale-free and beyond , Circuits and Systems Magazine, IEEE, 2003.
【5】 AL. Barabási and R Albert , Emergence of Scaling in Random Networks, Science, 1999 .
【6】 JP. Eckmann, E Moses,Curvature of co-links uncovers hidden thematic layers in the World Wide Web, PNAS, 2002.

延伸閱讀