透過您的圖書館登入
IP:18.226.169.94
  • 學位論文

利用基因演算法及圖型探勘來產生圖型原型以利圖型分類

Graph Prototype Generation for Graph Classification Using Genetic Algorithms and Graph Mining

指導教授 : 蘇豐文

摘要


近年來,圖形被廣泛地運用在表示結構化的事物上。這裡所用的「圖形」是指一群有標號或沒有標號的點以及一群有標號或沒有標號,有向或無向的邊所形成的組合。舉例來說,一個圖形可以表示化合物的結構,網頁的連結狀態,以及其他數種結構化的物件。 雖然基於圖形的資料表示型態在近年來越來越流行,然而缺少強大的分析工具是它的最重大的弱點。為了改善這個弱點,有很多的研究都嘗試著把圖形化的資料映射至特徵向量上,如此就可以套用數值化的分類器,例如SVM、Boosting等等,在這些資料上。 在這篇論文中,我們提供了另一種觀點去處理圖形分類的問題。那就是原型產生的方法。透過直接運用圖形的結構特徵,利用基因演算法去產生能夠很好的表達類別間之差異的圖形原型。在這個方法中,我們提出了一種基於圖形的基因運算子來實作圖形間的交配繁殖以產生下一子代圖形的方法。另外,我們也利用gSpan來找出圖形之間的最大相同子圖(Maximal Common Subgraph),透過最大相同子圖算出圖形之間的距離,來作為圖形分類的依據以及圖形原型演化的合適度計算。 用我們的方法來輔助圖形分類的結果,可以一定程度的提高分類的正確率。在跟其他方法的比較中,是趨近於最佳的。而且,只要有一個演化的指標,這個圖形基因演算的方法就可以套用在任何其他的方法上。

關鍵字

圖型 原型 分類 基因演算法

並列摘要


Recently, graphs are widely used to represent structured objects. The term ‘graphs’ used here means a combination of labeled/unlabeled vertices and directed/undirected labeled/unlabeled edges. For example, a graph can represent the structure of chemical compounds , web page linking , and many other kinds of structured data. Although graph-based data is becoming more and more popular lately, the lack of powerful analytic tools is its major weakness and bottleneck. To cope with the disadvantage, various approaches attempted to map graph-based data structure into feature vectors so that they can apply statistical classifiers such as Support Vector Machine (SVM) , Boosting, etc. to classify the graphs. In this thesis, we propose a new point of view to deal with graph classification problems, i.e. prototype generation approach. That is, directly use the features of graphs (node labels, edge labels, connected components, subgraphs) to generate prototypes for each class that maximize the difference between intra-class similarity and inter-class similarity. In this approach, a graph-based genetic algorithm which includes genetic operators is used to generate offsprings, and gSpan (graph-based Substructure pattern mining) [Yan & Han 2002] is used to mining subgraphs to compute the fitness of selected prototypes. The classification accuracy of our method is near the best compared with other approaches with statistical classifiers. And it can be applied on almost every approach if a precise objective is given.

並列關鍵字

graph prototype classification genetic algorithm

參考文獻


Simpler Patterns. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases. 55-66, 2006.
Bunke, H. Error Correcting Graph Matching: On The Influence of The Underlying Cost
Bunke, H., Shearer, K., 1999. A Graph Distance Metric Based on The Maximal Common
International Workshop on Mining and Learning with Graphs, 2008.
Cheng, H., Yan, X., Han, J., Hsu, H. Discriminative Frequent Pattern Analysis for

延伸閱讀