使用基因演算法建構分類樹

使用基因演算法來建構分類樹是一項創新的實驗研究。我們使用基因演算法不僅能用於輔助決策樹來決定最佳的屬性，並結合了資訊理論，以其作為分類樹中的分割準則，即最小化熵之概念。我們採用的方式是使用基因演算法，一開始以啟發式的作法來產生初始之演化族群，考量最小化之熵數與即時線上測試誤差為衡量依據，來找出分類樹中之最佳切割點，同時決定其切割屬性與切割值，一步步地以最佳節點將分類樹給推衍出來。在本研究中，我們針對五種具代表性的分類資料進行實驗，將提出的作法與其他決策樹演算法做比較。實驗結果顯示，本研究所提出之混合啟發式基因分類樹演算法能夠有效的降低分類樹中切割方式的複雜度，並能建構出較小的分類樹。此外，本研究亦開發了一套雛形系統，用以輔助分類樹的建構並自動產生一些規則以支援決策者制定決策。我們嘗試採用基因演算法去設計一套知識擷取系統，用以建構分類樹以進一步進行資料探勘之應用。

關鍵字

基因演算法；分類樹；決策樹；熵；資訊理論

並列摘要

In this paper, using genetic algorithms to construct classification trees is a novel implement research. The Genetic Algorithm not only assists to get the optimal features as the tree nodes but also combine the information theory as the criterion of the classification trees to minimum the entropy. The splitting method of the subsets of individuals associated to the nodes is the Genetic Algorithm. The stopping criterion of the tree induction is based on a heuristic able to recognize whether the set of the individuals associated to a node of the tree is a sub-population, or not. And considering the information theory combined with Genetic Algorithm-based computing to induct the classification tree. Experimental results for five data mining classification problems are presented and compared with other decision trees algorithms. This Hybrid GA-Heuristic Classification Tree algorithm indicates that a Genetic algorithm reduces the complexity of the used splitting methods to construct a small tree. A prototype was presented to assistant the classification tree construction and produced a set of rules to support decision makers. We try to design a knowledge acquisition system using genetic algorithms to construct the classification trees applying to data mining.

並列關鍵字

Genetic Algorithm ； Classification Tree ； Decision Tree ； Entropy ； Information Theory

參考文獻

[2] Apte, G., Weiss, S., “Data Mining with Decision Trees and Decision Rules”, Future Generation Computer Systems, Vol. 13, pp.197-210, 1997.

[3] Bala, J., De Jong, K., "Using Learning to Facilitate the Evolution of Features for Recognizing Visual Concepts", Evolutionary Computation, Vol. 4, Issue 3, pp.297-312, Fall 1996.

[5] Bala, J., Kenneth, A., Jong, D., Haung, J., Vafaie H., Wechsler, H., “Hybrid Learning Using Genetic Algorithms and Decision Trees for Pattern Classification”, IJCAI conference, Montreal, pp.19-25, August 1995.

[7] Breslow, L., D.W. Aha, "Simplifying Decision Trees: A Survey," Knowledge Engineering Review, Vol. 12, pp.1-40, 1997.

[9] Chai, B. B., Huang, T., Zhuang, X., Zhao, Y., Sklansky, J., "Piecewise Linear Classifiers Using Binary Tree Structure and Genetic Algorithm", Pattern Recognition, Vol. 29, No. 11, pp.1905-1917, 1996.

被引用紀錄

葉政豐（2004）。半導體封裝廠銲線機台選擇與派工模型〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200400113

謝佳蓁（2002）。多資源需求下之最適派工--以半導體測試作業為例〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200200321

國際替代計量

使用基因演算法建構分類樹

主題瀏覽