透過您的圖書館登入
IP:3.144.102.239
  • 學位論文

以基因規劃法自動建構多元節點迴歸樹

The Automatic Construction of Multivariate Split Point Regression Trees: A Genetic Programming Approach

指導教授 : 邱昭彰
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


資料探勘為一新崛起的知識擷取技術的泛稱,透過演算法自動地搜尋資料屬性間有用及有趣關係。在許多的資料探勘技術中(如類神經網路)對資料關係難以提供明確的規則描述。近年來,基因規劃法(Genetic Programming)成功地被運用在許多的領域,並具有從大量資料中挖掘出內隱知識的樣式識別的能力。在本研究中提出以基因規劃法自動建構多元節點迴歸樹之演算法,此演算法以先前文獻「以基因規劃法建構迴歸樹」之改進,做為除目前既存迴歸樹演算法外的另一種選擇。本研究著重於利用基因規劃法找出具有多個屬性的節點及切割值並結合區域線性規劃法,再配合樹的修剪及評估以建構出最佳的多元節點迴歸樹。並以二組迴歸資料進行實驗,將本研究提出的演算法與其它迴歸演算法做比較。實驗結果顯示,本研究所提出之多元節點迴歸樹演算法在精確度上比先前的研究要進步許多,在測試階段其效果更是比先前的研究表現更優越。此外,本研究亦開發了一套雛形系統,用以輔助多元節點迴歸樹的建構並自動產生一些規則以支援決策者制定決策。

並列摘要


Data mining is the automated search for interesting and useful relationships between attributes in database. In many of the best techniques (such as neural networks) yield little in terms of usable rules. In recent years, there has been considerable success in the use of genetic programming (GP) to evolve pattern recognizers. In this article we presents a GP-multivariate split point regression tree algorithm (called GPMRT) as an alternative to existing regression tree approaches. This is a reformed algorithm from previous research (called GPRT). It is using genetic programming and local linear regression to construct regression trees by genetic selection of features and Univariate split points, then using tree pruning and evaluation trying to find out an optimized regression tree. In our research, we focus on multivariate split points and introduce a MDL principle to balance accuracy and parsimony. We want to prove its efficiency of the splitter of the multi dimension regression tree is perfect than single dimension regression tree. The experiment results show that GPMRT is better than GPRT in training and testing phase.

參考文獻


[2] Safavian, S. R. and Landgrebe, D., "A Survey of Decision Tree Classifier Methodology," IEEE Transactions on Systems, Man and Cybernetics, Vol. 21, pp. 660-674, May/June 1991.
[21] Nikolaev, N. I. and Slavov, V., "Inductive Genetic Programming with Decision Trees," In 9th European Conference on Machine Learning, Czech Republic, Aprial 1997.
[5] Wang, H. and Zaniolo, C., "CMP: A Fast Decision Tree Classifier Using Multivariate Predictions," International Conference on Data Engineering, pp. 449-460, 2000.
[6] Langdon, W. B. and Qureshi, A., "Genetic Programming Computers using Natural Selection to Generate Programs," Gower Street, London, WC1E 6BT, UK, 1995.
[7] Murthy, S. K., "Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey," Data Mining and Knowledge Discovery, Vol. 4, pp.345-389, 1998.

延伸閱讀