透過您的圖書館登入
IP:216.73.216.134
  • 學位論文

利用MaxF篩選變數組合之迴歸樹

Enhanced Sample-Efficient Regression Trees with MaxF Selection Criterion and Attribute Combination Selection

指導教授 : 陳正剛

摘要


無資料

關鍵字

迴歸樹

並列摘要


The well-known regression trees use the variance reduction as a measure to select attributes and split the data set to build a decision tree model. The conventional tree splitting, however, depletes the sample size rapidly after few levels of splitting results in unreliable splitting decisions with small sample sizes. In order to overcome the sample-depleting problem of regression trees, Sample-efficient regression trees (SERT) was proposed to avoid the unnecessary splits. But when a great number of interaction effects exist, the select-and-split construction of SERT is still not efficient in stopping the sample size depleting. In this research, we propose an Enhanced Sample-Efficient Regression Trees (ESERT) that expended with attribute combination selection and the MaxF selection criterion. We first show how to apply the MaxF selection criterion to regression tree’s attribute selection and stopping of tree construction. With the MaxF selection criterion, methodologies of attribute combination selection are introduced. A complete select-and-split tree construction and model estimation will be described. The ESERT procedures for both binary and continuous attributes will be developed. Using three different simulation scenarios, we demonstrate the contributions of MaxF selection criterion, sample-efficient method and attribute combination selection to tree construction. Two real cases: semiconductor bad tool selection and differentially expressed gene selection, will be also used to illustrate and validate the proposed ESERT.

並列關鍵字

Regression Trees

參考文獻


[1] Bendel R. B. and Afifi A. A., “Comparison of stopping rules in forward stepwise regression”, Journal of American Statistical Association, vol. 72, pp. 46-53, 1997.
[3] Kidd Lin, “Robust Test for batch-and-batch variable selection”, National Taiwan University, 2004.
[5] Lan H. Witten and Eibe Frank, “Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations”, 1999.
[6] Legend Fu, “Robust Test for Stepwise Selection”, National Taiwan University, 2003.
[9] Paul D. Allison, “Testing for Interaction in Multiple Regression”, The American Journal of Sociology, vol. 83, no. 1, pp. 144-153.

被引用紀錄


Lu, Y. P. (2008). 整合統計分析與知識推論系統的貝氏架構設計 -以半導體良率分析為例 [master's thesis, Yuan Ze University]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0009-2307200816125100

延伸閱讀