使用Lasso-Cp選取線性模型解釋變數之探討

當線性回歸模型中的自變數極多時, 正規化是個常用的辦法來達到降低被選取回歸模型複雜度之目的。Lasso (Tibshirani, 1996) 被認為是可以達到選取模型參數精簡目的之正規化方法。當線性回歸模型中的自變數為么正且自變數個數及樣本數個數相近時, 本論文探討使用Lasso 與Cp辦法選擇重要自變數的操作性質。考慮的操作性質, 包含了被選取自變數的個數及被選取真實自變數佔被選取自變數的比例。當Lasso 與Cp作為多重假設檢定辦法時, 這些結論也適用之。

關鍵字

最小角度回歸

並列摘要

When the number of predictors in a linear regression model is large, regularization is a commonly used method to reduce the complexity of the fitted model. LASSO (Tibshirani, 1996) is being advocated as a useful regulation method for achieving sparsity or parsimony of resulting fitted model. In this thesis, we study the operating characteristics of LASSO coupled with Mallows’Cp on identifying the orthonormal predictor variables of linear regression when the number of predictors and the number of the observation are of the same magnitude. The characteristics includes the chosen number of predictors and the proportion of correctly identified predictors. This result can be useful in multiple testing.

並列關鍵字

Least angle regression ； Forward selection

參考文獻

[1] Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716-723.

[2] Balkema, A.A. and de Haan, L. (1972). On R. von Mises’ condition for the domain of attraction of exp(¡e−x). Annals of Mathematical Statistics. 43, 1352-1354.

[5] de Haan, L. (1970). On regular variation and its application to the weak convergence of sample extremes. Thesis, University of Amsterdam, Mathematical Centre tract, 32, 296, 299, 301.

[6] Donoho, D. and Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425-455.

[8] Gnedenko, B. (1943). Sur la distribution limite du terme maximum d’une serie aleatoire. Annals of Mathematics, 44, 423-453.

國際替代計量

使用Lasso-Cp選取線性模型解釋變數之探討

全文下載

主題瀏覽