透過您的圖書館登入
IP:18.119.110.116
  • 學位論文

迴歸變數選取之研究

Variable selection methods

指導教授 : 洪慧念

摘要


變數選取在統計中是一個重要的問題,而在大數據問題中,有一個變數個數p大於樣本個數n的問題,分析在p大於n時會有怎麼樣的問題,本篇論文討論的變數選取從線性模型著手。本論文使用了兩個可以解決這個問題的方法,一個是Bayesian Lasso[Park, T. and Casella, G. 2008 ],這是Lasso的延伸,Lasso在變數挑選中被廣為人知,這方法是Lasso和貝氏函數的結合,另一個方法是Stochastic search variable selection(SSVS)[George; Robert E. McCulloch 1993] ,這個方法是假設了貝氏函數,對於參數估計有了一個先驗分配,再用MCMC對於要估計的參數去抽樣,這兩個方法都在變數個數p大於樣本個數n時可以使用,並且做出不錯的結果,然而,本篇提出了一個截然不同的方法,稱作「分群法」,概念是藉由先進行分群再去挑選變數,將這方法與前面提的兩個方法進行比較分析,討論出這三個方法的優劣。

並列摘要


"Variables selection" is an important question in statistics. In this thesis we compare several existing methods, including Bayesian Lasso [Park, T. and Casella, G. 2008 ] and Stochastic search variable selection SSVS)[George; Robert E. McCulloch 1993]. We also provide a new method called "grouping method". We make comparison in the case of "large p and small n" data set.

參考文獻


[2]Bien, J., Taylor, J. and Tibshirani, R. (2013). A LASSO for hierarchical interactions.Ann. Statist. 41 1111–1141
[3]Bo Jiang and Jun S. Liu (2014) Variable selection for general index models via sliced inverse regression
[5]Clyde, M. A. and Parmigiani, G. (1994). Bayesian variable selection and prediction with mixtures. J. Biopharm. Statist.
[6]Efron, B., Hastie, T., Johnstone, I. and Tibshirani R. (2004). Least angle regression. Ann.Statist. 32 407–499.
[7]Emmanuel Candes and Terence Tao(2007) The Dantzig selector: Statistical estimation when p is much larger than n

延伸閱讀


國際替代計量