透過您的圖書館登入
IP:18.217.68.197
  • 期刊

共線性資料下不同準則對PLS中決定最佳因子數的模擬研究

The Simulation Study on Determining the Optimal Number of Factors of PLS Under Multi-Collinearity Data

若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


利用淨最小平方回歸來解決資料中共線性問題是有幫助的。淨最小平方法主要是將X與y的訊息轉成新的因子,第一個因子所攜帶原始資料中的訊息量最為豐富,第二個因子次之,以此類推至所有的因子。在所有新的因子中我們會刪除一些僅帶有薄弱訊息的因子,而保留模式中有用的因子。至於要保留前面幾個因子數目將是本研究所要探討的問題。本文中,選取最佳因子數目的5個準則分別為交叉驗證法、估計均方誤差法、外部驗證法、校正殘差法與變數轉換法。並利用模擬研究來對這5個準則作比較,模擬研究是依照共線性程度的大小來模擬產生資料,在此模擬中我們控制了資料的共線性大小,當共線性程度太過嚴重時,可以利用交叉驗證法來選擇保留有用訊息的因子數目。 綜言之,使用外部驗證法跟交叉驗證法對模式中最佳因子數目的估計能獲得不錯的效果,而交叉驗證法在各種情形下對模式中最佳因子數目的估計最為準確。

並列摘要


Partial least squares regression is helpful to solve the multi-collinear problem of data. PLS method transforms the information of X and y into new factors. The first factor carries the most information of original data, and then the second factors, and so on. For all factors we will delete some factors which having less information and keep the useful factors. So how to keep useful factors is the purpose of this research. The five criteria for choosing the number of fitting factors are MSECV, MSEP, MSEE, MSERSS, and H-ERROR. Those criteria were compared by simulation study. The simulated data were generated using the different degree of collinear. On this simulated data we controlled the degree of collinear. When the degree of collinear was too serious, we could use MSECV criterion to keep some useful factors. In conclusion, a good result of choosing the number of fitting factors can be achieved by using the MSEP and MSECV criteria, respectively. Additionally, in all conditions the accurate estimation will be achieved by the MSECV criterion.

延伸閱讀