模型選取一直以來是統計學上熱門的課題之一,當資料為獨立時常使用的選擇方法有Mallows's CP (Mallows, 1973) 、AIC (Akaike, 1974 ) 、BIC (Schwarz, 1978) 等等。長期追蹤資料 (longitudinal data) 在許多領域是一種常見的資料類型,在分析長期追蹤資料時,因重複觀測資料具有相關性,因此Mallows's CP、AIC及BIC並不適用。因此Jinchi 與 Liu 於2014 年提出廣義AIC (GAIC) 及廣義BIC (GBIC) ,將兩者推廣至模型缺失 (model misspecification) 的狀況下使用。另外,Shen 與 Chen (2012) 則針對在長期追蹤資料有單調缺失 (monotone) 的情況下,提出 MLIC (missing longitudinal information criterion) 的模型選取方法。 本文研究主題為探討在長期追蹤資料之下,模型選取方法的選擇問題。本文以模擬研究分析比較GAIC (generalized Akaike information criterion) 、 GBIC (generalized bayesian information criterion) 、 MLIC 三種模型選取方法,在不同狀況下的優劣。另外本文亦將三種選取模型方法應用至實際資料分析。
Recent developments in the field of Statistics have led to an interest in model selection. Mallows's CP (Mallows, 1973), AIC (Akaike, 1974) , BIC (Schwarz, 1978) are common methods applied to select models when data is independent. Longitudinal data is a common type of data in many fields; however, it is not proper to use above methods for model selection due to the repetitive observations. As a consequence, a recent study by Junchi and Liu (2014) has proposed model selection principles in misspecified models by GAIC and GBIC. Moreover, Shen and Chen (2012) proposed the missing longitudinal information criterion (MLIC) for GEE analysis when the outcome data are subject to dropout. This present study attempts to investigate the advantages and disadvantages of GAIC, GBIC, and MLIC model selection methods through the simulation studies. Besides, this present study also uses these model selection methods to analyze real data.