透過您的圖書館登入
IP:18.116.63.174
  • 期刊

探討階層線性模式中影響二元反應變數參數估計之因子

Exploring Factors Effecting Parameter Estimates in Hierarchical Linear Models with Binary Responses

摘要


許多實證研究所處理的資料常具有階層性的巢狀結構,傳統上對於這類資料的處理,多以迴歸方法,利用「合計」或「散計」的技巧,將階層性的資料當成不相關的個體再加以處理。然而,迴歸獨立性的假設也因此忽略了同一階層內的資料存在的相依關係,使得分析結果不夠精確。近年來新興的階層線性模式(Hierarchical Linear Models, HLM)有助於處理階層資料。階層線性模式就是針對各個階層分別建立迴歸模式,再將組內與組間的變異因素分別考慮,因此可以增進參數估計的精確度。 本研究主在探討階層線性模式中,影響二元反應變數參數估計之因子。所研究的模式為截距與斜率為結果變項模式(intercepts and slopes as outcomes model),考慮的影響因子為組內樣本個數、群組個數、斜率隨機效果變異數、截距隨機效果變異數、斜率與截距之間的相關係數。以往文獻多以連續變數為依變數,本研究考慮二元反應變數。研究者利用模擬資料求出參數估計值,計算估計偏誤(bias)與均方誤差(MSE),最後以迴歸分析探討何者為影響參數估計精確度的重要因子。 結果發現,主要影響參數估計精準度的因子為變異數與相關係數。當變異數愈大,相關係數愈小,其估計的精確度愈低,其中又以斜率隨機效果變異數最為明顯。而組內樣本個數或組別個數對參數估計精確度未有明顯的差異,可能的原因是本研究所設計的樣本數大小差異並非很大,因而樣本效應並不顯著。

並列摘要


Nested data structures exist in empirical research. It is traditionally handled with regression techniques to aggregate or dis-aggregate the data so that the data are treated as if they were uncorrelated. However, since the data are actually correlated within each hierarchy, these techniques ignore the heterogeneity of correlations. Consequently, biased estimates are obtained. The Hierarchical Linear Models (HLM) handle nested data. HLM builds separate regression models for each hierarchy, considers variations from both within and between hierarchies, and improves the accuracy of parameter estimates. The objective of this study is to explore the accuracy of parameter estimation under different conditions. The studied model is ”intercepts and slopes as outcomes” model, and we investigated the effects of sample sizes (number of clusters and cluster sizes), slopes and intercepts random effects, and correlation between slopes and intercepts on parameter estimation. We considered binary responses that are different from previous researches mainly on continuous responses. We simulated data based on combination of the five factors above, calculated bias and mean square error. We conducted a multiple regression analysis to study the impact of the factors. The result showed variances and correlation are the most influential factors to the accuracy of estimation, which are intercepts and slopes random effects, and correlation between slopes and intercepts. The larger the variance of intercepts and slopes or the smaller of the correlation, the less accurate the estimation. Among the variances, the random slopes effect is the most significant. The sample sizes do not have a substantial impact either. The accuracy of the smaller sizes does not differ much from those of the larger samples. Our explanation is the two sample sizes in our design are not distinguishable enough, so the difference of accuracy of estimation is not significant.

參考文獻


Aitkin, F.(1999).A general maximum likelihood analysis of variance components in generalized linear models.Biometrics.55,117-128.
Breslow, N.,Clayton D.(1993).Approximate inference in generalized linear models.Journal of the American Statistical Association.88,9-25.
Bridge, R.,Judd, C.,Moock, P.(1979).The determinants of educational outcomes: The impact of families, peers, teachers, and schools.Cambridge, MA:Ballinger.
Browne, W. J.(1998).Applying MCMC Methods to multilevel Models.Department of Mathematical Sciences, University of Bath, UK.
Bryk, A. S.,Raudenbush, S. W.(1988).Heterogeneity of variance in experimental studies: A challenge to conventional interpretations.Psychological Bulletin.104,396-404.

延伸閱讀