迴歸分析在許多研究中扮演重要決策角色,然而若迴歸分析中的共變數發生遺失的情況時,對迴歸分析的結論會有何影響?忽略遺失的共變數,進而直接將剩餘完整的資料進行迴歸分析是最簡單的方式,但卻產生訊息的減少及檢定力的下降。若不忽略遺失的共變數,可能的做法包括插補、加權、概率及無母數等方法,其中又以多重插補法廣為應用。然而多重插補法在某些資料遺失的情況下,無法提供正確的結果。基於如此,本篇文章將藉由模擬方式說明多重插補法在一般情況下產生較小偏差,但在某些情況下卻產生較大偏差。文中將比較完整案例分析法、反向選擇機率法及多重插補法等方法之表現,探討的模擬情況包括探討遺失資料的機制及比率對處理遺失資料方法之影響效應。模擬結果將提供在使用多重插補法時所需注意的事項,避免造成錯誤的應用。
Regression with missing covariate data is a frequently encountered and important problem in research. In the last few decades, there have been important developments in methodology, such as imputation, weighting, likelihood, and nonparametric methods. Among these methods, multiple imputation has been well applied in many applications. However, multiple imputation may not be valid in the sense that it may have biases under some missing data situations. In this note, we demonstrate that multiple imputation generally has small biases, but under some situations the biases could be large. The complete-case analysis and the inverse selection probability estimator are also compared via intensive simulation studies. Our research here provides cautious notes regarding the use of multiple imputation in applications.