  • 學位論文


A Comparative Analysis among Regression Estimators under Complex Sampling Survey with Missing Data

指導教授 : 許玉雪


本文旨在探討用於複雜抽樣下具有缺失資料之迴歸分析方法,並試圖提出一具不偏性且變異較小的結合插補方式之迴歸參數估計式。透過模擬方式比較分析結合插補方式之迴歸分析方法與 Skinner 和 Coker (1996) 所提的模型輔助迴歸分析法在「複雜抽樣設計下具缺失的不完整資料」的估計優劣。 本文複雜的抽樣方式將以 PPS (Probability Proportional to Size Sampling) 抽樣法為探討對象,而資料缺失機制則假設為隨機缺失 (Missing at Random ; MAR)。結合插補方式之迴歸分析方法係分別採用熱卡插補法 (Hot-Deck Imputation Method) 以及迴歸插補法 (Regression Imputation Method) 進行缺失值的插補,插補過後的完整資料再使用最小平方法以及機率加權最小平方法進行迴歸參數估計式,提出一不偏且變異較小的結合插補方式之迴歸參數估計式。進而將此估計式和 Skinner 和 Coker (1996) 提出的以模型調整法為基礎的最大概似迴歸估計式進行模擬比較。模擬結果顯示 (1)在具缺失之變數其變異較小時,由迴歸插補法插補至完整之資料,其結合機率加權最小平方法之估計式的不偏性為最佳; (2)當具缺失之變數其變異較大時,以模型調整法為基礎的最大概似迴歸估計式其在變異程度和不偏性的表現較為理想; (3)具缺失之資料先填補至完整的前提下,結合迴歸插補法之迴歸估計式是一處理在「複雜抽樣設計下具缺失的不完整資料」可行之方法。


This paper aims to compare the estimators of regression coefficients under complex survey with missing data based upon a Monte Carlo approach.Recently, regression analysis with complex surveys has being gotten more important and how to handle missing values effectively is also an important issue.This paper intends to provide an alternative estimation method for regression coefficients under a probability proportional to size sampling (PPS) with missing data and compare it with model-based pseudo maximum likelihood approach provided by Skinner and Coker (1996).The estimation method provided by this thesis is an integration of regression coefficients estimates with imputation.In this study the missing data mechanism is assumed to be missing at random (MAR); the imputation methods used in this paper include hot-deck imputation and regression imputation; as the alternative estimators include ordinary least squares estimator and probability weighted least squares estimator.The simulation results show that (1) The method provided by this study which combines regression imputation with probability weighted least squares estimator is an unbiased estimators of regression coefficients as the interest variable with missing has small variance, (2) The simulation results also find that the pseudo maximum likelihood estimator is an unbiased estimator and has a smaller asymptotic variance than other methods when the interest variable with missing has large variance,(3) The method proposed by this study which integrates regression imputation with probability weighted least squares estimator is a proper method for regression estimation under a complex survey with missing data.


李秋慧 (2010), 複雜抽樣設計下多元迴歸參數估計式之實證比較, 國立臺北大學 統計學研究所碩士論文。
陳建銘 (2008), 分層不等機率抽樣之迴歸參數估計的比較分析, 國立臺北大學統 計學研究所碩士論文。
Fellegi, I. P. and D. Holt (1976). A Systematic Approach to Automatic Edit and Imputation, Journal of the American Statistical Association, 71,17-35.
Hasen, M. H. and Hurwitz, W. N. (1943). On the Theory of Sampling from Finite Populations. Ann. Math. Statist, 14, 333-362.
Kalton, G. and Kasprzyk, D. (1986). The treatment of missing survey data. Survey Methodology, 12, 1-16.


