抽樣調查資料的樣本加權方法及其估計式之比較分析

抽樣調查長久以來一直面臨調查所得資料的樣本代表性問題，為了避免造成對母體推估的偏誤，最常見的解決辦法就是採用加權的方式處理。本篇論文主要是探討當抽樣調查所蒐集的資料，因為各種因素導致資料不具樣本代表性時，試圖利用一般常見的加權方法，比較分析在不同樣本結構下這些常見的加權方法其母體比例估計之精確度。實務上的抽樣調查在抽取樣本時，可能因為調查的時間、問卷內容等原因，使得蒐集樣本的過程中，某些變數的類別調查到的人數特別少，例如文化活動意向調查，年齡較小的族群比例就嚴重偏低，如此調查到的樣本結構就與母體結構差異甚大，使得資料不具樣本代表性。為了調整樣本結構與母體結構之間的差異，本文使用多變數反覆加權(raking)和事後分層加權(post-stratification)兩種方法對樣本進行加權，並以採用Oh and Scheuren (1987)和一般市調公司之估計方法，模擬不同樣本結構下，採取不同加權方法和估計方式，並對母體比例參數p進行估計，比較其不偏性、變異數大小和MSE。研究結果顯示發現(1)在多變數反覆加權下，一般市調公司與文獻中之估計方法兩者模擬結果都相當接近，但與母體結構不一致的變數越多文獻採用的估計式其估計結果較佳；(2)當樣本結構與母體結構差異越大，不論採用多變數反覆加權還是事後分層加權，其偏誤和變異也越大。

關鍵字

分層抽樣；多變數反覆加權；事後分層加權；樣本代表性；校準估計

並列摘要

The data we collected sometimes might be not with sample representativeness due to various reasons. For the non-representative sample, the raking methods are usually used to adjust the data. This article aims to compare the precision of the general raking methods with their corresponding estimators under the different sample distribution. In practice, iterative raking and post-stratification methods are commonly used for sample weighting to adjust the difference between the sample distribution and population distribution. The methods used for simulation and comparative study include methods proposed by Oh and Scheuren (1987) and methods used in market research companies in Taiwan. Simulation approach based on an empirical data set as a population is used to compare the precision of different weighted methods and corresponding estimations under the different sample distributions. Precision assessment is based upon their bias, variance, and MSE. The simulation results show that (1) in general, there is no big difference between the methods from literature and the market research companies, (2) if more than one variable indicates sample distribution is not consistent with population distribution, raking methods of literature has a better precision than that of market research companies, (3) For both raking or post-stratification methods, the bias and variance would be enlarged as the difference between sample distribution and population distribution are enlarged.

並列關鍵字

stratified sampling ； raking ； post-stratification ； representative sample ； calibration

參考文獻

洪永泰(1996)，抽樣調查中樣本代表性的問題，調查研究，第1期，頁7-37。

黃紀、張佑宗(2003)，樣本代表性檢定與最小差異加權：以2001年台灣選舉與民主化調查為例，選舉研究，第10卷，第2期，頁1-37。

Binder, D. A., and Theberge, A. (1988). Estimating the Variance of Raking Ratio Estimators. Canadian Journal of Statistics. Vol. 16, pp. 47-55.

Battaglia, M. P., Izrael, D., Hoaglin, D. C., Frankel, M. R. (2004). Tips and Tricks for Raking Survey Data (A.K.A. Sample Balancing). Paper presented at the annual meeting of the American Association for Public Opinion Research.

Deming, W. E., and Stephan, F. F. (1940). On a Least Square Adjustment of a Sampled Frequency Table When the Expected Marginal Totals Are Known. Annals of Mathematical Statistics, Vol. 11, No. 4, pp. 427-444.

國際替代計量

抽樣調查資料的樣本加權方法及其估計式之比較分析

未授權

主題瀏覽