因素分析常被用來探討測驗項目或問卷項目的資料結構,若原始資料的遺失值過多,且屬於隨機缺失(MAR)時,若僅針對完整的資料來做分析,易使資料分析的結果產生偏誤,變項與變項間的相關性也會受到扭曲。本研究想針對隨機缺失(MAR)機制造成的缺失資料做處理,尋找一個適當的插補值來替代。本研究係利用台灣教育長期追蹤資料庫第二波學生樣本資料,以原始缺失結構(Missing Pattern)為基礎,建構出1~5倍缺失率資料檔,再以傳統插補法(單一插補法)及多重插補法(Multiple Imputation)填補缺失值,再與基準資料集(baseline)進行因素分析之比較,探討主要特徵值及因素負荷在各種插補方法間的差異。根據研究結果,使用多重插補方法填補過的資料集,與基準資料集在比較因素分析之特徵值及因素負荷差異時,並不會受到缺失比例高低的影響,表現都還算不錯。特別是使用多重插補法之插補值選取四捨五入與多重插補法之插補值選取無條件進位,此兩種方法計算出來的差異與基準資料集為最小,顯示多重插補法確實是優於傳統插補法。
Factor analysis is usually used to discuss the infra-structure of survey data. When the raw data has too many missing values and the missing mechanism belongs to missing at random, it will easily come out with bias inference as well as distort the correlation between variables if we only analyze the complete part of raw data without dealing with the missing values. This study is trying to find out a way to deal with those missing values which come at random. We will suggest some suitable methods to impute those missing values. The sample data of the research comes from the Student Data of the Second Wave Survey in Taiwan Education Panel Survey(TEPS). We created five datasets which are one time up to five times of the original missing rate respectively based on the raw missing pattern. We will then apply some traditional imputation methods as well as some multiple imputation methods to fill in the missing data separately. And we tried to find out the diversity of eigen-values and factor loadings among those imputation methods by comparing with the results analyzed from the baseline. According to the finding, the multiple imputation methods are not affected by the missing rates obviously when we compared with the baseline in both eigen*values and factor loadings. And it is especially good when we round off as well as round up the imputed values for multiple imputation method. Therefore, the finding is that multiple imputation method is better than the traditional imputation method for our database.