問卷調查中常有缺失資料的問題。若缺失的原因是隨機缺失(Missing at random, MAR),或非隨機缺失(Missing not at random,MNAR),直接刪除缺失樣本,會產生統計分析上的偏誤。本論文以臺灣教育長期追蹤資料庫(TEPS)中,第一、二波國中學生資料為對象,探討缺失資料對心理健康問項的因素分析產生的影響。缺失資料集的建構方法,是以城鄉別和公私立結構的比例,對完整資料進行抽樣,再併入dropout資料集,形成一組缺失資料集。透過對完整資料集不同的抽樣樣本數,再加上dropout資料集,可做出不同缺失比例的缺失資料集。本研究將以(1/6, 1/5, 1/4, 1/3, 1/2)等不同的缺失比例下,隨機建構五十組缺失資料集,比較各種缺失處理的穩定性。缺失處理方法除了整筆刪除法(list-wise deletion, LD)之外,也將透過第一波的基本變數及心理健康變數,以邏輯斯迴歸逐一填補法、MCMC法、二階段填補法進行缺失處理,並和第一波心理健康變數的因素分析結果做比較。並將提出在不同缺失比例下,探討探索式因素分析(Exploratory factor analysis,EFA)最適缺失資料處理原則。
Missing values are common in survey data. If the missing mechanism are missing at random (MAR) or missing not at random (MNAR), statistical analysis using complete data may cause bias. In this study, we use mental health data of wave 1 and 2 junior high school samples in Taiwan Education Panel Survey (TEPS), trying to discuss the influence of factor analysis with missing data. The way to create the missing data sets is combine two samples, one is sampling from the students samples from both wave 1 and 2, the other is the students samples which are dropout in wave 2. We construct the missing data sets with five kinds of missing ratio, 1/6, 1/5, 1/4, 1/3, 1/2, and each ratio has 50 random samples. There are eight missing treatment in this study, including list-wise deletion, stepwise logistic regression imputation, MCMC method two-stage method and etc. With different missing treatments, compare the stability in factor analysis, and suggest some suitable missing treatments on statistical analysis.