透過您的圖書館登入
IP:18.227.228.95
  • 學位論文

多重插補方法處理缺失資料之因素分析-以「台灣教育長期追蹤資料庫」之第二波高中、高職、五專學生樣本資料為例

Application of the multiple imputation of Incomplete Data for Factor Analysis:the Example of Student Data of the Second Wave Survey in TEPS

指導教授 : 王鴻龍
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


因素分析常被用來探討測驗項目或問卷項目的資料結構,若原始資料的遺失值過多,且屬於隨機缺失(MAR)時,若僅針對完整的資料來做分析,易使資料分析的結果產生偏誤,變項與變項間的相關性也會受到扭曲。本研究想針對隨機缺失(MAR)機制造成的缺失資料做處理,尋找一個適當的插補值來替代。本研究係利用台灣教育長期追蹤資料庫第二波學生樣本資料,以原始缺失結構(Missing Pattern)為基礎,建構出1~5倍缺失率資料檔,再以傳統插補法(單一插補法)及多重插補法(Multiple Imputation)填補缺失值,再與基準資料集(baseline)進行因素分析之比較,探討主要特徵值及因素負荷在各種插補方法間的差異。根據研究結果,使用多重插補方法填補過的資料集,與基準資料集在比較因素分析之特徵值及因素負荷差異時,並不會受到缺失比例高低的影響,表現都還算不錯。特別是使用多重插補法之插補值選取四捨五入與多重插補法之插補值選取無條件進位,此兩種方法計算出來的差異與基準資料集為最小,顯示多重插補法確實是優於傳統插補法。

並列摘要


Factor analysis is usually used to discuss the infra-structure of survey data. When the raw data has too many missing values and the missing mechanism belongs to missing at random, it will easily come out with bias inference as well as distort the correlation between variables if we only analyze the complete part of raw data without dealing with the missing values. This study is trying to find out a way to deal with those missing values which come at random. We will suggest some suitable methods to impute those missing values. The sample data of the research comes from the Student Data of the Second Wave Survey in Taiwan Education Panel Survey(TEPS). We created five datasets which are one time up to five times of the original missing rate respectively based on the raw missing pattern. We will then apply some traditional imputation methods as well as some multiple imputation methods to fill in the missing data separately. And we tried to find out the diversity of eigen-values and factor loadings among those imputation methods by comparing with the results analyzed from the baseline. According to the finding, the multiple imputation methods are not affected by the missing rates obviously when we compared with the baseline in both eigen*values and factor loadings. And it is especially good when we round off as well as round up the imputed values for multiple imputation method. Therefore, the finding is that multiple imputation method is better than the traditional imputation method for our database.

參考文獻


王鴻龍, 楊孟麗, 林定香, 陳俊如(2007).不完整資料在因素分析上的處理方法之研究。發表於「調查研究方法與應用學術研討會」,中央研究院 調查專題研究中心。
Ho, P., Silva, M.C.M. and Hogg, T.A. (2001). Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port. Chemometrics and Intelligent Laboratory Systems, 55, 1-11
Bernaards, Coen. A. and Sijtsma, Klaas. (2000). Influence of imputation and EM methods on factor analysis when item nonresponse in questionnaire data is nonignorable. Multivariate Behavioral Research. 35, 321-364.
Kamakura, W. A. and Wedel, M. (2000). Factor analysis and missing data. Journal of Marketing Research. 37, 490-498.
Yuan, K.H., Marshall, L.L. and Bentler, P. M. (2002). A unified approach to exploratory factor analysis with missing data, nonnormal data, and in the presence of outliers. Psychometrika. 67, 95-121.

被引用紀錄


程于庭(2013)。遺失資料插補法在最適資產配置投資組合上之應用與比較-以臺灣證券市場為例〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2013.00602
張嘉晏(2011)。缺失資料處理對因素分析的影響 -以臺灣教育長期追蹤資料庫國中樣本心理健康問項跨期資料為例〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-2707201111022200
林欣潔(2012)。缺失資料處理方法對巢狀迴歸分析的影響〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-3007201214432700
洪靜茹(2013)。缺失資料處理對多變量變異數分析(MANOVA)的影響〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-0602201318131300
吳雅慧(2015)。缺失處理方法運用在“不知道”問項之研究 ─以身心障礙學生之自我概念為例〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-1005201615092036

延伸閱讀