多重插補方法處理缺失資料之因素分析－以「台灣教育長期追蹤資料庫」之第二波高中、高職、五專學生樣本資料為例

因素分析常被用來探討測驗項目或問卷項目的資料結構，若原始資料的遺失值過多，且屬於隨機缺失（MAR）時，若僅針對完整的資料來做分析，易使資料分析的結果產生偏誤，變項與變項間的相關性也會受到扭曲。本研究想針對隨機缺失（MAR）機制造成的缺失資料做處理，尋找一個適當的插補值來替代。本研究係利用台灣教育長期追蹤資料庫第二波學生樣本資料，以原始缺失結構（Missing Pattern）為基礎，建構出1~5倍缺失率資料檔，再以傳統插補法（單一插補法）及多重插補法（Multiple Imputation）填補缺失值，再與基準資料集（baseline）進行因素分析之比較，探討主要特徵值及因素負荷在各種插補方法間的差異。根據研究結果，使用多重插補方法填補過的資料集，與基準資料集在比較因素分析之特徵值及因素負荷差異時，並不會受到缺失比例高低的影響，表現都還算不錯。特別是使用多重插補法之插補值選取四捨五入與多重插補法之插補值選取無條件進位，此兩種方法計算出來的差異與基準資料集為最小，顯示多重插補法確實是優於傳統插補法。

關鍵字

因素分析；隨機缺失；多重插補法；傳統插補法；缺失結構

並列摘要

Factor analysis is usually used to discuss the infra-structure of survey data. When the raw data has too many missing values and the missing mechanism belongs to missing at random, it will easily come out with bias inference as well as distort the correlation between variables if we only analyze the complete part of raw data without dealing with the missing values. This study is trying to find out a way to deal with those missing values which come at random. We will suggest some suitable methods to impute those missing values. The sample data of the research comes from the Student Data of the Second Wave Survey in Taiwan Education Panel Survey（TEPS）. We created five datasets which are one time up to five times of the original missing rate respectively based on the raw missing pattern. We will then apply some traditional imputation methods as well as some multiple imputation methods to fill in the missing data separately. And we tried to find out the diversity of eigen-values and factor loadings among those imputation methods by comparing with the results analyzed from the baseline. According to the finding, the multiple imputation methods are not affected by the missing rates obviously when we compared with the baseline in both eigen*values and factor loadings. And it is especially good when we round off as well as round up the imputed values for multiple imputation method. Therefore, the finding is that multiple imputation method is better than the traditional imputation method for our database.

並列關鍵字

Factor analysis ； Missing at random ； Multiple imputation ； Traditional imputation ； Missing data pattern

參考文獻

王鴻龍, 楊孟麗, 林定香, 陳俊如（2007）.不完整資料在因素分析上的處理方法之研究。發表於「調查研究方法與應用學術研討會」，中央研究院調查專題研究中心。

Ho, P., Silva, M.C.M. and Hogg, T.A. (2001). Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port. Chemometrics and Intelligent Laboratory Systems, 55, 1-11

Bernaards, Coen. A. and Sijtsma, Klaas. (2000). Influence of imputation and EM methods on factor analysis when item nonresponse in questionnaire data is nonignorable. Multivariate Behavioral Research. 35, 321-364.

Kamakura, W. A. and Wedel, M. (2000). Factor analysis and missing data. Journal of Marketing Research. 37, 490-498.

Yuan, K.H., Marshall, L.L. and Bentler, P. M. (2002). A unified approach to exploratory factor analysis with missing data, nonnormal data, and in the presence of outliers. Psychometrika. 67, 95-121.

被引用紀錄

程于庭（2013）。遺失資料插補法在最適資產配置投資組合上之應用與比較－以臺灣證券市場為例〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2013.00602

張嘉晏（2011）。缺失資料處理對因素分析的影響－以臺灣教育長期追蹤資料庫國中樣本心理健康問項跨期資料為例〔碩士論文，國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-2707201111022200

林欣潔（2012）。缺失資料處理方法對巢狀迴歸分析的影響〔碩士論文，國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-3007201214432700

洪靜茹（2013）。缺失資料處理對多變量變異數分析(MANOVA)的影響〔碩士論文，國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-0602201318131300

吳雅慧（2015）。缺失處理方法運用在“不知道”問項之研究 ─以身心障礙學生之自我概念為例〔碩士論文，國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-1005201615092036

延伸閱讀

姚偉哲（2009）。應用加權方法處理缺失資料之因素分析-以「台灣教育長期追蹤資料庫」之第二波高中、高職、五專學生樣本資料為例〔碩士論文，國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-3008200913532700
張嘉晏（2011）。缺失資料處理對因素分析的影響－以臺灣教育長期追蹤資料庫國中樣本心理健康問項跨期資料為例〔碩士論文，國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-2707201111022200
蔡東敏、譚子文、董旭英（2015）。臺南都會區國中生緊張因素、接觸偏差同儕、認同非法手段對偏差行為之影響：建構整合理論解釋模型。青少年犯罪防治研究期刊，7(2)，37-80。https://doi.org/10.29751/JRDP.201512_7(2).0002
徐永豐、曾令明（2013）。階層性多項歷程樹狀模式在記憶缺陷評估之應用：以台灣臨床資料分析為例。中華心理學刊，55(1)，57-73。https://doi.org/10.6129/CJP.20120711
沈治華（2013）。Using information retrieval methods to compute the Effectiveness of the theses and dissertations contributed to their Advisors’ publications– a Case Study for the grade students in Department of Computer Science in Taiwan〔碩士論文，國立中正大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0033-2110201613565745

國際替代計量

多重插補方法處理缺失資料之因素分析－以「台灣教育長期追蹤資料庫」之第二波高中、高職、五專學生樣本資料為例

未授權

主題瀏覽