透過您的圖書館登入
IP:3.17.79.60
  • 學位論文

在認知診斷測量架構中的試題差異功能偵測效果探討

Detecting differential item functioning in a framework of cognitive diagnostic measurement

指導教授 : 陳柏熹 陳學志
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


試題差異功能檢驗已被視為在測驗發展過程的重要程序。隨著認知診斷評量持續在實務與方法學研究方面受到關注,在認知診斷測量架構下的試題差異功能議題自然也莫可忽視。本研究涵蓋三大目的,首先,本研究提出以模式為基礎所進行的試題差異功能偵測方法以處理認知診斷評量架構下的補償與非補償性資料;其次,本研究聚焦於過去在認知診斷測量架構下的試題差異功能研究中所忽視的當測驗受到偏誤試題污染的相關議題。最後,本研究以更系統性的探討可能影響試題差異功能偵測方法成效的因素,並將這些可能的影響因素導入於模擬研究設計中。本研究以馬克夫鍊蒙地卡羅演算法分別針對兩個所提出的模式進行參數估計,並且比較參數回覆性效果,同時檢驗在不同測驗情境下,使用模式為基礎的試題差異功能偵測方法與非參數取向的MH以及LR等試題差異功能偵測方法的型一錯誤率以及統計檢定力。除此之外,本研究加入了淨化程序於MH以及LR等試題差異功能偵測方法之中,並探討加入試題淨化程序後對於試題差異功能偵測的效能能否提升。最後,本研究使用2007年國際數學與科學教育成就趨勢調查研究中四年級數學科評量為範例,說明如何運用所提出的試題差異功能偵測方法於實務情境中。研究結果發現,在參數回覆方面,本研究所提出的兩個模式為基礎的試題差異功能偵測方法其參數回覆性效果甚佳。而在不同試題差異功能偵測方法的比較方面,本研究發現在相同測驗情境下以模式為基礎的試題差異功能檢驗方法其型一錯誤率的控制以及統計檢定力均優於MH以及LR。再者,模擬研究結果發現,當處理認知診斷測量資料時,試題遭受污染而未加以進行淨化程序即進行試題差異功能偵測,將會影響偵測效果,並且得到錯誤的結論。隨著淨化程序的加入,可以幫助改善MH以及LR等試題差異功能偵測方法在特定情境下的型一錯誤率的控制以及統計檢定力。不過此兩種方法,即使加入淨化程序後,仍無助於解決當受試者平均能力分布差異很大時,所造成的第一類型錯誤率膨脹的問題。最後,本研究也發現相較於MH以及LR等試題差異功能偵測方法,本研究所提出的模式為基礎的試題差異功能偵測方法在試題差異功能偵測的結果解釋較為細緻,並且能藉由模式擴展找出可能造成試題差異功能原因的前瞻性。

並列摘要


Detection of Differential item functioning, DIF has been recognizing as an important procedure especially in test development. With the cognitive diagnostic measurements, CDMs continue to receive attention both in applied and methodological studies. DIF related issues in the framework of CDMs remain to concern. The purpose of the study had three objectives; first, to propose model based DIF detection method in dealing compensatory and non-compensatory cognitive diagnostic data; second, to address on the contaminated matching criterion issue that has be overlook in the past DIF study within the CDM framework; third, to investigate more possible factors that may affect DIF detection methods and introduced into the simulation design. An MCMC algorithm employing Gibbs sampling was used to estimate the two proposed models and simulation study was done to examine model recovery, Type I error rates, and power under different testing conditions. For DIF detection, the model based method was also compared with the MH method and LR method. Furthermore, the purification procedure is applied in the MH and LR methods and compared with the model based method to investigate the effectiveness of DIF detection methods. Finally, TIMSS 2007 fourth grade mathematics assessment was used to demonstrate and the results were used to illustrate the implementation of the new method. The parameter recovery of the proposed models yielded well. The simulation results of DIF methods comparison appeared to confirm that the model based method outperformed the MH and LR methods in Type I error control and power rate under comparable testing conditions. Moreover, the result revealed that the biased matching criterion may also determine the effectiveness of DIF detection in a framework of cognitive diagnostic measurement. With purification procedure, could improve the Type I errors and power rates for MH and LR under specific circumstance. Finally, the model based method had the strength of interpreting results more elaborately compared to the other DIF methods.

參考文獻


Candell, G. L. & Drasgow, F. (1988). An purification procedure for linking metrics and assessing item bias in item response theory. Applied Psychological Measurement, 12, 253-260.
Chaimongkol, S. (2005). Modeling differential item functioning (DIF) using multilevel logistic regression models: A Bayesian perspective. Unpublished doctoral dissertation, The Florida State University.
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115–130.
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199.
de la Torre, J. & Karelitz, T. M. (2009). Impact of diagnosticity on the adequacy of models for cognitive diagnostic under a linear attribute structure: A simulation study. Journal of Educational Measurement, 46(4), 450-469.

延伸閱讀