Have library access?
  • Journals

Mantel-Haenszel DIF 程序之第一類型錯誤率和DIF嚴重度分類結果研究

An Investigation of the Type I Error Rates and ETS DIF Classification Results for the Mantel-Haenszel DIF Procedure

For better promotion, authorized us if you are the author.


本研究調查Mantel-Haenszel (MH) DIF 程序在數種模擬情境中之實徵第一類型錯誤率和其在ETS DlF分類系統中各類DIF 的分類結果。本研究以Monte Carlo法進行,操弄的因子包括:參照組和焦點組受試者的樣本數(2個水準)、兩組受試者能力的分配(3個水準)、測驗長度(3個水準)、受評試圖的鑑別度(6個水準)和難度(5個水準),採交叉設計,共模擬了540種情況,每種情況各重覆100次,所有試題均模擬為無DIF試題。本研究主要發現卸下:在大多數的模擬情況下,MH DIF程序第一類型錯誤率的控制力令人滿意,惟在兩組受試者能力分配有明攝差異且測驗信度偏低的情況下,鑑別度較極端的試題產主之實徵第一類型錯誤率有偏高的傾向,又以高鑑別力試題特別嚴重,這種現象在受試樣本人數多的情形下尤其明顯。就ETS DIF嚴重度分類結果而言,MH 的表現大致上令人滿意,在本研究模擬的各情況中,B和C類DIF的出現率低於5%。最後,研究者根據研究結果提出建議供實務工作者在解釋和運用MH DIF 分析結果時的參考。

Parallel abstracts

This study investigated the empirical Type I error rates and ETS DIF classification results for the Mantel-Haenszel (MH) DIF procedure under various simulated conditions. Monte Carlo simulations were conducted to investigate the effects of several manipulated factors on the performance of the MH. The factors manipulated in this study included sample size (2 levels) and ability distribution (3 levels) for the reference and focal groups, the test length (3 levels), item discrimination (6 levels) and difficulty (5 levels) for the studied items. By crossing these factors, there were 540 conditions simulated, and 100 replications were conducted in each condition. All studied items were simulated as non-DIF items. The results indicated that the performance of the type I error control of the MH was generally satisfactory except for few conditions. Specifically, when the ability difference for the two comparison groups was substantial, test was less reliable and sample size for two groups was large, the MH tended to produce inflated Type I error rates for highly or extremely low discriminating items. In terms of the ETS DIF classification results, the MH generally performed well. The empirical results for the Categories Band C items were less than 5% across simulated conditions. Finally, the implications of the findings in this work for the interpretation and practical use of the MH DIF results were suggested.


Camilli, G.,Shepard, L. A.(1994).Methods for identifying biased test items.Thousand Oaks, CA:Sage.
Clauser, B. E.,Mazor, K. M.(1998).Using statistical procedures to identify differentially functioning test items.Educational Measurement: Issues and Practice.17(1),31-44.
Donoghue, J. R.,Holland, P. W.,Thayer, D. T.(1993).Differential Item Functioning.Hillsdale, NJ:Lawrence Erlbaum.
Dorans, N. J.,Holland, P. W.(1993).Differential item functioning.Hillsdale, NJ:Lawrence Erlbaum.
Holland, P. W.,Thayer, D. T.(1988).Test validity.Hillsdale, NJ:Lawrence Erlbaum.

Cited by