Mantel-Haenszel DIF 程序之第一類型錯誤率和DIF嚴重度分類結果研究

本研究調查Mantel-Haenszel (MH) DIF 程序在數種模擬情境中之實徵第一類型錯誤率和其在ETS DlF分類系統中各類DIF 的分類結果。本研究以Monte Carlo法進行，操弄的因子包括：參照組和焦點組受試者的樣本數（2個水準）、兩組受試者能力的分配（3個水準）、測驗長度（3個水準）、受評試圖的鑑別度（6個水準）和難度（5個水準），採交叉設計，共模擬了540種情況，每種情況各重覆100次，所有試題均模擬為無DIF試題。本研究主要發現卸下:在大多數的模擬情況下，MH DIF程序第一類型錯誤率的控制力令人滿意，惟在兩組受試者能力分配有明攝差異且測驗信度偏低的情況下，鑑別度較極端的試題產主之實徵第一類型錯誤率有偏高的傾向，又以高鑑別力試題特別嚴重，這種現象在受試樣本人數多的情形下尤其明顯。就ETS DIF嚴重度分類結果而言，MH 的表現大致上令人滿意，在本研究模擬的各情況中，B和C類DIF的出現率低於5%。最後，研究者根據研究結果提出建議供實務工作者在解釋和運用MH DIF 分析結果時的參考。

關鍵字

差別試題功能(DIF) ； Mantel-Haenszel法；第一類型錯誤率； ETS DIF分類系統

並列摘要

This study investigated the empirical Type I error rates and ETS DIF classification results for the Mantel-Haenszel (MH) DIF procedure under various simulated conditions. Monte Carlo simulations were conducted to investigate the effects of several manipulated factors on the performance of the MH. The factors manipulated in this study included sample size (2 levels) and ability distribution (3 levels) for the reference and focal groups, the test length (3 levels), item discrimination (6 levels) and difficulty (5 levels) for the studied items. By crossing these factors, there were 540 conditions simulated, and 100 replications were conducted in each condition. All studied items were simulated as non-DIF items. The results indicated that the performance of the type I error control of the MH was generally satisfactory except for few conditions. Specifically, when the ability difference for the two comparison groups was substantial, test was less reliable and sample size for two groups was large, the MH tended to produce inflated Type I error rates for highly or extremely low discriminating items. In terms of the ETS DIF classification results, the MH generally performed well. The empirical results for the Categories Band C items were less than 5% across simulated conditions. Finally, the implications of the findings in this work for the interpretation and practical use of the MH DIF results were suggested.

並列關鍵字

differential item functioning (DIF) ； Mantel-Haenszel procedure ； Type I error ； ETS DIF classification

參考文獻

Camilli, G.,Shepard, L. A.(1994).Methods for identifying biased test items.Thousand Oaks, CA:Sage.

Google Scholar

Clauser, B. E.,Mazor, K. M.(1998).Using statistical procedures to identify differentially functioning test items.Educational Measurement: Issues and Practice.17(1),31-44.

Google Scholar

Donoghue, J. R.,Holland, P. W.,Thayer, D. T.(1993).Differential Item Functioning.Hillsdale, NJ:Lawrence Erlbaum.

Google Scholar

Dorans, N. J.,Holland, P. W.(1993).Differential item functioning.Hillsdale, NJ:Lawrence Erlbaum.

Google Scholar

Holland, P. W.,Thayer, D. T.(1988).Test validity.Hillsdale, NJ:Lawrence Erlbaum.

Google Scholar

被引用紀錄

蘇旭琳（2006）。DIF分析在小樣本情境中的效果—以視障生和普通生在國中基測數學科之DIF為例〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-0712200716110807

王嘉寧（2006）。影響試題差異功能的試題特徵探討─以90-95年國中基本學力測驗地理科試題為例〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-0712200716102268

國際替代計量

Mantel-Haenszel DIF 程序之第一類型錯誤率和DIF嚴重度分類結果研究

未授權

主題瀏覽