透過您的圖書館登入
IP:3.133.82.244
  • 學位論文

團體能力分配不同對MIMIC法進行二分題差異試題功能檢驗之影響

The Influence of Impact on the DIF Detection with MIMIC method for Dichotomous items

指導教授 : 蘇雅蕙
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


差異試題功能 (DIF) 是指來自不同團體但具相同能力的應試者,對於同一道試題的作答表現有所不同。過去DIF研究多數操弄團體之間能力分配平均值不同的情境進行檢驗,卻忽略真實情境團體能力分配變異數亦可能有所不同。因此本研究欲檢視團體間能力分配不同時,以MIMIC法進行DIF檢驗的效果。本研究提出先以迭代的MIMIC (M-IT) 法找出一組定錨題,再逐一檢驗除定錨題以外的其它試題,是為M-STPA法,有別於採同時檢定被檢驗試題的定錨題之MIMIC (M-PA) 法。因為M-STPA法是採逐一檢驗試題DIF,所以預期此法在團體間能力分配有差異時,會有優於M-PA法的表現,同時M-STPA與過去研究使用的標準MIMIC (M-ST) 法、量尺純化的MIMIC (M-SP) 法、M-PA法等進行比較,研究結果發現:(1)當兩團體能力分配變異相同(變異數等於1)但平均不同(平均差異為1)時,測驗中DIF試題含量超過10%時,M-ST法的型一錯誤會產生嚴重的膨脹,又DIF試題含量超過30%時,M-SP法的型一錯誤也會產生嚴重的膨脹,M-PA法和M-STPA法在DIF試題含量高達40%時,仍有控制良好的型一錯誤;(2)當兩團體能力分配平均相同但變異不同時(變異數差異為0.5、-0.5),四種檢驗方式的型一錯誤皆產生嚴重的膨脹;(3)當兩團體能力分配平均和變異皆不同時:(i)兩團體能力分配平均值差異為1(焦點團體能力分配平均值小於參照團體)和變異數差異為0.5(焦點團體能力分配變異數小於參照團體)時,即使測驗中DIF試題含量超過40%,M-STPA法仍穩定控制型一錯誤;(ii)兩團體能力分配平均值差異為1和變異數差異為-0.5時,雖然四種檢驗方式皆產生型一錯誤膨脹,但M-STPA法的檢驗結果明顯受到測驗長度影響。

並列摘要


Differential item functioning (DIF) occurs when subgroups of test takers have equal trait levels but differ in their probabilities of a correct response. Many simulation studies have been done to examine the performance of these methods to flag DIF items. However, among these studies, there is little attention to the effects on DIF detection methods of the difference in ability variance between two groups. Thus, the aim of this study is to examine how the difference combinations of ability variance and mean between reference and focal groups affect four multiple indicators–multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), the MIMIC method with a pure anchor (M-PA), and the standard MIMIC method with a pure anchor (M-STPA). In a series of simulations, it appeared that (1) under mean difference in ability, all four methods yielded a well- controlled Type I error rate when tests did not contain any DIF items. M-ST and M-SP began to yield an inflated Type I error rate and a deflated power when tests contained 20% and 40% DIF items, respectively. M-PA and M-STPA maintained an expected Type I error rate and a high power even when tests contained as many as 40% DIF items; (2) the difference in ability variance inflates the Type I errors for all the DIF detection methods; (3) when both mean difference in ability and difference in ability variance existed: (i) M-STPA maintained an expected Type I error rate when focal groups had smaller ability variance; (ii) all the DIF detection methods yield an inflated Type I error rate when focal groups had bigger ability variance. Test length appeared to have effect in M-STPA.

參考文獻


Bielinski, J., & Davison, M. L. (1998). Gender differences by item difficulty interactions in multiple-choice mathematics items. American Educational Research Journal, 35, 455-476.
Bradley, J. V. (1978). Roubustness? TheBritish Journal of Mathematical and Statistical Psychology, 31, 144-152.
Clauser, B., Mazor, K., & Hambleton, R. K. (1993). The effects of purification of the matching criterion on the identification of DIF using the Mantel&Haenszel procedure. Applied Measurement in Education, 6, 269-279.
Cohen, A. S., Kim, S. H., & Wollack, J. A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20, 15-26.
Feingold, A. (1992). Sex differences in variability in intellectual abilities: A new look at an old controversy. Review of Educational Research, 62, 61-84.

延伸閱讀